'1' 'IPR006139' '\

A number of NAD-dependent 2-hydroxyacid dehydrogenases which seem to be specific for the D-isomer of their substrate\ have been shown to be functionally and structurally related. The catalytic domain contains a number of conserved charged residues which may play a role in the catalytic mechanism. The NAD-binding domain is described in

\ ' '2' 'IPR001078' '\ This domain is found in the lipoamide acyltransferase component of the branched-chain alpha-keto acid dehydrogenase complex , which catalyses the overall conversion of alpha-keto acids to acyl-CoA and carbon dioxide PUBMED:8487300. It contains multiple copies of three enzymatic components: branched-chain alpha-keto acid decarboxylase (E1), lipoamide\ acyltransferase (E2) and lipoamide dehydrogenase (E3). The domain is also found in the dihydrolipoamide succinyltransferase component of the 2-oxoglutarate dehydrogenase complex .\ These proteins contain one to three copies of a lipoyl binding domain followed by the catalytic domain.\ ' '3' 'IPR005238' '\

2-phosphosulpholactate phosphatase (ComB; ) is a magnesium-dependent acid phosphatase that catalyzes the second step in coenzyme M (CoM; 2-mercaptoethanesulphonic acid) biosynthesis, namely, the hydrolysis of (2R)-2-phospho-3-sulpholactate to yield (2R)-3-sulpholactate and phosphate. CoM is an essential cofactor that acts as the terminal methyl carrier in methanogenesis PUBMED:11589710. Homologues of ComB have been identified in all available cyanobacterial genome sequences and in genomes from phylogenetically diverse bacteria and archaea. However, many of these organisms lack homologues of other CoM biosynthetic genes. ComB has a complex alpha/beta topology. The monomer is composed of two domains thought to be related by a common ancestral gene, plus a C-terminal helical and beta-hairpin region PUBMED:16927339.

\ ' '4' 'IPR005123' '\

This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily PUBMED:11276424. This family includes the C-terminal of prolyl 4-hydroxylase alpha subunit. The holoenzyme has the activity () catalysing the reaction:

\

\

The full enzyme consists of a alpha2 beta2 complex with the alpha subunit contributing most of the parts of the active site PUBMED:7753822. The family also includes lysyl hydrolases, isopenicillin synthases and AlkB.

\ \ ' '5' 'IPR002225' '\ The enzyme 3 beta-hydroxysteroid dehydrogenase/5-ene-4-ene \ isomerase (3 beta-HSD) catalyses the oxidation and isomerisation \ of 5-ene-3 beta-hydroxypregnene and 5-ene-hydroxyandrostene \ steroid precursors into the corresponding 4-ene-ketosteroids necessary\ for the formation of all classes of steroid hormones.\ \ 3Beta_HSD\ ' '6' 'IPR007513' '\ Members of this family are short proteins that are rich in aspartate, glutamate, lysine and arginine. Although the function of these proteins is unknown, they are found to be ubiquitously expressed PUBMED:9731538.\ ' '7' 'IPR006683' '\

This family contains a wide variety of enzymes, principally thioesterases. This family includes 4HBT () which catalyses the final step in the biosynthesis of 4-hydroxybenzoate from 4-chlorobenzoate in the soil dwelling microbe Pseudomonas CBS-3. This family includes various cytosolic long-chain acyl-CoA thioester hydrolases. Long-chain acyl-CoA hydrolases hydrolyse palmitoyl-CoA to CoA and palmitate, they also catalyse the hydrolysis of other long chain fatty acyl-CoA thioesters.

\ ' '8' 'IPR008334' '\

5\'-nucleotidases PUBMED:1637327 are enzymes that catalyze the hydrolysis of\ phosphate esterified at carbon 5\' of the ribose and deoxyribose portions of\ nucleotide molecules. 5\'-nucleotidase is a ubiquitous enzyme found in a wide\ variety of species and which occurs in different cellular locations. The extracellular 5\'-nucleotidase from mammals and Discopyge ommata (Electric ray) isozyme is a homodimeric disulphide-bonded glycoprotein attached to the membrane by a GPI-anchor, and requires zinc for its activity. Vibrio parahaemolyticus 5\'-nucleotidase (gene nutA) is bound to the membrane by a lipid chain, and requires chloride and magnesium ions for its activity. It is involved in degrading extracellular 5\'-nucleotides for nutritional needs.

\ \

Periplasmic bacterial 5\'-nucleotidase (gene ushA), also known\ as UDP-sugar hydrolase , can degrade UDP-glucose and other nucleotide diphosphate sugars. It produces sugar-1-phosphate which can then be used by the cell. UshA seems to require cobalt for its activity.\ 5\'-Nucleotidases are evolutionary related to the periplasmic bacterial 2\',3\'-cyclic-nucleotide 2\'-phosphodiesterase (gene cpdB), which catalyzes two consecutive reactions: it first converts 2\',3\'-cyclic-nucleotide to 3\'-nucleotide and then acts as a 3\'-nucleotidase; and mosquito apyrase (ATP-diphosphohydrolase) PUBMED:7846038, which catalyzes the hydrolysis of ATP into AMP and facilitates hematophagy by preventing ADP-dependent platelet aggregation in the host.

\ \

CD73 (also called ecto-5\'-nucleotidase) possesses the enzymatic activity of a 5\'-nucleotidase and catalyses the dephosphorylation of purine and pyrimidine ribo- and deoxyribonucleoside monophosphates to their corresponding nucleosides. Triggering of lymphocyte CD73 with mAb causes phosphorylation and dephosphorylation of certain, yet unknown protein substrates PUBMED:9015312. A possible function for CD73 is to regulate the availability of adenosine for interaction with cell surface adenosine receptor by converting AMP to adenosine. In common with other GPI anchored surface proteins CD73 can mediate costimulatory signals in T cell activation PUBMED:2550543.

\ \

This entry is the C-terminal domain of 5\'-nucleotidases.\

\ ' '9' 'IPR017978' '\

G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

\

GPCR family 3 receptors (also known as family C) are structurally similar to other GPCRs, but do not show any significant sequence similarity and thus represent a distinct group. Structurally they are composed of four elements; an N-terminal signal sequence; a large hydrophilic extracellular agonist-binding region containing several conserved cysteine residues which could be involved in disulphide bonds; a shorter region containing seven transmembrane domains; and a C-terminal cytoplasmic domain of variable length PUBMED:17266540. Family 3 members include the metabotropic glutamate receptors, the extracellular calcium-sensing receptors, the gamma-amino-butyric acid (GABA) type B receptors, and the vomeronasal type-2 receptors PUBMED:1309649, PUBMED:8255296, PUBMED:10773016, PUBMED:9292726. As these receptors regulate many important physiological processes they are potentially promising targets for drug development.

\

This entry represents the C-terminal region of family 3 GPCR receptor proteins, which contains the seven transmembrane region. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness PUBMED:16076846.

\ ' '11' 'IPR002589' '\

The Macro or A1pp domain is a module of about 180 amino acids which can bind ADP-ribose, an NAD metabolite or related ligands. The domain was described originally in association with ADP-ribose 1\'\'-phosphate (Appr-1\'\'-P) processing activity (A1pp) of the yeast YBR022W protein PUBMED:10550052. The domain is also called Macro domain as it is the C-terminal domain of mammalian core histone macro-H2A PUBMED:11343911, PUBMED:12842467. Macro domain proteins can be found in eukaryotes, in (mostly pathogenic) bacteria, in archaea and in ssRNA viruses, such as coronaviruses, Rubella and Hepatitis E viruses. In vertebrates the domain occurs e.g. in histone macroH2A, in predicted poly-ADP-ribose polymerases (PARPs) and in B aggressive lymphoma (BAL) protein. The macro domain can be associated with catalytic domains, such as PARP, or sirtuin. The Macro domain can recognise ADP-ribose or in some cases poly-ADP-ribose, which can be involved in ADP-ribosylation reactions that occur in important processes, such as chromatin biology, DNA repair and transcription regulation PUBMED:15902274. The human macroH2A1.1 Macro domain binds an NAD metabolite O-acetyl-ADP-ribose PUBMED:15965484. The Macro domain has been suggested to play a regulatory role in ADP-ribosylation, which is involved in inter- and intracellular signaling, transcriptional regulation, DNA repair pathways and maintenance of genomic stability, telomere dynamics, cell differentiation and proliferation, and necrosis and apoptosis.

\

The 3D structure of the Macro domain has a mixed alpha/beta fold of a mixed beta sheet sandwiched between four helices. Several Macro domain only domains are shorter than the structure of AF1521 and lack either the first strand or the C-terminal helix 5. Well conserved residues form a hydrophobic cleft and cluster around the AF1521-ADP-ribose binding site PUBMED:12842467, PUBMED:15902274, PUBMED:15965484, PUBMED:16912299.

\ ' '12' 'IPR008154' '\

Amyloid-beta precursor protein (APP, or A4) is associated with Alzheimer\'s disease (AD), because one of its breakdown products, amyloid-beta (A-beta), aggregates to form amyloid or senile plaques PUBMED:16301322, PUBMED:16364896. Mutations in APP or in proteins that process APP have been linked with early-onset, familial AD. Individuals with Down\'s syndrome carry an extra copy of chromosome 21, which contains the APP gene, and almost invariably develop amyloid plaques and Alzheimer\'s symptoms.

\

APP is important for the neurogenesis and neuronal regeneration, either through the intact protein, or through its many breakdown products PUBMED:16406235. APP consists of a large N-terminal extracellular region containing heparin-binding and copper-binding sites, a short hydrophobic transmembrane domain, and a short C-terminal intracellular domain. The N-terminal region is similar in structure to cysteine-rich growth factors and appears to function as a cell surface receptor, contributing to neurite growth, neuronal adhesion, axonogenesis and cell mobility PUBMED:16406235. APP acts as a kinesin I membrane receptor to mediate the axonal transport of beta-secretase and presenilin 1. The N-terminal domain can regulate neurite outgrowth through its binding to heparin and collagen I and IV, which are components of the extracellular matrix. APP is also coupled to apoptosis-inducing pathways, and is involved in copper homeostasis/oxidative stress through copper ion reduction, where copper-metallated APP induces neuronal death PUBMED:12611883. The C-terminal intracellular domain appears to be involved in transcription regulation through protein-protein interactions. APP can promote transcription activation through binding to APBB1/Tip60, and may bind to the adaptor protein FE65 to transactivate a wide variety of different promoters.

\

APP can be processed by different sets of enzymes:

\

\ \

This entry represents an extracellular domain that is usually found at the N-terminal of amyloidogenic glycoproteins such as amyloid-beta precursor protein (APP, or A4).

\ \

More information about these protein can be found at Protein of the Month: Amyloid-beta Precursor Protein PUBMED:.

\ ' '13' 'IPR001365' '\

Adenosine deaminase () catalyzes the hydrolytic deamination of adenosine into \ inosine and AMP deaminase () catalyzes the hydrolytic deamination of AMP into IMP.\ It has been shown PUBMED:1998686 that these two \ enzymes share three regions of sequence similarities; these regions are centred\ on residues which are proposed to play an important role in the catalytic mechanism of \ these two enzymes.

\ ' '14' 'IPR013057' '\ This transmembrane region is found in many amino acid transporters including (UNC-47) and (MTR). UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT) and is is predicted to have 10 transmembrane domains UNC47_CAEEL PUBMED:9349821. MTR is an N system amino acid transporter system protein involved in methyltryptophan resistance MTR_NEUCR. Other members of this family include proline transporters and amino acid transporters whose specificity has not yet been identified.\ ' '15' 'IPR003959' '\

AAA ATPases (ATPases Associated with diverse cellular Activities) form a large protein family and play a number of roles in the cell including cell-cycle regulation, protein proteolysis and disaggregation, organelle biogenesis and intracellular transport. Some of them function as molecular chaperones, subunits of proteolytic complexes or independent proteases (FtsH, Lon). They also act as DNA helicases and transcription factors PUBMED:17201069.

\ \

AAA ATPases belong to the AAA+ superfamily of ringshaped P-loop NTPases, which act via the energy-dependent unfolding of macromolecules PUBMED:15037233, PUBMED:16828312. There are six major clades of AAA domains (proteasome subunits, metalloproteases, domains D1 and D2 of ATPases with two AAA domains, the MSP1/katanin/spastin group and BCS1 and it homologues), as well as a number of deeply branching minor clades PUBMED:15037233.

\ \

They assemble into oligomeric assemblies (often hexamers) that form a ring-shaped structure with a central pore. These proteins produce a molecular motor that couples ATP binding and hydrolysis to changes in conformational states that act upon a target substrate, either translocating or remodelling it PUBMED:16919475.

\ \ \

They are found in all living organisms and share the common feature of the presence of a highly conserved AAA domain called the AAA module. This domain is responsible for ATP binding and hydrolysis. It contains 200-250 residues, among them there are two classical motifs, Walker A (GX4GKT) and Walker B (HyDE) PUBMED:17201069.

\ \

The functional variety seen between AAA ATPases is in part due to their extensive number of accessory domains and factors, and to their variable organisation within oligomeric assemblies, in addition to changes in key functional residues within the ATPase domain itself.

\

More information about these proteins can be found at Protein of the Month: AAA ATPases PUBMED:.

\ ' '16' 'IPR001048' '\

This entry contains proteins with various specificities and includes the aspartate, glutamate and uridylate kinase families. In prokaryotes and plants the synthesis of the essential amino acids lysine and threonine is predominantly regulated by feed-back inhibition of aspartate kinase (AK) and dihydrodipicolinate synthase (DHPS). In Escherichia coli, thrA, metLM, and lysC encode aspartokinase isozymes that show feedback inhibition by threonine, methionine, and lysine, respectively PUBMED:10220897. The lysine-sensitive isoenzyme of aspartate kinase from spinach leaves has a subunit composition of 4 large and 4 small subunits PUBMED:9584993.

\

In plants although the control of carbon fixation and nitrogen assimilation has been studied in detail, relatively little is known about the regulation of carbon and nitrogen flow into amino acids. The metabolic regulation of expression of an Arabidopsis thaliana aspartate kinase/homoserine dehydrogenase (AK/HSD) gene, which encodes two linked key enzymes in the biosynthetic pathway of aspartate family amino acids has been studied PUBMED:9501134. The conversion of aspartate into either the storage amino acid asparagine or aspartate family amino acids may be subject to a coordinated, reciprocal metabolic control, and this biochemical branch point is a part of a larger, coordinated regulatory mechanism of nitrogen and carbon storage and utilization.

\ ' '17' 'IPR004147' '\

This entry includes ABC1 from yeast PUBMED:1648478 and AarF from Escherichia coli PUBMED:9422602. These proteins have a nuclear or mitochondrial subcellular location in eukaryotes. The exact molecular functions of these proteins is not clear, however yeast ABC1 suppresses a cytochrome b mRNA translation defect and is essential for the electron transfer in the bc 1 complex PUBMED:1648478 and E. coli AarF is required for ubiquinone production PUBMED:9422602. It has been suggested that members of the ABC1 family are novel chaperonins PUBMED:1648478. These proteins are unrelated to the ABC transporter proteins.

\ ' '18' 'IPR002912' '\

The ACT domain is found in a variety of contexts and is proposed to be a conserved regulatory binding fold. ACT domains are linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. The archetypical ACT domain is the C-terminal regulatory domain of 3-phosphoglycerate dehydrogenase (3PGDH), which folds with a ferredoxin-like topology. A pair of ACT domains form an eight-stranded antiparallel sheet with two molecules of allosteric inhibitor serine bound in the interface. Biochemical exploration of a few other proteins containing ACT domains supports the suggestions that these domains contain the archetypical ACT structure PUBMED:11751050.

\ \ ' '19' 'IPR006693' '\

The alpha/beta hydrolase fold is common to several hydrolytic enzymes of widely differing phylogenetic origin and\ catalytic function. The core of each enzyme is similar: an alpha/beta sheet, not barrel, of eight beta-sheets connected by alpha-helices PUBMED:1409539. This entry describes a closely associated region, which is found in a number of lipases.

\ ' '20' 'IPR000182' '\

Histone acetylation is carried out by a class of enzymes known as histone acetyltransferases\ (HATs), which catalyze the transfer of an acetyl group from acetyl-CoA to the lysine E-amino\ groups on the N-terminal tails of histonesPUBMED:12801725. Early indication that HATs were involved in transcription\ came from the observation that in actively transcribed regions of chromatin, histones tend to be\ hyperacetylated, whereas in transcriptionally silent regions histones are hypoacetylated. The histone acetyltransferases are divided into five families. These include the Gcn5-related\ acetyltransferases (GNATs); the MYST (for \'MOZ, Ybf2/Sas3, Sas2 and Tip60)-related HATs;\ p300/CBP HATs; the general transcription factor HATs, which include the TFIID subunit TAF250;\ and the nuclear hormone-related HATs SRC1 and ACTR (SRC3).\ \ The GCN5-related N-acetyltransferase superfamily includes such enzymes as the histone acetyltransferases GCN5 and Hat1, the elongator complex subunit Elp3,\ the mediator-complex subunit Nut1, and Hpa2 PUBMED:9175471.

\

Many GNATs share several functional domains, including an N-terminal region of variable length, an\ acetyltransferase domain that encompasses the conserved sequence motifs described above, a\ region that interacts with the coactivator Ada2, and a C-terminal bromodomain that is believed to\ interact with acetyl-lysine residues. Members of the GNAT family are important for the regulation of cell growth and development. In\ mice, knockouts of Gcn5L are embryonic lethal. Yeast Gcn5 is needed for normal progression\ through the G2-M boundary and mitotic gene expression. The importance of GNATs is\ probably related to their role in transcription and DNA repair.

\

The yeast GCN5 (yGCN5) transcriptional coactivator functions as a histone acetyltransferase (HAT) to promote transcriptional activation. The crystal structure of the yeast histone acetyltransferase Hat1-acetyl coenzyme A (AcCoA) shows that Hat1 has an elongated, curved structure, and the AcCoA molecule is bound in a cleft on the concave surface of the protein, marking the active site of the enzyme. A channel of variable width and depth that runs across the protein is probably the binding site for the histone substrate PUBMED:9727486. The central protein core associated with AcCoA binding that appears to be structurally conserved among a superfamily of N-acetyltransferases, including yeast histone acetyltransferase 1 and Serratia marcescens aminoglycoside 3-N-acetyltransferase PUBMED:10430873.

\ ' '21' 'IPR008278' '\

These proteins transfer the 4\'-phosphopantetheine (4\'-PP) moiety from coenzyme A (CoA) to the invariant serine of pp-binding. This post-translational modification renders holo-ACP capable of acyl group activation via thioesterification of the cysteamine thiol of 4\'-PP PUBMED:7559576. This superfamily consists of two subtypes: The ACPS type such as ACPS_ECOLI and the Sfp type such as SFP_BACSU. The structure of the Sfp type is known PUBMED:10581256, which shows the active site accommodates a magnesium ion. The most highly conserved regions of the alignment are involved in binding the magnesium ion.

\ ' '22' 'IPR000472' '\ Transforming growth factor-beta (TGF-beta) forms a family with other\ growth factors described in . The receptors for most of the \ members of this growth factor family are related. These proteins are\ receptor-type kinases of Ser/Thr type ), which have a single\ transmembrane domain and a specific hydrophilic Cys-rich ligand-binding domain PUBMED:9023056, PUBMED:8047140, PUBMED:8909794. The C-terminal part of the extracellular\ domain is conserved. Some of the receptors of this family contain subclass-specific\ N-terminal extensions of this homology domain. The type I receptors also possess 7 extracellular residues\ preceding the cysteine box.\ ' '23' 'IPR006090' '\

Mammalian Co-A dehydrogenases () are enzymes that catalyse the first step in each cycle of beta-oxidation in mitochondion. Acyl-CoA dehydrogenases PUBMED:3326738, PUBMED:2777793, PUBMED:8034667 catalyze the alpha,beta-dehydrogenation of acyl-CoA thioesters to the corresponding trans 2,3-enoyl CoA-products with concommitant reduction of enzyme-bound FAD. Reoxidation of the flavin involves transfer of electrons to ETF (electron transfering flavoprotein). These enzymes are homodimers containing one molecule of FAD.

The monomeric enzyme is folded into three domains of approximately equal size. The N-terminal and the C-terminal are mainly alpha-helices packed together, and the middle domain consists of two orthogonal beta-sheets. The flavin ring is buried in the crevise between two alpha-helical domains and the beta-sheet of one subunit, and the adenosine pyrophosphate moiety is stretched into the subunit junction with one formed by two C-terminal domains PUBMED:8356049. The C-terminal domain of Acyl-CoA dehydrogenase is an all-alpha, four helical up-and-down bundle.

\ ' '24' 'IPR002656' '\

This entry contains a range of acyltransferase enzymes as well as yet uncharacterised proteins from Caenorhabditis elegans. It also includes the protein OatA. The pathogenic bacteria, Staphylococcus aureus, is able to cause persistent infections due to its ability to resist the immune defence system. Lysozyme, a cell wall-lytic enzyme, is one of the first defence compounds induced in serum and tissues after the onset of infection.

\ \

S. aureus has complete resistance to lysozyme action by O-acetylating its peptidoglycan (PG) by O-acetyltransferase (OatA) PUBMED:16861647, PUBMED:17676995. Staphylococcus bacteria are one of the only bacterial genera that are resistant to lysozyme and tend to colonise the skin and mucosa of humans and animals PUBMED:15661003. OatA is an integral membrane protein. This entry also includes NolL proteins. NolL-dependent acetylation is specific for the fucosyl penta-N-acetylglucosamine species. In addition, the NolL protein caused elevated production of lipo-chitin oligosaccharides (LCOs). The NolL protein obtained from Rhizobium loti (Mesorhizobium loti) functions as an acetyl transferase PUBMED:10755312.

\ ' '25' 'IPR008968' '\

Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport PUBMED:15261670. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors PUBMED:17449236, PUBMED:11598180.

\

AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes PUBMED:15107467. AP2 associates with the plasma membrane and is responsible for endocytosis PUBMED:12952931. AP3 is responsible for protein trafficking to lysosomes and other related organelles PUBMED:16542748. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins PUBMED:11080148. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface PUBMED:17254016.

\

This entry represents the C-terminal domain of the mu subunit from various clathrin adaptors (AP1, AP2 and AP3) PUBMED:11583591. The C-teminal domain has an immunoglobulin-like beta-sandwich fold consisting of 9 strands in 2 sheets with a Greek key topology, similar to that found in cytochrome f and certain transcription factors PUBMED:11583591. The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear PUBMED:1761056. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle PUBMED:1761056.

\

More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

\ ' '26' 'IPR013149' '\ Alcohol dehydrogenase () (ADH) catalyzes the reversible oxidation of\ alcohols to their corresponding acetaldehyde or ketone with the concomitant reduction of NAD:\ \ Currently three structurally and catalytically different types of alcohol\ dehydrogenases are known:\
    \
  1. Zinc-containing \'long-chain\' alcohol dehydrogenases.
  2. \
  3. Insect-type, or \'short-chain\' alcohol dehydrogenases.
  4. \
  5. Iron-containing alcohol dehydrogenases.
  6. \
\ Zinc-containing ADH\'s PUBMED:3622514, PUBMED:1593644 are dimeric or tetrameric enzymes that bind two\ atoms of zinc per subunit. One of the zinc atom is essential for catalytic\ activity while the other is not. Both zinc atoms are coordinated by either\ cysteine or histidine residues; the catalytic zinc is coordinated by two\ cysteines and one histidine. Zinc-containing ADH\'s are found in bacteria,\ mammals, plants, and in fungi. In many species there is more than one isozyme\ (for example, humans have at least six isozymes, yeast have three, etc.). A\ number of other zinc-dependent dehydrogenases are closely related to zinc\ ADH PUBMED:8504864 and are included in this family.\ \

In addition, this family includes NADP-dependent quinone oxidoreductase (),\ an enzyme found in bacteria (gene qor), in yeast and in mammals where, in some\ species such as rodents, it has been recruited as an eye lens protein and is\ known as zeta-crystallin PUBMED:8486156. The sequence of quinone oxidoreductase is\ distantly related to that other zinc-containing alcohol dehydrogenases and it\ lacks the zinc-ligand residues. The torpedo fish and mammalian synaptic vesicle\ membrane protein vat-1 is related to qor.

\ \

This entry represents the cofactor-binding domain of these enzymes, which is normally found towards the C-terminus. Structural studies indicate that it forms a classical Rossman fold that reversibly binds NAD(H) PUBMED:12962626, PUBMED:12627956, PUBMED:7602590.

\ ' '27' 'IPR003833' '\

Allophanate hydrolase catalyses the second reaction in an ATP-dependent, two-step degradation of urea to ammonia and C02. This follows the action of the biotin-containing urea carboxylase. Saccharomyces cerevisiae can use urea as a sole nitrogen source via this degradation pathway PUBMED:6124544. In yeast, the fusion of allophanate hydrolase to urea carboxylase is called urea amidolyase.

\ \

In bacteria, the second step in the urea degradation pathway is also the ATP-dependent allophanate hydrolase. The gene encoding this enzyme is found adjacent to the urea carboxylase gene PUBMED:15796980. Allophanate hydrolase has strict substrate specificity, as analogues of allophanate are not hydrolysed by it PUBMED:15796980.

\ \

This domain represents subunit 1 of allophanate hydrolase (AHS1), which is found in urea carboxylase.

\ ' '28' 'IPR005613' '\

Aip3p/Bud6p is a regulator of cell and cytoskeletal polarity in Saccharomyces cerevisiae (Baker\'s yeast) that was previously identified as an actin-interacting protein. Actin-interacting protein 3 (Aip3p) localizes at the cell cortex where cytoskeleton assembly must be achieved to execute polarised cell growth, and deletion of AIP3 causes gross defects in cell and cytoskeletal polarity. Aip3p localisation is mediated by the secretory pathway, mutations in early- or late-acting components of the secretory apparatus lead to Aip3p mislocalisation PUBMED:10679021.

\ ' '29' 'IPR011079' '\

Alanine racemase () plays a role in providing the D-alanine required for cell wall biosynthesis by isomerising L-alanine to D-alanine. Proteins contains this domain are found in both prokaryotic and eukaryotic proteins PUBMED:1676385,PUBMED:7871888. The molecular structure of alanine racemase from Bacillus stearothermophilus (Geobacillus stearothermophilus) was determined by X-ray crystallography to a resolution of 1.9 A PUBMED:9063881. The alanine racemase monomer is composed of two domains, an eight-stranded alpha/beta barrel at the N-terminus, and a C-terminal domain essentially composed of beta-strand. The pyridoxal 5\'-phosphate (PLP) cofactor lies in and above the mouth of the alpha/beta barrel and is covalently linked via an aldimine linkage to a lysine residue, which is at the C-terminus of the first beta-strand of the alpha/beta barrel.

\ ' '30' 'IPR001608' '\

Alanine racemase plays a role in providing the D-alanine required for cell wall biosynthesis by isomerising L-alanine to D-alanine. Proteins containing this domain are found in both prokaryotes and eukaryotes PUBMED:1676385,PUBMED:7871888. The molecular structure of alanine racemase from Bacillus stearothermophilus was determined by X-ray crystallography to a resolution of 1.9 A PUBMED:9063881. The alanine racemase monomer is composed of two domains, an eight-stranded alpha/beta barrel at the N-terminus, and a C-terminal domain essentially composed of beta-strands. The pyridoxal 5\'-phosphate (PLP) cofactor lies in and above the mouth of the alpha/beta barrel and is covalently linked via an aldimine linkage to a lysine residue, which is at the C-terminus of the first beta-strand of the alpha/beta barrel.

\

This domain is also found in the PROSC (proline synthetase co-transcribed bacterial homolog) family of proteins, which are not known to have alanine racemase activity.

\ ' '31' 'IPR000674' '\

Aldehyde oxidase () catalyses the conversion of an aldehyde in the presence of oxygen and water to an acid and hydrogen peroxide. The enzyme is a homodimer, and requires FAD, molybdenum and two 2FE-2S clusters as cofactors. Xanthine dehydrogenase () catalyses the hydrogenation of xanthine to urate, and also requires FAD, molybdenum and two 2FE-2S clusters as cofactors. This activity is often found in a bifunctional enzyme with xanthine oxidase () activity too. The enzyme can be converted from the dehydrogenase form to the oxidase form irreversibly by proteolysis or reversibly through oxidation of sulphydryl groups.

\ ' '32' 'IPR008274' '\

Aldehyde oxidase () catalyses the conversion of an aldehyde in the presence of oxygen and water to an acid and hydrogen peroxide. The enzyme is a homodimer, and requires FAD, molybdenum and two 2FE-2S clusters as cofactors. Xanthine dehydrogenase () catalyses the hydrogenation of xanthine to urate, and also requires FAD, molybdenum and two 2FE-2S clusters as cofactors. This activity is often found in a bifunctional enzyme with xanthine oxidase () activity too. The enzyme can be converted from the dehydrogenase form to the oxidase form irreversibly by proteolysis or reversibly through oxidation of sulphydryl groups.

\ ' '33' 'IPR001395' '\

The aldo-keto reductase family includes a number of related monomeric \ NADPH-dependent oxidoreductases, such as aldehyde reductase, aldose\ reductase, prostaglandin F synthase, xylose reductase, rho crystallin, and\ many others PUBMED:2498333. All possess a similar structure, with a beta-alpha-beta fold \ characteristic of nucleotide binding proteins PUBMED:2105951.\ The fold comprises a parallel beta-8/alpha-8-barrel, which contains a \ novel NADP-binding motif. The binding site is located in a large,\ deep, elliptical pocket in the C-terminal end of the beta sheet, the \ substrate being bound in an extended conformation. The hydrophobic\ nature of the pocket favours aromatic and apolar substrates over highly\ polar ones PUBMED:1621098.

Binding of the NADPH coenzyme causes a massive\ conformational change, reorienting a loop, effectively locking the\ coenzyme in place. This binding is more similar to FAD- than to\ NAD(P)-binding oxidoreductases PUBMED:1447221.

\

Some proteins of this entry contain a K+ ion channel beta chain regulatory domain; these are reported to have oxidoreductase activity PUBMED:10884227.

\ ' '34' 'IPR008183' '\

Aldose 1-epimerase () (mutarotase) is the enzyme responsible for the anomeric interconversion of D-glucose and other aldoses between their alpha- and beta-forms.

\

The sequence of mutarotase from two bacteria, Acinetobacter calcoaceticus and Streptococcus thermophilus is available PUBMED:1694527. It has also been shown that, on the basis of extensive sequence similarities, a mutarotase domain seems to be present in the C-terminal half of the fungal GAL10 protein which encodes, in the N-terminal part, UDP-glucose 4-epimerase.

\ ' '35' 'IPR004856' '\

N-linked (asparagine-linked) glycosylation of proteins is mediated by a highly conserved pathway in eukaryotes, in which a lipid (dolichol\ phosphate)-linked oligosaccharide is assembled at the endoplasmic reticulum membrane prior to the transfer of the oligosaccharide\ moiety to the target asparagine residues. This oligosaccharide is composed of Glc(3)Man(9)GlcNAc(2). The addition of the three\ glucose residues is the final series of steps in the synthesis of the oligosaccharide precursor. Alg6 transfers the first glucose residue,\ and Alg8 transfers the second one PUBMED:8016100. In the human alg6 gene, a C-T transition, which causes Ala333 to be replaced with Val, has\ been identified as the cause of a congenital disorder of glycosylation, designated as type Ic OMIM:603147 PUBMED:10359825.

\ ' '36' 'IPR001952' '\

This entry represents alkaline phosphatases () (ALP), which act as non-specific phosphomonoesterases to hydrolyse phosphate esters, optimally at high pH. The reaction mechanism involves the attack of a serine alkoxide on a phosphorus of the substrate to form a transient covalent enzyme-phosphate complex, followed by the hydrolysis of the serine phosphate. Alkaline phosphatases are found in all kingdoms of life, with the exception of some plants. Alkaline phosphatases are metalloenzymes that exist as a dimer, each monomer binding metal ions. The metal ions they carry can differ, although zinc and magnesium are the most common. For example, Escherichia coli alkaline phosphatase (encoded by phoA) requires the presence of two zinc ions bound at the M1 and M2 metal sites, and one magnesium ion bound at the M3 site PUBMED:15938627. However, alkaline phosphatases from Thermotoga maritima and Bacillus subtilis require cobalt for maximal activity PUBMED:11910033.

\ \

In mammals, there are four alkaline phosphatase isozymes: placental, placental-like (germ cell), intestinal and tissue-nonspecific (liver/bone/kidney). All four isozymes are anchored to the outer surface of the plasma membrane by a covalently attached glycosylphosphatidylinositol (GPI) anchor PUBMED:17520090. Human alkaline phosphatases have four metal binding sites: two for zinc, one for magnesium, and one for calcium ion. Placental alkaline phosphatase (ALPP or PLAP) is highly polymorphic, with at least three common alleles PUBMED:11124260. Its activity is down-regulated by a number of effectors such as l-phenylalanine, 5\'-AMP, and by p-nitrophenyl-phosphonate (PNPPate) PUBMED:15946677. The placental-like isozyme (ALPPL or PLAP-like) is elevated in germ cell tumours. The intestinal isozyme (ALPI or IAP) has the ability to detoxify lipopolysaccharide and prevent bacterial invasion across the gut mucosal barrier PUBMED:18292227. The tissue-nonspecific isozyme (ALPL) is, and may play a role in skeletal mineralisation. Defects in ALPL are a cause of hypophosphatasia, including infantile-type (OMIM:241500), childhood-type (OMIM:241510) and adult-type (OMIM:146300). Hhypophosphatasia is an inherited metabolic bone disease characterised by defective skeletal mineralisation PUBMED:17719863.

\ \

This entry also contains the related enzyme streptomycin-6-phosphate phosphatase () (encoded by strK) from Streptomyces species. This enzyme is involved in the synthesis of the antibiotic streptomycin, specifically cleaving both streptomycin-6-phosphate and, more slowly, streptomycin-3-phosphate PUBMED:1654502.

\ ' '37' 'IPR006048' '\

O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

\ \

Alpha-amylase is classified as family 13 of the glycosyl hydrolases and is present in archaea, bacteria, plants and animals. Alpha-amylase is an essential enzyme in alpha-glucan metabolism, acting to catalyse the hydrolysis of alpha-1,4-glucosidic bonds of glycogen, starch and related polysaccharides. Although all alpha-amylases possess the same catalytic function, they can vary with respect to sequence. In general, they are composed of three domains: a TIM barrel containing the active site residues and chloride ion-binding site (domain A), a long loop region inserted between the third beta strand and the alpha-helix of domain A that contains calcium-binding site(s) (domain B), and a C-terminal beta-sheet domain that appears to show some variability in sequence and length between amylases (domain C) PUBMED:11141191. Amylases have at least one conserved calcium-binding site, as calcium is essential for the stability of the enzyme. The chloride-binding functions to activate the enzyme, which acts by a two-step mechanism involving a catalytic nucleophile base (usually an Asp) and a catalytic proton donor (usually a Glu) that are responsible for the formation of the beta-linked glycosyl-enzyme intermediate.

\

This entry represents the all-beta domain that is found in several alpha-amylases, usually at the C-terminus, and which forms a Greek key beta-barrel fold in these enzymes PUBMED:7877175.

\

More information about this protein can be found at Protein of the Month: alpha-Amylase PUBMED:.

\ ' '38' 'IPR007798' '\ This family consists of mammalian Ameloblastin precursor (Amelin) proteins. Matrix proteins of tooth enamel consist mainly of amelogenin but also of non-amelogenin proteins, which, although their volumetric percentage is low, have an important role in enamel mineralization. One of the non-amelogenin proteins is ameloblastin, also known as amelin and sheathlin. Ameloblastin (AMBN) is one of the enamel sheath proteins which is thought to have a role in determining the prismatic structure of growing enamel crystals PUBMED:11867231.\ ' '39' 'IPR002502' '\ This family includes zinc amidases that have N-acetylmuramoyl-L-alanine\ amidase activity This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls\ (preferentially: D-lactyl-L-Ala). The structure is known for the\ Bacteriophage T7 structure and shows that two of the conserved histidines\ are zinc binding.\ ' '40' 'IPR002508' '\

Autolysin hydrolyses the link between N-acetylmuramoyl residues and L-amino acid residues in certain bacterial cell wall glycopeptides.

\

The cell wall envelope of Gram-positive bacteria is a macromolecular, exoskeletal organelle that is assembled and turned over at designated sites. The cell wall also functions as a surface organelle that allows Gram-positive pathogens to interact with their environment, in particular the tissues of the infected host. All of these functions require that surface proteins and enzymes be properly targeted to the cell wall envelope. Two basic mechanisms, cell wall sorting and targeting, have been identified. Cell well sorting is the covalent attachment of surface proteins to the peptidoglycan via a C-terminal sorting signal that contains a consensus LPXTG sequence. More than 100 proteins that possess cell wall-sorting signals, including the M proteins of Streptococcus pyogenes, protein A of Staphylococcus aureus, and several internalins of Listeria monocytogenes, have been identified. Cell wall targeting involves the noncovalent attachment of proteins to the cell surface via specialised binding domains. Several of these wall-binding domains appear to\ interact with secondary wall polymers that are associated with the peptidoglycan, for example teichoic acids and polysaccharides. Proteins that are targeted to the cell surface include muralytic enzymes such as autolysins, lysostaphin, and phage lytic enzymes. Other examples for targeted proteins are the surface S-layer proteins of bacilli and clostridia, as well as virulence factors required for the pathogenesis of L. monocytogenes (internalin B) and Streptococcus pneumoniae (PspA) infections PUBMED:10066836.

\ ' '41' 'IPR002937' '\ This entry consists of various amine oxidases, including maize polyamine oxidase (PAO) PUBMED:9598979, L-amino acid oxidases (LAO) and various flavin containing monoamine oxidases (MAO). The aligned region includes the flavin binding site of these\ enzymes.\ In vertebrates MAO plays an important role in regulating the intracellular levels of amines via their oxidation; these include various neurotransmitters, neurotoxins and trace amines PUBMED:9162023. In lower eukaryotes\ such as aspergillus and in bacteria the main role of amine oxidases is to provide a source of ammonium PUBMED:7770050.\ PAOs in plants, bacteria and protozoa oxidise spermidine and spermine to an aminobutyral, diaminopropane and hydrogen peroxide and are involved in the catabolism of polyamines PUBMED:9598979.\ Other members of this family include tryptophan 2-monooxygenase, putrescine oxidase, corticosteroid binding proteins and antibacterial glycoproteins.\ ' '42' 'IPR000873' '\

A number of prokaryotic and eukaryotic enzymes, which appear to act via an ATP-dependent covalent binding of AMP to their substrate, share a region of sequence similarity PUBMED:2118102, PUBMED:2911486, \ PUBMED:2254270. This region is a Ser/Thr/Gly-rich domain that is further characterised by a conserved Pro-Lys-Gly triplet. The family of enzymes includes luciferase, long chain fatty acid Co-A ligase, acetyl-CoA synthetase and various other closely-related synthetases.

\ ' '43' 'IPR007865' '\

Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

\ \

In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

\ \ \

In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

\ \ \

This N-terminal domain is associated with N-terminal region of aminopeptidase P (X-Pro aminopeptidase I and II, ) and related sequences. It is not found associated with methionyl aminopeptidase 1 () or methionyl aminopeptidase 2 () families. The domain is structurally very similar PUBMED:9520390 to the creatinase N-terminal domain (), however, little or no sequence similarity exists between the two domains.

\ \

The sequences belong to MEROPS peptidase family M24B, clan MG.

\ ' '44' 'IPR002007' '\

Peroxidases are haem-containing enzymes that use hydrogen peroxide as\ the electron acceptor to catalyse a number of oxidative reactions.

\ \

Peroxidases are found in bacteria, fungi, plants and animals. On the basis\ of sequence similarity, a number of animal haem peroxidases can be\ categorised as members of a superfamily: myeloperoxidase (MPO); eosinophil\ peroxidase (EPO); lactoperoxidase (LPO); thyroid peroxidase (TPO);\ prostaglandin H synthase (PGHS); and peroxidasin PUBMED:8062820, PUBMED:7922023, PUBMED:2840655.

\

MPO plays a major role in the oxygen-dependent microbicidal system of neutrophils. EPO from eosinophilic granulocytes \ participates in immunological reactions, and potentiates tumor necrosis \ factor (TNF) production and hydrogen peroxide release by human monocyte-derived macrophages PUBMED:2548579, PUBMED:7774640. In the main, MPO (and possibly EPO) utilises Cl-ions and H2O2 to form hypochlorous acid (HOCl), which can effectively kill\ bacteria or parasites. In secreted fluids, LPO catalyses the oxidation of thiocyanate ions (SCN-) by H2O2, producing the weak oxidising agent \ hypothiocyanite (OSCN-), which has bacteriostatic activity PUBMED:6295491. TPO uses \ I- ions and H2O2 to generate iodine, and plays a central role in the \ biosynthesis of thyroid hormones T(3) and T(4).

\

To date, the 3D structures of MPO and PGHS have been reported. MPO is a \ homodimer: each monomer consists of a light (A or B) and a heavy (C or D) \ chain resulting from post-translational excision of 6 residues from the \ common precursor. Monomers are linked by a single inter-chain disulphide. \ Each monomer includes a bound calcium ion PUBMED:1320128. PGHS exists as a symmetric \ dimer, each monomer of which consists of 3 domains: an N-terminal epidermal\ growth factor (EGF) like module; a membrane-binding domain; and a large\ C-terminal catalytic domain containing the cyclooxygenase and the peroxidase \ active sites. The catalytic domain shows striking structural similarity to \ MPO. The cyclooxygenase active site, which catalyses the formation of \ prostaglandin G2 (PGG2) from arachidonic acid, resides at the apex of a \ long hydrophobic channel, extending from the membrane-binding domain to the\ centre of the molecule. The peroxidase active site, which catalyses the\ reduction of PGG2 to PGH2, is located on the other side of the molecule, at\ the haem binding site PUBMED:8121489. Both MPO and the catalytic domain of PGHS are \ mainly alpha-helical, 19 helices being identified as topologically and\ spatially equivalent; PGHS contains 5 additional N-terminal helices that\ have no equivalent in MPO. In both proteins, three Asn residues in each\ monomer are glycosylated.

\ ' '45' 'IPR000020' '\

Complement components C3, C4 and C5 are large glycoproteins that have important functions in the immune response and host defence PUBMED:1431125. They have a wide variety of biological activities and are proteolytically activated by cleavage at a specific site, forming a- and b-fragments PUBMED:2777798. A-fragments form distinct structural domains of approximately 76 amino acids, coded for by a single exon within the complement protein gene. The C3a, C4a and C5a components are referred to as anaphylatoxins PUBMED:2777798, PUBMED:3081348: they cause smooth muscle contraction, histamine release from mast cells, and enhanced vascular permeability PUBMED:3081348. They also mediate chemotaxis, inflammation, and generation of cytotoxic oxygen radicals PUBMED:3081348. The proteins are highly hydrophilic, with a mainly alpha-helical structure held together by 3 disulphide bridges PUBMED:3081348.

\ \

Fibulins are secreted glycoproteins that become incorporated into a fibrillar extracellular matrix when expressed by cultured cells or added exogenously to cell monolayers PUBMED:2269669, PUBMED:12778127. The five known members of the family share an elongated structure and many calcium-binding sites, owing to the presence of tandem arrays of epidermal growth factor-like domains. They have overlapping binding sites for several basement-membrane proteins, tropoelastin, fibrillin, fibronectin and proteoglycans, and they participate in diverse supramolecular structures. The amino-terminal domain I of fibulin consists of three anaphylatoxin-like (AT) modules, each approximately 40 residues long and containing four or six cysteines. The structure of an AT module was determined for the complement-derived anaphylatoxin C3a, and was found to be a compact alpha-helical fold that is stabilised by three disulphide bridges in the pattern Cys1-4, Cys2-5 and Cys3-6 (where Cys is cysteine). The bulk of the remaining portion of the fibulin molecule is a series of nine EGF-like repeats PUBMED:8245130.

\ ' '46' 'IPR001828' '\ This describes a ligand binding domain and includes extracellular ligand binding domains of a wide range of receptors, as well as the bacterial amino acid binding proteins of known structure PUBMED:8011339.\ ' '47' 'IPR002110' '\

The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers PUBMED:8108379. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.

\ \

The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures PUBMED:8875926, PUBMED:9353127, PUBMED:9461436, PUBMED:9865693. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure PUBMED:8875926, PUBMED:12461176.

\ \ \ ' '48' 'IPR018502' '\

The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner PUBMED:1646719. They are distributed ubiquitously in different tissues and cell types of higher and lower eukaryotes, including mammals, fish, birds, Drosophila melanogaster (Fruit fly), Xenopus laevis (African clawed frog), Caenorhabditis elegans\ , Dictyostelium discoideum (Slime mold) and Neurospora crassa PUBMED:9797403, PUBMED:9165068. Annexins are absent from yeasts and prokaryotes PUBMED:15059252. The plant annexins are somewhat distinct from those found in other taxa PUBMED:9165068.

\ \

Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long PUBMED:1646719. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five alpha-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic \'type 2\' motif for binding calcium ions with the sequence \'GxGT-[38 residues]-D/E\'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca 2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.

\ \

Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The famiy has been linked with inhibition of phospholipase activity, exocytosis and endoctyosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication PUBMED:9797403.

\ ' '49' 'IPR005561' '\

ANTAR (AmiR and NasR transcription antitermination regulators) is an RNA-binding domain found in bacterial transcription antitermination regulatory proteins PUBMED:11796212. This domain has been detected in various response regulators of two-component systems, which are structured around two proteins, a histidine kinase and a response regulator. This domain is also found in one-component sensory regulators from a variety of bacteria. Most response regulators interact with DNA, however ANTAR-containing regulators interact with RNA. The majority of the domain consists of a coiled-coil.

\ ' '50' 'IPR006818' '\

This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity PUBMED:14680630. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a compact immunoglobulin-like beta sandwich fold topped by three helical linkers PUBMED:10759893.

\ ' '52' 'IPR004094' '\

Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

\

This group of serine protease inhibitors belong to MEROPS inhibitor family I15, clan IO. They inhibit serine peptidases of the S1 family () PUBMED:14705960 and are characterised by a well conserved pattern of cysteine residues. Many of the proteins that belong to this family are anti-coagulants.

\ ' '53' 'IPR001471' '\

Pathogenesis-related genes transcriptional activator binds to the GCC-box pathogenesis-related promoter element and activates the plant\'s defence genes. Ethylene, chemically the simplest plant hormone, participates in a number of stress responses and developmental processes: e.g., fruit ripening, inhibition of stem and root elongation, promotion of seed germination and flowering, senescence of leaves and flowers, and sex determination PUBMED:7732375. DNA sequence elements that confer ethylene responsiveness have been shown to contain two 11bp GCC boxes, which are necessary and sufficient for transcriptional control by ethylene. Ethylene responsive element binding proteins (EREBPs) have now been identified in a variety of plants. The proteins share a similar domain of around 59 amino acids, which interacts directly with the GCC box in the ERE.

\ ' '54' 'IPR003374' '\ This prokaryotic family of lipoproteins are related to ApbE, from Salmonella typhimurium. ApbE is involved in thiamine synthesis PUBMED:9473043. More specifically is may be involved in the conversion of aminoimidazole ribotide (AIR) to 4-amino-5-hydroxymethyl-2-methyl pyrimidine (HMP) during the biosynthesis of the pyrimidine moiety of thiamine.\ ' '55' 'IPR007192' '\

The anaphase-promoting complex is composed of eight protein subunits, including BimE (APC1), CDC27 (APC3), CDC16 (APC6), and CDC23 (APC8). This entry is for CDC23.

\ ' '56' 'IPR007239' '\ Macroautophagy is a bulk degradation process induced by starvation in eukaryotic cells. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. No molecule involved in autophagy has yet been identified in higher eukaryotes PUBMED:9852036. The pre-autophagosomal structure contains at least five Apg proteins: Apg1p, Apg2p, Apg5p, Aut7p/Apg8p and Apg16p. It is found in the vacuole PUBMED:11689437. The C-terminal glycine of Apg12p is conjugated to a lysine residue of Apg5p via an isopeptide bond. During autophagy, cytoplasmic components are enclosed in autophagosomes and delivered to lysosomes/vacuoles. Auotphagy protein 16 (Apg16) has been shown to be bind to Apg5 and is required for the function of the Apg12p-Apg5p conjugate PUBMED:10406794. Autophagy protein 5 (Apg5) is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway PUBMED:10712513.\ This entry represents autophagy protein 5 (Apg5).\ ' '57' 'IPR007680' '\ Arabinosyltransferase is involved in arabinogalactan (AG) biosynthesis pathway in mycobacteria. AG is a component of the macromolecular assembly of the mycolyl-AG-peptidoglycan complex of the cell wall. This enzyme has important clinical applications as it is believed to be the target of the antimycobacterial drug Ethambutol PUBMED:8876238.\ ' '58' 'IPR004313' '\

The two acireductone dioxygenase enzymes (ARD and ARD\', previously known as E-2 and E-2\') from Klebsiella pneumoniae share the same amino acid sequence Q9ZFE7, but bind different metal ions: ARD binds Ni2+, ARD\' binds\ Fe2+ PUBMED:9880484. ARD and ARD\' can be experimentally interconverted by removal of the bound metal ion and reconstitution with\ the appropriate metal ion. The two enzymes share the same substrate, 1,2-dihydroxy-3-keto-5-(methylthio)pentene, but\ yield different products. ARD\' yields the alpha-keto precursor of methionine (and formate), thus forming part of the\ ubiquitous methionine salvage pathway that converts 5\'-methylthioadenosine (MTA) to methionine. This pathway is\ responsible for the tight control of the concentration of MTA, which is a powerful inhibitor of polyamine biosynthesis and\ transmethylation reactions PUBMED:11371200. ARD yields methylthiopropanoate, carbon monoxide and formate, and thus prevents the\ conversion of MTA to methionine. The role of the ARD catalysed reaction is unclear: methylthiopropanoate is cytotoxic,\ and carbon monoxide can activate guanylyl cyclase, leading to increased intracellular cGMP levels PUBMED:11371200, PUBMED:9880484.

\

This family also\ contains other proteins, whose functions are not well characterised.

\ ' '59' 'IPR001164' '\

This entry describes a family of small GTPase activating proteins, for example ARF1-directed GTPase-activating protein, the cycle control GTPase\ activating protein (GAP) GCS1 which is important for the regulation of\ the ADP ribosylation factor ARF, a member of the Ras superfamily of GTP-binding\ proteins PUBMED:9446556. The GTP-bound form of ARF is essential for the maintenance of normal\ Golgi morphology, it participates in recruitment of coat proteins which are\ required for budding and fission of membranes. Before the fusion with an\ acceptor compartment the membrane must be uncoated. This step required the\ hydrolysis of GTP associated to ARF. These proteins contain a characteristic zinc finger motif\ (Cys-x2-Cys-x(16,17)-x2-Cys) which displays some similarity to the C4-type\ GATA zinc finger. The ARFGAP domain display no obvious similarity to other GAP\ proteins.

\ \ The 3D structure of the ARFGAP domain of the PYK2-associated protein beta has\ been solved PUBMED:10601011. It consists of a three-stranded beta-sheet surrounded by 5\ alpha helices. The domain is organised around a central zinc atom which is\ coordinated by 4 cysteines. The ARFGAP domain is clearly\ unrelated to the other GAP proteins structures which are exclusively helical.\ Classical GAP proteins accelerate GTPase activity by supplying an arginine\ finger to the active site. The crystal structure of ARFGAP bound to ARF\ revealed that the ARFGAP domain does not supply an arginine to the active site\ which suggests a more indirect role of the ARFGAP domain in the GTPase\ hydrolysis PUBMED:10102276.

\ \

The Rev protein of human immunodeficiency virus type 1 (HIV-1) facilitates\ nuclear export of unspliced and partly-spliced viral RNAs PUBMED:7637788. Rev contains\ an RNA-binding domain and an effector domain; the latter is believed to \ interact with a cellular cofactor required for the Rev response and hence\ HIV-1 replication. Human Rev interacting protein (hRIP) specifically\ interacts with the Rev effector. The amino acid sequence of hRIP is \ characterised by an N-terminal, C-4 class zinc finger motif.

\ ' '60' 'IPR001606' '\

Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes\ including embryonic development, cell lineage gene regulation and cell cycle\ control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure PUBMED:10838570. The basic structure of the ARID domain domain appears to be a series of six\ alpha-helices separated by beta-strands, loops, or turns, but the structured\ region may extend to an additional helix at either or both ends of the basic\ six. Based on primary sequence homology, they can be partitioned into three\ structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini.\

\ \ \

The human SWI-SNF complex protein p270 is an ARID family member with non-sequence-specific DNA binding activity. The ARID consensus and other structural features are common to both p270 and yeast SWI1, suggesting that p270 is a human counterpart of SWI1 PUBMED:10757798. The approximately 100-residue ARID sequence is present in a series of proteins strongly implicated in the regulation of cell growth, development, and tissue-specific gene expression. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific gene expression), dead ringer (a Drosophila melanogaster gene product required for normal development), and MRF-2 (which represses expression from the Cytomegalovirus enhancer) have been analyzed directly in regard to their DNA binding properties. Each binds preferentially to AT-rich sites. In contrast, p270 shows no sequence preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID domains and that ARID family proteins may be involved in a wider range of\ DNA interactions PUBMED:10757798.

\ ' '61' 'IPR006773' '\

This is a family of eukaryotic proteins, many of which are believed to be involved in cell adhesion. Members are involved in gastrulation and also in metastatis formation and the progression of cancer. Experimental evidence suggests that these proteins are transmembrane and possibly glycoproteins PUBMED:10610020, PUBMED:10919708.

\ ' '62' 'IPR000225' '\

The armadillo (Arm) repeat is an approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila melanogaster segment polarity gene armadillo involved in signal transduction through wingless. Animal Arm-repeat proteins function in various processes, including intracellular signalling and cytoskeletal regulation, and include such proteins as beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumour suppressor protein, and the nuclear transport factor importin-alpha, amongst others PUBMED:9770300. A subset of these proteins is conserved across eukaryotic kingdoms. In higher plants, some Arm-repeat proteins function in intracellular signalling like their mammalian counterparts, while others have novel functions PUBMED:12946625.

\

The 3-dimensional fold of an armadillo repeat is known from the crystal structure of beta-catenin, where the 12 repeats form a superhelix of alpha helices with three helices per unit PUBMED:9298899. The cylindrical structure features a positively charged grove, which presumably interacts with the acidic surfaces of the known interaction partners of beta-catenin.

\ \ ' '63' 'IPR011021' '\

G protein-coupled receptors are a large family of signalling molecules that respond to a wide variety of extracellular stimuli. The receptors relay the information encoded by the ligand through the activation of heterotrimeric G proteins and intracellular effector molecules. To ensure the appropriate regulation of the signalling cascade, it is vital to properly inactivate the receptor. This inactivation is achieved, in part, by the binding of a soluble protein, arrestin, which uncouples the receptor from the downstream G protein after the receptors are phosphorylated by G protein-coupled receptor kinases. In addition to the inactivation of G protein-coupled receptors, arrestins have also been implicated in the endocytosis of receptors and cross talk with other signalling pathways. Arrestin (retinal S-antigen) is a major protein of the retinal rod outer segments. It interacts with photo-activated phosphorylated rhodopsin, inhibiting or \'arresting\' its ability to interact with transducin PUBMED:15335861. The protein binds calcium, and shows similarity in its C-terminus to alpha-transducin and other purine nucleotide-binding proteins. In mammals, arrestin is associated with autoimmune uveitis.

\

Arrestins comprise a family of closely-related proteins that includes beta-arrestin-1 and -2, which regulate the function of beta-adrenergic receptors by binding to their phosphorylated forms, impairing their capacity to activate G(S) proteins; Cone photoreceptors C-arrestin (arrestin-X) PUBMED:7720881, which could bind to phosphorylated red/green opsins; and Drosophila phosrestins I and II, which undergo light-induced phosphorylation, and probably play a role in photoreceptor transduction PUBMED:8452755, PUBMED:1517224, PUBMED:2158671.

\

The crystal structure of bovine retinal arrestin comprises two domains of antiparallel beta-sheets connected through a hinge region and one short alpha-helix on the back of the amino-terminal fold PUBMED:9495348. The binding region for phosphorylated light-activated rhodopsin is located at the N-terminal domain, as indicated by the docking of the photoreceptor to the three-dimensional structure of arrestin.

\

The N-terminal domain consists of an immunoglobulin-like beta-sandwich structure. This entry represents proteins with immunoglobulin-like domains that are similar to those found in arrestin.

\ ' '64' 'IPR011022' '\

G protein-coupled receptors are a large family of signalling molecules that respond to a wide variety of extracellular stimuli. The receptors relay the information encoded by the ligand through the activation of heterotrimeric G proteins and intracellular effector molecules. To ensure the appropriate regulation of the signalling cascade, it is vital to properly inactivate the receptor. This inactivation is achieved, in part, by the binding of a soluble protein, arrestin, which uncouples the receptor from the downstream G protein after the receptors are phosphorylated by G protein-coupled receptor kinases. In addition to the inactivation of G protein-coupled receptors, arrestins have also been implicated in the endocytosis of receptors and cross talk with other signalling pathways. Arrestin (retinal S-antigen) is a major protein of the retinal rod outer segments. It interacts with photo-activated phosphorylated rhodopsin, inhibiting or \'arresting\' its ability to interact with transducin PUBMED:15335861. The protein binds calcium, and shows similarity in its C-terminus to alpha-transducin and other purine nucleotide-binding proteins. In mammals, arrestin is associated with autoimmune uveitis.

\

Arrestins comprise a family of closely-related proteins that includes beta-arrestin-1 and -2, which regulate the function of beta-adrenergic receptors by binding to their phosphorylated forms, impairing their capacity to activate G(S) proteins; Cone photoreceptors C-arrestin (arrestin-X) PUBMED:7720881, which could bind to phosphorylated red/green opsins; and Drosophila phosrestins I and II, which undergo light-induced phosphorylation, and probably play a role in photoreceptor transduction PUBMED:8452755, PUBMED:1517224, PUBMED:2158671.

\

The crystal structure of bovine retinal arrestin comprises two domains of antiparallel beta-sheets connected through a hinge region and one short alpha-helix on the back of the amino-terminal fold PUBMED:9495348. The binding region for phosphorylated light-activated rhodopsin is located at the N-terminal domain, as indicated by the docking of the photoreceptor to the three-dimensional structure of arrestin.

\

The C-terminal domain consists of an immunoglobulin-like beta-sandwich structure. This entry represents proteins with immunoglobulin-like domains that are similar to those found in arrestin.

\ ' '65' 'IPR007042' '\ This conserved, predominantly, C-terminal region is found in a number of proteins including arsenite-resistance protein 2, which is thought to play a role in arsenite resistance PUBMED:10069470. Arsenite is a carcinogenic compound which can act as a comutagen by inhibiting DNA repair.\ ' '66' 'IPR019887' '\

The many bacterial transcription regulation proteins which bind DNA through a\ \'helix-turn-helix\' motif can be classified into subfamilies on the basis of\ sequence similarities. One such family is the AsnC/Lrp subfamily PUBMED:7770911. The Lrp family of transcriptional regulators appears to be widely distributed among bacteria and\ archaea, as an important regulatory system of the amino acid metabolism and related processes PUBMED:12675791.

Members of the Lrp family are small DNA-binding proteins with molecular masses of around\ 15 kDa. Target promoters often contain a\ number of binding sites that typically lack obvious inverted repeat elements, and to which binding is\ usually co-operative. LrpA from Pyrococcus furiosus is the first Lrp-like protein to date of which a three-dimensional structure\ has been solved. In the crystal structure LrpA forms an octamer consisting\ of four dimers. The structure revealed that the N-terminal part of the protein consists of a\ helix-turn-helix (HTH) domain, a fold generally involved in DNA binding.\ The C-terminus of Lrp-like proteins has a beta-fold, where the two alpha-helices are located at one side of the four-stranded antiparallel beta-sheet.\ LrpA forms a homodimer mainly through interactions between the beta-strands of this C-terminal\ domain, and an octamer through further interactions between the second alpha-helix and fourth beta-strand\ of the motif. Hence, the C-terminal domain of Lrp-like proteins appears to\ be involved in ligand-response and activation PUBMED:12675791.

\ ' '67' 'IPR007803' '\

The alpha-ketoglutarate-dependent dioxygenase aspartyl (asparaginyl) beta-hydroxylase () specifically hydroxylates one aspartic or asparagine residue in certain epidermal growth factor-like domains of a number of proteins. Its action may be due to histidine-675, which, when mutated to an alanine residue, causes the loss of enzymatic activity in the protein PUBMED:8041771.

\ \

An invertebrate alpha-ketoglutarate-dependent aspartyl/asparaginyl beta-hydroxylase, which posttranslationally hydroxylates specific aspartyl or asparaginyl residues within epidermal growth factor-like modules PUBMED:2187868, activity was found to be similar to that of the purified mammalian aspartyl/asparaginyl beta-hydroxylase with respect to cofactor requirements, stereochemistry and substrate sequence specificity PUBMED:1449478. This enzyme requires Fe2+ as a cofactor. Some vitamin K-dependent coagulation factors, as well as synthetic peptides based on the structure of the first epidermal growth factor domain of human coagulation factor IX or X, can act as acceptors.

\ ' '68' 'IPR015942' '\

This entry represents a group of related proteins that includes aspartate racemase, glutamate racemase, hydantoin racemase and arylmalonate decarboxylase.

\ \

Aspartate racemase () and glutamate racemase () are two evolutionary related bacterial enzymes that do not seem to require a cofactor for their activity PUBMED:8385993. Glutamate racemase, which interconverts L-glutamate into D-glutamate, is required for the biosynthesis of peptidoglycan and some peptide-based antibiotics such as gramicidin S.\ In addition to characterised aspartate and glutamate racemases, this family also includes a hypothetical protein from Erwinia carotovora and one from Escherichia coli (ygeA).

\

Two conserved cysteines are present in the sequence of these enzymes. They are expected to play a role in catalytic activity by acting as bases in proton abstraction from the substrate.

\ ' '69' 'IPR001506' '\

Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

\ \

In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

\ \ \

In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

\ \ \

This group of metallopeptidases belong to the MEROPS peptidase family M12, subfamily M12A (astacin family, clan MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH PUBMED:7674922.

\ \ \ \ \

The astacin () family of metalloendopeptidases encompasses a range of proteins\ found in hydra to humans, in mature and developmental systems PUBMED:7670368. Their\ functions include activation of growth factors, degradation of polypeptides,\ and processing of extracellular proteins PUBMED:7670368. The proteins are synthesised\ with N-terminal signal and pro-enzyme sequences, and many contain multiple\ domains C-terminal to the protease domain. They are either secreted from\ cells, or are associated with the plasma membrane.

\ \

The astacin molecule adopts a kidney shape, with a deep active-site cleft\ between its N- and C-terminal domains PUBMED:8445658. The zinc ion, which lies at the\ bottom of the cleft, exhibits a unique penta-coordinated mode of binding,\ involving 3 histidine residues, a tyrosine and a water molecule (which is\ also bound to the carboxylate side chain of Glu93) PUBMED:8445658. The N-terminal\ domain comprises 2 alpha-helices and a 5-stranded beta-sheet. The overall\ topology of this domain is shared by the archetypal zinc-endopeptidase\ thermolysin. Astacin protease domains also share common features with\ serralysins, matrix metalloendopeptidases, and snake venom proteases; they\ cleave peptide bonds in polypeptides such as insulin B chain and bradykinin,\ and in proteins such as casein and gelatin; and they have arylamidase\ activity PUBMED:7670368.

\ ' '70' 'IPR017956' '\

AT hooks are DNA-binding motifs with a preference for A/T rich regions. These motifs are found in a variety of proteins, including the high mobility group (HMG) proteins PUBMED:11406267, in DNA-binding proteins from plants PUBMED:8790293 and in hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex PUBMED:17081121.

\ \

High mobility group (HMG) proteins are a family of relatively low molecular weight non-histone components in chromatin PUBMED:11406267. HMG-I and HMG-Y (HMGA) are proteins of about 100 amino acid residues which are produced by the alternative splicing of a single gene. HMG-I/Y proteins bind preferentially to the minor groove of AT-rich regions in double-stranded DNA in a non-sequence specific manner PUBMED:1692833, PUBMED:8414980. It is suggested that these proteins could function in nucleosome phasing and in the 3\' end processing of mRNA transcripts. They are also involved in the transcription regulation of genes containing, or in close proximity to, AT-rich regions.

\ ' '71' 'IPR004130' '\ Members of this family are found in a range of archaea and eukaryotes and have hypothesised ATP binding activity.\ ' '72' 'IPR000793' '\

ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

\ \ \

This entry represents the alpha and beta subunits found in the F1, V1, and A1 complexes of F-, V- and A-ATPases, respectively (sometimes called the A and B subunits in V- and A-ATPases). The F-ATPases (or F1F0-ATPases), V-ATPases (or V1V0-ATPases) and A-ATPases (or A1A0-ATPases) are composed of two linked complexes: the F1, V1 or A1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0, V0 or A0 complex that forms the membrane-spanning pore. The F-, V- and A-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis PUBMED:11309608, PUBMED:15629643.

\

In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit PUBMED:12745923.

\

In V- and A-ATPases, the alpha/A and beta/B subunits of the V1 or A1 complex are homologous to the alpha and beta subunits in the F1 complex of F-ATPases, except that the alpha subunit is catalytic and the beta subunit is regulatory.

\

The alpha/A and beta/B subunits can each be divided into three regions, or domains, centred around the ATP-binding pocket, and based on structure and function, where the central region is the nucleotide-binding domain () PUBMED:12885621. This entry represents the C-terminal domain of the alpha/A/beta/B subunits, which forms a left-handed superhelix composed of 4-5 individual helices. The C-terminal domain can vary between the alpha and beta subunits, and between different ATPases PUBMED:11309608.

\

More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

\ ' '73' 'IPR004100' '\

ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

\ \ \

This entry represents the alpha and beta subunits found in the F1, V1, and A1 complexes of F-, V- and A-ATPases, respectively (sometimes called the A and B subunits in V- and A-ATPases). The F-ATPases (or F1F0-ATPases), V-ATPases (or V1V0-ATPases) and A-ATPases (or A1A0-ATPases) are composed of two linked complexes: the F1, V1 or A1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0, V0 or A0 complex that forms the membrane-spanning pore. The F-, V- and A-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis PUBMED:11309608, PUBMED:15629643.

\

In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit PUBMED:12745923.

\

In V- and A-ATPases, the alpha/A and beta/B subunits of the V1 or A1 complex are homologous to the alpha and beta subunits in the F1 complex of F-ATPases, except that the alpha subunit is catalytic and the beta subunit is regulatory.

\

The alpha/A and beta/B subunits can each be divided into three regions, or domains, centred around the ATP-binding pocket, and based on structure and function, where the central region is the nucleotide-binding domain () PUBMED:12885621. This entry represents the N-terminal domain of the alpha/A/beta/B subunits, which forms a closed beta-barrel with Greek-key topology.

\

More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

\ ' '74' 'IPR006808' '\

ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

\ \ \

F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

\

This entry represents the G subunit found in the F0 complex of F-ATPases in mitochondria. The function of subunit G is currently unknown. There is no counterpart in chloroplast or bacterial F-ATPases identified so far PUBMED:8011660.

\

More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

\ ' '75' 'IPR005520' '\

This domain is found in attacin and sarcotoxin, but not diptericin (which shares similarity to the C-terminal region of attacin). All these proteins are insect antibacterial proteins which are induced by the fat body and subsequently secreted into the hemolymph where they act synergistically to kill the invading microorganism PUBMED:7772280.

\ ' '76' 'IPR005546' '\

Secretion of protein products occurs by a number of different pathways in bacteria. One of these pathways known as the type IV pathway was first described for the IgA1 protease PUBMED:3027577. The protein component that mediates secretion through the outer membrane is contained within the secreted protein itself, hence the proteins secreted in this way are called autotransporters. This family corresponds to the presumed integral membrane beta-barrel domain that transports the protein. This domain is found at the C-terminus of the proteins it occurs in. The N-terminus contains the variable passenger domain that is translocated across the membrane. Once the passenger domain is exported it is cleaved auto-catalytically in some proteins, in others a different peptidase is used and in some cases no cleavage occurs PUBMED:9778731. In those proteins where the cleavage is auto-catalytic, the peptidase domains belong to MEROPS peptidase families S6 and S8.

\ ' '77' 'IPR004776' '\

This entry represents a mostly uncharacterised family of membrane transport proteins found in eukaryotes, bacteria and archaea. Most characterised members of this family are the PIN components of auxin efflux systems from plants. These carriers are saturable, auxin-specific, and localized to the basal ends of auxin transport-competent cells PUBMED:16054428, PUBMED:15564124. Plants typically posses several of these proteins, each displaying a unique tissue-specific expression pattern. They are expressed in almost all plant tissues including vascular tissues and roots, and influence many processes including the establishment of embryonic polarity, plant growth, apical hook formation in seedlings and the photo- and gravitrophic responses. These plant proteins are typically 600-700 amino acyl residues long and exhibit 8-12 transmembrane segments.

\ ' '78' 'IPR006158' '\

The cobalamin (vitamin B12) binding domain can bind two different forms of the cobalamin cofactor, with cobalt bonded either to a methyl group (methylcobalamin) or to 5\'-deoxyadenosine (adenosylcobalamin). Cobalamin-binding domains are mainly found in two families of enzymes present in animals and prokaryotes, which perform distinct kinds of reactions at the cobalt-carbon bond. Enzymes that require methylcobalamin carry out methyl transfer reactions. Enzymes that require adenosylcobalamin catalyse reactions in which the first step is the cleavage of adenosylcobalamin to form cob(II)alamin and the 5\'-deoxyadenosyl radical, and thus act as radical generators. In both types of enzymes the B12-binding domain uses a histidine to bind the cobalt atom of cobalamin cofactors. This histidine is embedded in a DXHXXG sequence, the most conserved primary sequence motif of the domain PUBMED:16042603, PUBMED:9242908, PUBMED:14527323. Proteins containing the cobalamin-binding domain include:

\ \

\ \

The core structure of the cobalamin-binding domain is characterised by a five-stranded alpha/beta (Rossmann) fold, which consists of 5 parallel beta-sheets surrounded by 4-5 alpha helices in three layers (alpha/beta/alpha) PUBMED:7992050. Upon binding cobalamin, important elements of the binding site appear to become structured, including an alpha-helix that forms on one side of the cleft accommodating the nucleotide \'tail\' of the cofactor. In cobalamin, the cobalt atom can be either free (dmb-off) or bound to dimethylbenzimidazole (dmb-on) according to the pH. When bound to the cobalamin-binding domain, the dimethylbenzimidazole ligand is replaced by the active histidine (His-on) of the DXHXXG motif. The replacement of dimethylbenzimidazole by histidine allows switching between the catalytic and activation cycles PUBMED:8805541. In methionine synthase the cobalamin cofactor is sandwiched between the cobalamin-binding domain and an approximately 90 residues N-terminal domain forming a helical bundle comprising two pairs of antiparallel helices PUBMED:8805541.

\ \

In methionine synthase, there is a second, adjacent domain involved in cobalamin binding that forms a 4-helical bundle cap (); in the conversion to the active conformation of this enzyme, the 4-helical cap rotates to allow the cobalamin cofactor to bind the activation domain () PUBMED:11731805.

\ ' '79' 'IPR003759' '\

Cobalamin-dependent methionine synthase () is a large modular protein that catalyses methyl transfer from methyltetrahydrofolate (CH3-H4folate) to homocysteine. During the catalytic cycle, it supports three distinct methyl transfer reactions, each involving the cobalamin (vitamin B12) cofactor and a substrate bound to its own functional unit PUBMED:11731805. The cobalamin cofactor plays an essential role in this reaction, accepting the methyl group from CH3-H4folate to form methylcob(III)alamin, and in turn donating the methyl group to homocysteine to generate methionine and cob(I)alamin.

\

Methionine synthase is a large enzyme composed of four structurally and functionally distinct modules: the first two modules bind homocysteine and CH3-H4folate, the third module binds the cobalamin cofactor and the C-terminal module binds S-adenosylmethionine. The cobalamin-binding module is composed of two structurally distinct domains: a 4-helical bundle cap domain (residues 651-740 in the Escherichia coli\ enzyme) and an alpha/beta B12-binding domain (residues 741-896) (). The 4-helical bundle forms a cap over the alpha/beta domain, which acts to shield the methyl ligand of cobalamin from solvent PUBMED:8939751. Furthermore, in the conversion to the active conformation of this enzyme, the 4-helical cap rotates to allow the cobalamin cofactor to bind the activation domain (). The alpha/beta domain is a common cobalamin-binding motif, whereas the 4-helical bundle domain with its methyl cap is a distinctive feature of methionine synthases.

\

This entry represents the 4-helical bundle cap domain. This domain is also present in other shorter proteins that bind to B12, and is always found N-terminus to the alpha/beta B12-binding domain.

\ ' '80' 'IPR003340' '\ Two DNA binding proteins, RAV1 and RAV2 from Arabidopsis thaliana contain two distinct amino acid sequence domains found only in higher plant species. The N-terminal regions of RAV1 and RAV2 are homologous to the AP2 DNA-binding domain (see ) present in a family of transcription factors, while the C-terminal region exhibits homology to the highly conserved C-terminal domain, designated B3, of VP1/ABI3 transcription factors PUBMED:9862967. The AP2 and B3-like domains of RAV1 bind autonomously to the CAACA and CACCTG motifs, respectively, and together achieve a high affinity and specificity of binding. It has been suggested that the AP2 and B3-like domains of RAV1 are connected by a highly flexible structure\ enabling the two domains to bind to the CAACA and CACCTG motifs in various\ spacings and orientations PUBMED:9862967.\ ' '81' 'IPR002191' '\

The fliL operon of Escherichia coli contains seven genes (including fliO, fliP, fliQ and fliR) involved in the biosynthesis and functioning of the flagellar organelle PUBMED:8282695. The fliO, fliP, fliQ and fliR genes encode highly hydrophobic polypeptides. The fliQ gene product, a small integral membrane protein that contains two putative transmembrane (TM) regions, is required for the assembly of the rivet at the earliest stage of flagellar biosynthesis.

\

Proteins sharing an evolutionary relationship with FliQ have been found in a range of bacteria: these include Yop translocation protein S from Yersinia pestis PUBMED:8300512; surface antigen-presentation protein SpaQ from Salmonella typhimurium and Shigella flexneri PUBMED:8404849; and probable translocation protein Y4YM from Rhizobium sp. (strain NGR234) PUBMED:9163424. All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways PUBMED:7814323.

\ ' '82' 'IPR003824' '\

This is a family of small, highly hydrophobic proteins. Over-expression of this protein in Escherichia coli is associated with bacitracin resistance PUBMED:8389741, and the protein was originally proposed to be an undecaprenol kinase called bacA. BacA protein, however, does not show undecaprenol phosphokinase activity PUBMED:15138271. It is now known to be an undecaprenyl pyrophosphate phosphatase () and is renamed UppP. It is not the only protein associated with bacitracin resistance PUBMED:15946938, PUBMED:15778224.

\ ' '83' 'IPR002372' '\ Pyrrolo-quinoline quinone (PQQ) is a redox coenzyme, which serves as a cofactor\ for a number of enzymes (quinoproteins) and particularly for some bacterial\ dehydrogenases PUBMED:2549854, PUBMED:2572081. A number of bacterial quinoproteins belong to this family.\ \

Enzymes in this group have repeats of a beta propeller.

\ ' '84' 'IPR005158' '\

Found in the DNRI/REDD/AFSR family of regulators, this region of AFSR () along with the C-terminal region is capable of independently directing actinorhodin production. It is important for the formation of secondary metabolites.

\ ' '85' 'IPR001025' '\

The BAH (bromo-adjacent homology) family contains proteins such as eukaryotic DNA (cytosine-5) methyltransferases , the origin recognition complex 1 (Orc1) proteins, as well as several proteins involved in transcriptional regulation. The BAH domain appears to act as a protein-protein interaction module specialised in gene silencing, as suggested for example by its interaction within yeast Orc1p with the silent information regulator Sir1p. The BAH module might therefore play an important role by linking DNA methylation, replication and transcriptional regulation PUBMED:10100640.

\ ' '86' 'IPR001107' '\ The band 7 protein is an integral membrane protein which is thought to regulate\ cation conductance. A variety of proteins belong to this family. These include the\ prohibitins, cytoplasmic anti-proliferative proteins and stomatin, an erythrocyte membrane protein. Bacterial HflC protein also belongs\ to this family.\ ' '87' 'IPR004711' '\

Proteins in this entry are often encoded near genes involved in the transport of benzoate into the cell PUBMED:11053377, PUBMED:1885518. They contain several transmembrane regions and show some similarity to members of the aromatic acid:H(+) symporter (AAHS) family, a subgroup of the major facilitator superfamily (MFS). Therefore, they are predicted to be benzoate:H(+) symporters, though this has not been established experimentally.

\ \ ' '88' 'IPR004210' '\

The BESS domain has been named after the three proteins that originally defined the domain: BEAF (Boundary element associated factor 32) PUBMED:7781065, Suvar(3)7 PUBMED:2107402 and Stonewall PUBMED:8631271). The BESS domain is 40 amino acid residues long and is predicted to be composed of three alpha helices, as such it might be related to the myb/SANT HTH domain. The BESS domain directs a variety of protein-protein interactions, including interactions with itself, with Dorsal, and with a TBP-associated factor. It is found in a single copy in Drosophila proteins and is often associated with the MADF domain PUBMED:1731341, PUBMED:9528796, PUBMED:12459265.

\

Proteins known to contain a BESS domain include:

\ \

\ ' '89' 'IPR003344' '\ Proteins that contain this domain are found in a variety of bacterial and phage surface proteins such as intimins. Intimin is a bacterial cell-adhesion molecule that mediates the intimate bacterial host-cell interaction. It contains three domains; two immunoglobulin-like domains and a C-type lectin-like module implying that carbohydrate recognition may be important in intimin-mediated cell adhesion PUBMED:10201396.\ ' '90' 'IPR005482' '\

Acetyl-CoA carboxylase is found in all animals, plants, and bacteria and catalyzes the first committed step in fatty acid synthesis. It is a\ multicomponent enzyme containing a biotin carboxylase activity, a biotin carboxyl carrier protein, and a carboxyltransferase\ functionality. The\ "B-domain" extends from the main body of the subunit where it folds into two alpha-helical regions and three strands of beta-sheet.\ Following the excursion into the B-domain, the polypeptide chain folds back into the body of the protein where it forms an\ eight-stranded antiparallel beta-sheet. In addition to this major secondary structural element, the C-terminal domain also contains a\ smaller three-stranded antiparallel beta-sheet and seven alpha-helices PUBMED:7915138.

\ ' '91' 'IPR002860' '\

Members of this family contain multiple BNR (bacterial neuraminidase repeat) repeats or Asp-boxes. The repeats are short, however the repeats are never found closer than 40 residues together suggesting that the repeat is structurally longer. These repeats are found in a variety of non-homologous proteins, including bacterial ribonucleases, sulphite oxidases, reelin, netrins, sialidases, neuraminidases, some lipoprotein receptors, and a variety of glycosyl hydrolases PUBMED:11266614.

\ ' '92' 'IPR000515' '\

Bacterial binding protein-dependent transport systems PUBMED:3527048, PUBMED:2229036 are multicomponent systems typically composed of a periplasmic substrate-binding protein, one or two reciprocally homologous integral inner-membrane proteins and one or two peripheral membrane ATP-binding proteins that couple energy to the active transport system. The integral inner-membrane proteins translocate the substrate across the membrane. It has been shown PUBMED:3000770, PUBMED:7934906 that most of these proteins contain a conserved region located about 80 to 100 residues from their C-terminal extremity. This region seems PUBMED:1738314 to be located in a cytoplasmic loop between two transmembrane domains. Apart from the conserved region, the sequence of these proteins is quite divergent, and they have a variable number of transmembrane helices, however they can be classified into seven families which have been respectively termed: araH, cysTW, fecCD, hisMQ, livHM, malFG and oppBC.

\ ' '93' 'IPR003142' '\

This C-terminal domain has an SH3-like barrel fold, the function of which is unknown. It is found associated with prokaryotic bifunctional transcriptional repressors PUBMED:2642476 and eukaryotic enzymes involved in biotin utilization PUBMED:7842009, PUBMED:9173880.

\ \

In Escherichia coli the biotin operon repressor (BirA) is a bifunctional protein. BirA acts both as the acetyl-coA carboxylase biotin holoenzyme synthetase () and as the biotin operon repressor. DNA sequence analysis of mutations indicates that the helix-turn-helix DNA binding region is located at the N-terminus while mutations affecting enzyme function, although mapping over a large region, are found mainly in the central part of the protein\'s primary sequence PUBMED:2642476.

\ ' '94' 'IPR004143' '\ This domain is found in biotin protein ligase, lipoate-protein ligase A and B. Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Each organism probably has only one BPL. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine PUBMED:10470036. Lipoate-protein ligase A (LPLA) catalyses the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes PUBMED:8206909.\ ' '95' 'IPR003406' '\

The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

\ \

This is the glycosyltransferase family 14 , a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme () and core-2 branching enzyme (). I-branching enzyme, an integral membrane protein, converts linear into branched poly-N-acetyllactosaminoglycans in the glycosylation pathway, and is responsible for the production of the blood group I-antigen during embryonic development PUBMED:8449405. Core-2 branching enzyme, also an integral membrane protein, forms crucial side-chain branches in O-glycans in the glycosylation pathway PUBMED:9915862.

\ ' '96' 'IPR001357' '\

The BRCT domain (after the C_terminal domain of a breast cancer susceptibility protein) is found predominantly in proteins involved in cell cycle checkpoint \ functions responsive to DNA damage PUBMED:9034168, for example as found in the breast cancer DNA-repair protein BRCA1. The domain is an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein binding domain PUBMED:14576433.

\

A chitin biosynthesis protein from \ yeast also seems to belong to this group.

\ ' '97' 'IPR007084' '\

The BRICHOS family is defined by a 100 amino acid region found in a variety of proteins implicated in dementia, respiratory distress and cancer, including BRI-2, Chondromodulin-I (ChM-I), CA11, and surfactant protein C PUBMED:12114016. In several cases, the BRICHOS region is located in the propeptide region that is removed after proteolytic processing. This domain could be involved in the complex post-translational processing of these proteins.

\ ' '98' 'IPR007109' '\

The Brix domain is found in a number of eukaryotic proteins including some from Saccharomyces cerevisiae and Homo sapiens, Arabidopsis thaliana Peter Pan-like protein and several hypothetical proteins.

\

There are six (one archaean and five eukaryotic) protein families which have a similar domain architecture with a central globular Brix domain. They have an optional N- and obligatory C-terminal segments, which both have charged low-complexity regions PUBMED:11406393.

\

Proteins from the Imp4/Brix superfamily appear to be involved in ribosomal RNA processing, which essential for the functioning of all cells. The N- and C-terminal halves of a member of the superfamily, Mil, show significant structural similarity to one another. This suggests an origin by means of an ancestral duplication. Both halves have the same fold as the anticodon-binding domain of class IIa aminoacyl-tRNA synthetases, with greater conservation seen in the N-terminal half. Structural evidence suggests that the Imp4/Brix superfamily proteins could bind single-stranded segments of RNA along a concave surface formed by the N-terminal half of their beta-sheet and a central alpha-helix PUBMED:15654320.

\ ' '99' 'IPR004328' '\

The BRO1 domain has about 390 residues and occurs in a number of eukaryotic proteins such as yeast BRO1 and human PDCD6IP/Alix that are involved in protein targeting to the vacuole or lysosome. The BRO1 domain of fungal and mammalian proteins binds with multivesicular body components (ESCRT-III proteins) such as yeast Snf7 and mammalian CHMP4b, and can function to target BRO1 domain-containing proteins to endosomes PUBMED:15935782, PUBMED:14583093, PUBMED:15944343.

\ \

The BRO1 domain has a boomerang shape composed of 14 alpha-helices and 3 beta-sheets. It contains a TPR-like substructure in the central part PUBMED:15935782. The C-terminus is less conserved. This domain is found in a number of signal transduction proteins. The Saccharomyces cerevisiae protein Bro1p is required for sorting endocytic cargo to the lumen of multivesicular bodies (MVBs). Alix appears to be the mammalian orthologue of Bro1p PUBMED:18434552.

\ \

Alix is also involved in the ESCRT pathway, which facilitates membrane fission events during enveloped virus budding, multivesicular body formation, and cytokinesis. To promote HIV budding and cytokinesis, the ALIX protein must bind and recruit CHMP4 subunits of the ESCRT-III complex. The Bro1 domain of ALIX binds specifically to C-terminal residues of the human CHMP4 proteins PUBMED:18511562, PUBMED:17696968. Likewise, the Homo sapiens Brox protein has a Bro1 domain. CHMP4 proteins are components of endosomal sorting complex required for transport III, via their Bro1 domains and to play roles in sorting of ubiquitinated cargoes PUBMED:18190528. Alix also binds to the nucleocapsid (NC) domain of HIV-1 Gag. Alix and the Bro1 domain can be specifically packaged into viral particles via the NC PUBMED:18032513.\

\

Myopic is the Drosophila homologue of the Bro1-domain tyrosine phosphatase HD-PTP, and it promotes the epidermal growth factor receptor (EGFR) signalling PUBMED:18434417. The Caenorhabditis elegans Bro1-domain protein, ALX-1, interacts with LIN-12/Notch. The EGO-2 protein also contains a Bro1 domain. Notch-type signalling mediates numerous inductive events during development PUBMED:17603118.

\ ' '100' 'IPR001487' '\ Bromodomains are found in a variety of mammalian, invertebrate and yeast DNA-binding proteins PUBMED:1350857. Bromodomains can interact with\ acetylated lysine PUBMED:9175470.\ In some proteins, the classical bromodomain has diverged to such an\ extent that parts of the region are either missing or contain an insertion\ (e.g., mammalian protein HRX, Caenorhabditis elegans hypothetical protein ZK783.4, yeast protein YTA7). The bromodomain may occur as a single copy, or in duplicate.\

The precise function of the domain is unclear, but it may be involved in\ protein-protein interactions and may play a role in assembly or activity\ of multi-component complexes involved in transcriptional activation PUBMED:7580139.

\ ' '101' 'IPR005607' '\

The BSD domain is an about 60-residue long domain named after the BTF2-like\ transcription factors, Synapse-associated proteins and DOS2-like proteins in\ which it is found. Additionally, it is also found in several hypothetical\ proteins. The BSD domain occurs in one or two copies in a variety of species\ ranging from primal protozoan to human. It can be found associated with other\ domains such as the BTB domain (see ) or the U-box in multidomain\ proteins. The function of the BSD domain is yet unknown PUBMED:11943536.

\ \

Secondary structure prediction indicates the presence of three predicted alpha\ helices, which probably form a three-helical bundle in small domains. The\ third predicted helix contains neighbouring phenylalanine and tryptophan\ residues - less common amino acids that are invariant in all the BSD domains\ identified and that are the most striking sequence features of the domain PUBMED:11943536.\

\ Some proteins known to contain one or two BSD domains are listed below:\ \
  • Mammalian TFIIH basal transcription factor complex p62 subunit (GTF2H1).
  • \
  • Yeast RNA polymerase II transcription factor B 73 kDa subunit (TFB1), the\ homologue of BTF2.
  • \
  • Yeast DOS2 protein. It is involved in single-copy DNA replication and\ ubiquitination.
  • \
  • Drosophila synapse-associated protein SAP47.
  • \
  • Mammalian SYAP1.
  • \
  • Various Arabidopsis thaliana (Mouse-ear cress) hypothetical proteins.
  • \ ' '102' 'IPR004324' '\ Members of this family are transmembrane proteins. Several are Leishmania putative proteins that are thought to be\ pteridine transporters PUBMED:10589984, PUBMED:7984172. This family also contains five putative Arabidopsis thaliana proteins of unknown\ function as well as two predicted prokaryotic proteins (from the cyanobacteria Synechocystis and Synechococcus).\ ' '103' 'IPR013069' '\

    The BTB (for BR-C, ttk and bab) PUBMED:7938017 or POZ (for Pox virus and Zinc finger)\ PUBMED:7958847 domain is present near the N terminus of a fraction of zinc finger\ () proteins and in proteins that contain the motif such as Kelch and a family of pox virus proteins.\ The BTB/POZ domain mediates homomeric dimerisation and in some instances\ heteromeric dimerisation PUBMED:7958847.\ The structure of the dimerised PLZF BTB/POZ domain has been solved and\ consists of a tightly intertwined homodimer. The central scaffolding of\ the protein is made up of a cluster of alpha-helices flanked by short\ beta-sheets at both the top and bottom of the molecule PUBMED:9770450.\ POZ domains from several zinc finger proteins have been shown to mediate\ transcriptional repression and to interact with components of histone\ deacetylase co-repressor complexes including N-CoR and SMRT PUBMED:9019154, PUBMED:9824158, PUBMED:9765306.\ The POZ or BTB domain is also known as BR-C/Ttk or ZiN.

    \ ' '104' 'IPR005137' '\

    Photosystem I (PSI) is a large protein complex embedded within the photosynthetic thylakoid membrane. It consists of 11 subunits, ~100 chlorophyll a molecules, 2 phylloquinones, and 3 Fe4S4-clusters. The three dimensional structure of the PSI complex has been resolved at 2.5 A PUBMED:11418848, which allows the precise localisation of each cofactor. PSI together with photosystem II (PSII) catalyses the light-induced steps in oxygenic photosynthesis - a process found in cyanobacteria, eukaryotic algae (e.g. red algae, green algae) and higher plants.

    \

    To date, three thylakoid proteins involved in the stable accumulation of PSI have been identified: BtpA PUBMED:9045660, Ycf3 PUBMED:9321389, PUBMED:9314531, and Ycf4 () PUBMED:9321389. Because translation of the psaA and psaB mRNAs encoding the two reaction centre polypeptides, of PSI and PSII respectively, is not affected in mutant strains lacking functional ycf3 and ycf4, the products of these two genes appear to act at a post-translational step of PSI biosynthesis.\ These gene products are therefore involved either in the stabilisation or in the assembly of the PSI complex. However, their exact roles remain unknown. The BtpA protein appears to act at the level of PSI stabilisation PUBMED:10806238. It is an extrinsic membrane protein located on the cytoplasmic side of the thylakoid membrane PUBMED:10103064, PUBMED:10806238. Homologs of BtpA are found in the crenarchaeota and euryarchaeota, where their function remains unknown. The Ycf4 protein is firmly associated with the thylakoid membrane, presumably through a transmembrane domain PUBMED:9321389. Ycf4 co-fractionates with a protein complex larger than PSI upon sucrose density gradient centrifugation of solubilised thylakoids PUBMED:9321389. The Ycf3 protein is loosely associated with the thylakoid membrane and can be released from the membrane with sodium carbonate. This suggests that Ycf3 is not part of a stable complex and that it probably interacts transiently with its partners PUBMED:11752384. Ycf3 contains a number of tetratrico peptide repeats (TPR, ); TPR is a structural motif present in a wide range of proteins, which mediates protein-protein interactions.

    \ \ ' '105' 'IPR004826' '\

    There are several different types of Maf transcription factors with different roles in the cell. MafG and MafH are small Mafs which lack a putative transactivation domain. They behave as transcriptional repressors when they dimerize among themselves. However they also serve as transcriptional activators by dimerizing with other (usually larger) basic-zipper proteins and recruiting them to specific DNA-binding sites. Maf transcription factors contain a conserved basic region leucine zipper (bZIP) domain, which mediates their dimerization and DNA binding property. Neural retina-specific leucine zipper proteins also belong to this family. Together with the basic region, the Maf extended homology region (EHR), conserved only within the Maf family, defines the DNA binding specific to Mafs. This structure enables Mafs to make a broader area of contact with DNA and to recognise longer DNA sequences. In particular, the two residues at the beginning of helix H2 are positioned to recognise the flanking region PUBMED:11875518. Small Maf proteins heterodimerize with Fos and may act as competitive repressors of the NF2-E2 transcription factor.

    \ \

    In mouse, Maf1 may play an early role in axial patterning. Defects in these proteins are a cause of autosomal dominant retinitis pigmentosa.

    \ ' '106' 'IPR000008' '\ The C2 domain is a Ca2+-dependent membrane-targeting module found in many cellular proteins involved in signal transduction or membrane trafficking. C2 domains are unique among membrane targeting domains in that they show wide range of lipid selectivity for the major components of cell membranes, including phosphatidylserine and phosphatidylcholine. This C2 domain is about 116 amino-acid residues and is located between the two copies of\ the C1 domain in Protein Kinase C (that bind phorbol esters and diacylglycerol) (see )\ and the protein kinase catalytic domain (see ). Regions with\ significant homology PUBMED:7559667 to the C2-domain have been found in many proteins.\ The C2 domain is thought to be involved in calcium-dependent phospholipid\ binding PUBMED:8253763 and in membrane targetting processes such as subcellular localisation.

    The 3D structure of the\ C2 domain of synaptotagmin has been reported\ PUBMED:7697723, the domain forms an eight-stranded beta sandwich constructed around a \ conserved 4-stranded motif, designated a C2 key PUBMED:7697723. Calcium binds in\ a cup-shaped depression formed by the N- and C-terminal loops of the\ C2-key motif. Structural analyses of several C2 domains have shown them to consist of similar ternary structures in which three Ca2+-binding loops are located at the end of an 8 stranded antiparallel beta sandwich.

    \ ' '108' 'IPR004010' '\ Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source PUBMED:11084361. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions PUBMED:11292341.\ ' '109' 'IPR006941' '\ CAF1 is an RNase of the DEDD superfamily, and a subunit of the Ccr4-Not complex that mediates 3\' to 5\' mRNA deadenylation. The major pathways of mRNA turnover in eukaryotes initiate with shortening of the poly(A) tail. CAF1 encodes a critical component of the major cytoplasmic deadenylase in yeast. Caf1p is required for normal mRNA deadenylation in vivo and localises to the cytoplasm. Caf1p copurifies with a Ccr4p-dependent poly(A)-specific exonuclease activity. Some members of this family contain a single-stranded nucleic acid binding domain, R3H.\ ' '110' 'IPR018145' '\

    This family includes the Helicobacter pylori protein CagE (see examples), which together with other proteins from the\ cag pathogenicity island (PAI), encodes a type IV transporter secretion system. The precise role of CagE is not known,\ but studies in animal models have shown that it is essential for pathogenesis in Helicobacter pylori induced gastritis and\ peptic ulceration PUBMED:11104802. Indeed, the expression of the cag PAI has been shown to be essential for stimulating human gastric\ epithelial cell apoptosis in vitro PUBMED:11447179.

    \

    Similar type IV transport systems are also found in other bacteria. This family includes proteins from the trb and Vir conjugal transfer systems in\ Agrobacterium tumefaciens and homologues of VirB proteins from other species.

    \ ' '111' 'IPR003673' '\

    CoA-transferases are found in organisms from all kingdoms of life. They catalyse reversible transfer reactions of coenzyme A groups from CoA-thioesters to free acids. There are at least three families of CoA-transferases, which differ in sequence and reaction mechanism:

    \ \

    This entry represents family III CoA-transferases.

    \ ' '112' 'IPR004178' '\ Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM) PUBMED:11323678. CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of\ CaM. The structure of this domain complexed with CaM is known PUBMED:11323678. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other.\ ' '113' 'IPR000938' '\

    Cytoskeleton-associated proteins (CAP) are made of three distinct parts, an N-terminal section that is most probably globular and contains the CAP-Gly domain, a large central region predicted to be in an alpha-helical coiled-coil conformation and, finally, a short C-terminal globular domain. The CAP-Gly \ domain is a conserved, glycine-rich domain of about 42 residues found in some CAPs PUBMED:8480366. Proteins known to contain this domain include restin (also known as cytoplasmic linker protein-170 or CLIP-170), a 160 kDa protein associated with intermediate filaments and that links endocytic vesicles to microtubules; vertebrate dynactin (150 kDa dynein-associated polypeptide; DAP) and Drosophila glued, a major component of activator I; yeast protein BIK1, which seems to be required for the formation or\ stabilisation of microtubules during mitosis and for spindle pole body fusion during conjugation; yeast protein NIP100 (NIP80); human protein CKAP1/TFCB; Schizosaccharomyces pombe protein alp11 and Caenorhabditis elegans hypothetical protein F53F4.3. The latter proteins contain a N-terminal ubiquitin domain and a C-terminal \ CAP-Gly domain.

    \ \

    The crystal structure of the CAP-Gly domain of C. elegans F53F4.3 protein, solved by single wavelength sulphur-anomalous phasing, revealed a novel protein fold containing three beta-sheets. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove. Residues in the groove are highly conserved as measured from the information content of the aligned sequences. The C-terminal tail of another molecule in the crystal is bound in this groove PUBMED:12221106.

    \ ' '114' 'IPR000022' '\

    Members in this domain include biotin dependent carboxylases\ PUBMED:8102604, PUBMED:8366018.\ The carboxyl transferase domain carries out the following reaction;\ transcarboxylation from biotin to an acceptor molecule. There are\ two recognised types of carboxyl transferase. One of them uses acyl-CoA\ and the other uses 2-oxo acid as the acceptor molecule of carbon dioxide. \ All of the members in this family utilise acyl-CoA as the acceptor\ molecule.

    \ ' '115' 'IPR000542' '\

    A number of eukaryotic acetyltransferases can, on the basis of sequence similarities, be grouped together into a family. These enzymes include:

    \

    \ ' '116' 'IPR005043' '\ Mammalian cellular apoptosis susceptibility (CAS) proteins are homologous to the yeast chromosome-segregation protein, CSE1 PUBMED:7479798. This family aligns the C-terminal\ halves (approximately). CAS is involved in both cellular apoptosis and proliferation PUBMED:8639641, PUBMED:8610099. Apoptosis is inhibited in CAS-depleted cells, while the expression of CAS\ correlates to the degree of cellular proliferation. Like CSE1, it is essential for the mitotic checkpoint in the cell cycle (CAS depletion blocks the cell in the G2 phase),\ and has been shown to be associated with the microtubule network and the mitotic spindle PUBMED:8610099, as is the protein MEK, which is thought to regulate the intracellular\ localization (predominantly nuclear vs. predominantly cytosolic) of CAS. In the nucleus, CAS acts as a nuclear transport factor in the importin pathway PUBMED:9323134. The\ importin pathway mediates the nuclear transport of several proteins that are necessary for mitosis and further progression. CAS is therefore thought to affect the cell\ cycle through its effect on the nuclear transport of these proteins PUBMED:9323134. Since apoptosis also requires the nuclear import of several proteins (such as P53 and\ transcription factors), it has been suggested that CAS also enables apoptosis by facilitating the nuclear import of at least a subset of these essential proteins PUBMED:9497270.\ ' '117' 'IPR004014' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    P-ATPases (sometime known as E1-E2 ATPases) () are found in bacteria and in a number of eukaryotic plasma membranes and organelles PUBMED:9419228. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.

    \

    This entry represents the conserved N-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H+ (), Na+ (), Ca2+ (), Na+/K+ (), and H+/K+ (). In the H+/K+- and Na+/K+-exchange P-ATPases, this domain is found in the catalytic alpha chain. In gastric H+/K+-ATPases, this domain undergoes reversible sequential phosphorylation inducing conformational changes that may be important for regulating the function of these ATPases PUBMED:12480547, PUBMED:12529322.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '118' 'IPR002524' '\

    Members of this family are integral membrane proteins, that\ are found to increase tolerance to divalent metal ions such\ as cadmium, zinc, and cobalt. These proteins are considered to\ be efflux pumps that remove these ions from cells PUBMED:9696746, PUBMED:8829543, however others are implicated in ion uptake PUBMED:1508175. The\ family has six predicted transmembrane domains. Members of the family are variable\ in length because of variably sized inserts, often containing low-complexity sequence.

    \ ' '119' 'IPR006433' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported.

    \

    This group of peptidases belong to MEROPS peptidase family U35 (clan U-). This family contains the prohead protease from HK97 and related phage and prophage. It is generally encoded next to the gene for the capsid protein that it processes, and in some cases may be fused to it. This family does not show similarity to the prohead protease of Bacteriophage T4 ().

    \ ' '120' 'IPR002557' '\

    The Peritrophin-A domain is found in chitin binding proteins, particularly the peritrophic matrix proteins of insects and animal chitinases PUBMED:9651363, PUBMED:8621536, PUBMED:9256413. Copies of the domain are also found in some baculoviruses. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains PUBMED:9651363.

    \ ' '121' 'IPR001956' '\

    This domain is involved in cellulose binding PUBMED:1490597 and is found\ associated with a wide range of bacterial glycosyl hydrolases. The structure for\ this domain is known PUBMED:8918451; it forms a beta sandwich.

    \ ' '122' 'IPR003305' '\ The 1,4-beta-glucanase CenC from Cellulomonas fimi contains two\ cellulose-binding domains, CBD(N1) and CBD(N2), arranged in tandem at its\ N-terminus. These homologous CBDs are distinct in their selectivity for binding amorphous and not crystalline cellulose PUBMED:10704194.\ Multidimensional heteronuclear nuclear magnetic resonance (NMR) spectroscopy\ was used to determine the tertiary structure of the 152 amino acid N-terminal\ cellulose-binding domain from C. fimi 1,4-beta-glucanase CenC\ (CBDN1) PUBMED:8916925. The tertiary\ structure of CBDN1 is strikingly similar to that of the bacterial\ 1,3-1,4-beta-glucanases, as well as other sugar-binding proteins with jelly-roll folds.\ ' '123' 'IPR005084' '\

    Carbohydrate-binding modules (CBMs) of microbial glycoside hydrolases play a central role in the recycling of photosynthetically fixed carbon through their binding to specific plant structural polysaccharides PUBMED:11598143. Carbohydrate-binding modules (CBMs) can recogise both crystalline and amorphous cellulose forms PUBMED:15136030. CBMs are the most common non-catalytic modules associated with enzymes active in plant cell-wall hydrolysis. Many putative CBMs have been identified by amino acid sequence alignments but only a few representatives have been show experimentally to have a carbohydrate-binding function PUBMED:15210353.

    \ \

    binds both beta-1,4-glucan and beta-1,3-1,4-mixed linked glucans. binds to xylan and xylooligosaccharides. CBM25 has a starch-binding function. binds to amorphous cellulose and soluble beta-1,4-glucans, with a minimal binding requirement of cellotriose and optimal affinity for cellohexaose. Family 17 CBMs appear to have a very shallow binding cleft that may be more accessible to cellulose chains in non-crystalline cellulose than the deeper binding clefts of family 4 CBMs PUBMED:11733998. CBM28 does not compete with CBM17 modules when binding to non-crystalline cellulose but does have a "beta-jelly roll" topology, which is similar in structure to the CBM17 domains. Sequence and structural conservation in families 17 and 28 suggests that they have evolved through gene duplication and subsequent divergence PUBMED:15136030.

    \ \

    The carbohydrate-binding module, family 6 PUBMED: was previously known as cellulose-binding domain family VI (CBD VI). The cellulose-binding function has been\ demonstrated in one case on amorphous cellulose and xylan. Some of these modules also bind beta-1,3-glucan.

    \ ' '124' 'IPR000644' '\

    CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes PUBMED:9020585, PUBMED:16275737.

    \ \

    Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains PUBMED:14722619. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP PUBMED:14722619. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking PUBMED:14521953, while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking PUBMED:14718533. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular\ metabolites PUBMED:14722619, PUBMED:14722609.

    \ \ \

    Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern PUBMED:10200156. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations PUBMED:17851531.

    \ \

    In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).

    \ ' '125' 'IPR004017' '\ This domain is usually found in two copies per protein. It contains up to four conserved cysteines. The group includes proteins characterised as: \ heterodisulphide reductase, subunit B (HrdB);\ \ succinate dehydrogenase, subunit C (SdhC, ); \ \ Fe-S oxidoreductase; \ glycerol-3-phosphate dehydrogenase subunit C (Anaerobic GlpC, ); and \ glycolate oxidase iron-sulphur subunit (GlcF) PUBMED:8606183.\ ' '126' 'IPR003544' '\ Within mitochondria and bacteria, a family of related proteins is involved\ in the assembly of periplasmic c-type cytochromes: these include CycK PUBMED:7665469, CcmF PUBMED:7635817,PUBMED:9043133, NrfE PUBMED:8057835 and CcbS PUBMED:8389979. These proteins may play a role in \ guidance of apocytochromes and haem groups for their covalent linkage \ by the cytochrome-c-haem lyase. Members of the family are probably integral\ membrane proteins, with up to 16 predicted transmembrane (TM) helices. \ \

    The gene products of the hel and ccl loci have been shown to be required\ specifically for the biogenesis of c-type cytochromes in the Gram-negative\ photosynthetic bacterium Rhodobacter capsulatus PUBMED:1310666. Genetic and molecular\ analyses show that the hel locus contains at least 4 genes, helA, helB, helC\ and orf52. HelA is similar to the ABC transporters and helA, helB, and\ helC are proposed to encode an export complex PUBMED:8057835. It is believed that the\ hel-encoded proteins are required for the export of haem to the periplasm,\ where it is subsequently ligated to the c-type apocytochromes PUBMED:1310666. However,\ while CcmB and CcmC have the potential to interact with CcmA, the 3 gene \ products probably associating to form a complex with (CcmA)2-CcmB-CcmC\ stoichiometry, the substrate for the putative CcmABC-transporter is probably\ neither haem nor c-type apocytochromes PUBMED:9043133. Hydropathy analysis suggests\ the presence of 6 TM domains.

    \ ' '127' 'IPR007237' '\

    This family includes the CD20 protein and the beta subunit of the high affinity receptor for IgE Fc. The high affinity receptor for IgE is a tetrameric structure consisting of a single IgE-binding alpha subunit, a single beta subunit, and two disulphide-linked gamma subunits. The alpha subunit of Fc epsilon RI and most Fc receptors are homologous members of the Ig superfamily. By contrast, the beta and gamma subunits from Fc epsilon RI are not homologous to the Ig superfamily. Both molecules have four putative transmembrane segments and a probable topology where both N- and C termini protrude into the cytoplasm PUBMED:2531187.

    \ ' '128' 'IPR007852' '\

    Paf1 is an RNA polymerase II-associated protein in yeast, which defines a complex that is distinct from the Srb/Mediator holoenzyme.\ The Paf1 complex, which also contains Cdc73, Ctr9, Hpr1, Ccr4, Rtf1 and Leo1, is required for full expression of a subset of yeast genes, particularly those responsive to signals from the Pkc1/MAP kinase cascade. The complex appears to play an essential role in RNA elongation PUBMED:12242279.

    \ ' '129' 'IPR000462' '\ A number of phosphatidyltransferases, which are all involved in phospholipid\ biosynthesis and that share the property of catalyzing the displacement of CMP\ from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond\ and concomitant breaking of a phosphoride anhydride bond share a conserved\ sequence region PUBMED:3031032, PUBMED:1848238.\ These enzymes are proteins of from 200 to 400 amino acid residues. The\ conserved region contains three aspartic acid residues and is located in the\ N-terminal section of the sequences.\ ' '130' 'IPR001547' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 5 comprises enzymes with several known activities; endoglucanase (); beta-mannanase (); exo-1,3-glucanase (); endo-1,6-glucanase (); xylanase (); endoglycoceramidase ().

    \ \

    The microbial degradation of cellulose and xylans requires several types of enzymes. Fungi and bacteria produces a spectrum of cellulolytic enzymes (cellulases) and xylanases which, on the basis of sequence similarities, can be classified into families. One of these families is known as the cellulase family A PUBMED:2806912 or as the glycosyl hydrolases family 5 PUBMED:1747104. One of the conserved regions in this family contains a conserved glutamic acid residue which is potentially involved PUBMED:1677466 in the catalytic mechanism.

    \ ' '131' 'IPR006695' '\ Centromere Protein B (CENP-B) is a DNA-binding protein localized to the centromere. Within the N-terminal 125 residues, there is a DNA-binding region, which binds to a corresponding 17bp CENP-B box sequence. CENP-B dimers either bind two separate DNA molecules or alternatively, they may bind two CENP-B boxes on one DNA molecule, with the intervening stretch of DNA forming a loop structure. The CENP-B DNA-binding domain consists of two repeating domains, RP1 and RP2. This family corresponds to RP1 has been shown to consist of four helices in a helix-turn-helix structure PUBMED:9451007.\ ' '132' 'IPR006823' '\ This family represents a group of neutral/alkaline ceramidases found in both bacteria and eukaryotes PUBMED:10753931, PUBMED:10781606, PUBMED:10593963.\ ' '133' 'IPR005559' '\

    CG-1 domains are highly conserved domains of about 130 amino-acid residues containing a predicted bipartite NLS and named after a partial cDNA clone isolated from parsley encoding a sequence-specific DNA-binding protein PUBMED:8075408. CG-1 domains are associated with CAMTA proteins (for CAlModulin -binding Transcription Activator) that are transcription factors containing a calmodulin-binding domain and ankyrins PUBMED:11925432.

    \ ' '134' 'IPR001715' '\

    The calponin homology domain (also known as CH-domain) is a superfamily of actin-binding domains found in both cytoskeletal proteins and signal transduction proteins PUBMED:7589522. It comprises the following groups of actin-binding domains:\

    \

    A comprehensive review of proteins containing this type of actin-binding domains is given in PUBMED:7584474.

    \

    The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity PUBMED:9625744. Most proteins have two copies of the CH domain, however some proteins such as calponin and the human vav proto-oncogene () have only a single copy. The structure of an example CH-domain has recently been solved PUBMED:9164454.

    \ ' '135' 'IPR007514' '\ Members of this family are probably coiled-coil proteins that are similar to the CHD5 (Congenital heart disease 5) protein. The exact molecular function of these eukaryotic proteins is unknown.\ ' '136' 'IPR004302' '\ Entomopoxviruses are a class of insect viruses whose virions are embedded in cytoplasmic occlusion bodies. The major component of these protective complexes is a protein called spheroidin/spindolin. Intermolecular disulphide bonds have been shown to play major roles in the formation and structure of these viral occlusion bodies PUBMED:2327073 some of which are spindle body proteins.\ ' '137' 'IPR000618' '\

    Insect cuticle is composed of proteins and chitin. The cuticular proteins seem to be specific to the type of\ cuticle (flexible or stiff) that occur at stages of the insect development. The proteins found in the flexible\ cuticle of larva and pupa of different insects share a conserved C-terminal section PUBMED:2462055 such a\ region is also found in the soft endocuticle of adults insects PUBMED:1997327 as well as in other cuticular\ proteins including in arachnids PUBMED:9014336. In addition, cuticular proteins share hydrophobic regions\ dominated by tetrapeptide repeats (A-A-P-A/V), which are presumed to be functionally important PUBMED:1997327,\ PUBMED:9066122. Many insect cuticle proteins also include a 35-36 amino acid motif known as the R and R consensus. An extended form of this motif has been shown PUBMED:11520687 to bind chitin. It has no sequence similiarity to the cysteine-containing chitin-binding domain of chitinases and some peritrophic membrane proteins, suggesting that arthropods have two distinct classes of chitin-binding proteins, those with the chitin-binding domains found in lectins, chitinases and peritrophic membranes (cysCBD), and those with the type of chitin-binding domains found in cuticular proteins (non-cysCBD) PUBMED:11520687.

    \

    The cuticle protein signature has been found in locust cuticle proteins 7 (LM-7), 8 (LM-8), 19\ (LM-19) and endocuticle structural glycoprotein ABD-4; Hyalophora cecropia (Cecropia moth) cuticle proteins 12 and 66;\ Drosophila melanogaster (Fruit fly) larval cuticles proteins I, II, III and IV (LCP1 to LCP4); drosophila pupal cuticle proteins PCP,\ EDG-78E and EDG-84E; Manduca sexta (Tobacco hawkmoth) cuticle protein LCP-14; Tenebrio molitor (Yellow mealworm) cuticle proteins ACP-20, A1A, A2B\ and A3A; and Araneus diadematus (Spider) cuticle proteins ACP 11.9, ACP 12.4, ACP 12.6, ACP 15.5 and ACP 15.7.

    \ ' '138' 'IPR007051' '\ CHORD represents a Zn binding domain. Silencing of the Caenorhabditis elegansCHORD-containing gene results in semisterility and embryo lethality, suggesting an essential function of the wild-type gene in nematode development. The CHORD domain is sometimes found N-terminal to the CS domain, , in metazoan proteins, but occurs separately from the CS domain in plants. This association is thought to be indicative of an functional interaction between CS and CHORD domains PUBMED:10571178.\ ' '139' 'IPR003508' '\

    This domain consists of caspase-activated (CAD) nucleases, which induce DNA fragmentation and chromatin condensation during apoptosis, and the cell death activator proteins CIDE-A and CIDE-B, which are inhibitors of CAD nuclease. The two proteins interact through the region defined by the method signatures.

    \ ' '140' 'IPR003492' '\

    Batten\'s disease, the juvenile variant of neuronal ceroid lipofuscionosis\ (NCL), is a recessively inherited disorder affecting children of 5-10\ years of age. The disease is characterised by progressive loss of vision,\ seizures and psychomotor disturbances. Biochemically, the disease is\ characterised by lysosomal accumulation of hydrophobic material, mainly ATP\ synthase subunit C, largely in the brain but also in other tissues. The disease is fatal within a decade PUBMED:7553855.

    \

    Mutations in the CLN3 gene are believed to cause Batten\'s disease PUBMED:7553855. The\ CLN3 gene, with a predicted 438-residue product, maps to chromosome p16p12.1. The gene contains at least 15 exons spanning 15kb and is highly conserved in mammals PUBMED:2142158. A 1.02kb deletion in the CLN3 gene, occurring in either one or both alleles, is found in 85% of Batten disease chromosomes causing a frameshift generating a predicted translated product of 181 amino acid residues PUBMED:7553855, PUBMED:10191115. 22 other mutations, including deletions, insertions and point mutations, have been\ reported. It has been suggested that such mutations result in severely\ truncated CLN3 proteins, or affect its structure/conformation PUBMED:7553855, PUBMED:9311735.

    \

    CLN3 proteins, which are believed to associate in complexes, are heavily\ glycosylated lysosomal membrane proteins PUBMED:10191115, containing complex Asn-linked\ oligosaccharides PUBMED:2142158. Extensive glycosylation is important for the stability\ of these lysosomal proteins in the highly hydrolytic lysosomal lumen. Lysosomal\ sequestration of active lysosomal enzymes, transport of degraded molecules\ from the lysosomes, and fusion and fission between lysosomes and other\ organelles. The CLN3 protein is a 43kDa, highly hydrophobic, multi-transmembrane (TM),\ phosphorylated protein PUBMED:10191115. Hydrophobicity analysis predicts 6-9 TM\ segments, suggesting that CLN3 is a TM protein that may function as a\ chaperone or signal transducer. The majority of putative phosphorylation\ sites are found in the N-terminal domain, encompassing 150 residues PUBMED:10191114.\ Phosphorylation is believed to be important for membrane compartment \ interaction, in the formation of functional complexes, and in regulation \ and interactions with other proteins PUBMED:1482112.

    \

    CLN3 contains several motifs that may undergo lipid post-translational\ modifications (PTMs). PTMs contribute to targeting and anchoring of modified\ proteins to distinct biological membranes PUBMED:7716512. There are three general \ classes of lipid modification: N-terminal myristoylation, C-terminal \ prenylation, and palmitoylation of cysteine residues. Such modifications \ are believed to be a common form of PTM occurring in 0.5% of all cellular\ proteins, including brain tissue PUBMED:10191112. The C terminus of the CLN3 contains\ various lipid modification sites: C435, target for prenylation; G419, \ target for myristoylation; and C414, target for palmitoylation PUBMED:9384607.\ Prenylation results in protein hydrophobicity, influences interaction with\ upstream regulatory proteins and downstream effectors, facilitates protein-protein interaction (multisubunit assembly) and promotes anchoring to\ membrane lipids. The prenylation motif, Cys-A-A-X, is highly conserved\ within CLN3 protein sequences of different species PUBMED:10191112.\ Species with known CLN3 protein homologues include: Homo sapiens, Canis familiaris, Mus musculus, Saccharomyces cerevisiae and Drosophila melanogaster.

    \ ' '141' 'IPR004176' '\ This short domain is found in one or two copies at the amino terminus of ClpA and ClpB proteins from bacteria and eukaryotes. The function of these domains is uncertain but they may form a protein binding site PUBMED:10982797. The proteins are thought to be subunits of ATP-dependent proteases which act as chaperones to target the proteases to substrates.\ ' '142' 'IPR003333' '\

    Methyl transfer from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalyzed by methyltransferases (Mtases) and modifies DNA, RNA, proteins and small molecules, such as catechol for regulatory purposes. The various aspects of the role of DNA methylation in prokaryotic restriction-modification systems and in a number of cellular processes in eukaryotes including gene regulation and differentiation is well documented.

    \

    This entry represents cyclopropane-fatty-acyl-phospholipid synthase that is slosely related to methyltransferases.

    \

    Cyclopropane-fatty-acyl-phospholipid synthase or CFA synthase catalyses the reaction:

    \ \

    The major mycolic acid produced by Mycobacterium tuberculosis contains two cis-cyclopropanes in the meromycolate chain. Cyclopropanation may contribute to the structural integrity of the cell wall complex PUBMED:7592990.

    \ ' '144' 'IPR001180' '\

    Based on sequence similarities a domain of homology has been identified in the following proteins PUBMED:10391936:

    \ \

    This domain, called the citron homology domain, is often found after cysteine rich and pleckstrin homology (PH) domains at the C-terminal end of the proteins PUBMED:10391936. It acts as a regulatory domain and could be involved in macromolecular interactions PUBMED:10391936, PUBMED:9135144.

    \ ' '145' 'IPR000595' '\ Proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues PUBMED:14638413,\ PUBMED:10550204, PUBMED:1710853. The best studied of these proteins is the prokaryotic catabolite gene activator (also\ known as the cAMP receptor protein) (gene crp) where such a domain is known to be composed of three alpha-helices and\ a distinctive eight-stranded, antiparallel beta-barrel structure. There are six invariant amino acids in this domain,\ three of which are glycine residues that are thought to be essential for maintenance of the structural integrity of\ the beta-barrel. cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic\ nucleotide-binding domain. The cAPK\'s are composed of two different subunits, a catalytic chain and a regulatory chain,\ which contains both copies of the domain. The cGPK\'s are single chain enzymes that include the two copies of the domain\ in their N-terminal section. Vertebrate cyclic nucleotide-gated ion-channels also contain this domain. Two such\ cations channels have been fully characterised, one is found in rod cells where it plays a role in visual signal\ transduction.\ ' '146' 'IPR003781' '\ This domain has a Rossmann fold and is found in a number of proteins including succinyl CoA synthetases,\ malate and ATP-citrate ligases.\ ' '147' 'IPR004165' '\ Coenzyme A (CoA) transferases belong to an evolutionary conserved PUBMED:1624453, PUBMED:9325289 family of enzymes catalyzing the reversible transfer of CoA from one carboxylic acid to another. They have been identified in many prokaryotes and in mammalian tissues. The bacterial enzymes are heterodimer of two subunits (A and B) of about 25 Kd each while eukaryotic SCOT consist of a single chain which is colinear with the two bacterial subunits.\ ' '148' 'IPR003672' '\ This family contains a domain common to the cobN protein and to magnesium protoporphyrin chelatase. CobN may play a role in cobalt insertion reactions and is implicated in the conversion of precorrin-2 to cobyrinic acid in cobalamin biosynthesis PUBMED:1655697. Magnesium protoporphyrin chelatase is involved in\ chlorophyll biosynthesis as the third subunit of light-independent protochlorophyllide reductase in bacteria and plants PUBMED:8385667.\ ' '149' 'IPR003495' '\

    Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase PUBMED:12869542. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase PUBMED:17163662.

    \ \

    There are at least two distinct cobalamin biosynthetic pathways in bacteria PUBMED:11153269:

    \ \

    Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) PUBMED:11215515. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.

    \ \

    CobW proteins are generally found proximal to the trimeric cobaltochelatase subunit CobN, which is essential for vitamin B12 (cobalamin) biosynthesis PUBMED:12869542. They contain a P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. CobW might be involved in cobalt reduction leading to cobalt(I) corrinoids.

    \

    This entry represents CobW-like proteins, including P47K (), a Pseudomonas chlororaphis protein needed for nitrile hydratase expression PUBMED:7765511, and urease accessory protein UreG, which acts as a chaperone in the activation of urease upon insertion of nickel into the active site PUBMED:17309280.

    \ \ ' '150' 'IPR002018' '\ Higher eukaryotes have many distinct esterases. Among the different types are\ those which act on carboxylic esters (). Carboxyl-esterases have\ been classified into three categories (A, B and C) on the basis of\ differential patterns of inhibition by organophosphates. The sequence of a\ number of type-B carboxylesterases indicates PUBMED:3163407, PUBMED:1862088, PUBMED:8453375 that the majority are evolutionary related. As is the case for lipases and serine proteases, the catalytic apparatus of\ esterases involves three residues (catalytic triad): a serine, a glutamate or\ aspartate and a histidine.\ ' '151' 'IPR002486' '\

    The function of this domain is unknown. It is found in the N-terminal\ region of nematode cuticle collagens (see ). Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins PUBMED:2753356, PUBMED:7828882.

    \ ' '152' 'IPR008160' '\ Members of this family belong to the collagen superfamily PUBMED:8240831.\ Collagens are generally extracellular structural proteins\ involved in formation of connective tissue structure.\ The sequence is predominantly repeats of the G-X-Y and the polypeptide chains\ form a triple helix. The first position of the repeat is\ glycine, the second and third positions can be any residue\ but are frequently proline and hydroxyproline. Collagens\ are post-translationally modified by proline hydroxylase\ to form the hydroxyproline residues. Defective\ hydroxylation is the cause of scurvy.\

    Some members of the collagen superfamily are not involved\ in connective tissue structure but share the same triple\ helical structure.

    \ ' '153' 'IPR004477' '\

    This family is defined to identify a pair of paralogous 3\' exoribonucleases in Escherichia coli, plus the set of proteins apparently orthologous to one or the other in other eubacteria. VacB was characterised originally as required for the expression of virulence genes, but is now recognised as the exoribonuclease RNase R (Rnr). Its paralog in Escherichia coli and Haemophilus influenzae is designated exoribonuclease II (Rnb). Both are involved in the degradation of mRNA, and consequently have strong pleiotropic effects that may be difficult to disentangle. Both these proteins share domain-level similarity (RNB, S1) with a considerable number of other proteins, and full-length similarity scoring below the trusted cut off to proteins associated with various phenotypes but uncertain biochemistry; it may be that these latter proteins are also 3\' exoribonucleases.

    \ ' '154' 'IPR007763' '\

    this entry represents the 17.2kDa subunit from NADH:ubiquinone oxidoreductase and its homologues PUBMED:14741580. This subunit is believed to be one of the 36 structural complex I proteins.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '155' 'IPR001268' '\

    The 30 kDa subunit from NADH:ubiquinone oxidoreductase is found in both eukaryotes and prokaryotes. In mammals and in Neurospora crassa, it is nuclear-encoded as a precursor form with a transit peptide, while in Paramecium (protein P1), in the Dictyostelium discoideum (Slime mold) it is mitochondrial-encoded and it is chloroplast-encoded in various higher plants. It is also present in bacteria.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '156' 'IPR004214' '\

    Cone snail toxins, conotoxins, are small neurotoxic peptides with disulphide connectivity that target ion-channels or G-protein coupled receptors. Based on the number and pattern of disulphide bonds and biological activities, conotoxins can be classified into several families PUBMED:11478951. Omega, delta and kappa families of conotoxins have a knottin or inhibitor cysteine knot scaffold. The knottin scaffold is a very special disulphide-through-disulphide knot, in which the III-VI disulphide bond crosses the macrocycle formed by two other disulphide bonds (I-IV and II-V) and the interconnecting backbone segments, where I-VI indicates the six cysteine residues starting from the N-terminus.

    \

    The disulphide bonding network, as well as specific amino acids in inter-cysteine loops, provide the specificity of conotoxins PUBMED:10988292. The cysteine arrangements are the same for omega, delta and kappa families, even though omega conotoxins are calcium channel blockers, whereas delta conotoxins delay the inactivation of sodium channels, and kappa conotoxins are potassium channel blockers PUBMED:11478951. Mu conotoxins have two types of cysteine arrangements, but the knottin scaffold is not observed. Mu conotoxins target the voltage-gated sodium channels PUBMED:11478951, and are useful probes for investigating voltage-dependent sodium channels of excitable tissues PUBMED:2410412. Alpha conotoxins have two types of cysteine arrangements PUBMED:1390774, and are competitive nicotinic acetylcholine receptor antagonists.

    \ ' '157' 'IPR007715' '\ Coq4p was shown to peripherally associate with the matrix face of the mitochondrial inner membrane. The putative mitochondrial- targeting sequence present at the N terminus of the polypeptide efficiently imports it to mitochondria. The function of Coq4p is unknown, although its presence is required to maintain a steady-state level of Coq7p, another component of the Q biosynthetic pathway PUBMED:11469793.\ ' '158' 'IPR005170' '\

    This small domain is found in a family of proteins with the CBS domain and two CBS domains with this domain found at the C-terminus of the proteins, the domain is also found at the C-terminus of some Na+/H+ antiporters. This domain is also found in CorC that is involved in Magnesium and cobalt efflux. The function of this domain is uncertain but might be involved in modulating transport of ion substrates.

    \ ' '159' 'IPR003302' '\ SPRR genes (formerly SPR) encode a novel class of polypeptides (small proline rich proteins) that are strongly induced during differentiation of human epidermal keratinocytes in vitro and in vivo.The most characteristic\ feature of the SPRR gene family resides in the structure of the central segments of the encoded polypeptides that are built up from tandemly repeated units of either eight (SPRR1 and SPRR3) or nine (SPRR2) amino\ acids with the general consensus XKXPEPXX where X is any amino acid PUBMED:8325635.\ ' '160' 'IPR003780' '\ Cytochrome aa3 is one of two terminal oxidase complexes in the Bacillus subtilis\ electron transport chain. CtaA is required for cytochrome aa3 biosynthesis and sporulation in B. subtilis PUBMED:2549006. In yeast the COX15 protein is required for cytochrome c oxidase assembly.\ ' '161' 'IPR004203' '\ Cytochrome c oxidase, a 13 sub-unit complex () is the terminal oxidase in the mitochondrial electron transport chain. This\ family is composed of cytochrome c oxidase subunit IV. The Dictyostelium discoideum (Slime mould) member of this family is called COX VI. The Saccharomyces cerevisiae protein YGX6_YEAST appears to be the yeast COX IV subunit.\ ' '162' 'IPR005171' '\

    Cytochrome c oxidase (COX) is a multi-subunit enzyme complex that catalyzes the final step of electron transfer through the respiratory chain on the mitochondrial inner membrane. Bacterial cytochrome c oxidases generally consist of four different subunits, I to IV. This family is composed of cytochrome c oxidase subunit IV from prokaryotes which is present in a cleft formed by subunits I and III. Subunit IV assists the a copper ion, CuB, binding to subunit I during biosynthesis or assembly of the oxidase complex PUBMED:8663126.

    \ ' '163' 'IPR004204' '\ Cytochrome c oxidase, a 13 subunit complex, is the terminal oxidase in the mitochondrial electron transport chain. This\ family is composed of cytochrome c oxidase subunit VIc.\ ' '164' 'IPR004202' '\ Cytochrome c oxidase, a 13 sub-unit complex, is the terminal oxidase in the mitochondrial electron transport chain. This\ family is composed of cytochrome c oxidase subunit VIIc. The yeast member of this family is called COX VIII\ ' '165' 'IPR007604' '\ This entry represents a conserved region in the CP2 transcription factor family.\ ' '166' 'IPR002423' '\

    Partially folded polypeptide chains, either newly made by ribosomes or emerging from mature proteins unfolded by stress, run the risk of aggregating with one another to the detriment of the organism. Folding of newly synthesised polypeptides in the crowded cellular environment requires the assistance of molecular chaperone proteins, such as the large bacterial chaperonins GroEL and GroES.

    GroEL and GroES prevent aggregation by encapsulating individual chains within the so-called \'Anfinsen cage\' provided by the GroEL-GroES complex, where they can fold in isolation from one another PUBMED:12654267. GroEL consists of two heptameric rings of identical ATPase subunits stacked back to back, containing a cage in each ring. Each subunit consists of three domains. The equatorial domain contains the nucleotide binding site and is connected by a flexible intermediate domain with the apical domain. The latter presents several hydrophobic amino-acid side chains at the top of the ring, orientated towards the cavity of the cage. These side chains are involved in binding either a partially folded polypeptide chain or a single molecule of GroES.

    \

    The assembly of proteins has been thought to be the sole result of properties inherent in the primary sequence of polypeptides themselves. In some cases, however, structural information from other protein molecules is required for correct folding and subsequent assembly into oligomers PUBMED:2897629. These \'helper\' molecules are referred to as molecular chaperones, a subfamily of which are the chaperonins PUBMED:1349837, which include 10 kDa and 60 kDa proteins. These are found in abundance in prokaryotes, chloroplasts and mitochondria. They are required for normal cell growth (as demonstrated by the fact that no temperature sensitive mutants for the chaperonin genes can be found in the temperature range 20 to 43 degrees centigrade PUBMED:2897629), and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions PUBMED:1349837.

    \ \

    The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between 6 to 8 identical subunits, whereas the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits PUBMED:2897629. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The cpn10 and cpn60 oligomers also require Mg2+-ATP in order to interact to form a functional complex, although the mechanism of this interaction is as yet unknown PUBMED:1350777. This chaperonin complex is essential for the correct folding and assembly of polypeptides into oligomeric structures, of which the chaperonins themselves are not a part PUBMED:1349837. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.

    \

    The 60 kDa form of chaperonin is the immunodominant antigen of patients with Legionnaire\'s disease PUBMED:1672279, and is thought to play a role in the protection of the Legionella bacteria from oxygen radicals within macrophages. This hypothesis is based on the finding that the cpn60 gene is upregulated in response to hydrogen peroxide, a source of oxygen radicals. Cpn60 has also been found to display strong antigenicity in many bacterial species PUBMED:1347461, and has the potential for inducing immune protection against unrelated bacterial infections. The RuBisCO subunit binding protein (which has been implicated in the assembly of RuBisCO) and cpn60 have been found to be evolutionary homologues, the RuBisCO subunit binding protein having the C-terminal Gly-Gly-Met repeat found in all bacterial cpn60 sequences. Although the precise function of this repeat is unknown, it is thought to be important as it is also found in 70 kDa heat-shock proteins PUBMED:1672279. The crystal structure of Escherichia coli GroEL has been resolved to 2.8A PUBMED:7935790. The TCP-1 family of proteins act as molecular chaperones for tubulin, actin and probably some other proteins. They are weakly, but significantly, related to the cpn60/groEL chaperonin family.

    \ ' '167' 'IPR005480' '\

    Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine () or ammonia (), and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates PUBMED:10387030, PUBMED:11212301. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate PUBMED:8916922. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase () (ACC), propionyl-CoA carboxylase () (PCCase), pyruvate carboxylase () (PC) and urea carboxylase ().

    \

    Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain PUBMED:10089390. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites PUBMED:12379099. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.

    \

    Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein PUBMED:7907330. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP PUBMED:17397987. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia PUBMED:17451989. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains PUBMED:.

    \ \ \

    This entry represents the oligomerisation domain found in the large subunit of carbamoyl phosphate synthases as well as in certain other carboxy phsophate domain-containing enzymes.

    \ ' '168' 'IPR017447' '\ The function of the CS domain is unknown. The CS domain is sometimes found C-terminal to the CHORD domain () in metazoan proteins, but occurs separately from the CHORD domain in plants. This association is thought to be indicative of an functional interaction between CS and CHORD domains PUBMED:10571178.\ ' '169' 'IPR002474' '\

    Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine () or ammonia (), and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates PUBMED:10387030, PUBMED:11212301. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate PUBMED:8916922. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase () (ACC), propionyl-CoA carboxylase () (PCCase), pyruvate carboxylase () (PC) and urea carboxylase ().

    \

    Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain PUBMED:10089390. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites PUBMED:12379099. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.

    \

    Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein PUBMED:7907330. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP PUBMED:17397987. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia PUBMED:17451989. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains PUBMED:.

    \ \ \

    This entry represents the N-terminal domain of the small subunit of carbamoyl phosphate synthase. The small subunit catalyses the hydrolysis of glutamine to ammonia, which in turn used by the large chain to synthesize carbamoyl phosphate. The small subunit has a 3-layer beta/beta/alpha structure, and is thought to be mobile in most proteins that carry it. The C-terminal domain of the small subunit of CPSase has glutamine amidotransferase activity.

    \ ' '170' 'IPR004871' '\ This family includes a region that lies towards the C-terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa)\ subunit. CPSF is involved in mRNA polyadenylation and binds the AAUAAA conserved sequence in pre-mRNA. CPSF has also been\ found to be necessary for splicing of single-intron pre-mRNAs PUBMED:11421366. The function of the aligned region is unknown but may be involved\ in RNA/DNA binding.\ ' '171' 'IPR002059' '\ When Escherichia coli is exposed to a temperature drop from 37 to 10 degrees\ centigrade, a 4-5 hour lag phase occurs, after which growth is resumed at\ a reduced rate PUBMED:1912512. During the lag phase, the expression of around 13\ proteins, which contain specific DNA-binding regions PUBMED:2247479, is increased\ 2-10 fold. These so-called \'cold shock\' proteins are thought to help the\ cell to survive in temperatures lower than optimum growth temperature, by\ contrast with heat shock proteins, which help the cell to survive in\ temperatures greater than the optimum, possibly by condensation of the\ chromosome and organization of the prokaryotic nucleoid PUBMED:1912512.\ A conserved domain of about 70 amino acids has been found in prokaryotic and\ eukaryotic DNA-binding proteins PUBMED:1622933, PUBMED:2184368, PUBMED:8022259. This domain is known as the\ \'cold-shock domain\' (CSD), part of which is highly similar PUBMED:1614871 to the RNP-1 RNA-binding motif.\ ' '172' 'IPR007533' '\ Cytochrome c oxidase assembly protein is essential for the assembly of functional cytochrome oxidase protein. In eukaryotes it is an integral protein of the mitochondrial inner membrane. Cox11 is essential for the insertion of Cu(I) ions to form the CuB site. This is essential for the stability of other structures in subunit I, for example haems a and a3, and the magnesium/manganese centre. Cox11 is probably only required in sub-stoichiometric amounts relative to the structural units PUBMED:10617659. The C-terminal region of the protein is known to form a dimer. Each monomer coordinates one Cu(I) ion via three conserved cysteine residues (111, 208 and 210) in Saccharomyces cerevisiae (). Met 224 is also thought to play a role in copper transfer or stabilising the copper site PUBMED:12063264.\ ' '173' 'IPR000374' '\ Phosphatidate cytidylyltransferase () PUBMED:2995359, PUBMED:8557688, PUBMED:9083091 (also known as CDP-\ diacylglycerol synthase) (CDS) is the enzyme that catalyzes the synthesis of\ CDP-diacylglycerol from CTP and phosphatidate (PA):\ \ CDP-diacylglycerol is an\ important branch point intermediate in both prokaryotic and eukaryotic\ organisms. CDS is a membrane-bound enzyme.\ ' '174' 'IPR007274' '\

    The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionines probably coordinate copper during the process of metal transport.

    \ ' '175' 'IPR001117' '\

    Copper is one of the most prevalent transition metals in living organisms and its biological function is intimately related to its redox properties. Since free copper is toxic, even at very low concentrations, its homeostasis in living organisms is tightly controlled by subtle molecular mechanisms. In eukaryotes, before being transported inside the cell via the high-affinity copper transporters of the CTR family, the copper (II) ion is reduced to copper (I). In blue copper proteins such as cupredoxin, the copper (I) ion form is stabilised by a constrained His2Cys coordination environment.

    \

    Multicopper oxidases oxidise their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre; dioxygen binds to the trinuclear centre and, following the transfer of four electrons, is reduced to two molecules of water PUBMED:16234932. There are three spectroscopically different copper centres found in multicopper oxidases: type 1 (or blue), type 2 (or normal) and type 3 (or coupled binuclear) PUBMED:2404764, PUBMED:1995346. Multicopper oxidases consist of 2, 3 or 6 of these homologous domains, which also share homology to the cupredoxins azurin and plastocyanin. Structurally, these domains consist of a cupredoxin-like fold, a beta-sandwich consisting of 7 strands in 2 beta-sheets, arranged in a Greek-key beta-barrel PUBMED:11867755. Multicopper oxidases include:

    \

    \

    In addition to the above enzymes there are a number of other proteins that are similar to the multi-copper oxidases in terms of structure and sequence, some of which have lost the ability to bind copper. These include: copper resistance protein A (copA) from a plasmid in Pseudomonas syringae; domain A of (non-copper binding) blood coagulation factors V (Fa V) and VIII (Fa VIII) PUBMED:3052293; yeast FET3 required for ferrous iron uptake PUBMED:8293473; yeast hypothetical protein YFL041w; and the fission yeast homologue SpAC1F7.08.

    \ \

    This entry represents multicopper oxidase type 1 (blue) domains. These domains are also present in proteins that have lost the ability to bind copper.

    \ ' '176' 'IPR014783' '\ Copper type II, ascorbate-dependent monooxygenases PUBMED:2792366 are a class of enzymes\ that requires copper as a cofactor and which uses ascorbate as an electron\ donor. This family contains two related enzymes, Dopamine-beta-monooxygenase ()\ and Peptidyl-glycine alpha-amidating monooxygenase ().\ There are a few regions of sequence similarities between these two enzymes,\ two of these regions contain clusters of conserved histidine residues which\ are most probably involved in binding copper.\ ' '177' 'IPR000323' '\ Copper type II, ascorbate-dependent monooxygenases PUBMED:2792366 are a class of enzymes\ that requires copper as a cofactor and which uses ascorbate as an electron\ donor. This family contains two related enzymes, Dopamine-beta-monooxygenase ()\ and Peptidyl-glycine alpha-amidating monooxygenase ().\ There are a few regions of sequence similarities between these two enzymes,\ two of these regions contain clusters of conserved histidine residues which\ are most probably involved in binding copper.\ ' '178' 'IPR003892' '\ This domain may be involved in binding ubiquitin-conjugating enzymes (UBCs). CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2.\ ' '179' 'IPR006045' '\

    This family represents the conserved barrel domain of the \'cupin\' superfamily (\'cupa\' is the Latin term for a small barrel). This family contains 11S and 7S plant seed storage proteins, and germins. Plant seed storage proteins provide the major nitrogen source for the developing plant.

    \ ' '180' 'IPR006973' '\ This family represents Cwf15/Cwc15 (from Schizosaccharomyces pombe and Saccharomyces cerevisiae respectively) and their homologues. The function of these proteins is unknown, but they form part of the spliceosome and are thus thought to be involved in mRNA splicing PUBMED:11884590.\ ' '181' 'IPR006768' '\

    This group of sequences contain a conserved C-terminal domain which is found in the Schizosaccharomyces pombe (Fission yeast) protein Cwf19 () and its homologues. Cwf19 is part of the Cdc5p complex involved in mRNA splicing PUBMED:11884590. This domain is found in association with , which is generally C-terminal and adjacent to this domain.

    \ ' '182' 'IPR006767' '\

    This group of sequences contain a conserved C-terminal domain which is found in the Schizosaccharomyces pombe (Fission yeast) protein Cwf19 () and its homologues. Cwf19 is part of the Cdc5p complex involved in mRNA splicing PUBMED:11884590. This domain is found in association with , which is generally N-terminal and adjacent to this domain.

    \ ' '183' 'IPR002619' '\ This domain has no known function. It is found in several Caenorhabditis elegans proteins. The domain contains 6 conserved cysteines that probably form three disulphide bridges.\ ' '184' 'IPR006765' '\

    Aromatic polyketides are assembled by a type II (iterative) polyketide synthase in bacteria. Iterative type II polyketide synthases produce polyketide chains of variable but defined length from a\ specific starter unit and a number of extender units. They also specify the initial regiospecific folding and cyclization pattern of nascent\ polyketides either through the action of a cyclase (CYC) subunit or through the combined action of site-specific ketoreductase \ and CYC subunits. Additional CYCs and other modifications may be necessary to produce linear aromatic polyketides.

    \ \

    This family represents a number of cyclases involved in polyketide synthesis in a number of actinobacterial species.

    \ \

    TcmI () catalyses an aromatic rearrangement in the biosynthetic pathaway of tetracenomycin C from Streptomyces coelicolor. The protein is a homodimer where each subunit forms a beta-alpha-beta fold belonging to the ferrodoxin fold superfamily PUBMED:15231835. Four strands of antiparallel sheets and a layer of alpha helices create a cavity which was proposed to be the active site. This structure shows strong topological similarity to a polyketide monoxygenase () from S. coelicolor which functions in the actinorhodin biosynthesic pathway PUBMED:12514126. It was suggested, therefore, that this fold is well suited to serve as a framework for rearrangements and chemical modification of polyaromatic substrates.

    \ ' '185' 'IPR006671' '\

    Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles PUBMED:12910258, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

    \

    Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi\'s sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus PUBMED:11056549.

    \ \ Cyclins contain two domains of similar all-alpha fold, of which this entry is associated with the N-terminal domain.\ ' '186' 'IPR003158' '\

    The photosynthetic apparatus in non-oxygenic bacteria consists of light-harvesting (LH) protein-pigment complexes LH1 and LH2, which use carotenoid and bacteriochlorophyll as primary donors PUBMED:11005826. LH1 acts as the energy collection hub, temporarily storing it before its transfer to the photosynthetic reaction centre (RC) PUBMED:15329728. Electrons are transferred from the primary donor via an intermediate acceptor (bacteriopheophytin) to the primary acceptor (quinine Qa), and finally to the secondary acceptor (quinone Qb), resulting in the formation of ubiquinol QbH2. RC uses the excitation energy to shuffle electrons across the membrane, transferring them via ubiquinol to the cytochrome bc1 complex in order to establish a proton gradient across the membrane, which is used by ATP synthetase to form ATP PUBMED:16931113, PUBMED:12872158, PUBMED:2676514.

    \ \

    The core complex is anchored in the cell membrane, consisting of one unit of RC surrounded by LH1; in some species there may be additional subunits PUBMED:11095707. RC consists of three subunits: L (light), M (medium), and H (heavy). Subunits L and M provide the scaffolding for the chromophore, while subunit H contains a cytoplasmic domain PUBMED:8027023. In Rhodopseudomonas viridis, there is also a non-membranous tetrahaem cytochrome (4Hcyt) subunit on the periplasmic surface.

    \ \

    In the purple bacterium Rhodocyclus gelatinosus (Rhodopseudomonas gelatinosa), a high potential Fe-S protein (HiPIP) acts as an electron donor to reaction centre-bound cyt bc1 under anaerobic conditions in the light, while cyt c acts as a soluble electron carrier under aerobic conditions in the dark in order to re-reduce the oxidized electron donor PUBMED:15155756.

    \ ' '187' 'IPR007732' '\

    Flavocytochrome b558 is the catalytic core of the respiratory-burst oxidase, an enzyme complex that catalyzes the\ NADPH-dependent reduction of O2 into the superoxide anion O2 in phagocytic cells. Flavocytochrome b558 is anchored in the plasma membrane. It is a heterodimer that consists of a large glycoprotein gp91phox (phox forphagocyte oxidase) (beta subunit) and a\ small protein p22phox (alpha subunit). The other components of the respiratory-burst oxidase are water-soluble proteins of cytosolic\ origin, namely p67phox, p47phox, p40phox and Rac. Upon cell stimulation, they assemble with the membrane-bound\ flavocytochrome b558 which becomes activated and generates O2- PUBMED:8798532.\

    \ ' '188' 'IPR011127' '\ This entry represents the N-terminal region of the D-alanine--D-alanine ligase enzyme () which is thought to be involved in substrate binding PUBMED:10908650. D-Alanine is one of the central molecules of the cross-linking step of peptidoglycan assembly. There are three enzymes involved in the D-alanine branch of peptidoglycan biosynthesis: the pyridoxal phosphate-dependent D-alanine racemase (Alr), the ATP-dependent D-alanine:D-alanine ligase (Ddl), and the ATP-dependent D-alanine:D-alanine-adding enzyme (MurF) PUBMED:9054558.\ ' '189' 'IPR000846' '\

    Dihydrodipicolinate reductase catalyzes the second step in the biosynthesis of diaminopimelic acid and lysine, the NAD or NADP-dependent reduction of 2,3-dihydrodipicolinate into 2,3,4,5-tetrahydrodipicolinate.

    \ \

    In Escherichia coli and Mycobacterium tuberculosis, dihydrodipicolinate reductase has equal specificity for NADH and NADPH, however in Thermotoga maritima there it has a greater affinity for NADPH PUBMED:18250105. In addition, the enzyme is inhibited by high concentrations of its substrate, which consequently acts as a feedback control on the lysine biosynthesis pathway. In T. maritima, the enzyme also lacks N-terminal and C-terminal loops which are present in enzyme of the former two organisms.

    \ ' '190' 'IPR002602' '\ This domain has no known function being found in several\ Caenorhabditis elegans proteins. The domain contains 12 conserved\ cysteines that probably form six disulphide bridges.\ This domain is found associated with Ig and\ Fibronectin, type III domains.\ ' '191' 'IPR007708' '\

    This presumed domain is found at the C terminus of lariat debranching enzyme. This domain is always found in association with a metallo-phosphoesterase domain . RNA lariat debranching enzyme is capable of digesting a variety of branched nucleic acid substrates and multicopy single-stranded DNAs. The enzyme degrades intron lariat structures during splicing.

    \ ' '192' 'IPR004146' '\ This short domain is rich in cysteines and histidines. The pattern of conservation is similar to that found in DAG_PE-bind (), therefore we have termed this domain DC1 for divergent C1 domain. This domain probably also binds to two zinc ions. The function of proteins with this domain is uncertain, however this domain may bind to molecules such as diacylglycerol. This family are found in plant proteins.\ ' '193' 'IPR002125' '\

    Cytidine deaminase () (cytidine aminohydrolase) catalyzes the hydrolysis of cytidine into uridine and ammonia while deoxycytidylate deaminase () (dCMP deaminase) hydrolyzes dCMP into dUMP. Both enzymes are known to bind zinc and to require it for their catalytic activity PUBMED:1567863, PUBMED:8428902. These two enzymes do not share any sequence similarity with the exception of a region that contains three conserved histidine and cysteine residues which are thought to be involved in the binding of the catalytic zinc ion.

    \

    Such a region is also found in other proteins PUBMED:8061614, PUBMED:8203015:

    \ \ ' '194' 'IPR007722' '\ This presumed domain is always found to the N-terminal side of the NUDIX hydrolase domain . This domain appears to be specific to mRNA decapping protein 2 and its close homologues. This region has been termed Box A PUBMED:12218187.\ ' '195' 'IPR003533' '\

    X-linked lissencephaly is a severe brain malformation affecting males.\ Recently it has been demonstrated that the doublecortin gene is implicated in\ this disorder PUBMED:9489699. Doublecortin was found to bind to the microtubule cytoskeleton. In vivo and in vitro assays show that Doublecortin stabilises microtubules and causes bundling PUBMED:10441322. Doublecortin is a basic protein with an iso-electric point of 10, typical of microtubule-binding proteins. However, its sequence contains no known microtubule-binding domain(s).

    \ \

    The detailed sequence analysis of Doublecortin and Doublecortin-like proteins allowed the identification of an evolutionarily conserved Doublecortin (DC) domain. This domain is found in the N-terminus of proteins and consists of one or two tandemly repeated copies of an around 80 amino acids region. It has been suggested that the first DC domain of Doublecortin binds tubulin and enhances microtubule polymerisation PUBMED:10749977.

    \ ' '196' 'IPR005112' '\

    This region is always found associated with . It is predicted to form a globular domain that is completely alpha helical PUBMED:11563850. Although not statistically supported it has been suggested that this domain may be similar to members of the Rho/Rac/Cdc42 GEF family PUBMED:11563850.

    \ ' '197' 'IPR004022' '\ This domain is predicted to be a DNA binding domain. The DDT domain is named after (DNA binding homeobox and Different Transcription factors). It is found in foetal Alzheimer antigen and several hypothetical and uncharacterised proteins.\ ' '198' 'IPR011545' '\

    Members of this family include the DEAD and DEAH box helicases. Helicases are involved in unwinding nucleic acids. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression.

    \ ' '199' 'IPR000488' '\

    The death domain (DD) is a homotypic protein interaction module composed of a bundle of six alpha-helices. DD is related in sequence and structure to the death effector domain (DED, see ) and the caspase recruitment domain (CARD, see ), which work in similar pathways and show similar interaction properties PUBMED:11504623. DD bind each other forming oligomers. Mammals have numerous and diverse DD-containing proteins PUBMED:7482697. Within these proteins, the DD domains can be found in combination with other domains, including: CARDs, DEDs, ankyrin repeats (), caspase-like folds, kinase domains, leucine zippers (), leucine-rich repeats (LRR) (), TIR domains (), and ZU5 domains () PUBMED:15226512.

    \

    Some DD-containing proteins are involved in the regulation of apoptosis and inflammation through their activation of caspases and NF-kappaB, which typically involves interactions with TNF (tumour necrosis factor) cytokine receptors PUBMED:14585074, PUBMED:14601641. In humans, eight of the over 30 known TNF receptors contain DD in their cytoplasmic tails; several of these TNF receptors use caspase activation as a signalling mechanism. The DD mediates self-association of these receptors, thus giving the signal to downstream events that lead to apoptosis. Other DD-containing proteins, such as ankyrin, MyD88 and pelle, are probably not directly involved in cell death signalling. DD-containing proteins also have links to innate immunity, communicating with Toll family receptors through bipartite adapter proteins such as MyD88 PUBMED:12691620.

    \ \ ' '200' 'IPR006720' '\

    The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing PUBMED:1699826. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa).

    Alternative splicing generates different carboxy terminal ends in different protein isoforms. This domain is the most C-terminal region that is present in the main isoforms.

    \ ' '201' 'IPR001194' '\

    The human serine- and leucine-rich DENN protein possesses a RGD cellular adhesion motif and a leucine-zipper-like motif associated with protein dimerization, and shows partial homology to the receptor binding domain of tumor necrosis factor alpha. DENN is virtually identical to MADD, a human MAP kinase-activating death domain protein that interacts with type I tumor necrosis factor receptor. DENN displays significant homology to Rab3 GEP, a rat GDP/GTP exchange protein specific for Rab3 small G proteins implicated in intracellular vesicle trafficking. DENN also exhibits strong similarity to Caenorhabditis elegans AEX-3, which interacts with Rab3 to regulate synaptic vesicle release PUBMED:9796103. The DENN domain is always encircled on both sides by more divergent domains, known as uDENN () and dDENN (), which could play a key role in DENN function.

    \ ' '202' 'IPR000591' '\

    This is a domain of unknown function present in signalling proteins including dishevelled, Egl-10, and pleckstrin proteins. Segment polarity dishevelled protein is required to establish coherent arrays of polarized cells and segments in embryos, and plays a role in wingless signalling. Egl-10 regulates G-protein signalling in the central nervous system. Mammalian regulators of G-protein signalling also contain these domains, and regulate signal transduction by increasing the GTPase activity of G-protein alpha subunits, thereby driving them into their inactive GDP-bound form.

    \ ' '203' 'IPR003156' '\

    This domain is often found adjacent to the DHH domain, found in the RecJ-like phosphoesterase family , and is called DHHA1 for DHH associated domain. DHHA1 is diagnostic of DHH subfamily 1 members PUBMED:9478130. This domain is also found in alanyl tRNA synthetase e.g. , suggesting that it may have an RNA binding function. The domain is about 60 residues long and contains a conserved GG motif.

    \ ' '204' 'IPR007006' '\

    Members of this entry are glycosyltransferases, belonging to the ALG10 family. The majority of the members are annotated as alpha-1,2 glucosyltransferas. The ALG10 protein from Saccharomyces cerevisiae (Baker\'s yeast) encodes the alpha-1,2 glucosyltransferase of the endoplasmic reticulum. This protein has been characterised in Rat as potassium channel regulator 1 PUBMED:9722534.

    \ ' '205' 'IPR018444' '\ Dilute encodes a novel type of myosin heavy chain, with a tail, or C-terminal, region that has elements of both type II (alpha-helical coiled-coil) and type I (non-coiled-coil) myosin heavy chains. \ The DIL non alpha-helical domain is found in dilute myosin heavy chain proteins and other myosins. In mouse the dilute protein may play a role in the elaboration, maintenance, or function of cellular processes of melanocytes and neurons PUBMED:1996138.\ The MYO2 protein of Saccharomyces cerevisiae is implicated in vectorial vesicle transport and is homologous to the dilute protein over practically its entire length PUBMED:2016335.\ ' '206' 'IPR004265' '\ This family contains a number of proteins which are induced during disease response in plants.\ ' '207' 'IPR003351' '\

    Wnt proteins constitute a large family of secreted signalling molecules that\ are involved in intercellular signalling during development. The name \ derives from the first 2 members of the family to be discovered: int-1 \ (mouse) and wingless (Wg) (Drosophila) PUBMED:9891778. It is now recognised that Wnt signalling controls many cell fate decisions in a variety of different organisms, including mammals. Wnt signalling has been implicated in \ tumourigenesis, early mesodermal patterning of the embryo, morphogenesis of \ the brain and kidneys, regulation of mammary gland proliferation and \ Alzheimer\'s disease PUBMED:10967351.

    \ \

    Wnt signal transduction proceeds initially via binding to their cell\ surface receptors - the so-called frizzled proteins. This activates the\ signalling functions of B-catenin and regulates the expression of specific\ genes important in development PUBMED:10733430. More recently, however, several non-canonical Wnt signalling pathways have been elucidated that act independently of B-catenin. In both cases, the transduction mechanism\ requires dishevelled protein (Dsh), a cytoplasmic phosphoprotein that acts\ directly downstream of frizzled PUBMED:12072470. In addition to its role in Wnt signalling, Dsh is also involved in generating planar polarity in Drosophila and has been implicated in the Notch signal transduction cascade. Three human and mouse homologues of Dsh have been cloned (DVL-1 to 3); it is \ believed that these proteins, like their Drosophila counterpart, are \ involved in signal transduction. Human and murine orthologues share more \ than 95% sequence identity and are each 40-50% identical to Drosophila Dsh.

    \ \

    Sequence similarity amongst Dsh proteins is concentrated around three \ conserved domains: at the N-terminus lies a DIX domain (mutations \ mapping to this region reduce or completely disrupt Wg signalling); a PDZ \ (or DHR) domain, often found in proteins involved in protein-protein \ interactions, lies within the central portion of the protein (point \ mutations within this module have been shown to have little effect on \ Wg-mediated signal transduction); and a DEP domain is located towards the C-terminus and is conserved among a set of proteins that regulate various \ GTPases (whilst genetic and molecular assays have shown this module to be \ dispensable for Wg signalling, it is thought to be important in planar \ polarity signalling in flies PUBMED:12072470).

    \ \

    This domain is specific to the signalling protein dishevelled. In Drosophila melanogaster, the dishevelled segment polarity protein is required to establish coherent arrays of polarized cells and segments in embryos. It plays a role in wingless signalling, possibly through the reception of the wingless signal by target cells and subsequent redistribution of arm protein in response to that signal in embryos.The domain is found adjacent to the PDZ domain (), often in conjunction with DEP () and DIX ().

    \ ' '208' 'IPR004190' '\ The DNA polymerase processivity factor is a replisome sliding clamp subunit, which is responsible for tethering the catalytic subunit of DNA polymerase to the DNA during high speed replication. The crystal structure of the Bacteriophage RB69 sliding clamp has been solved. It has shown that the peptide binds to the sliding clamp at the same position as that of a replication inhibitor peptide bound to PCNA. This suggests that the replication inhibitor protein p21CIP1 competes with eukaryotic polymerases for the same binding pocket on the clamp PUBMED:10535734.\ ' '209' 'IPR002818' '\

    This signature defines a diverse group of protein families which include proteins involved in RNA-protein interaction regulation,\ thiamine biosynthesis, Ras-related signal transduction, and those with protease activity. Examples of annotation are:

    \ \

    \ ' '210' 'IPR001275' '\ This domain was first discovered in the doublesex proteins of Drosophila melanogaster and is also seen in proteins from Caenorhabditis elegans PUBMED:9490411. In D. melanogaste the \ doublesex gene controls somatic sexual differentiation by producing alternatively spliced mRNAs encoding related sex-specific polypeptides PUBMED:8978051. These proteins are believed to function as transcription factors on downstream sex-determination genes, especially on neuroblast differentiation and yolk protein genes transcription PUBMED:1907913, PUBMED:3046751. The DM domain binds DNA as a dimer, allowing the recognition of pseudopalindromic sequences PUBMED:8978051, PUBMED:9927589, PUBMED:10898790. The NMR analysis of the DSX DM domain PUBMED:10898790 revealed a novel zinc module containing \'intertwined\' CCHC and HCCC \ zinc-binding sites. The recognition of the DNA requires the carboxy-terminal basic\ tail which contacts the minor groove of the target sequence.\ ' '211' 'IPR012310' '\

    This domain belongs to a more diverse superfamily, including catalytic domain of the mRNA capping enzyme () and NAD-dependent DNA ligase () PUBMED:8653795.

    \ ' '212' 'IPR012309' '\

    This region is found in many but not all ATP-dependent DNA ligase enzymes (). It is thought to constitute part of the catalytic core of ATP dependent DNA ligase PUBMED:9016621.

    \ ' '213' 'IPR012308' '\

    This region is found in many but not all ATP-dependent DNA ligase enzymes (). It is thought to be involved in DNA binding and in catalysis. In human DNA ligase I (), and in Saccharomyces cerevisiae (Baker\'s yeast) (), this region was necessary for catalysis, and separated from the amino terminus by targeting elements. In Vaccinia virus () this region was not essential for catalysis, but deletion decreases the affinity for nicked DNA and decreased the rate of strand joining at a step subsequent to enzyme-adenylate formation PUBMED:9016621.

    \ ' '214' 'IPR006050' '\

    DNA photolyases are enzymes that bind to DNA containing pyrimidine dimers:\ on absorption of visible light, they catalyse dimer splitting into the\ constituent monomers, a process called photoreactivation PUBMED:6325459. This is a DNA\ repair mechanism, repairing mismatched pyrimidine dimers induced by\ exposure to ultra-violet light PUBMED:3000886. The precise mechanisms involved in\ substrate binding, conversion of light energy to the mechanical energy\ needed to rupture the cyclobutane ring, and subsequent release of the\ product are uncertain PUBMED:6325459. Analysis of DNA lyases has revealed the presence\ of an intrinsic chromophore, all monomers containing a reduced FAD moiety,\ and, in addition, either a reduced pterin or 8-hydroxy-5-diazaflavin as a\ second chromophore PUBMED:3000886, PUBMED:2110564. Either chromophore may act as the primary photon\ acceptor, peak absorptions occurring in the blue region of the spectrum\ and in the UV-B region, at a wavelength around 290nm PUBMED:2110564.

    This domain binds a light harvesting cofactor.

    \ ' '215' 'IPR006134' '\

    DNA is the biological information that instructs cells how to exist in an ordered fashion: accurate replication is thus one of the\ most important events in the life cycle of a cell. This function is performed by DNA- directed DNA-polymerases )\ by adding nucleotide triphosphate (dNTP) residues to the 5\'-end of the growing chain of DNA, using a complementary DNA\ chain as a template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins\ may also be used for the de novo synthesis of a DNA chain. Even though there are 2 different methods of priming, these are\ mediated by 2 very similar polymerases classes, A and B, with similar methods of chain elongation. \ \ A number of DNA polymerases have been grouped under the designation of DNA polymerase family B. Six regions\ of similarity (numbered from I to VI) are found in all or a subset of the B family polymerases. The most conserved region (I)\ includes a conserved tetrapeptide with two aspartate residues. Its function is not yet known, however, it has been suggested\ that it may be involved in binding a magnesium ion. All sequences in the B family contain a characteristic DTDS motif, and\ possess many functional domains, including a 5\'-3\' elongation domain, a 3\'-5\' exonuclease domain PUBMED:8679562, a DNA binding domain,\ and binding domains for both dNTP\'s and pyrophosphate PUBMED:9757117.

    \

    This region of DNA polymerase B appears to consist of more than one structural domain, possibly including elongation,\ DNA-binding and dNTP binding activities PUBMED:9757117.

    \ ' '216' 'IPR006133' '\

    DNA is the biological information that instructs cells how to exist in an ordered fashion: accurate replication is thus one of the\ most important events in the life cycle of a cell. This function is performed by DNA- directed DNA-polymerases )\ by adding nucleotide triphosphate (dNTP) residues to the 5\'-end of the growing chain of DNA, using a complementary DNA\ chain as a template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins\ may also be used for the de novo synthesis of a DNA chain. Even though there are 2 different methods of priming, these are\ mediated by 2 very similar polymerases classes, A and B, with similar methods of chain elongation. \ \ A number of DNA polymerases have been grouped under the designation of DNA polymerase family B. Six regions\ of similarity (numbered from I to VI) are found in all or a subset of the B family polymerases. The most conserved region (I)\ includes a conserved tetrapeptide with two aspartate residues. Its function is not yet known. However, it has been suggested\ that it may be involved in binding a magnesium ion. All sequences in the B family contain a characteristic DTDS motif, and\ possess many functional domains, including a 5\'-3\' elongation domain, a 3\'-5\' exonuclease domain PUBMED:8679562, a DNA binding domain,\ and binding domains for both dNTP\'s and pyrophosphate PUBMED:9757117.

    \

    This domain has 3\' to 5\' exonuclease activity and adopts a ribonuclease H type fold PUBMED:8679562.

    \ ' '217' 'IPR007218' '\

    DNA polymerase is responsible for effective DNA replication. The function of the delta subunit 4 of DNA polymerase is not yet known.

    \ ' '218' 'IPR007693' '\

    The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This N-terminal domain is required both for interaction with other proteins in the primosome and for DnaB helicase activity. This domain has a multi-helical structure that forms an orthogonal bundle PUBMED:10404598.

    \ ' '219' 'IPR007694' '\

    The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This C-terminal domain contains an ATP-binding site and is therefore probably the site of ATP hydrolysis.

    \ ' '220' 'IPR001623' '\

    The prokaryotic heat shock protein DnaJ interacts with the chaperone hsp70-like DnaK protein PUBMED:8016869. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called \'J\' domain) of about 70 amino acids, a glycine-rich region (\'G\' domain\') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif (\'CRR\' domain) and a C-terminal region of 120 to 170 residues.

    \

    Such a structure is shown in the following schematic representation:\

    \
      +------------+-+-------+-----+-----------+--------------------------------+\
      | N-terminal | | Gly-R |     | CXXCXGXG  | C-terminal                     |\
      +------------+-+-------+-----+-----------+--------------------------------+\
    

    \

    It is thought that the \'J\' domain of DnaJ mediates the interaction with the dnaK protein and consists of four helices, the second of which has a charged surface that includes at least one pair of basic residues that are essential for interaction with the ATPase domain of Hsp70. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins PUBMED:1585456, either together or separately. In yeast, J-domains have been classified into 3 groups; the class III proteins are functionally distinct and do not appear to act as molecular chaperones PUBMED:15170475.

    \ ' '221' 'IPR013050' '\

    The DOMON domain is an 110-125 residue long domain which has been identified\ in the physiologically important enzyme dopamine beta-monooxygenase and in\ several other secreted and transmembrane proteins from both plants and\ animals. It has been named after DOpamine beta-MOnooxygenase N-terminal\ domain. The DOMON domain can be found in one to four copies and in association\ with other domains, such as the Cu-ascorbate dependent monooxygenase domain,\ the epidermal growth factor domain, the trypsin inhibitor-like domain (TIL), the SEA domain and the Reelin domain.\ The architectures of the DOMON domain proteins strongly suggest a function in\ extracellular adhesion PUBMED:11551777.

    \

    \ The sequence conservation is predominantly centred around patches of\ hydrophobic residues. The secondary structure prediction of the DOMON domain\ points to an all-beta-strand fold with seven or eight core strands supported\ by a buried core of conserved hydrophobic residues. There is a chraracteristic\ motif with two small positions (Gly or Ser) corresponding to a conserved turn\ immediately C-terminal to strand three. It has been proposed that the DOMON\ domain might form a beta-sandwich structure, with the strands distributed into\ two beta sheets as is seen in many extracellular adhesion domains such as the\ immunoglobulin, fibronectin type III, cadherin and PKD domains PUBMED:11551777.

    \ ' '222' 'IPR007255' '\ Dor1 is involved in vesicle targeting to the yeast Golgi apparatus and complexes with a number of other trafficking proteins, which include Sec34 and Sec35 PUBMED:11703943.\ ' '223' 'IPR007301' '\

    is a subunit of the terminal quinol oxidase present in the plasma membrane of Acidianus ambivalens, with calculated molecular mass of 20.4 kDa PUBMED:15306018. Thiosulphate:quinone oxidoreductase (TQO) is one of the early steps in elemental sulphur oxidation. A novel TQO enzyme was purified from the thermo-acidophilic archaeon A. ambivalens and shown to consist of a large subunit (DoxD) and a smaller subunit (DoxA). The DoxD- and DoxA-like two subunits are fused together in a single polypeptide in .

    \ ' '224' 'IPR002469' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This domain defines serine peptidases belonging to MEROPS peptidase family S9 (clan SC), subfamily S9B (dipeptidyl-peptidase IV). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. This domain is an alignment of the region to the N-terminal side of the active site, which is found in .

    \ \ \

    CD26 () is also called adenosine deaminase-binding protein (ADA-binding protein) or dipeptidylpeptidase IV (DPP IV ectoenzyme). The exopeptidase cleaves off N-terminal X-Pro or X-Ala dipeptides from polypeptides (dipeptidyl peptidase IV activity). CD26 serves as the costimulatory molecule in T cell activation and is an associated marker of autoimmune diseases, adenosine deaminase-deficiency and HIV pathogenesis.

    \ \

    Dipeptidyl peptidase IV (DPP IV) is responsible for the removal of N-terminal dipeptides sequentially from polypeptides having unsubstituted N termini, provided that the penultimate residue is proline. The enzyme catalyses the reaction:\ \ It is a type II membrane protein that forms a homodimer.

    \ \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).\

    \ \ ' '225' 'IPR007858' '\

    This motif is about 40 residues long and is probably formed of two alpha-helices. It is found in the Dpy-30 proteins, hence the motifs name. Dpy-30 from \ Caenorhabditis elegans is an essential component of dosage compensation machinery and loss of dpy-30 activity results in XX-specific lethality; in XO animals, Dpy-30 is required for developmental processes other than dosage compensation PUBMED:7588066. In yeast, the homologue of DPY-30, Saf19p, functions as part of the Set1 complex that is necessary for the methylation of histone H3 at lysine residue 4; Set1 is a key part of epigenetic developmental control PUBMED:11752412. There is also a human homologue of Dpy-30 PUBMED:16260194. This Dpy-30 region may be a dimerisation motif analogous that found in the cAMP-dependent protein kinase regulator, type II PKA, R subunit .

    \ ' '226' 'IPR001774' '\ Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of\ the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1,\ influence different cell fate decisions during development PUBMED:8575327. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain PUBMED:1657403, PUBMED:7716513.\ ' '227' 'IPR002593' '\ This domain has no known function. It is found in several\ Caenorhabditis elegans proteins. The domain contains 6 conserved\ cysteines that probably form three disulphide bridges.\ ' '228' 'IPR000340' '\

    Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; ) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation PUBMED:9818190, PUBMED:14625689. The PTP superfamily can be divided into four subfamilies PUBMED:12678841:

    \

    \

    Based on their cellular localisation, PTPases are also classified as:

    \

    \

    All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif PUBMED:9646865. Functional diversity between PTPases is endowed by regulatory domains and subunits.

    \ \

    This entry represents dual specificity protein-tyrosine phosphatases. Ser/Thr and Tyr dual specificity phosphatases are a group of enzymes with both Ser/Thr () and tyrosine specific protein phosphatase () activity able to remove both the serine/threonine or tyrosine-bound phosphate group from a wide range of phosphoproteins, including a number of enzymes which have been phosphorylated\ under the action of a kinase. Dual specificity protein phosphatases (DSPs) regulate mitogenic signal transduction and control the cell cycle. The crystal structure of a human DSP, vaccinia H1-related phosphatase (or VHR), has been determined at 2.1 angstrom resolution PUBMED:8650541. A shallow active site pocket in VHR allows for the hydrolysis of phosphorylated serine, threonine, or tyrosine protein residues, whereas the deeper active site of protein tyrosine phosphatases (PTPs) restricts substrate specificity to only phosphotyrosine. Positively charged crevices near the active site may explain the enzyme\'s preference for substrates with two phosphorylated residues. The VHR structure defines a conserved structural scaffold for both DSPs and PTPs. A "recognition region" connecting helix alpha1 to strand beta1, may determine differences in substrate specificity between VHR, the PTPs, and other DSPs.

    \

    These proteins may also have inactive phosphatase domains, and dependent on the domain composition this loss of catalytic activity has different effects on protein function. Inactive single domain phosphatases can still specifically bind substrates, and protect again dephosphorylation, while the inactive domains of tandem phosphatases can be further subdivided into two classes. Those which bind phosphorylated tyrosine residues may recruit multi-phosphorylated substrates for the adjacent active domains and are more conserved, while the other class have accumulated several variable amino acid substitutions and have a complete loss of tyrosine binding capability. The second class shows a release of evolutionary constraint for the sites around the catalytic centre, which emphasises a difference in function from the first group. There is a region of higher conservation common to both classes, suggesting a new regulatory centre PUBMED:14739250.

    \ ' '229' 'IPR001159' '\ The DsRBD domain is found in a variety of RNA-binding proteins with different\ structures and exhibiting a diversity of functions PUBMED:8036511.\ It is involved in localisation of at least five different mRNAs in the early Drosophila embryo and by interferon-induced protein kinase in humans, which is part of the cellular response to dsRNA.\ ' '230' 'IPR002804' '\

    The function of one of the archeases from the hyperthermophile Pyrococcus abyssi has been determined. The gene encoding the archease (PAB1946) is located in a bicistronic operon immediately upstream from a second open reading frame (PAB1947), which encodes a tRNA m5C methyltransferase. The methyl transferase catalyses m5C formation at several cytosine\'s within tRNAs with preference for C49; the specificity of the methyltransferase reaction being increased by the archease. The archease exists in monomeric and oligomeric states, with only the oligomeric forms able to bind the methyltransferase. Binding prevents aggregation and hinders dimerisation of the methyltransferase-tRNA complex PUBMED:17470432.

    \ \

    The function of this family of archeases as chaperones is supported by structural analysis of from Methanobacterium thermoautotrophicum, which shows homology to heat shock protein 33, which is a chaperone protein that inhibits the aggregation of partially denatured proteins PUBMED:11854485.

    \ \ \

    The archease superfamily of proteins are represented in all three domains of life. Archease genes are generally located adjacent to genes encoding proteins involved in DNA or RNA processing and therefore been predicted to be modulators or chaperones involved in DNA or RNA metabolism. Many of the roles of archeases remain to be established experimentally.

    \ ' '231' 'IPR004401' '\

    The function of this protein is unknown. It is restricted to bacteria and a few plants, such as Arabidopsis. The plant form contains an additional N-terminal region that may serve as a transit peptide and shows a close relationship to the cyanobacterial member, suggesting that it is a chloroplast protein. Members of this family are found in a single copy per bacterial genome, but are broadly distributed. A crystal structure of one member, YbaB from Haemophilus influenzae, revealed a core structure consisting of two layers, alpha/beta; YbaB forms a tight dimer with a 3-layer structure, beta/alpha/beta PUBMED:12486730. YbaB is co-transcribed with RecR, which appears to protect DNA strands of the replilcation fork when it is blocked by DNA damage. A deletion of the YbaB operon resulted in increased sensitivity to DNA-damaging agents compared with the wild-type strain.

    \ ' '232' 'IPR003732' '\

    This homodimeric enzyme appears able to cleave any D-amino acid (and glycine, which does not have distinct D/L forms) from charged tRNA. The name reflects characterization with respect to D-Tyr on tRNA(Tyr) as established in the literature, but substrate specificity seems much broader.

    \ ' '233' 'IPR003735' '\

    This entry describes proteins of unknown function.

    \ ' '234' 'IPR003740' '\

    This entry describes proteins of unknown function.

    \ ' '235' 'IPR003791' '\

    This entry describes proteins of unknown function.

    \ ' '236' 'IPR003793' '\ This is an uncharacterised domain found in proteins of unknown function.\ ' '237' 'IPR004119' '\ This family includes proteins of unknown function. All known members of this group are proteins from drosophila and Caenorhabditis elegans.\ ' '238' 'IPR004245' '\ Members of this family are uncharacterised with a long conserved region that may contain several domains.\ ' '239' 'IPR008166' '\

    This family contains Caenorhabditis elegans proteins of unknown function.

    \ ' '240' 'IPR004253' '\ This domain of unknown function is found in Arabidopsis thaliana and other plant proteins.\ ' '241' 'IPR004127' '\

    Prefoldin (PFD) is a chaperone that interacts exclusively with type II chaperonins, hetero-oligomers lacking an obligate co-chaperonin that are found only in eukaryotes (chaperonin-containing T-complex polypeptide-1 (CCT)) and archaea. Eukaryotic PFD is a multi-subunit complex containing six polypeptides in the molecular mass range of 14-23 kDa. In archaea, on the other hand, PFD is composed of two types of subunits, two alpha and four beta. The six subunits associate to form two back-to-back up-and-down eight-stranded barrels, from which hang six coiled coils. Each subunit contributes one (beta subunits) or two (alpha subunits) beta hairpin turns to the barrels. The coiled coils are formed by the N and C termini of an individual subunit. Overall, this unique arrangement resembles a jellyfish. The eukaryotic PFD hexamer is composed of six different subunits; however, these can be grouped into two alpha-like (PFD3 and -5) and four beta-like (PFD1, -2, -4, and -6) subunits based on amino acid sequence similarity with their archaeal counterparts. Eukaryotic PFD has a six-legged structure similar to that seen in the archaeal homologue PUBMED:11106732, PUBMED:12456645. This family contains the archaeal alpha subunit, eukaryotic prefoldin subunits 3 and 5 and the UXT (ubiquitously expressed transcript) family. \ \

    \ \

    Eukaryotic PFD has been shown to bind both actin and tubulin co-translationally. The chaperone then delivers the target protein to CCT, interacting with the chaperonin through the tips of the coiled coils. No authentic target proteins of any archaeal PFD have been identified, to date.

    \ ' '244' 'IPR004314' '\

    This domain is found in a number of Arabidopsis thaliana and other plant proteins of unknown function. A small number of the proteins that contain this domain are annotated as carboxyl-terminal proteinase-like.

    \ ' '245' 'IPR004320' '\ This family represents plant proteins of unknown function.\ ' '246' 'IPR004145' '\ This domain is only found in fly proteins. It is found associated with YLP motifs () in some proteins.\ ' '247' 'IPR004158' '\ The function of the plant proteins constituting this family is unknown.\ ' '248' 'IPR004159' '\

    Members of this family of hypothetical plant proteins are putative methyltransferases.

    \ ' '249' 'IPR004353' '\

    Members of this family have been called SAND proteins PUBMED:15647795 although these proteins do not contain a SAND domain. In Saccharomyces cerevisiae a protein complex of Mon1 and Ccz1 functions with the small GTPase Ypt7 to mediate vesicle trafficking to the vacuole PUBMED:12364329, PUBMED:11169758. The Mon1/Ccz1 complex is conserved in eukaryotic evolution and members of this family (previously known as DUF254) are distant homologues to domains of known structure that assemble into cargo vesicle adapter (AP) complexes PUBMED:10025966, PUBMED:17075139.

    \ ' '250' 'IPR004882' '\

    This family consists of several LUC7 protein homologues that are restricted to eukaryotes. LUC7 has been shown to be a U1 snRNA associated protein PUBMED:10631324 with a role in splice site recognition PUBMED:11170747. The entry contains human and mouse LUC7 like (LUC7L) proteins PUBMED:10500099 and human cisplatin resistance-associated overexpressed protein (CROP) PUBMED:11804584.

    \ ' '251' 'IPR002902' '\ This domain is found in plants and it has no known function. It is found in serine/threonine kinases, associated with the Eukaryotic protein kinase domain . The domain contains four conserved cysteines.\ ' '252' 'IPR004883' '\

    The lateral organ boundaries (LOB) gene is expressed at the adaxial base of initiating lateral organs and encodes a plant-specific protein of unknown function. The N-terminal one half of the LOB protein contains a conserved approximately 100-amino acid domain (the LOB domain) that is present in 42 other Arabidopsis thaliana proteins and in proteins from a variety of other plant species. Genes encoding LOB domain (LBD) proteins are expressed in a variety of temporal- and tissue-specific patterns, suggesting that they may function in diverse processes PUBMED:12068116\ The LOB domain contains conserved blocks of amino acids that identify the LBD gene family. In particular, a conserved C-x(2)-C-x(6)-C-x(3)-C motif, which is defining feature of the LOB domain, is present in all LBD proteins. It is possible that this motif forms a new zinc finger PUBMED:12068116.

    \ \ ' '254' 'IPR004950' '\

    This family of proteins, from Caenorhabditis species, have not been characterised though a number are annotated as \'serpentine receptor, class r\' proteins.

    \ ' '255' 'IPR004951' '\ This family consists of proteins of unknown function found in Caenorhabditis species.\ ' '256' 'IPR004987' '\

    This is a family of proteins of unknown function.

    \ ' '257' 'IPR005071' '\ This family of worm proteins has no known function\ ' '258' 'IPR005024' '\

    This is a family of eukaryotic proteins which are variously described as either hypothetical protein, developmental protein or related to yeast SNF7. The family contains human CHMP1. CHMP1 (CHromatin Modifying Protein; CHarged Multivesicular body Protein), is encoded by an alternative open reading frame in the PRSM1 gene PUBMED:8863740 and is conserved in both complex and simple eukaryotes. CHMP1 contains a predicted bipartite nuclear localisation signal and distributes as distinct forms to the cytoplasm and the nuclear matrix in all cell lines tested.

    \

    Human CHMP1 is strongly implicated in multivesicular body formation. A multivesicular body is a vesicle-filled endosome that targets proteins to the interior of lysosomes. Immunocytochemistry and biochemical fractionation localise CHMP1 to early endosomes and CHMP1 physically interacts with SKD1/VPS4, a highly conserved protein directly linked to multivesicular body sorting in yeast. Similar to the action of a mutant SKD1 protein, over expression of a fusion derivative of human CHMP1 dilates endosomal compartments and disrupts the normal distribution of several endosomal markers. Genetic studies in Saccharomyces cerevisiae (Baker\'s yeast) further support a conserved role of CHMP1 in vesicle trafficking. Deletion of CHM1, the budding yeast homologue of CHMP1, results in defective sorting of carboxypeptidases S and Y and produces abnormal, multi-lamellar prevacuolar compartments. This phenotype classifies CHM1 as a member of the class E vacuolar protein sorting genes PUBMED:11559747.

    \ ' '259' 'IPR005044' '\

    This family consists of proteins of unknown function found in Caenorhabditis species.

    \ ' '260' 'IPR005034' '\

    This domain is found in members of the Dicer protein family of dsRNA nucleases. This entry represents a dsRNA-binding domain. RNA interference (RNAi) is an ancient gene-silencing process that plays a fundamental role in diverse eukaryotic functions including viral defence, chromatin remodelling, genome rearrangement, developmental timing, brain morphogenesis, and stem cell maintenance. All RNAi pathways require the multidomain ribonuclease Dicer, which initiates RNAi by cleaving double-stranded RNA (dsRNA) substrates into small fragments ~25 nuleotides in length. A typical eukaryotic Dicer consists of a helicase domain (), a domain of unknown function, and a PAZ domain () at the amino (N)-terminus as well as two ribonuclease III domains () and a dsRNA-binding domain (dsRBD) () at the carboxy (C)-terminus. The domain of unknown function of ~100 amino acids is predicted to adopt the canonical alpha-beta-beta-beta-alpha-fold found in all dsRBDs PUBMED:16410517, PUBMED:16954143, PUBMED:17277330, PUBMED:17666393.

    \ ' '261' 'IPR005046' '\

    This is a family proteins of unknown function. Many contain a tandem peptide repeat sequence of 25 or 26 residues, found in predicted surface proteins (often lipoproteins) from Listeria monocytogenes, Listeria innocua, Enterococcus faecalis (Streptococcus faecalis), Lactobacillus plantarum, Mycoplasma mycoides, Helicobacter hepaticus, and other species.

    \ ' '262' 'IPR001534' '\

    This new apparently nematode-specific protein family has been called family 2 PUBMED:9417907. The proteins show weak similarity to transthyretin (formerly called prealbumin) which transports thyroid hormones. The specific function of this protein is unknown.

    \ ' '263' 'IPR005174' '\

    This family of proteins are found in plants. The function of the proteins is unknown.

    \ ' '264' 'IPR005176' '\

    The eukaryotic defective in cullin neddylation (DCN) protein family, may contribute to neddylation of cullin components of SCF-type E3 ubiquitin ligase complexes. These multi-protein complexes are required for polyubiquitination and subsequent degradation of target proteins by the 26S proteasome PUBMED:15988528. Proteins in the DCN family include:\

    \

    \ \

    This entry represents a domain found within DCN family proteins. Its function is unknown but it has been suggested that it has the features of a basic helix-loop-helix leucine zipper (bHLH-ZIP) domain PUBMED:10831844.It is often found in association with a UBA-like domain ().

    \ ' '265' 'IPR005178' '\

    This is a family of mainly hypothetical proteins of no known function.

    \ ' '266' 'IPR005182' '\

    A domain that is found in uncharacterised family of membrane proteins. 1-3 copies found in each protein, with each copy flanked by transmembrane helices.

    \ ' '267' 'IPR005183' '\

    A domain that is found in small family of bacterial secreted proteins with no known function. It ia also found in Paramecium bursaria Chlorella virus 1 (PBCV-1). This domain is short and found in one or two copies. The domain has a conserved HH motif that may be functionally important.

    \ ' '268' 'IPR005508' '\ This is a family of proteins from Arabidopsis thaliana (Mouse-ear cress) with uncharacterised function.\ ' '269' 'IPR005514' '\

    This is a family of uncharacterised proteins from Caenorhabditis elegans.

    \ ' '270' 'IPR005528' '\

    This is a small domain found in a family of streptomyces proteins, which are annotated as \'putative secreted protein\'. The domain occurs singly or as a pair and many have two cysteines that may form a disulphide bridge.

    \ ' '271' 'IPR005560' '\ This entry is a small cysteine-rich repeat. The cysteines mostly follow a C-X(2)-C-X(3)-C-X(2)-C-X(3) pattern, though they often appear at other positions in the repeat as well.\ ' '272' 'IPR005629' '\

    This family consists of the beta-glucan synthesis-associated proteins KRE6 and SKN1. Beta1,6-Glucan is a key component of the yeast cell wall, interconnecting cell wall proteins, beta1,3-glucan, and chitin. It has been postulated that the synthesis of beta1,6-glucan begins in the endoplasmic reticulum with the formation of protein-bound primer structures and that these primer structures are extended in the Golgi complex by two putative glucosyltransferases that are functionally redundant, Kre6 and Skn1. This is followed by maturation steps at the cell surface and by coupling to other cell wall macromolecules PUBMED:10601196.

    \ \ ' '273' 'IPR007137' '\ This domain normally occurs as tandem repeats; however it is found as a single copy in the Saccharomyces cerevisiae (Baker\'s yeast) DNA-binding nuclear protein YCR593 ().\ ' '274' 'IPR007139' '\ This motif is found singly or as up to five tandem repeats in a small set of bacterial proteins. There are two or three alpha-helices, and possibly a beta-strand.\ ' '275' 'IPR007153' '\

    The structure of a member of this family from the hyperthermophilic archaeon Pyrobaculum aerophilum contains a modified histidine residue which is interpreted as stable phosphorylation. In vitro binding studies confirmed that adenosine and AMP but not ADP or ATP bind to the protein PUBMED:16737961.

    \ ' '276' 'IPR007158' '\ The proteins in this family are around 200 amino acids long with the exception of that has an additional 100 amino acids at its N terminus. The function of these bacterial protein is unknown, however, they do contain several conserved histidines and aspartates that might form a metal-binding site.\ ' '277' 'IPR007165' '\ These proteins are predicted transmembrane proteins with probably four transmembrane spans. The function of these bacterial proteins is unknown. The sequences do not appear to contain any conserved polar residues that could form an active site.\ ' '278' 'IPR007180' '\ This domain is specific to the human splicing factor 3b subunit 2 and its orthologs.\ ' '279' 'IPR004378' '\ The Mycobacterium tuberculosis paralogous family 11 groups a number of related hypothetical proteins from this organism. The function of these proteins is not yet known.\ ' '280' 'IPR005240' '\

    This family of conserved hypothetical proteins has no known function. It includes potential integral membrane proteins.

    \ ' '281' 'IPR007284' '\

    This group of proteins contain one or more copies of the ground-like domain, which are specific to Caenorhabditis elegans and Caenorhabditis briggsae. It has been proposed that the ground-like domain containing proteins may bind and modulate the activity of Patched-like membrane molecules, reminiscent of the modulating activities of neuropeptides PUBMED:10523520.

    \ ' '282' 'IPR006696' '\ This is a potential integral membrane protein with no known function.\ ' '284' 'IPR007367' '\ This is a family of uncharacterised proteins.\ ' '285' 'IPR002724' '\

    Arginine decarboxylase () catalyses the interconversion of arginine and agmatine plus carbon dioxide PUBMED:12623016. It requires a pyruvoyl group for its activity. Archaeoglobus fulgidus contains three copies of this 80-residue domain, all of which are very closely related.

    \ ' '286' 'IPR007427' '\ This entry contains proteins that are predicted to be an integral membrane proteins with multiple transmembrane domains.\ ' '287' 'IPR006502' '\

    This family of uncharacterised plant proteins are defined by a region found toward the C-terminus. This region is strongly conserved (greater than 30 % sequence identity between most pairs of members) but flanked by highly divergent regions including stretches of low-complexity sequence.

    \ ' '289' 'IPR006869' '\ This is a conserved region found in uncharacterised proteins from Caenorhabditis elegans and Arabidopsis thaliana (Mouse-ear cress).\ ' '290' 'IPR007573' '\ This is a family of related proteins that is plant specific.\ ' '291' 'IPR007613' '\ This is a domain which occurs in several uncharacterised plant proteins. It is predicted to contain several transmembrane helices and is usually found together with a cytochrome domain ().\ ' '292' 'IPR007590' '\ This is a family of eukaryotic proteins with undetermined function.\ ' '293' 'IPR007592' '\ This is a family of uncharacterised proteins.\ ' '294' 'IPR006702' '\ This family of plant proteins contains a domain that may have a catalytic activity. It has a conserved arginine and aspartate that could form an active site. These proteins are predicted to contain 3 or 4 transmembrane helices.\ ' '295' 'IPR002744' '\ This family includes prokaryotic proteins of unknown\ function. The family also includes PhaH ()\ from Pseudomonas putida. PhaH forms a complex with\ PhaF (), PhaG () and PhaI (),\ which hydroxylates phenylacetic acid to 2-hydroxyphenylacetic\ acid PUBMED:9600981. So members of this family may all be components\ of ring hydroxylating complexes.\ ' '296' 'IPR007632' '\ This family contains several uncharacterised eukaryotic proteins.\ ' '297' 'IPR007656' '\ This is a family of uncharacterised proteins.\ ' '298' 'IPR006736' '\

    This family consists of several uncharacterised plant proteins which share a conserved region.

    \ ' '299' 'IPR006735' '\ This family represents several uncharacterised eukaryotic proteins.\ ' '300' 'IPR006745' '\

    This family contains proteins from the Eukaryota; functionally they are uncharacterised.

    \ ' '301' 'IPR006769' '\ This family represents a conserved region found in several uncharacterised eukaryotic proteins.\ ' '302' 'IPR006775' '\

    This domain is found in non-lysosomal glucosylceramidases that catalyze the conversion of glucosylceramide to free glucose and ceramide PUBMED:17105727. It is involved in sphingomyelin generation and prevention of glycolipid accumulation and may also catalyze the hydrolysis of bile acid 3-O-glucosides, however, the relevance of such activity is unclear in vivo PUBMED:17080196.

    \ ' '303' 'IPR006816' '\

    This entry represents the ELMO (EnguLfment and Cell MOtility) domain, which is found in a number of eukaryotic proteins involved in the cytoskeletal rearrangements required for phagocytosis of apoptotic cells and cell motility, including CED-12, ELMO-1 and ELMO-2.

    \

    ELMO-1 and ELMO-2 are components of signalling pathways that regulate phagocytosis and cell migration and are mammalian orthologues of the Caenorhabditis elegans gene, ced-12 that is required for the engulfment of dying cells and cell migration. ELMO-1/2 act in association with DOCK1 and CRK. ELMO-1/2 interact with the SH3-domain of DOCK1 via an SH3-binding site to enhance the guanine nucleotide exchange factor (GEF) activity of DOCK1. ELMO-1/2 could be part of a complex with DOCK1 and Rac1 that could be required to activate Rac Rho small GTPases. Regulatory GTPases in the Ras superfamily employ a cycle of alternating GTP binding and hydrolysis, controlled by guanine nucleotide exchange factors and GTPase-activating proteins (GAPs), as essential features of their actions in cells. Within the Ras superfamily, the Arf family is composed of 30 members, including 22 Arf-like (Arl) proteins. The ELMO domain has been proposed to be a GAP domain for ARL2 and other members of the Arf family PUBMED:17452337.

    \ ' '304' 'IPR006836' '\ This family includes several uncharacterised proteins from Caenorhabditis elegans.\ ' '305' 'IPR006461' '\

    This group of sequences are described by a region of about 170 amino acids. These proteins have highly divergent N-terminal regions rich in low complexity sequence. PSI-BLAST reveals no clear similarity to any characterised protein. It is common in plants but found also in Homo sapiens (Human), Dictyostelium, and Leishmania; at least 12 distinct members are found in Arabidopsis. Most members of this family contain more than 10 per cent Cys, but no Cys residue is invariant across the family.

    \ ' '306' 'IPR006460' '\

    This family of hypothetical plant proteins are defined by a region of about 170 amino acids found at the C terminus. These proteins have highly divergent N-terminal regions rich in low complexity sequence. PSI-BLAST reveals no clear similarity to any characterised protein. At least 12 distinct members are found in Arabidopsis thaliana (Mouse-ear cress).

    \ ' '307' 'IPR006903' '\ This entry represents a conserved region found in a number of uncharacterised eukaryotic proteins.\ ' '308' 'IPR006874' '\ This is a conserved region found in uncharacterised proteins from Caenorhabditis elegans, and is noted to have possible G-protein-coupled receptor-like activity.\ ' '309' 'IPR006887' '\ This is a conserved region which characterises a number of eukaryotic proteins of unknown function.\ ' '310' 'IPR006868' '\ This region is sometimes found at the N terminus of putative plant bZIP proteins . The function of this conserved region is not known.\ ' '311' 'IPR006149' '\

    The EB domain has no known function. It is found in several Caenorhabditis elegans proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains

    \ ' '312' 'IPR006867' '\ This conserved region contains a leucine zipper-like domain. The proteins are found only in plants and their functions are unknown.\ ' '313' 'IPR006911' '\

    This entry consists of mammalian proteins of unknown function.

    \ ' '314' 'IPR006943' '\ This conserved region is found in a number of plant proteins of unknown function.\ ' '315' 'IPR006968' '\

    This is a family of proteins of unknown function, restricted to eukaryotes.

    \ ' '316' 'IPR006984' '\ This family is comprises of uncharacterised eukaryotic proteins.\ ' '317' 'IPR007033' '\ This is a family of hypothetical eukaryotic proteins.\ ' '318' 'IPR007034' '\

    This conserved region is found in a number of eukaryotic proteins, including the ribosome biogenesis protein (BMS) which may act as a molecular switch during maturation of the 40S ribosomal subunit in the nucleolus.

    \ ' '319' 'IPR007751' '\

    This domain is associated with eukaryotic proteins of unknown function, which are hydrolase-like.

    \ ' '320' 'IPR007789' '\ This family contains uncharacterised proteins found in Arabidopsis thaliana.\ ' '321' 'IPR007795' '\ This family contains uncharacterised bacterial membrane proteins of unknown function.\ ' '322' 'IPR007807' '\ This domain is about 350 amino acid residues long and appears to have a P-loop motif, suggesting this is an ATPase. This domain is often N-terminal to a GCN5-related N-acetyltransferase domain .\ ' '323' 'IPR007808' '\ This family of uncharacterised, mostly short, proteins contain a putative zinc binding domain with four conserved cysteines.\ ' '325' 'IPR007866' '\ This family of proteins has no known function. This region may contain transmembrane alpha helices. The domain is found in a variety of metazoan species.\ ' '326' 'IPR000375' '\ Dynamin is a microtubule-associated force-producing protein of 100 Kd\ which is involved in the production of microtubule bundles. At the N terminus of\ dynamin is a GTPase domain (see ),\ and at the C-terminus is a PH domain (see ).\ Between these two domains lies a central region of unknown function.\ ' '327' 'IPR004273' '\

    Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force generating protein of eukaryotic cilia and flagella. The cytoplasmic\ isoform of dynein acts as a motor for the intracellular retrograde motility of\ vesicles and organelles along microtubules.

    \

    Dynein is composed of a number of\ ATP-binding large subunits, intermediate size subunits and small subunits (see ).\ \ This family represents the C-terminal region of dynein heavy chain. The dynein heavy chain also exhibits ATPase activity and\ microtubule binding ability and acts as a motor for the movement of organelles and vesicles along microtubules.

    \ ' '328' 'IPR006314' '\

    A defined member of this superfamily is Dyp, a dye-decolourising peroxidase that lacks a typical haem-binding region PUBMED:10742277. A distinct, uncharacterised branch of this superfamily has a typical twin-arginine dependent signal sequence characteristic of exported proteins with bound redox cofactors.

    \ ' '329' 'IPR003172' '\

    The MD-2-related lipid-recognition (ML) domain is implicated in lipid recognition, particularly in the recognition of pathogen related products. It has an immunoglobulin-like beta-sandwich fold similar to that of E-set Ig domains. This domain is present in the following proteins:\

    \

    \ \ ' '330' 'IPR004953' '\

    A group of microtubule-associated proteins called +TIPs (plus end tracking\ proteins), including EB1 (end-binding protein 1) family proteins, label\ growing microtubules ends specifically in diverse organisms and are implicated\ in spindle dynamics, chromosome segregation, and directing microtubules toward\ cortical sites. EB1 members have a bipartite composition: the N-terminal CH\ domain () mediates microtubule plus end localization and a C-terminal cargo binding domain (EB1-C) that captures cell polarity\ determinants. The EB1-C domain comprises a unique EB1-like sequence motif that\ acts as a binding site for other +TIP proteins. It interacts with the carboxy\ terminus of the adenomatous polyposis coli (APC) tumor suppressor, a well\ conserved +TIP phosphoprotein with a pivotal function in cell cycle\ regulation. Another binding partner of the EB1-C domain is the well conserved\ +TIP protein dynactin, a component of the large cytoplasmic dynein/dynactin\ complex PUBMED:14614826, PUBMED:15699215, PUBMED:15616574.

    \ \

    The ~80-residue EB1-C domain starts with a long smoothly curved helix\ (alpha1), which is followed by a hairpin connection leading to a short second\ helix (alpha2) running antiparallel to alpha1. The two\ parallel alpha1 helices of the EB1-C domain dimer wrap around each other in a\ slightly left-handed supercoil. The two alpha2 helices run antiparallel to\ helices alpha1 and form a similar fork in the opposite orientation and rotated\ by 90 degrees. As a result, two helical segments from each monomer form a four-helix\ bundle. The side chain forming the hydrophobic core of this bundle are highly\ conserved PUBMED:15699215, PUBMED:15616574, PUBMED:16109370.

    \ \

    Some protein known to contain an EB1-C domain are listed below:\

    \

    \ ' '331' 'IPR007706' '\

    This family contains EBNA-3A, -3B, and -3C which are latent infection nuclear proteins important for Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4)-induced B-cell immortalisation and the immune response to EBVG infection.

    \ ' '332' 'IPR000640' '\

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome PUBMED:12762045, PUBMED:15922593, PUBMED:12932732. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    \

    Elongation factor EF2 (EF-G) is a G-protein. It brings about the translocation of peptidyl-tRNA and mRNA through a ratchet-like mechanism: the binding of GTP-EF2 to the ribosome causes a counter-clockwise rotation in the small ribosomal subunit; the hydrolysis of GTP to GDP by EF2 and the subsequent release of EF2 causes a clockwise rotation of the small subunit back to the starting position PUBMED:12762009, PUBMED:12762047. This twisting action destabilises tRNA-ribosome interactions, freeing the tRNA to translocate along the ribosome upon GTP-hydrolysis by EF2. EF2 binding also affects the entry and exit channel openings for the mRNA, widening it when bound to enable the mRNA to translocate along the ribosome.

    \

    This entry represents the C-terminal domain found in EF2 (or EF-G) of both prokaryotes and eukaryotes (also known as eEF2), as well as in some tetracycline-resistance proteins. This domain adopts a ferredoxin-like fold consisting of an alpha/beta sandwich with anti-parallel beta-sheets. It resembles the topology of domain III found in these elongation factors, with which it forms the C-terminal block, but these two domains cannot be superimposed PUBMED:12471894. This domain is often found associated with (), which contains the signatures for the N-terminus of the proteins.

    \

    More information about these proteins can be found at Protein of the Month: Elongation Factors PUBMED:.

    \ ' '333' 'IPR005517' '\

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome PUBMED:12762045, PUBMED:15922593, PUBMED:12932732. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    \

    Elongation factor EF2 (EF-G) is a G-protein. It brings about the translocation of peptidyl-tRNA and mRNA through a ratchet-like mechanism: the binding of GTP-EF2 to the ribosome causes a counter-clockwise rotation in the small ribosomal subunit; the hydrolysis of GTP to GDP by EF2 and the subsequent release of EF2 causes a clockwise rotation of the small subunit back to the starting position PUBMED:12762009, PUBMED:12762047. This twisting action destabilises tRNA-ribosome interactions, freeing the tRNA to translocate along the ribosome upon GTP-hydrolysis by EF2. EF2 binding also affects the entry and exit channel openings for the mRNA, widening it when bound to enable the mRNA to translocate along the ribosome.

    \

    EF2 has five domains. This entry represents domain IV found in EF2 (or EF-G) of both prokaryotes and eukaryotes. The EF2-GTP-ribosome complex undergoes extensive structural rearrangement for tRNA-mRNA movement to occur. Domain IV, which extends from the \'body\' of the EF2 molecule much like a lever arm, appears to be essential for the structural transition to take place.

    \

    More information about these proteins can be found at Protein of the Month: Elongation Factors PUBMED:.

    \ ' '334' 'IPR018248' '\ Many calcium-binding proteins belong to the same evolutionary family and share\ a type of calcium-binding domain known as the EF-hand. This type of\ domain consists of a twelve residue loop flanked on both side by a twelve\ residue alpha-helical domain. In an EF-hand loop the calcium ion is\ coordinated in a pentagonal bipyramidal configuration. The six residues\ involved in the binding are in positions 1, 3, 5, 7, 9 and 12; these residues\ are denoted by X, Y, Z, -Y, -X and -Z. The invariant Glu or Asp at position 12\ provides two oxygens for liganding Ca (bidentate ligand).\ ' '335' 'IPR007316' '\ eIF-3 is a multisubunit complex that stimulates translation initiation in vitro at several different steps. This family corresponds to the gamma subunit of eIF3 PUBMED:7542616, PUBMED:9851972.\ ' '336' 'IPR004701' '\

    The phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS) PUBMED:8246840, PUBMED:2197982 is a major carbohydrate transport system in bacteria. The PTS catalyses the phosphorylation of incoming sugar substrates and coupled with translocation across the cell membrane, makes the PTS a link between the uptake and metabolism of sugars.

    \ \

    The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred via a signal transduction pathway, to enzyme I (EI) which in turn transfers it to a phosphoryl carrier, the histidine protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease, a membrane-bound complex known as enzyme 2 (EII), which transports the sugar to the cell. EII consists of at least three structurally distinct domains IIA, IIB and IIC PUBMED:1537788. These can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII).

    \ \

    The first domain (IIA or EIIA) carries the first permease-specific phosphorylation site, a histidine which is phosphorylated by phospho-HPr. The second domain (IIB or EIIB) is phosphorylated by phospho-IIA on a cysteinyl or histidyl residue, depending on the sugar transported. Finally, the phosphoryl group is transferred from the IIB domain to the sugar substrate concomitantly with the sugar uptake processed by the IIC domain. This third domain (IIC or EIIC) forms the translocation channel and the specific substrate-binding site.

    \ \

    An additional transmembrane domain IID, homologous to IIC, can be found in some PTSs, e.g. for mannose PUBMED:8246840, PUBMED:1537788, PUBMED:7815935, PUBMED:11361063.

    \ \

    The Man family is unique in several respects among PTS permease families.\

  • It is the only PTS family in which members possess a IID protein.
  • It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue.
  • Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars.
  • \

    The mannose permease of Escherichia coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine, N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine.

    \ \

    This family is specific for IIA and IIB components.

    \ ' '337' 'IPR000949' '\

    The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex PUBMED:10226007. The domain is usually found to the N terminus of a myb-like DNA binding domain and a GATA binding domain. ELM2, in some instances, is also found associated with the ARID DNA binding domain . This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain.

    \ \ ' '338' 'IPR004240' '\ The transmembrane 9 superfamily protein (TM9SF) may function as a channel or small molecule transporter. Proteins in this group are endosomal integral membrane proteins.\ ' '339' 'IPR001604' '\ A family of bacterial and eukaryotic endonucleases share the following characteristics: they act on both DNA and RNA, cleave double-stranded and single-stranded nucleic acids and require a divalent ion such as magnesium for their activity. A histidine has been shown PUBMED:8078761 to be essential for the activity of the Serratia marcescens nuclease. This residue is located in a conserved region which also contains an aspartic acid residue that could be implicated in the binding of the divalent ion.\ ' '340' 'IPR001799' '\

    Ephrins are a family of proteins PUBMED:7838529 that are ligands of class V (EPH-related) receptor protein-tyrosine kinases (see ). These receptors and their ligands have been implicated in regulating neuronal axon guidance and in patterning of the developing nervous system and may also serve a patterning and compartmentalisation role outside of the nervous system as well.

    \

    Ephrins are membrane-attached proteins of 205 to 340 residues. Attachment appears to be crucial for their normal function. Type-A ephrins are linked to the membrane via a glycosylphosphatidylinositol (GPI)-linkage, while type-B ephrins are type-I membrane proteins.

    \ ' '341' 'IPR007815' '\ This family includes erythromycin esterase enzymes PUBMED:3899861, PUBMED:3523438 that confer resistance to the erythromycin antibiotic.\ ' '342' 'IPR006885' '\ This is a family of pankaryotic NADH-ubiquinone oxidoreductase subunits (, ) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein PUBMED:7947902.\ ' '343' 'IPR006806' '\ This is a family of eukaryotic NADH-ubiquinone oxidoreductase subunits () () from complex I of the electron transport chain initially identified in Neurospora crassa as a 29.9 kDa protein. The conserved region is found at the N-terminus of the member proteins PUBMED:1830489.\ ' '344' 'IPR006715' '\ The N-terminal of the PEA3 transcription factors is implicated in transactivation and in inhibition of DNA binding PUBMED:9259977. Transactivation is potentiated by activation of the Ras/MAP kinase and protein kinase A signalling cascades. The N-terminal region contains conserved MAP kinase phosphorylation sites PUBMED:9285689.\ ' '345' 'IPR006863' '\ Biogenesis of Fe/S clusters involves a number of essential mitochondrial proteins. Erv1p of Saccharomyces cerevisiae (Baker\'s yeast) mitochondria is required for the maturation of Fe/S proteins in the cytosol. The ALR (augmenter of liver regeneration) represents a mammalian ortholog of yeast Erv1p. Both Erv1p and full-length ALR are located in the mitochondrial intermembrane and it is thought to operate downstream of the mitochondrial ABC transporter PUBMED:11493598.\ ' '346' 'IPR005135' '\

    This domain is found in a large number of proteins including magnesium dependent endonucleases and phosphatases involved in intracellular signalling PUBMED:10838565. Proteins this domain is found in include: AP endonuclease proteins (), DNase I proteins (), Synaptojanin an inositol-1,4,5-trisphosphate phosphatase () and Sphingomyelinase ().

    \ ' '347' 'IPR003761' '\

    Exonuclease VII is composed of two non-identical subunits; one large subunit and 4 small ones PUBMED:6284744. This enzyme catalyses exonucleolytic cleavage in either 5\'-3\' or 3\'-5\' direction to yield nucleoside 5\'-phosphates.

    \ ' '348' 'IPR013520' '\ This entry includes a variety of exonuclease proteins, such as ribonuclease T PUBMED:8506149 and the epsilon subunit of DNA polymerase III. Ribonuclease T is responsible for the end-turnover of tRNA,and removes the terminal AMP residue from uncharged tRNA. DNA polymerase III is a complex, multichain enzyme responsible for most of the replicative synthesis in bacteria, and also exhibits 3\' to 5\' exonuclease activity.\ ' '349' 'IPR004263' '\ Hereditary multiple exostoses (EXT) is an autosomal dominant disorder that is characterised by the\ appearance of multiple outgrowths of the long bones (exostoses) at their epiphyses PUBMED:9473480. Mutations in two homologous genes, EXT1 and EXT2, are responsible for the EXT\ syndrome. The human and mouse EXT genes have at least two homologs in the invertebrate\ Caenorhabditis elegans, indicating that they do not function exclusively as regulators of bone growth.\ EXT1 and EXT2 have both been shown to encode glycosyltransferases involved in the chain\ elongation step of heparan sulphate biosynthesis PUBMED:9756849.\ ' '351' 'IPR004342' '\

    The EXS domain is named after ERD1/XPR1/SYG1 and proteins containing this motif include the C-terminal of the SYG1 G-protein associated signal transduction protein from Saccharomyces cerevisiae, and sequences that are thought to be Murine leukemia virus (MLV) receptors (XPR1. The N-terminal of these proteins often have an SPX domain () PUBMED:9990033.

    \

    While the N-terminal is thought to be involved in signal transduction, the role of the C-terminal is not known. This region of similarity contains several predicted transmembrane helices. This family also includes the ERD1 (ERD: ER retention defective) S. cerevisiae proteins. ERD1 proteins are involved in the localization of endogenous endoplasmic reticulum (ER) proteins. Erd1 null mutants secrete such proteins even though they possess the C-terminal HDEL ER lumen localization label sequence. In addition, null mutants also exhibit defects in the Golgi-dependent processing of several glycoproteins, which led to the suggestion that the sorting of luminal ER proteins actually occurs in the Golgi, with subsequent return of these proteins to the ER via \'salvage\' vesicles PUBMED:2178921.

    \ ' '352' 'IPR006706' '\

    Extensins are homologous hydroxyproline-rich glycoproteins (HRGPs) found in the plant extracellular matrix. The key to the role of HRGPs in cell wall self-assembly and cell extension lies in their chemistry, which is dependent on extensive post-translational modifications (PTMs): hydroxylation, glycosylation, and cross-linking. Repetitive peptide motifs characterise HRGPs.

    \ \ \ ' '353' 'IPR001810' '\

    The F-box domain was first described as a sequence motif found in cyclin-F that interacts with the protein SKP1 PUBMED:8706131, PUBMED:9346238. This relatively conserved structural motif is present in numerous proteins and serves as a link between a target protein and a ubiquitin-conjugating enzyme. The SCF complex (e.g., Skp1-Cullin-F-box) plays a similar role as an E3 ligase in the ubiquitin protein degradation pathway PUBMED:9499404, PUBMED:9635407. Different\ F-box proteins as a part of SCF complex recruit particular substrates for ubiquitination through specific protein-protein interaction domains.

    \ \

    Many mammalian F-box domains contain leucine-rich or WD-40 repeats (). However, several F-box\ proteins either have other previously described domains such as Sec7 domain found in FBS protein or do not contain defined protein-protein\ interaction domains or motifs.

    \ ' '354' 'IPR000421' '\ Blood coagulation factors V and VIII contain a C-terminal, twice repeated,\ domain of about 150 amino acids, which is called F5/8 type C, FA58C, or C1/C2-\ like domain. In the Dictyostelium discoideum (Slime mold) cell adhesion protein discoidin, a related\ domain, named discoidin I-like domain, DLD, or DS, has been found which shares\ a common C-terminal region of about 110 amino acids with the FA58C domain, but\ whose N-terminal 40 amino acids are much less conserved. Similar domains have\ been detected in other extracellular and membrane proteins PUBMED:3092220, PUBMED:8390675, PUBMED:8639264\ In coagulation factors V and VIII the repeated domains compose part of a\ larger functional domain which promotes binding to anionic phospholipids on\ the surface of platelets and endothelial cells PUBMED:3125864. The C-terminal domain of\ the second FA58C repeat (C2) of coagulation factor VIII has been shown to be\ responsible for phosphatidylserine-binding and essential for activity PUBMED:2110840, PUBMED:7515064.\ It forms an amphipathic alpha-helix, which binds to the membrane PUBMED:7893714.\ FA58C contains two conserved cysteines in most proteins, which link the\ extremities of the domain by a disulphide bond PUBMED:8504111, PUBMED:7613471, PUBMED:8856064. A further disulphide\ bond is located near the C-terminal of the second FA58C domain in MFGM PUBMED:8856064.\
    \
      +------------------------------------------------------------------------+\
      |                                                               +-+      |\
      |                                                               | |      |\
      CxPLGxxQITASxxxxxRLxxxWxxxxWxxxxxxQGxxxxxxxxxxxxGNxxxxxxxxxxRxPxcxcLRxExGC\
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
    \'c\': cysteine involved in a disulphide bond in MFGM .\
    \'x\': any amino acid.\
    upper case letters: conserved residues.\
    
    \ ' '355' 'IPR005804' '\

    Fatty acid desaturases are enzymes that catalyse the insertion of a double bond at the delta position of fatty acids.

    \ \

    There seem to be two distinct families of fatty acid desaturases which do not seem to be evolutionary related.

    \ \

    Family 1 is composed of:

    \ \ \

    Family 2 is composed of:

    \ \ \

    This entry contains fatty acid desaturases belonging to Family 1.

    \ ' '356' 'IPR004113' '\

    Some oxygen-dependent oxidoreductases are flavoproteins that contain a covalently bound FAD group which is attached to a histidine via an 8-alpha-(N3-histidyl)-riboflavin linkage. The region around the histidine that binds the FAD group is conserved in these enzymes (see ).

    \ ' '357' 'IPR006094' '\

    Various enzymes use FAD as a co-factor, most of these enzymes are oxygen-dependent oxidoreductases, containing a covalently bound FAD group which is attached to a histidine via an 8-alpha-(N3-histidyl)-riboflavin linkage. One of the enzymes Vanillyl-alcohol oxidase (VAO, ) has a solved structure, the alignment includes the\ FAD binding site, called the PP-loop, between residues 99-110 PUBMED:10984479. The FAD molecule is covalently bound in the known\ structure, however the residue that links to the FAD is not in the alignment. VAO catalyses the oxidation of a wide\ variety of substrates, ranging from aromatic amines to 4-alkylphenols.

    \ ' '358' 'IPR008333' '\

    These sequences contain an oxidoreductase FAD-binding domain.

    \ \

    To date, the 3D-structures of the flavoprotein domain of Zea mays (Maize) nitrate reductase PUBMED:7812715 and of pig NADH:cytochrome b5 reductase PUBMED:7893687 have been solved. The overall fold is similar to that of ferredoxin:NADP+ reductase PUBMED:8027025: the FAD-binding domain (N-terminal) has the topology of an anti-parallel beta-barrel, while the NAD(P)-binding domain (C-terminal) has the topology of a classical pyridine dinucleotide-binding fold (i.e. a central parallel beta-sheet flanked by 2 helices on each side).

    \ ' '359' 'IPR005101' '\

    This entry represents a multi-helical domain composed of two all-alpha subdomains that is found as the C-terminal domain in cryptochrome proteins, as well as at the N-terminal of DNA photolyase where it acts as a FAD-binding domain (the N-terminal of DNA photolyase binds a light-harvesting cofactor).

    \ \

    Photolyases and cryptochromes are related flavoproteins that bind FAD. Photolyases harness the energy of blue light to repair DNA damage by removing pyrimidine dimers. Cryptochromes (CRY1 and CRY2) are blue light photoreceptors that mediate blue light-induced gene expression PUBMED:12535521, PUBMED:15299148.

    \ \

    DNA photolyases are DNA repair enzymes that repair mismatched pyrimidine dimers induced by exposure to ultra-violet light. They bind to UV-damaged DNA containing pyrimidine dimers and, upon absorbing a near-UV photon (300 to 500 nm), they catalyse dimer splitting, breaking the cyclobutane ring joining the two pyrimidines of the dimer so as to split them into the constituent monomers; this process is called photoreactivation. DNA photolyases require two choromophore-cofactors for their activity. All monomers contain a reduced FAD moiety, and, in addition, either a reduced pterin or 8-hydroxy-5-diazaflavin as a second chromophore. Either chromophore may act as the primary photon acceptor, peak absorptions occurring in the blue region of the spectrum and in the UV-B region, at a wavelength around 290nm PUBMED:7604260, PUBMED:15213381.

    \ ' '360' 'IPR004330' '\

    Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to Mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants.

    \ \

    The FRS (FAR1 Related Sequences) family of proteins, which include FHY3, FAR1, share a similar domain structure to Mutator-like transposases: including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from Mutator-like transposases PUBMED:18715961, PUBMED:15591448.

    \ ' '361' 'IPR007397' '\

    Proteins containing this domain are associated with F-box domains (), hence the name FBA. This domain is probably involved in binding other proteins that will be targeted for ubiquitination. is involved in binding to N-glycosylated proteins.

    \ ' '362' 'IPR001060' '\

    The FCH domain is a short conserved region of around 60 amino acids first described as a region of homology between FER and CIP4 proteins PUBMED:9210375. Many proteins containing an FCH domain are involved in the regulation of cytoskeletal rearrangements, vesicular transport and endocytosis. In the CIP4 protein the FCH domain binds to microtubules PUBMED:10713100. The FCH domain is always found N-terminally and is followed by a coiled-coil region.\

    \ \

    Proteins containing an FCH domain can be divided in 3 classes PUBMED:11994747:\

      \
    1. A subfamily of protein kinases usually associated with an SH2 domain:\ \
    2. \
    3. Adaptor proteins usually associated with a C-terminal SH3 domain:\ \
    4. \
    5. A subfamily of Rho-GAP proteins:\ \
    6. \
    \

    \ ' '363' 'IPR001041' '\

    The ferredoxin protein family are electron carrier proteins with an iron-sulphur cofactor that act in a wide variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending upon the physiological nature of the iron-sulphur cluster(s) and according to sequence similarities.

    \

    This entry represents members of the 2Fe-2S ferredoxin family that have a general core structure consisting of beta(2)-alpha-beta(2), which includes putidaredoxin and terpredoxin, and adrenodoxin PUBMED:11173487, PUBMED:15755454, PUBMED:10220356, PUBMED:12069587. \ They are proteins of around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This conserved region is also found as a domain in various metabolic enzymes and in multidomain proteins, such as aldehyde oxidoreductase (N-terminal), xanthine oxidase (N-terminal), phthalate dioxygenase reductase (C-terminal), succinate dehydrogenase iron-sulphur protein (N-terminal), and methane monooxygenase reductase (N-terminal).

    \ \ ' '364' 'IPR002888' '\ The [2Fe-2S] binding domain is found in a range of enzymes including dehydrogenases, oxidases and oxidoreductases.\

    The aldehyde oxido-reductase (Mop) from the sulphate reducing anaerobic Gram-negative bacterium Desulfovibrio gigas is a homodimer of 907 amino acid residues subunits and is a member of the xanthine oxidase family. The protein contains a molybdopterin cofactor (Mo-co) and two different [2Fe-2S] centres. It is folded into four domains of which the first two bind the iron sulphur centres and the last two are involved in Mo-co binding. Mo-co is a molybdenum molybdopterin cytosine dinucleotide. Molybdopterin forms a tricyclic system with the pterin bicycle annealed to a pyran ring. The molybdopterin dinucleotide is deeply buried in the protein. The cis-dithiolene group of the pyran ring binds the molybdenum, which is coordinated by three more (oxygen) ligands PUBMED:7502041.

    \ ' '365' 'IPR007419' '\

    The two Fe ions are each coordinated by two conserved cysteine residues. This domain occurs alone in small proteins such as bacterioferritin-associated ferredoxin (BFD, ). The function of BFD is not known, but it may be a general redox and/or regulatory component involved in the iron storage or mobilisation functions of bacterioferritin in bacteria PUBMED:8639572. This domain is also found in nitrate reductase proteins in association with the nitrite and sulphite reductase 4Fe-4S domain (), nitrite/sulphite reductase ferredoxin-like half domain () and pyridine nucleotide-disulphide oxidoreductase (). It is also found in NifU nitrogen fixation proteins, in association with NifU-like N-terminal domain () and C-terminal domain ().

    \ ' '366' 'IPR001450' '\

    Ferredoxins are iron-sulphur proteins that mediate electron transfer in a range of metabolic reactions; they fall into several subgroups according to the nature of their iron-sulphur cluster(s) PUBMED:3932661, PUBMED:2506358. One group, originally found in bacteria, has been termed "bacterial-type", in which the active centre is a 4Fe-4S cluster. 4Fe-4S ferredoxins may in turn be subdivided into further groups, based on their sequence properties. Most contain at least one conserved domain, including four Cys residues that bind to a 4Fe-4S centre.

    \ \

    During the evolution of bacterial-type ferredoxins, intrasequence gene duplication, transposition and fusion events occured, resulting in the appearance of proteins with multiple iron-sulphur centres: e.g. dicluster-type (2[4Fe-4S]) and polyferredoxins, iron-sulphur subunits of bacterial succinate dehydrogenase/fumarate reductase, formate hydrogenlyase and formate dehydrogenase complexes, pyruvate-flavodoxin oxidoreductase, NADH:ubiquinone reductase, as well as others. In some bacterial ferredoxins, one of the duplicated domains has lost one or more of the four conserved Cys residues. These domains have either lost their iron-sulphur binding property, or bind to a 3Fe-4S centre instead of a 4Fe-4S centre. 3D structures are now known both for a number of monocluster-type PUBMED:2600971 and dicluster-type PUBMED:7966291 4Fe-4S ferredoxins.

    \

    CAUTION: PRINTS signature in the current entry is known to miss protein matches and should be updated in the near future.

    \ ' '367' 'IPR008331' '\

    Ferritin is one of the major non-haem iron storage proteins in animals, plants, and microorganisms PUBMED:15222465. It consists of a mineral core of hydrated ferric oxide, and a multi-subunit protein shell that encloses the former and assures its solubility in an aqueous environment.

    \

    In animals the protein is mainly cytoplasmic and there are generally two or more genes that encode closely related subunits - in mammals there are two subunits which are known as H(eavy) and L(ight). In plants ferritin is found in the chloroplast PUBMED:2211706.

    \

    This family contains ferritins and other ferritin-like proteins such as members of the DPS family and bacterioferritins.

    \ ' '368' 'IPR013517' '\

    This region contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure PUBMED:7929213. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat PUBMED:7929213. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding PUBMED:8990162. A putative Ca2+ binding motif is found in some of the repeats.

    \ ' '369' 'IPR015425' '\

    Formin homology (FH) proteins play a crucial role in the reorganization of the actin cytoskeleton, which mediates various functions of the cell cortex including motility, adhesion, and cytokinesis PUBMED:10631086. Formins are multidomain proteins that interact with diverse signalling molecules and cytoskeletal proteins, although some formins have been assigned functions within the nucleus. Formins are characterised by the presence of three FH domains (FH1, FH2 and FH3), although members of the formin family do not necessarily contain all three domains PUBMED:12538772. The proline-rich FH1 domain mediates interactions with a variety of proteins, including the actin-binding protein profilin, SH3 (Src homology 3) domain proteins, and WW domain proteins. The FH2 domain is required for the self-association of formin proteins through the ability of FH2 domains to directly bind each other PUBMED:14576350, and may also act to inhibit actin polymerisation PUBMED:14992721. The FH3 domain () is less well conserved and may be important for determining intracellular localisation of formin family proteins. In addition, some formins can contain a GTPase-binding domain (GBD) () required for binding to Rho small GTPases, and a C-terminal conserved Dia-autoregulatory domain (DAD).

    \

    This entry represents the FH2 domain, which was shown by X-ray crystallography to have an elongated, crescent shape containing three helical subdomains PUBMED:15006353.

    \ ' '370' 'IPR003812' '\

    This entry contains the Fic (filamentation induced by cAMP) protein and doc (death on curing) protein. The Fic protein is involved in cell division and is suggested to be involved in the synthesis of PAB or folate, indicating that the Fic protein and cAMP are involved in a regulatory mechanism of cell division via folate metabolism PUBMED:1656497. This family contains a central conserved motif HPFXXGNG in most members. The exact molecular function of these proteins is uncertain. P1 lysogens of Escherichia coli carry the prophage as a stable low copy number plasmid. The frequency with which viable cells cured of prophage are produced is about 10(-5) per cell per generation PUBMED:1656497. A significant part of this remarkable stability can be attributed to a plasmid-encoded mechanism that causes death of cells that have lost P1 PUBMED:8411153. In other words, the lysogenic cells appear to be addicted to the presence of the prophage. The plasmid withdrawal response depends on a gene named doc (death on curing) that is represented by this family.

    \ ' '371' 'IPR001179' '\

    Synonym(s): Peptidylprolyl cis-trans isomerase

    \ \

    FKBP-type peptidylprolyl isomerases () in vertebrates, are receptors for the two immunosuppressants, FK506 and rapamycin. The drugs inhibit T cell proliferation by arresting two distinct cytoplasmic signal transmission pathways. Peptidylprolyl isomerases accelerate protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. These proteins are found in a variety of organisms.

    \ ' '372' 'IPR001122' '\

    Flaviruses are small, enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Yellow fever virus, West Nile virus, Tick-borne encephalitis virus, Japanese encephalitis virus, and Dengue virus 2 PUBMED:15378043. Flaviviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope glycoproteins M () and E. The virion of these viruses is a nucleocapsid covered by a lipoprotein envelope, where the nucleocapsid is a complex of capsid protein C and mRNA. The capsid protein C is a dimeric alpha-helical protein, and its interaction with RNA is critical for the production of viable virus particles PUBMED:12768036.

    \ ' '373' 'IPR000336' '\

    Flaviruses are small, enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Yellow fever virus, West Nile virus, Tick-borne encephalitis virus, Japanese encephalitis virus, and Dengue virus 2 PUBMED:15378043. Flaviviruses consist of three structural proteins: the core nucleocapsid protein C (), and the envelope glycoproteins M () and E. Glycoprotein E is a class II viral fusion protein that mediates both receptor binding and fusion. Class II viral fusion proteins are found in flaviviruses and alphaviruses, and are structurally distinct from class I fusion proteins from influenza-type viruses and retroviruses. Glycoprotein E is comprised of three domains: domain I (dimerisation domain) is an 8-stranded beta barrel, domain II (central domain) is an elongated domain composed of twelve beta strands and two alpha helices, and domain III (immunoglobulin-like domain) is an IgC-like module with ten beta strands. This entry represents the Ig-like domain III, which contains a putative receptor-binding loop PUBMED:12759475.

    \ ' '374' 'IPR011999' '\

    Flaviviruses are small, enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include\ \ Yellow fever virus (YFV), \ West Nile virus (WNV), \ Tick-borne encephalitis virus, \ Japanese encephalitis virus and \ Dengue virus 2 \ \ viruses PUBMED:15378043. Flaviviruses consist of three structural proteins: the core nucleocapsid protein C (), and the envelope glycoproteins M () and E. Glycoprotein E is a class II viral fusion protein that mediates both receptor binding and fusion. Class II viral fusion proteins are found in flaviviruses and alphaviruses, and are structurally distinct from class I fusion proteins from influenza virus and HIV. Glycoprotein E is comprised of three domains: domain I (dimerisation domain) is an 8-stranded beta barrel, domain II (central domain) is an elongated domain composed of twelve beta strands and two alpha helices, and domain III (immunoglobulin-like domain) is an IgC-like module with ten beta strands. This entry represents domains I and II, which are intertwined PUBMED:7753193.

    \

    The glycoprotein E dimers on the viral surface re-cluster irreversibly into fusion-competent trimers upon exposure to low pH, as found in the acidic environment of the endosome. The formation of trimers results in a conformational change in the hinge region of domain II, a key structural element that opens a ligand-binding hydrophobic pocket at the interface between domains I and II. The conformational change results in the exposure of a fusion peptide loop at the tip of domain II, which is required in the fusion step to drive the cellular and viral membranes together by inserting into the membrane PUBMED:12759475.

    \ ' '375' 'IPR001850' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.

    \ \ \

    Flaviviruses produce a polyprotein from the ssRNA genome. The N-terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase PUBMED:7642575, PUBMED:2174669, PUBMED:8269709.

    \ ' '376' 'IPR000069' '\

    Flaviviruses are small enveloped viruses with virions comprised of\ three proteins called C, M and E PUBMED:8676481, PUBMED:7913359, PUBMED:8437237. The envelope glycoprotein M is made as a precursor, called prM. The precursor portion of the protein is the signal peptide for the proteins entry into the membrane. prM is cleaved to form M in a late-stage cleavage event. Associated with this cleavage is a change in the infectivity and fusion activity of the virus.

    \ ' '377' 'IPR001157' '\ The Flavivirus genome polypepetide contains the capsid protein C (core protein),\ the matrix protein (envelope protein M), the major envelope protein E, a number\ of small non structural proteins (NS1, NS2A, NS2B, NS4A and NS4B), helicase and\ RNA-directed polymerase (NS5) PUBMED:9371625.\ ' '378' 'IPR000752' '\ NS2A is a hydrophobic protein about 25kDa in size, which is cleaved from NS1 by a membrane \ bound host protease PUBMED:7474145. NS2A has been found to associate with the dsRNA within the \ vesicle packages. It has also been found that NS2A associates with the known replicase \ components and so NS2A has been postulated to be part of this replicase complex PUBMED:9636360.\ ' '379' 'IPR000487' '\

    Flaviviruses encode a single polyprotein. This is cleaved into\ three structural and seven non-structural proteins. All, but two,\ are cleaved by the NS2B-NS3 protease complex PUBMED:9499070, PUBMED:7884844.

    \ ' '380' 'IPR002563' '\

    The FMN-binding domain is found in NAD(P)H-flavin oxidoreductases (flavin reductases), a class of enzymes capable of producing reduced flavin for bacterial bioluminescence and other biological processes, and various other oxidoreductase and monooxygenase enzymes PUBMED:12829278, PUBMED:15461461, PUBMED:11017201.

    \

    This domain consists of a beta-barrel with Greek key topology, and is related to the ferredoxin reductase-like FAD-binding domain. The flavin reductases have a different dimerisation mode than that found in the PNP oxidase-like family, which also carries an FMN-binding domain with a similar topology.

    \ \ ' '381' 'IPR008254' '\

    This domain is found in a number of proteins including flavodoxin and nitric-oxide synthase. Flavodoxins are electron-transfer proteins that function in various electron transport systems. They bind one FMN molecule, which serves as a\ redox-active prosthetic group PUBMED:2597140 and are functionally interchangeable\ with ferredoxins. They have been isolated from prokaryotes, cyanobacteria, and\ some eukaryotic algae. Nitric oxide synthase () produces nitric oxide from L-arginie and NADPH. Nitric oxide acts as a messenger molecule in the body.

    \ ' '382' 'IPR007588' '\

    C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf\'s can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 PUBMED:11361095. C2H2 Znf\'s are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes PUBMED:10664601. Transcription factors usually contain several Znf\'s (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA PUBMED:10940247. C2H2 Znf\'s can also bind to RNA and protein targets PUBMED:18253864.

    \

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif:

    \
    \
    F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH\
    
    \

    where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs PUBMED:12723696. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain.

    \

    FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain PUBMED:16944302.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '383' 'IPR005025' '\

    NADPH-dependent FMN reductase () reduces FMN and also reduces riboflavin and FAD, although more\ slowly. Members of this entry catalyse the reaction

    \ \ \ \ \ \ \ ' '384' 'IPR003378' '\

    The Notch receptor is a large, cell surface transmembrane protein involved in a wide variety of developmental processes in higher organisms PUBMED:10221902. It becomes activated when its extracellular region binds to ligands located on adjacent cells. Much of this extracellular region is composed of EGF-like repeats, many of which can be O-fucosylated. A number of these O-fucosylated repeats can in turn be further modified by the action of a beta-1,3-N-acetylglucosaminyltransferase enzyme known as Fringe PUBMED:12417415. Fringe potentiates the activation of Notch by Delta ligands, while inhibiting activation by Serrate/Jagged ligands. This regulation of Notch signalling by Fringe is important in many processes PUBMED:14570055.

    \ \

    Four distinct Fringe proteins have so far been studied in detail; Drosophila Fringe (Dfng) and its three mammalian homologues Lunatic Fringe (Lfng), Radical Fringe (Rfng) and Manic Fringe (Mfng). Dfng, Lfng and Rfng have all been shown to play important roles in developmental processes within their host, though the phenotype of mutants can vary between species eg Rfng mutants are retarded in wing development in chickens, but have no obvious phenotype in mice PUBMED:7954826, PUBMED:12001066, PUBMED:9121551. Mfng mutants have not, so far, been charcterised. Biochemical studies indicate that the Fringe proteins are fucose-specific transferases requiring manganese for activity and utilising UDP-N-acetylglucosamine as a donor substrate PUBMED:16221665. The three mammalian proteins show distinct variations in their catalytic efficiencies with different substrates.

    \ \

    This entry consists of Fringe proteins and related glycosyltransferase enzymes including:\

    \

    \ ' '385' 'IPR000539' '\ The frizzled (fz) locus of Drosophila coordinates the cytoskeletons of epidermal cells, producing a parallel array of cuticular hairs and bristles PUBMED:2174014, PUBMED:2493583. In fz mutants, the orientation of individual hairs with respect both to their neighbours and to the organism as a whole is altered. In the wild-type wing, all hairs point towards the distal tip PUBMED:2493583. In the developing wing, fz has 2 functions: it is required for the proximal-distal transmission of an intracellular polarity signal; and it is required for cells to respond to the polarity signal. Fz produces an mRNA that encodes an integral membrane protein with 7 putative transmembrane (TM) domains. This protein should contain both extracellular and cytoplasmic domains, which could function in the transmission and interpretation of polarity information PUBMED:2493583. This signature is usually found downstream of the Fz domain ()\ ' '386' 'IPR002900' '\

    This domain has no known function, it is presumed to be a protein-protein interaction module. It is found in many proteins from Caenorhabditis elegans and Caenorhabditis briggsae.\ The domain is found associated with, and C-terminal to, the cyclin-like F-box .

    \ ' '387' 'IPR002770' '\

    Formylmethanofuran:tetrahyromethanopterin formyltransferase (Ftr) is involved in C1 metabolism in methanogenic archaea, sulphate-reducing archaea and methylotrophic bacteria. It catalyses the following reversible reaction:

    \ \ \

    Ftr from the thermophilic methanogen Methanopyrus kandleri (optimum growth temperature 98 degrees C) is a hyperthermophilic enzyme that is absolutely dependent on the presence of lyotropic salts for activity and thermostability. The crystal structure of Ftr, determined to a reveals a homotetramer composed essentially of two dimers. Each subunit is subdivided into two tightly associated lobes both consisting of a predominantly antiparallel beta sheet flanked by alpha helices forming an alpha/beta sandwich structure. The approximate location of the active site was detected in a region close to the dimer interface PUBMED:9195883. Ftr from the mesophilic methanogen Methanosarcina barkeri and the sulphate-reducing archaeon Archaeoglobus fulgidus have a similar structure PUBMED:12192072.

    \ \

    In the methylotrophic bacterium Methylobacterium extorquens, Ftr interacts with three other polypeptides to form an Ftr/cyclohydrolase complex which catalyses the hydrolysis of formyl-tetrahydromethanopterin to formate during growth on C1 substrates PUBMED:12123819.

    \ ' '388' 'IPR002770' '\

    Formylmethanofuran:tetrahyromethanopterin formyltransferase (Ftr) is involved in C1 metabolism in methanogenic archaea, sulphate-reducing archaea and methylotrophic bacteria. It catalyses the following reversible reaction:

    \ \ \

    Ftr from the thermophilic methanogen Methanopyrus kandleri (optimum growth temperature 98 degrees C) is a hyperthermophilic enzyme that is absolutely dependent on the presence of lyotropic salts for activity and thermostability. The crystal structure of Ftr, determined to a reveals a homotetramer composed essentially of two dimers. Each subunit is subdivided into two tightly associated lobes both consisting of a predominantly antiparallel beta sheet flanked by alpha helices forming an alpha/beta sandwich structure. The approximate location of the active site was detected in a region close to the dimer interface PUBMED:9195883. Ftr from the mesophilic methanogen Methanosarcina barkeri and the sulphate-reducing archaeon Archaeoglobus fulgidus have a similar structure PUBMED:12192072.

    \ \

    In the methylotrophic bacterium Methylobacterium extorquens, Ftr interacts with three other polypeptides to form an Ftr/cyclohydrolase complex which catalyses the hydrolysis of formyl-tetrahydromethanopterin to formate during growth on C1 substrates PUBMED:12123819.

    \ ' '389' 'IPR002877' '\

    RrmJ (FtsJ) is a well conserved heat shock protein present in prokaryotes, archaea, and eukaryotes. RrmJ is responsible for\ methylating 23 S rRNA at position U2552 in the aminoacyl (A)1-site of the ribosome PUBMED:11976298. U2552 is one of the five universally conserved A-loop residues and has been\ shown to be methylated at the ribose 2\'-OH group in the majority of organisms investigated so far. This suggests that this modification plays an important role in the\ A-loop function. RrmJ recognises its methylation target only when the 23 S rRNA is present in 50 S ribosomal subunits. This suggests that the RrmJ-mediated methylation must occur late in the maturation process of the\ ribosome. This is in contrast to other known 23 S rRNA modifications that occur in earlier maturation steps.

    \ \

    The 1.5 A crystal structure of RrmJ in complex with its cofactor S-adenosylmethionine revealed that RrmJ has a methyltransferase fold. The active site of RrmJ appears to be formed by a catalytic triad consisting of two lysine residues and the negatively charged aspartate residue. Another highly conserved glutamate residue that is present in the active site of RrmJ appears to play only a minor role in the methyltransfer reaction in vivo PUBMED:10983982.

    \ ' '390' 'IPR003838' '\ Uncharacterised domain in proteins of unknown function. Proteins that contain this domain are often predicted permeases and hypothetical transmembrane proteins.\ ' '391' 'IPR007219' '\

    This domain is found in a number of fungal transcription factors including transcriptional activator xlnR, yeast regulatory protein GAL4, and other transcription proteins regulating a variety of cellular and metabolic processes.

    \ ' '392' 'IPR000306' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two zinc ions PUBMED:8798641. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. FYVE-type domains are divided into two known classes: FYVE domains that specifically bind to phosphatidylinositol 3-phosphate in lipid bilayers and FYVE-related domains of undetermined function PUBMED:15576038. Those that bind to phosphatidylinositol 3-phosphate are often found in proteins targeted to lipid membranes that are involved in regulating membrane traffic PUBMED:11456498, PUBMED:11739631, PUBMED:11509568. Most FYVE domains target proteins to endosomes by binding specifically to phosphatidylinositol-3-phosphate at the membrane surface. By contrast, the CARP2 FYVE-like domain is not optimized to bind to phosphoinositides or insert into lipid bilayers. FYVE domains are distinguished from other zinc fingers by three signature sequences: an N-terminal WxxD motif, a basic R(R/K)HHCR patch, and a C-terminal RVC motif.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '393' 'IPR020067' '\ The Frizzled CRD (cysteine rich domain) is conserved in diverse proteins including several receptor tyrosine kinases\ PUBMED:9637908, PUBMED:9684897, PUBMED:9852758.\ In Drosophila melanogaster, members of the Frizzled family of tissue-polarity genes encode proteins that appear to function as cell-surface receptors for Wnts. The Frizzled genes belong to the seven transmembrane class of receptors (7TMR) and have in their extracellular region a cysteine-rich domain that has been implicated as the Wnt binding domain. Sequence similarity between the cysteine-rich domain of Frizzled and several receptor tyrosine kinases, which have roles in development, include the muscle-specific receptor tyrosine kinase (MuSK), the neuronal specific kinase (NSK2), and ROR1 and ROR2.\ \ The structure of this domain is known and is composed mainly of alpha helices.\ This domain contains ten conserved cysteines that form five disulphide bridges.\ ' '394' 'IPR000467' '\ The D111/G-patch domain PUBMED:10470032 is a short conserved region of about 40 amino acids\ which occurs in a number of putative RNA-binding proteins, including tumor \ suppressor and DNA-damage-repair proteins, suggesting that this\ domain may have an RNA binding function. This domain\ has seven highly conserved glycines.\ A multiple alignment of a small subset of D111/G-patch domains is shown in Fig. 2b\ of PUBMED:10353602.\ ' '395' 'IPR000101' '\

    Gamma-glutamyltranspeptidase () (GGT) PUBMED:2868390 catalyzes the transfer of the gamma-glutamyl moiety of glutathione to an acceptor that may be an amino acid, a peptide or water (forming glutamate). GGT plays a key role in the gamma-glutamyl cycle, a pathway for the synthesis and degradation of glutathione and drug and xenobiotic detoxification PUBMED:1378736. In prokaryotes and eukaryotes, it is an enzyme that consists of two polypeptide chains, a heavy and a light subunit, processed from a single chain precursor by an autocatalytic cleavage. The active site of GGT is known to be located in the light subunit. The sequences of mammalian and bacterial GGT show a number of regions of high similarity PUBMED:2570061. Pseudomonas cephalosporin acylases () that convert 7-beta-(4-carboxybutanamido)-cephalosporanic acid (GL-7ACA) into 7-aminocephalosporanic acid (7ACA) and glutaric acid are evolutionary related to GGT and also show some GGT activity PUBMED:1358202. Like GGT, these GL-7ACA acylases, are also composed of two subunits.

    \ \

    As an autocatalytic peptidase GGT belongs to MEROPS peptidase family T3 (gamma-glutamyltransferase family, clan PB(T)). The active site residue for members of this family and family T1 is C-terminal to the autolytic cleavage site. The type example is gamma-glutamyltransferase 1 from Escherichia coli.

    \ ' '396' 'IPR007246' '\

    GPI (glycosyl phosphatidyl inositol) transamidase is a multiprotein complex required for a terminal step of adding the glycosylphosphatidylinositol (GPI) anchor attachment onto proteins. Gpi16, Gpi8 and Gaa1 form a sub-complex of the GPI transamidase.

    \ ' '397' 'IPR003018' '\ This domain is present in phytochromes and cGMP-specific phosphodiesterases. cGMP-dependent 3\',5\'-cyclic phosphodiesterase () catalyses the conversion of guanosine 3\',5\'-cyclic phosphate to guanosine 5\'-phosphate.\ A phytochrome is a regulatory photoreceptor which exists in 2 forms that are reversibly interconvertible by light, the PR form that absorbs maximally in the red region of the spectrum, and the PFR form that absorbs maximally in the far-red region. This domain is also found in NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. \ NifA interacts with sigma-54.\ ' '398' 'IPR000840' '\

    Retroviral matrix proteins (or major core proteins) are components of envelope-associated capsids, which line the inner surface of virus envelopes and are associated with viral membranes PUBMED:9657938. Matrix proteins are produced as part of Gag precursor polyproteins. During viral maturation, the Gag polyprotein is cleaved into major structural proteins by the viral protease, yielding the matrix (MA), capsid (CA), nucleocapsid (NC), and some smaller peptides. Gag-derived proteins govern the entire assembly and release of the virus particles, with matrix proteins playing key roles in Gag stability, capsid assembly, transport and budding. Although matrix proteins from different retroviruses appear to perform similar functions and can have similar structural folds, their primary sequences can be very different.

    \

    This entry represents matrix proteins from gamma-retroviruses, such as Moloney murine leukemia virus (MoMLV), Feline leukemia virus (FLV), and Feline sarcoma virus (FESV) PUBMED:12467570, PUBMED:9740771. This entry also identifies matrix proteins from several eukaryotic endogenous retroviruses, which arise when one or more copies of the retroviral genome becomes integrated into the host genome PUBMED:12876457.

    \ ' '399' 'IPR002079' '\ The retroviral p12 protein is a proline rich virion structural protein found in the inner coat. The function carried out by\ p12 in assembly and replication is unknown.\ p12 is associated with pathogenicity of the virus PUBMED:7690416.\ ' '400' 'IPR000922' '\

    The D-galactoside binding lectin purified from sea urchin (Anthocidaris crassispina) eggs exists as a disulphide-linked homodimer of two subunits; the dimeric form is essential for hemagglutination activity PUBMED:2001368. The sea urchin egg lectin (SUEL) forms a new class of lectins. Although SUEL was first isolated as a D-galactoside binding lectin, it was latter shown that it bind to L-rhamnose preferentially PUBMED:2001368, PUBMED:10564781. L-rhamnose and D-galactose share the same hydroxyl group orientation at C2 and C4 of the pyranose ring structure.

    \

    A cysteine-rich domain homologous to the SUEL protein has been identified in the following proteins PUBMED:9261169, PUBMED:9668106, PUBMED:9920906:

    \ \ ' '402' 'IPR004152' '\ The GAT domain is responsible for binding of GGA proteins to several members of the ARF family including ARF1 PUBMED:10747089 and ARF3. The GAT domain stabilises membrane bound ARF1 in its GTP bound state, by interfering with GAP proteins PUBMED:11301005.\ ' '403' 'IPR000679' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents GATA-type zinc fingers (Znf). A number of transcription factors (including erythroid-specific transcription factor and nitrogen regulatory proteins), specifically bind the DNA sequence (A/T)GATA(A/G) PUBMED:2249770 in the regulatory regions of genes. They are consequently termed GATA-binding transcription factors. The interactions occur via highly-conserved Znf domains in which the zinc ion is coordinated by 4 cysteine residues PUBMED:2776214, PUBMED:8332909. NMR studies have shown the core of the Znf to comprise 2 irregular anti-parallel beta-sheets and an alpha-helix, followed by a long loop to the C-terminal end of the finger. The N-terminal part, which includes the helix, is similar in structure, but not sequence, to the N-terminal zinc module of the glucocorticoid receptor DNA-binding domain. The helix and the loop connecting the 2 beta-sheets interact with the major groove of the DNA, while the C-terminal tail wraps around into the minor groove. It is this tail that is the essential determinant of specific binding. Interactions between the Znf and DNA are mainly hydrophobic, explaining the preponderance of thymines in the binding site; a large number of interactions with the phosphate backbone have also been observed PUBMED:8332909. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contains a single copy of the domain.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '404' 'IPR000991' '\

    Glutamine amidotransferase (GATase) () activity involves the removal of the ammonia group from a glutamate molecule and its subsequent transfer to a specific substrate, thus creating a new carbon-nitrogen group on the substrate. This activity is found in a range of biosynthetic enzymes, including glutamine amidotransferase, anthranilate synthase component II, p-aminobenzoate, and glutamine-dependent carbamoyl-transferase (CPSase). Glutamine amidotransferase (GATase) domains can occur either as single polypeptides, as in glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. On the basis of sequence similarities two classes of GATase domains have been identified PUBMED:3298209, PUBMED:6086650, class-I (also known as trpG-type) and class-II (also known as purF-type). Class-I GATase domains are defined by a conserved catalytic triad consisting of cysteine, histidine and glutamate. Class-I GPTase domains have been found in the following enzymes, the second component of anthranilate synthase and 4-amino-4-deoxychorismate (ADC) synthase; CTP synthase; GMP synthase; glutamine-dependent carbamoyl-phosphate synthase; phosphoribosylformylglycinamidine synthase II; and the histidine amidotransferase hisH.

    \ \

    These signatures also detect peptidases belonging to MEROPS peptidase family C26 (gamma-glutamyl hydrolase), and non-peptidase homologs belonging to family C56 (PfpI endopeptidase) both of which are members of clan PC(C). Other members of family C56 are found in .

    \ ' '406' 'IPR006075' '\

    Glutamyl-tRNA(Gln) amidotransferase subunit B () PUBMED:9342321 is a microbial enzyme that furnishes a means for formation of correctly charged Gln-tRNA(Gln) through the transamidation of misacylated Glu-tRNA(Gln) in organisms which lack glutaminyl-tRNA synthetase. The reaction takes place in the presence of glutamine and ATP through an activated gamma-phospho-Glu-tRNA(Gln). The enzyme is composed of three subunits: A (an amidase), B and C. It also exists in eukaryotes as a protein targeted to the mitochondria.\

    \ ' '407' 'IPR018027' '\

    The GatB domain, the function of which is uncertain, is associated with aspartyl/glutamyl amidotransferase subunit B and glutamyl amidotransferase subunit E. These are involved in the formation of correctly charged Asn-tRNA(Asn) or Gln-tRNA(Gln) through the transamidation of misacylated Asp-tRNA(Asn) or Glu-tRNA(Gln) in organisms which lack either or both of asparaginyl-tRNA or glutaminyl-tRNA synthetases. The reaction takes place in the presence of glutamine and ATP through an activated phospho-Asp-tRNA(Asn) or phospho-Glu-tRNA(Gln).

    \ ' '408' 'IPR003902' '\

    GCM transcription factors are a family of proteins which contain a GCM motif. The GCM motif is a domain that has been\ identified in proteins belonging to a family of\ transcriptional regulators involved in fundamental developmental processes which comprise Drosophila melanogaster GCM and its mammalian\ homologs PUBMED:8962155, PUBMED:9114061, PUBMED:9580683, PUBMED:10671510. IN GCM transcription factors the N-terminal moiety contains a DNA-binding domain of 150 residues. Sequence conservation is\ highest in this GCM domain. In contrast, the C-terminal moiety contains one or two transactivating regions and is only poorly conserved.

    The GCM motif has been shown to be a DNA binding domain that recognises preferentially the nonpalindromic octamer 5\'-ATGCGGGT-3\' PUBMED:8962155, PUBMED:9114061, PUBMED:9580683. The GCM motif contains many conserved basic amino acid residues, seven cysteine residues, and four histidine residues PUBMED:8962155. The conserved cysteines are involved in shaping the overall conformation of the domain, in the process of DNA binding and in the redox regulation of DNA binding PUBMED:9580683. The\ GCM domain as a new class of Zn-containing DNA-binding domain with no similarity to any other DNA-binding domain PUBMED:12682016. The GCM domain consists of a large and\ a small domain tethered together by one of the two Zn ions present in the structure. The large and the small domains comprise five- and three-stranded\ beta-sheets, respectively, with three small helical segments packed against the same side of the two beta-sheets. The GCM domain exercises a novel mode of\ sequence-specific DNA recognition, where the five-stranded beta-pleated sheet inserts into the major groove of the DNA. Residues protruding from the edge strand of\ the beta-pleated sheet and the following loop and strand contact the bases and backbone of both DNA strands, providing specificity for its DNA target site.

    \ ' '409' 'IPR004308' '\ This family represents the catalytic subunit of glutamate-cysteine ligase (), also known as\ gamma-glutamylcysteine synthetase (GCS). This enzyme catalyses the rate limiting step in the biosynthesis of glutathione.\ The eukaryotic enzyme is a dimer of a heavy chain and a light chain with all the catalytic activity exhibited by the heavy\ chain.\ ' '410' 'IPR004129' '\ Glycerophosphoryl diester phosphodiesterases display broad specificity for glycerophosphodiesters; glycerophosphocholine, glycerophosphoethanolamine, glycerophosphoglycerol, and bis(glycerophosphoglycerol) all of which are are hydrolysed by this enzyme.\ ' '411' 'IPR007123' '\

    Gelsolin is a cytoplasmic, calcium-regulated, actin-modulating protein that binds\ to the barbed ends of actin filaments, preventing monomer exchange (end-blocking or\ capping) PUBMED:3023087. It can promote nucleation (the assembly of\ monomers into filaments), as well as sever existing filaments. In addition, this protein\ binds with high affinity to fibronectin. Plasma gelsolin and cytoplasmic gelsolin are\ derived from a single gene by alternate initiation sites and differential splicing.

    \

    Sequence comparisons indicate an evolutionary relationship between gelsolin,\ villin, fragmin and severin PUBMED:2850369. Six large repeating segments\ occur in gelsolin and villin, and 3 similar segments in severin and fragmin. While the\ multiple repeats have yet to be related to any known function of the actin-severing\ proteins, the superfamily appears to have evolved from an ancestral sequence of 120\ to 130 amino acid residues PUBMED:2850369.

    \ ' '412' 'IPR000683' '\

    This group of enzymes utilise NADP or NAD, and is known as the GFO/IDH/MOCA family in UniProtKB/Swiss-Prot. GFO is a glucose--fructose oxidoreductase, which converts D-glucose and D-fructose into D-gluconolactone and D-glucitol in the sorbitol-gluconate pathway. MOCA is a rhizopine catabolism protein which may catalyse the NADH-dependent dehydrogenase reaction involved in rhizopine catabolism. Other proteins belonging to this family include Gal80, a negative regulator for the expression of lactose and galactose metabolic genes; and several hypothetical proteins from yeast, Escherichia coli and Bacillus subtilis.

    \ \

    The oxidoreductase, N-terminal domain is almost always associated with the oxidoreductase, C-terminal domain (see ).

    \ ' '413' 'IPR004104' '\

    Enzymes containing this domain utilise NADP or NAD, and are known as the GFO/IDH/MOCA family in UniProtKB/Swiss-Prot. GFO is a glucose--fructose oxidoreductase, which converts D-glucose and D-fructose into\ D-gluconolactone and D-glucitol in the sorbitol-gluconate pathway. MOCA is a rhizopine catabolism protein which may catalyse the NADH-dependent dehydrogenase reaction involved in rhizopine catabolism. Other proteins belonging to this family include Gal80, a negative regulator for the expression of lactose and\ galactose metabolic genes; and several hypothetical proteins from yeast, Escherichia coli and Bacillus subtilis.

    \

    The oxidoreductase, C-terminal domain is almost always associated with the oxidoreductase, N-terminal domain (see ).

    \ ' '414' 'IPR000160' '\

    This domain appears to be ubiquitous in bacteria and is often linked to a regulatory domain, such as a phosphorylation receiver or oxygen sensing domain. Its function is to synthesize cyclic di-GMP, which is used as an intracellular signalling molecule in a wide variety of bacteria PUBMED:15075296,PUBMED:15716451. Enzymatic activity can be strongly influenced by the adjacent domains. Processes regulated by this domain include exopolysaccharide synthesis, biofilm formation, motility and cell differentiation.

    \ \

    Structural studies of PleD from Caulobacter crescentus show that this domain forms a five-stranded beta sheet surrounded by helices, similar to the catalytic core of adenylate cyclase PUBMED:15569936.

    \ ' '415' 'IPR004993' '\ Transcription of the gene family, GH3, has been shown to be specifically induced by the plant\ hormone auxin. The auxin-responsive GH3 gene promoter is composed of multiple auxin response elements (AuxREs), and each\ AuxRE contributes incrementally to the strong auxin inducibility to the promoter.\ ' '416' 'IPR006204' '\

    The galacto- (), homoserine (), mevalonate () and phosphomevalonate () kinases contain, in their N-terminal section, a conserved Gly/Ser-rich region which is probably involved in the binding of ATP PUBMED:1846667, PUBMED:10562426. This group of kinases has been called \'GHMP\' (from the first letter of their substrates).

    \ ' '417' 'IPR000294' '\

    The GLA (gamma-carboxyglutamic acid-rich) domain contains glutamate residues that have been post-translationally modified by vitamin K-dependent carboxylation to form gamma-carboxyglutamate (Gla) PUBMED:18374189, PUBMED:11818531, PUBMED:18374194. All glutamic acid (Glu) residues present in the GLA domain are potential carboxylation sites; in coagulation proteins, all Gu residues are modified to Gla, while in osteocalcin and matrix Gla proteins only some Glu residues are modified to Gla.

    \ \

    The GLA domain is responsible for the high-affinity binding of calcium ions. It starts at the N-terminal extremity of the mature form of proteins and ends with a conserved aromatic residue; a conserved Gla-x(3)-Gla-x-Cys motif PUBMED:3317405 is found in the middle of the domain which seems to be important for substrate recognition by the carboxylase.

    \ \

    The 3D structure of the GLA domain has been solved PUBMED:7713897, PUBMED:8663165. Calcium ions induce conformational changes in the GLA domain that and are necessary for the proper folding of the GLA domain. A common structural feature of functional GLA domains is the clustering of N-terminal hydrophobic residues into a hydrophobic patch that mediates interaction with the cell surface membrane PUBMED:8663165.

    \ \

    Proteins known to contain a GLA domain include PUBMED:18373251:

    \ \

    \ ' '418' 'IPR006096' '\

    Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction.

    \ \

    Glutamate dehydrogenases (, , and ) (GluDH) are enzymes that catalyse the NAD- and/or NADP-dependent reversible deamination of L-glutamate into alpha-ketoglutarate PUBMED:1358610, PUBMED:8315654. GluDH isozymes are generally involved with either ammonia assimilation or glutamate catabolism. Two separate enzymes are present in yeasts: the NADP-dependent enzyme, which catalyses the amination of alpha-ketoglutarate to L-glutamate; and the NAD-dependent enzyme, which catalyses the reverse reaction PUBMED:2989290 - this form links the L-amino acids with the Krebs cycle, which provides a major pathway for metabolic interconversion of alpha-amino acids and alpha- keto acids PUBMED:3368458.

    \ \

    Leucine dehydrogenase () (LeuDH) is a NAD-dependent enzyme that catalyses the reversible deamination of leucine and several other aliphatic amino acids to their keto analogues PUBMED:3069133. Each subunit of this octameric enzyme from Bacillus sphaericus contains 364 amino acids and folds into two domains, separated by a deep cleft. The nicotinamide ring of the NAD+ cofactor binds deep in this cleft, which is thought to close during the hydride transfer step of the catalytic cycle.

    \ \

    Phenylalanine dehydrogenase () (PheDH) is na NAD-dependent enzyme that catalyses the reversible deamidation of L-phenylalanine into phenyl-pyruvate PUBMED:1880121.

    \ \

    Valine dehydrogenase () (ValDH) is an NADP-dependent enzyme that catalyses the reversible deamidation of L-valine into 3-methyl-2-oxobutanoate PUBMED:8320231.

    \

    This entry represents the C-terminal domain of these proteins.

    \ ' '419' 'IPR000971' '\

    Globins are haem-containing proteins involved in binding and/or transporting oxygen. They belong to a very large and well studied family that is widely distributed in many organisms PUBMED:17540514. Globins have evolved from a common ancestor and can be divided into three groups: single-domain globins, and two types of chimeric globins, flavohaemoglobins and globin-coupled sensors. Bacteria have all three types of globins, while archaea lack flavohaemoglobins, and eukaryotes lack globin-coupled sensors PUBMED:16600051. Several functionally different haemoglobins can coexist in the same species. The major types of globins include:

    \

    \ \

    This entry covers most of the globin family of proteins, but it omits some bacterial globins and the protoglobins.

    \

    More information about these proteins can be found at Protein of the Month: Haemoglobin PUBMED:.

    \ ' '420' 'IPR006982' '\

    Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively PUBMED:10357231). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.

    \

    Reaction of amidotransferase domain:

    \ \ \

    Reactions of FMN-binding domain:

    \ \ \ The central domain of glutamate synthase connects the N-terminal amidotransferase domain with the FMN-binding domain and has an alpha/beta overall topology PUBMED:11967268.\ ' '422' 'IPR015868' '\

    Glutaminases () deaminate glutamine to glutamate. In Bacillus subtilis, glutaminase is encoded by glnA, which is part of an operon, glnA-glnT (formerly ybgJ-ybgH), where glnT encodes a glutamine transporter. The glnA-glnT operon is regulated by the 2-component system GlnK-GlnL in response to glutamine PUBMED:15995196. This entry represents the core structural motif of a family of glutaminases that include GlnA, which are characterised by their beta-lactamase-like topology, containing a cluster of alpha-helices and an alpha/beta sandwich.

    \ ' '425' 'IPR007577' '\ The DXD motif is a short conserved motif found in many families of glycosyltransferases, which add a range of different sugars to other sugars, phosphates and proteins. DXD-containing glycosyltransferases all use nucleoside diphosphate sugars as donors and require divalent cations, usually manganese. The DXD motif is expected to play a carbohydrate binding role in sugar-nucleoside diphosphate and manganese dependent glycosyltransferases PUBMED:9653120.\ ' '426' 'IPR000757' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 16 comprises enzymes with a number of known activities; lichenase (); xyloglucan xyloglucosyltransferase (); agarase (); kappa-carrageenase (); endo-beta-1,3-glucanase (); endo-beta-1,3-1,4-glucanase (); endo-beta-galactosidase ().

    \ ' '427' 'IPR001223' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Some members of this family, , belong to the chitinase class II group which includes chitinase, chitodextrinase and the killer toxin of Kluyveromyces lactis. The chitinases hydrolyse chitin oligosaccharides. The family also includes various glycoproteins from mammals; cartilage glycoprotein and the oviduct-specific glycoproteins are two examples.

    \ ' '428' 'IPR001139' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 30 \ comprises enzymes with only one known activity; glucosylceramidase ().

    \ \

    Family 30 encompasses the mammalian glucosylceramidases. Human acid beta-glucosidase (D-glucosyl-N-acylsphingosine glucohydrolase),\ cleaves the glucosidic bonds of glucosylceramide and synthetic beta-glucosides PUBMED:3456607. Any one of over 50 different mutations in the gene of glucocerebrosidase have been found to affect activity of this hydrolase, producing variants of Gaucher disease, the most prevalent lysosomal storage disease PUBMED:3456607, PUBMED:9316290.

    \ ' '429' 'IPR000322' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 31 comprises enzymes with several known activities; alpha-glucosidase (), alpha-galactosidase (); glucoamylase (), sucrase-isomaltase () (); alpha-xylosidase (); alpha-glucan lyase ().

    \

    Glycoside hydrolase family 31 groups a number of glycosyl hydrolases on the basis of sequence\ similarities PUBMED:1747104, PUBMED:1761061, PUBMED:1743281\ An aspartic acid has been implicated PUBMED:1856189 in the catalytic activity of sucrase,\ isomaltase, and lysosomal alpha-glucosidase.

    \ ' '430' 'IPR001382' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 47 comprises enzymes with only one known activity; alpha-mannosidase ().

    \

    Alpha-mannosidase is involved in the maturation of Asn-linked oligo-saccharides PUBMED:8144580. The enzyme hydrolyses terminal 1,2-linked alpha-D-mannose\ residues in the oligo-mannose oligosaccharide man(9)(glcnac)(2) in a\ calcium-dependent manner. The mannose residues are trimmed away to produce,\ first, man(8)glcnac(2), then a man(5)(glcnac)(2) structure.

    \ ' '431' 'IPR004888' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This is a family of eukaryotic enzymes belonging to glycosyl hydrolase family 63 (). They catalyse the specific cleavage of the\ non-reducing terminal glucose residue from Glc(3)Man(9)GlcNAc(2). Mannosyl oligosaccharide glucosidase is the first enzyme in the N-linked oligosaccharide processing pathway.

    \ \ ' '432' 'IPR005154' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This represents a family of alpha-glucuronidases (). Deletion mutants have indicated that the central region is responsible for the catalytic activity. Within this central domain, the invariant Glu and Asp (residues 391 and 364 respectively from Bacillus stearothermophilus (Geobacillus stearothermophilus)) are thought to from the the catalytic centre PUBMED:11358519.

    \ ' '433' 'IPR005198' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This is a family of alpha-1,6-mannanases belonging to glycoside hydrolase family 76 ().

    \ ' '434' 'IPR005201' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This group of endo-beta-N-acetylglucosaminidases belong to the glycoside hydrolase family 85 (). These enzymes work on a broad spectrum of substrates.

    \ ' '435' 'IPR007235' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 28 comprises enzymes with a number of known activities; 1,2-diacylglycerol 3-beta-galactosyltransferase (); 1,2-diacylglycerol 3-beta-glucosyltransferase (); beta-N-acetylglucosamine transferase ().\ Structural analysis suggests the C-terminal domain contains the UDP-GlcNAc binding site.

    \ ' '436' 'IPR001503' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 10 comprises enzymes with two known activities; galactoside 3(4)-L-fucosyltransferase () and galactoside 3-fucosyltransferase ().

    \ \

    The galactoside 3-fucosyltransferases display similarities with the alpha-2 and alpha-6-fucosyltranferases PUBMED:9451017. The biosynthesis of the carbohydrate antigen sialyl Lewis X (sLe(x)) is dependent on the activity of an galactoside 3-fucosyltransferase. This enzyme catalyses the transfer of fucose from GDP-beta-fucose to the 3-OH of N-acetylglucosamine present in lactosamine acceptors PUBMED:9042366.

    \

    Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Galactoside 3(4)-L-fucosyltransferase () belongs to the Lewis blood group system and is associated with Le(a/b) antigen.

    \ ' '437' 'IPR001830' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 20 comprises enzymes with only one known activity; alpha, alpha-trehalose-phosphate synthase [UDP-forming] ().

    \ \ \

    Synthesis of trehalose in the yeast Saccharomyces cerevisiae is catalysed by the trehalose-6-phosphate (Tre6P) synthase/phosphatase complex, which is composed of at least three different subunits encoded by the genes TPS1, TPS2, and TSL1. Tps1 and Tps2 carry the catalytic activities of trehalose synthesis, namely Tre6P synthase (Tps1) and Tre6P phosphatase (Tps2), while TsI1 has regulatory functions. There is some evidence that TsI1 and Tps3\ may share a common function with respect to regulation and/or structural stabilisation of the Tre6P synthase/phosphatase complex in exponentially growing, heat-shocked cells PUBMED:9194697.

    \

    OtsA (trehalose-6-phosphate synthase) from Escherichia coli has homology to the full-length TPS1, the N-terminal part of TPS2 and an internal region of TPS3 (TSL1) of yeast PUBMED:8045430.

    \ ' '438' 'IPR002654' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 25 comprises enzymes with only one known activity; as a lipopolysaccharide biosynthesis protein. These enzymes catalyse the transfer of various sugars onto the growing lipopolysaccharide chain during its biosynthesis PUBMED:8817494.

    \ ' '439' 'IPR001675' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 29 () comprises enzymes with a number of known activities; sialyltransferase (), beta-galactosamide alpha-2,6-sialyltransferase (), alpha-N-acetylgalactosaminide alpha-2,6-sialyltransferase (), beta-galactoside alpha-2,3-sialyltransferase (), N-acetyllactosaminide alpha-2,3-sialyltransferase (), alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase (); lactosylceramide alpha-2,3-sialyltransferase (). These enzymes use a nucleotide monophosphosugar as the donor (CMP-NeuA) instead of a nucleotide diphosphosugar.

    \ \

    Sialyltransferase may be responsible for the synthesis of the sequence NEUAC-Alpha-2,3-GAL-Beta-1,3-GALNAC-, found on sugar chains O-linked to thr or ser and also as a terminal sequenec on certain gagnliosides. These enzymes catalyse sialyltransfer reactions during glycosylation, and are type II membrane proteins.

    \ ' '440' 'IPR005027' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 43 comprises enzymes with only one known activities; beta-glucuronyltransferase();.

    \ ' '441' 'IPR002495' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 8 comprises enzymes with a number of known activities; lipopolysaccharide galactosyltransferase (), lipopolysaccharide\ glucosyltransferase 1 (), glycogenin glucosyltransferase (), inositol 1-alpha-galactosyltransferase (). These enzymes have a distant similarity to family GT_24.

    \ ' '442' 'IPR001173' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \ This domain is found in a diverse family of glycosyl transferases that transfer the sugar\ from UDP-glucose, UDP-N-acetyl-galactosamine, GDP-mannose or CDP-abequose, to a range\ of substrates including cellulose, dolichol phosphate and teichoic acids.\ \ ' '443' 'IPR018481' '\

    This entry represents a conserved region found in a family of UDP-GlcNAc/MurNAc: polyisoprenol-P GlcNAc/MurNAc-1-P transferases. Members of the family include eukaryotic N-acetylglucosamine-1-phosphate transferases, which catalyse the conversion of UDP-N-acteyl-D-glucosamine and dolichyl phosphate to UMP and N-acetyl-D-glucosaminyl-diphosphodolichol in the glycosylation pathway;\ and bacterial phospho-N-acetylmuramoyl-pentapeptide-transferases, which catalyse the first step of the lipid cycle reactions in the biosynthesis of cell wall peptidoglycan.

    \ ' '444' 'IPR004360' '\ Glyoxalase I () (lactoylglutathione lyase) catalyzes the first step of the glyoxal pathway. S-lactoylglutathione is then converted by glyoxalase II to lactic acid PUBMED:7684374.\ Glyoxalase I is an ubiquitous enzyme which binds one mole of zinc\ per subunit. The bacterial and yeast enzymes are monomeric while the mammalian one is homodimeric. The sequence of glyoxalase I is well conserved. This domain is found in other related proteins including the Bleomycin resistance protein and dioxygenases eg. 4-hydroxyphenylpyruvate dioxygenase.\ ' '445' 'IPR007554' '\ Wall-associated teichoic acids are a heterogeneous class of phosphate-rich polymers that are covalently linked to the cell wall peptidoglycan of Gram-positive bacteria. They consist of a main chain of phosphodiester-linked polyols and/or sugar moieties attached to peptidoglycan via a linkage unit. CDP-glycerol:poly(glycerophosphate) glycerophosphotransferase is responsible for the polymerisation of the main chain of the teichoic acid by sequential transfer of glycerol-phosphate units from CDP-glycerol to the linkage unit lipid PUBMED:10648531.\ ' '446' 'IPR001674' '\

    The amidotransferase family of enzymes utilises the ammonia derived from the hydrolysis of glutamine for a subsequent chemical reaction catalyzed by the same enzyme. The ammonia intermediate does not dissociate into solution during the chemical transformations PUBMED:10387030.\ GMP synthetase is a glutamine amidotransferase from the de novo purine biosynthetic pathway. The C-terminal domain is specific to the GMP synthases . In prokaryotes this domain mediates dimerisation. Eukaryotic GMP synthases are monomers. This domain in eukaryotes includes several large insertions that may form globular domains PUBMED:8548458.

    \ ' '447' 'IPR000203' '\

    This domain has been termed the GPS domain (for GPCR proteolytic site), because it contains a cleavage site in latrophilin PUBMED:9920906. However this region in latrophilin is found in many otherwise unrelated cell surface receptors PUBMED:10469603. There is no evidence currently that this domain provides a cleavage site in any of the other receptors. However the peptide bond that is cleaved in latrophilin is between Leu and Thr residues that are conserved in some of the other receptors PUBMED:10469603.

    \ \

    GPS domains are about 50 residues long and contain either 2 or 4 cysteine residues that are likely to form disulphide bridges. Based on conservation of these cysteines the following pairing can be predicted.

    \ \
    \
                                 +-----------------+\
                                 |                 |\
               +-----------------+---------------+ |\
               |                 |               | |\
            XXXCXXXXXXXXXXXXXXXXXCXXXXXXXXXXXXXXXCXCXXLTXXXXXXX\
                                                       ^\
                                                       cleavage site\
    
    \ ' '448' 'IPR004182' '\

    The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. It is normally about 70 amino acids in length. It is thought to be an intracellular protein-binding or lipid-binding signalling domain, which has an important function in membrane-associated processes. Mutations in the GRAM domain of myotubularins cause a muscle disease, which suggests that the domain is essential for the full function of the enzyme PUBMED:11050430. Myotubularin-related proteins are a large subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids PUBMED:14690594.

    \ ' '449' 'IPR005202' '\

    Sequence analysis of the products of the GRAS (GAI, RGA, SCR) gene family indicates that they share a variable N-terminus and a highly conserved C-terminus that contains five recognizable motifs PUBMED:10341448. Proteins in the GRAS family are transcription factors that seem to be involved in development and other processes. Mutation of the SCARECROW (SCR) gene results in a radial pattern defect, loss of a ground tissue layer, in the root. The PAT1 protein is involved in phytochrome A signal transduction PUBMED:10817761.

    \

    \ GRAS proteins contain a conserved region of about 350 amino acids that can be\ divided in 5 motifs, found in the following order: leucine heptad repeat I,\ the VHIID motif, leucine heptad repeat II, the PFYRE motif and the SAW motif\ PUBMED:10341448, PUBMED:14760535. Plant specific GRAS proteins have parallels in their motif structure to\ the animal Signal Transducers and Activators of Transcription (STAT) family of\ proteins PUBMED:10842311 which suggests also some parallels in their functions.

    \ ' '450' 'IPR000237' '\ The GRIP (golgin-97, RanBP2alpha,Imh1p and p230/golgin-245) domain\ PUBMED:10209120, PUBMED:10209123, PUBMED:10209125\ is found in many large coiled-coil proteins. It has been shown to\ be sufficient for targeting to the Golgi. The GRIP domain contains\ a completely conserved tyrosine residue.\ ' '451' 'IPR004212' '\ This region of sequence similarity is found up to six times in a variety of proteins including GTF2I. It has been suggested that this may be a DNA binding domain PUBMED:9774679, PUBMED:10198167.\ ' '452' 'IPR006169' '\

    Several proteins have recently been shown to contain the 5 structural motifs characteristic\ of GTP-binding proteins PUBMED:1449490. These include murine DRG protein; GTP1 protein\ from Schizosaccharomyces pombe; OBG protein from Bacillus subtilis; and several others.\ Although the proteins contain GTP-binding motifs and are similar to each other, they do\ not share sequence similarity to other GTP-binding proteins, and have thus been classed\ as a novel group, the GTP1/OBG family. As yet, the functions of these proteins is uncertain,\ but they have been shown to be important in development and normal cell metabolism\ PUBMED:8462872, PUBMED:2537815.

    \ ' '453' 'IPR000795' '\ Elongation factors belong to a family of proteins that promote the GTP-dependent binding of aminoacyl tRNA to the A site of ribosomes during protein biosynthesis, and catalyse the translocation of the synthesised protein chain from the A to the P site. The proteins are all relatively similar in the vicinity of their C-termini, and are also highly similar to a range of proteins that includes the nodulation Q protein from Rhizobium meliloti (Sinorhizobium meliloti), bacterial tetracycline resistance proteins PUBMED:2841293 and the omnipotent suppressor\ protein 2 from yeast.

    In both prokaryotes and eukaryotes, there are three distinct types of elongation factors, EF-1alpha (EF-Tu), which binds GTP and an aminoacyl-tRNAand delivers the latter to the A site of ribosomes; EF-1beta (EF-Ts), which interacts with EF-1a/EF-Tu to displace GDP and thus allows the\ regeneration of GTP-EF-1a; and EF-2 (EF-G), which binds GTP and peptidyl-tRNA and translocates the latter from the A site to the P site. In EF-1-alpha, a specific region has been shown PUBMED:3126836 to be involved in a conformational change mediated by the hydrolysis of GTP to GDP. This region is conserved in both EF-1alpha/EF-Tu as well as EF-2/EF-G and thus seems typical for GTP-dependent proteins which bind non-initiator tRNAs to the ribosome. The GTP-binding protein synthesis factor family also includes the eukaryotic peptide chain release factor GTP-binding subunits PUBMED:7556078 and prokaryotic peptide chain release factor 3 (RF-3) PUBMED:7737996; the prokaryotic GTP-binding protein lepA and its homologue in yeast\ (GUF1) and Caenorhabditis elegans (ZK1236.1); yeast HBS1 PUBMED:1394434; rat statin S1 PUBMED:1709933; and the prokaryotic selenocysteine-specific elongation factor selB PUBMED:2531290.

    \ ' '454' 'IPR004161' '\

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome PUBMED:12762045, PUBMED:15922593, PUBMED:12932732. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    \

    EF1A (also known as EF-1alpha or EF-Tu) is a G-protein. It forms a ternary complex of EF1A-GTP-aminoacyltRNA. The binding of aminoacyl-tRNA stimulates GTP hydrolysis by EF1A, causing a conformational change in EF1A that causes EF1A-GDP to detach from the ribosome, leaving the aminoacyl-tRNA attached at the A-site. Only the cognate aminoacyl-tRNA can induce the required conformational change in EF1A through its tight anticodon-codon binding PUBMED:15680978, PUBMED:12102560. EF1A-GDP is returned to its active state, EF1A-GTP, through the action of another elongation factor, EF1B (also known as EF-Ts or EF-1beta/gamma/delta).

    \

    EF1A consists of three structural domains. This entry represents domain 2 of EF2, which adopts a beta-barrel structure, and is involved in binding to both charged tRNA PUBMED:7491491. This domain is structurally related to the C-terminal domain of EF2 (), to which it displays weak sequence matches. This domain is also found in other proteins such as translation initiation factor IF-2 and tetracycline-resistance proteins.

    \

    More information about these proteins can be found at Protein of the Month: Elongation Factors PUBMED:.

    \ \ \ \ ' '455' 'IPR004160' '\

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome PUBMED:12762045, PUBMED:15922593, PUBMED:12932732. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    \

    EF1A (also known as EF-1alpha or EF-Tu) is a G-protein. It forms a ternary complex of EF1A-GTP-aminoacyltRNA. The binding of aminoacyl-tRNA stimulates GTP hydrolysis by EF1A, causing a conformational change in EF1A that causes EF1A-GDP to detach from the ribosome, leaving the aminoacyl-tRNA attached at the A-site. Only the cognate aminoacyl-tRNA can induce the required conformational change in EF1A through its tight anticodon-codon binding PUBMED:15680978, PUBMED:12102560. EF1A-GDP is returned to its active state, EF1A-GTP, through the action of another elongation factor, EF1B (also known as EF-Ts or EF-1beta/gamma/delta).

    \

    EF1A consists of three structural domains. This entry represents the C-terminal domain, which adopts a beta-barrel structure, and is involved in binding to both charged tRNA and to EF1B (or EF-Ts, ) PUBMED:9253415.

    \

    More information about these proteins can be found at Protein of the Month: Elongation Factors PUBMED:.

    \ ' '456' 'IPR002489' '\

    Glutamate synthase (GltS) is a complex iron-sulphur flavoprotein that catalyses the reductive synthesis of L-glutamate from 2-oxoglutarate and L-glutamine via intramolecular channelling of ammonia, a reaction in the bacterial, yeast and plant pathways for ammonia assimilation PUBMED:11188694. GltS is a multifunctional enzyme that functions through three distinct active centres carrying out multiple reaction steps: L-glutamine hydrolysis, conversion of 2-oxoglutarate into L-glutamate, and electron uptake from an electron donor. The active centres are synchronised to avoid the wasteful consumption of L-glutamine PUBMED:11967268. There are three classes of GltS, which share many functional properties: bacterial NADPH-dependent GltS, ferredoxin-dependent GltS from photosynthetic cells, and NAD(P)H-dependent GltS from yeast, fungi and lower animals.

    \

    The dimeric alpha subunits each consist of four domains: N-terminal amidotransferase domain, the central domain, the FMN binding domain and the C-terminal domain. The C-terminal domain forms a right-handed beta-helix that comprises seven helical turns PUBMED:11188694. Each helical turn has a sharp bend that is associated with a repeated sequence motif consisting of G-XX-G-XXX-G. This domain does not contain any residues directly involved in catalysis, but has a crucial structural role.

    \

    This domain is also found in proteins such as subunit C of formylmethanofuran dehydrogenase, which catalyses the first step in methane formation from carbon dioxide in methanogenic archaea. There are two isoenzymes of formylmethanofuran dehydrogenase: a tungsten-containing isoenzyme (FwdC) and a molybdenum-containing isoenzyme (FmdC). The tungsten isoenzyme is constitutively transcribed, whereas transcription of the molybdenum operon is induced by molybdate PUBMED:9818358.

    \ \ ' '457' 'IPR004011' '\

    The GYR motif is found in several Drosophila melanogaster proteins, in either single or multiple copies. Its function is unknown, however the presence of completely conserved tyrosine residues may suggest it could be a substrate for tyrosine kinases.

    \ ' '458' 'IPR005114' '\ This short domain is found in multiple copies in bacterial helicase proteins. The domain is predicted to contain 3 alpha helices. The function of this domain may be to bind nucleic acid.\ ' '459' 'IPR007502' '\ This presumed domain is about 90 amino acid residues in length. It is found as a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding.\ ' '460' 'IPR006861' '\ This family includes the HABP4 family of hyaluronan-binding proteins, and the PAI-1 mRNA-binding protein, PAI-RBP1. HABP4 has been observed to bind hyaluronan (a glucosaminoglycan), but it is not known whether this is its primary role in vivo. It has also been observed to bind RNA, but with a lower affinity than that for hyaluronan PUBMED:10887182. PAI-1 mRNA-binding protein specifically binds the mRNA of type-1 plasminogen activator inhibitor (PAI-1), and is thought to be involved in regulation of mRNA stability PUBMED:11001948. However, in both cases, the sequence motifs predicted to be important for ligand binding are not conserved throughout the family, so it is not known whether members of this family share a common function.\ ' '461' 'IPR003106' '\

    This region is a plant specific leucine zipper that is always found\ associated with a homeobox PUBMED:7915839.

    \ ' '462' 'IPR003660' '\

    The HAMP linker domain (present in Histidine kinases, Adenyl cyclases, Methyl-accepting proteins and Phosphatases) is an approximately 50-amino acid alpha-helical region. It is found in bacterial sensor and chemotaxis proteins and in eukaryotic histidine kinases. The bacterial proteins are usually integral membrane proteins and part of a two-component signal transduction pathway. One or several copies of the HAMP domain can be found in association with other domains, such as the histidine kinase domain, the bacterial chemotaxis sensory transducer domain, the PAS repeat, the EAL domain, the GGDEF domain, the protein phosphatase 2C-like domain, the guanylate cyclase domain, or the response regulatory domain. It has been suggested that the HAMP domain possesses a role of regulating the phosphorylation or methylation of homodimeric receptors by transmitting the conformational changes in periplasmic ligand-binding domains to cytoplasmic signalling kinase and methyl-acceptor domains.

    \ ' '463' 'IPR006933' '\

    This family is defined by an N-terminal conserved region found in several huntingtin-associated protein 1 (HAP1) homologues. HAP1 binds to huntingtin in a polyglutamine repeat-length-dependent manner. However, its possible role in the pathogenesis of Huntingtons disease is unclear. This family also includes a similar N-terminal conserved region from hypothetical protein products of ALS2CR3 genes found in the human juvenile amyotrophic lateral sclerosis critical region 2q33-2q34 PUBMED:11161814.

    \ ' '464' 'IPR003107' '\

    The HAT (Half A TPR) repeat has a repetitive pattern characterised by three aromatic residues with a conserved spacing. They are structurally and sequentially similar to TPRs (tetratricopeptide repeats), though they lack the highly conserved alanine and glycine residues found in TPRs. The number of HAT repeats found in different proteins varies between 9 and 12. HAT-repeat-containing proteins appear to be components of macromolecular complexes that are required for RNA processing PUBMED:9478129. The repeats may be involved in protein-protein interactions. The HAT motif has striking structural similarities to HEAT repeats (), being of a similar length and consisting of two short helices connected by a loop domain, as in HEAT repeats.

    \ \ \ \ ' '465' 'IPR003594' '\

    This domain is found in several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases PUBMED:15105144, heat shock protein HSP90 PUBMED:15292259, PUBMED:14718169, PUBMED:15217611, phytochrome-like ATPases and DNA mismatch repair proteins.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '466' 'IPR011531' '\

    Bicarbonate (HCO3-) transport mechanisms are the principal regulators of pH in animal cells. Such transport also plays a vital role in acid-base movements in the stomach, pancreas, intestine, kidney, reproductive organs and the central nervous system. Functional studies have suggested four different HCO3- transport modes. Anion exchanger proteins exchange HCO3- for Cl- in a reversible, electroneutral manner PUBMED:2289848. Na+/HCO3- co-transport proteins mediate the coupled movement of Na+ and HCO3- across plasma membranes, often in an electrogenic manner PUBMED:. Na- driven Cl-/HCO3- exchange and K+/HCO3- exchange activities have also been detected in certain cell types, although the molecular identities of the proteins responsible remain to be determined.

    \ \

    Sequence analysis of the two families of HCO3- transporters that have been cloned to date (the anion exchangers and Na+/HCO3- co-transporters) reveals that they are homologous. This is not entirely unexpected, given that they both transport HCO3- and are inhibited by a class of pharmacological agents called disulphonic stilbenes PUBMED:9235899. They share around ~25-30% sequence identity, which is distributed along their entire sequence length, and have similar predicted membrane topologies, suggesting they have ~10 transmembrane (TM) domains.

    \

    This domain is found at the C-terminus of many bicarbonate transport proteins. It is also found in some plant proteins responsible for boron transport PUBMED:12447444. In these proteins it covers almost the entire length of the sequence.

    \ ' '467' 'IPR006674' '\ This domain is found in a superfamily of enzymes with a predicted or known phosphohydrolase activity. These enzymes appear to be involved in the nucleic acid metabolism, signal transduction and possibly other functions in bacteria, archaea and eukaryotes.\ The fact that all the highly conserved residues in the HD superfamily are histidines or aspartates suggests that coordination of divalent cations is essential for the activity of these proteins PUBMED:9868367.\ ' '468' 'IPR001650' '\

    The domain, which defines this group of proteins is found in a wide variety of helicases and helicase related proteins. It may be that this is not an autonomously folding unit, but an integral part of the helicase.

    \ \

    The eukaryotic translation initiation factor 4A (eIF4A) is a member of the DEA(D/H)-box RNA helicase family This is a diverse group of proteins that couples an ATPase activity to RNA binding and unwinding. The structure of the carboxyl-terminal domain of eIF4A has been determined to 1.75 A resolution; it has a parallel alpha-beta topology that superimposes, with minor variations, on the structures and conserved motifs of the equivalent domain in other, distantly related helicases PUBMED:11087862.

    \ ' '469' 'IPR003754' '\

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway PUBMED:16564539. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin PUBMED:17227226.

    \

    The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase (), or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase () to charge a tRNA with glutamate, glutamyl-tRNA reductase () to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase () to catalyse a transamination reaction to produce ALA.

    \

    The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase, ) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase, ) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase () to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.

    \

    Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase (). To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase () PUBMED:11215515.

    \ \ \

    This entry represents uroporphyrinogen III synthase () which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses the inversion of the final pyrrole unit (ring D) of the linear tetrapyrrole molecule, linking it to the first pyrrole unit (ring A), thereby generating a large macrocyclic structure called uroporphyrinogen III PUBMED:11215515. The enzyme folds into two alpha/beta domains connected by a beta-ladder, the active site being located between the two domains PUBMED:11689424. Congenital erythropoietic porphyria (CEP) is an autosomal recessive inborn error of metabolism that results from the markedly deficient activity of uroporphyrinogen III synthase PUBMED:17270473.

    \ ' '470' 'IPR001199' '\ Cytochromes b5 are ubiquitous electron transport proteins found in animals, plants and\ yeasts PUBMED:2752049. The microsomal and mitochondrial variants are membrane-bound, \ while those from erythrocytes and other animal tissues are water-soluble PUBMED:4030743, PUBMED:8439576.

    The 3D structure of bovine cyt b5 is known, the\ fold belonging to the alpha+beta class, with 5 strands and 5 short helices\ forming a framework for supporting a central haem group PUBMED:1167544. The cytochrome b5 domain is similar to that of a number\ of oxidoreductases, such as plant and fungal nitrate reductases, sulphite oxidase, yeast\ flavocytochrome b2 (L-lactate dehydrogenase) and plant cyt b5/acyl lipid desaturase\ fusion protein.

    \ ' '471' 'IPR012312' '\

    The haemerythrin family is composed of haemerythrin proteins found in invertebrates, and a broader collection of bacterial and archaeal homologues. Haemerythrin is an oxygen-binding protein found in the vascular system and coelemic fluid, or in muscles (myohaemerythrin) in invertebrates PUBMED:2065779. Many of the lhomologous proteins found in prokaryotes are multi-domain proteins with signal-transducing domains such as the GGDEF diguanylate cyclase domain () and methyl-accepting chemotaxis protein (MCP) signalling domain (). Most haemerythrins are oxygen-carriers with a bound non-haem iron, but at least one example is a cadmium-binding protein, apparently with a role in sequestering toxic metals rather than in binding oxygen. The prokaryote with the most instances of this domain is Magnetococcus sp. MC-1, a magnetotactic bacterium.

    \

    Haemerythrins and myohaemerythrins PUBMED:3681996, PUBMED:2362933 are small proteins of about 110 to 129 amino acid residues that bind two iron atoms. They are left-twisted 4-alpha-helical bundles, which provide a hydrophobic pocket where dioxygen binds as a peroxo species, interacting with adjacent aliphatic side chains via van der Waals forces PUBMED:3856224. In both haemerythrins and myohaemerythrins, the active centre is a binuclear iron complex, bound directly to the protein via 7 amino acid side chains PUBMED:3856224, 5 His, 1 Glu and 1 Asp PUBMED:2065779. Ovohaemerythrin PUBMED:1425663, a yolk protein from the leech Theromyzon tessulatum seems to belong to this family of proteins, it may play a role in the detoxification of free iron after a blood meal PUBMED:1425663.

    \ \

    This entry represents a haemerythrin/HHE cation-binding motif that occurs as a duplicated domain in haemerythrin and related proteins. This domain binds iron in haemerythrin, but can bind other metals in related proteins, such as cadmium in a Nereis diversicolor protein (). A bacterial protein, , is a regulator of response to NO, which suggests a different set-up for its metal ligands. A protein from Cryptococcus neoformans (Filobasidiella neoformans) that contains haemerythrin/HHE cation-binding motifs is also involved in NO response PUBMED:17661046. A Staphylococcus aureus protein () has been noted to be important when the organism switches to living in environments with low oxygen concentrations; perhaps this protein acts as an oxygen store or scavenger.

    \ ' '472' 'IPR001343' '\ Gram-negative bacteria produce a number of proteins that are secreted into the growth medium by a mechanism that does not require a cleaved N-terminal signal sequence. These proteins, while having different functions, seem to share two properties: they bind calcium and they contain a multiple tandem repeat of a nonapeptide PUBMED:2303029. The nonapeptide is found in a group of bacterial exported proteins that includes haemolysin, cyclolysin, leukotoxin and metallopeptidases belonging to MEROPS peptidase family M10 (clan MA(M)), subfamily 10B (serralysin).

    It has been suggested that the internally \ repeated domain of haemolysin may be involved in Ca-mediated binding to erythrocytes. It has been shown that such a domain is involved in the binding of calcium ions in a parallel beta roll structure PUBMED:8253063.

    \ ' '473' 'IPR001451' '\

    A variety of bacterial transferases contain a repeat structure composed of tandem repeats of a [LIV]-G-X(4) hexapeptide, which, in the tertiary structure of LpxA (UDP N-acetylglucosamine acyltransferase) PUBMED:7481807, has been shown to form a left-handed parallel beta helix. A number of different transferase protein families contain this repeat, such as galactoside acetyltransferase-like proteins PUBMED:11937062, the gamma-class of carbonic anhydrases PUBMED:10924115, and tetrahydrodipicolinate-N-succinlytransferases (DapD), the latter containing an extra N-terminal 3-helical domain PUBMED:11910040.

    \ ' '474' 'IPR004154' '\ tRNA synthetases, or tRNA ligases are involved in protein synthesis. This domain is found in histidyl, glycyl, threonyl and prolyl tRNA synthetases PUBMED:10447505 it is probably the anticodon binding domain PUBMED:9115984.\ ' '475' 'IPR005213' '\

    This short (30 amino acids) repeat is found in a number of plant proteins. It contains a conserved HGWP motif, hence its name. The function of these proteins is unknown.

    \ ' '476' 'IPR000445' '\ The HhH motif is an around 20 amino acids domain present in prokaryotic and\ eukaryotic non-sequence-specific DNA binding proteins PUBMED:7664751, PUBMED:9973609, PUBMED:9987128. \ The HhH motif is similar to, but distinct from, the HtH motif. Both of these\ motifs have two helices connected by a short turn. In the HtH motif the second\ helix binds to DNA with the helix in the major groove. This allow the contact\ between specific base and residues throughout the protein. In the HhH motif\ the second helix does not protrude from the surface of the protein and\ therefore cannot lie in the major groove of the DNA. Crystallographic studies\ suggest that the interaction of the HhH domain with DNA is mediated by amino\ acids located in the strongly conserved loop (L-P-G-V) and at the N-terminal\ end of the second helix PUBMED:7664751. This interaction could involve the formation of\ hydrogen bonds between protein backbone nitrogens and DNA phosphate groups\ PUBMED:8692686. \ The structural difference between the HtH and HhH domains is reflected at the\ functional level: whereas the HtH domain, found primarily in gene regulatory\ proteins, binds DNA in a sequence specific manner, the HhH domain is rather\ found in proteins involved in enzymatic activities and binds DNA with no\ sequence specificity PUBMED:8692686.\ ' '477' 'IPR001767' '\

    This domain identifies a group of cysteine peptidases correspond to MEROPS peptidase family C46 (clan CH). The type example is the Hedgehog protein from Drosophila melanogaster (Fruit fly). These are involved in intracellular signalling required for a variety of patterning events during development.

    \ \

    The hedgehog family of proteins self process by a cysteine-dependent mechanism, which is a one-time autolytic cleavage. It is differentiated from a typical peptidase reaction by the fact that the newly-formed carboxyl group\ is esterified with cholesterol, rather than being left free. The three-dimensional structure of the autolytic domain of the hedgehog protein of D. melanogaster shows that it is formed from two divergent copies of a\ module that also occurs in inteins, called a Hint domain PUBMED:9335337,PUBMED:9489693.

    \ ' '478' 'IPR013820' '\

    ATP phosphoribosyltransferase () is the enzyme that catalyzes the first step in the biosynthesis of histidine in bacteria, fungi and plants as shown below. It is a member of the larger phosphoribosyltransferase superfamily of enzymes which catalyse the condensation of 5-phospho-alpha-D-ribose 1-diphosphate with nitrogenous bases in the presence of divalent metal ions PUBMED:11751055.\ \ \ \ Histidine biosynthesis is an energetically expensive process and ATP phosphoribosyltransferase activity is subject to control at several levels. Transcriptional regulation is based primarily on nutrient conditions and determines the amount of enzyme present in the cell, while feedback inihibition rapidly modulates activity in response to cellular conditions. The enzyme has been shown to be inhibited by 1-(5-phospho-D-ribosyl)-ATP, histidine, ppGpp (a signal associated with adverse environmental conditions) and ADP and AMP (which reflect the overall energy status of the cell). As this pathway of histidine biosynthesis is present only in prokayrotes, plants and fungi, this enzyme is a promising target for the development of novel antimicrobial compounds and herbicides.

    \ \

    ATP phosphoribosyltransferase is found in two distinct forms: a long form containing two catalytic domains and a C-terminal regulatory domain, and a short form in which the regulatory domain is missing. The long form is catalytically competent, but in organisms with the short form, a histidyl-tRNA synthetase paralogue, HisZ, is required for enzyme activity PUBMED:10430882.\ This entry represents the catalytic region of this enzyme.

    \ \

    The structures of the long form enzymes from Escherichia coli () and Mycobacterium tuberculosis () have been determined PUBMED:14741209, PUBMED:12511575. The enzyme itself exists in equilibrium between an active dimeric form, an inactive hexameric form and higher aggregates. Interconversion between the various forms is largely reversible and is influenced by the binding of the natural substrates and inhibitors of the enzyme. The two catalytic domains are linked by a two-stranded beta-sheet and togther form a "periplasmic binding protein fold". A crevice between these domains contains the active site. The C-terminal domain is not directly involved in catalysis but appears to be involved the formation of hexamers, induced by the binding of inhibitors such as histidine to the enzyme, thus regulating activity.

    \ ' '479' 'IPR003661' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms PUBMED:8868347, PUBMED:11406410. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation PUBMED:10426948, and CheA, which plays a central role in the chemotaxis system PUBMED:9989504. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water PUBMED:11145881. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.

    \

    HKs can be roughly divided into two classes: orthodox and hybrid kinases PUBMED:8029829, PUBMED:1482126. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK PUBMED:10966457. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.

    \

    This entry represents the dimerisation and phosphoacceptor domain found in histidine kinases. It has been found in bacterial sensor protein/histidine kinases. Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms PUBMED:8868347. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation PUBMED:10426948, and CheA, which plays a central role in the chemotaxis system PUBMED:9989504. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and the phosphotransfer from aspartyl phosphate back to ADP or to water PUBMED:11145881. The homodimeric domain includes the site of histidine autophosphorylation and phosphate transfer reactions. The structure of the homodimeric domain comprises a closed, four-helical bundle with a left-handed twist, formed by two identical alpha-hairpin subunits.

    \ ' '480' 'IPR000286' '\ Histones can be reversibly acetylated on several lysine residues.\ Regulation of transcription is caused in part by this\ mechanism. Histone deacetylases catalyse the removal\ of the acetyl group. Histone deacetylases, acetoin utilization proteins and acetylpolyamine amidohydrolases are all members of this ancient protein superfamily PUBMED:9278492.\ ' '481' 'IPR001092' '\ Basic helix-loop-helix proteins (bHLH) are a group of eukaryotic transcription factors that exert a determinative influence in a variety of developmental pathways. These transcription factors are characterised by a highly evolutionary conserved bHLH domain that mediates specific dimerisation PUBMED:7553065. They facilitate the conversion of inactive monomers to trans-activating dimers at appropriate stages of development PUBMED:1755826. \

    The bHLH proteins can be classified into discrete categories. One such subdivision according to dimerisation, DNA binding and expression characteristics defines seven groups PUBMED:8018712. Class I proteins form dimers within the group or with class II proteins. Class II can only form heterodimers with class I factors. Class III factors are characterised by the presence of a leucine zipper () adjacent to the bHLH domain. Class IV factors may form homodimers or teterodimers with class III proteins. Class V and class VI proteins act as regulators of class I and class II factors and class VII proteins have a PAS domain ().\

    \ ' '482' 'IPR006121' '\

    Proteins that transport heavy metals in micro-organisms and mammals share similarities in their sequences and structures.

    \

    These proteins provide an important focus for research, some being involved in bacterial resistance to toxic metals, such as lead and cadmium, while others are involved in inherited human syndromes, such as Wilson\'s and Menke\'s diseases PUBMED:8091505.

    \

    A conserved domain has been found in a number of these heavy metal transport or detoxification proteins PUBMED:8091505. The domain, which has been termed Heavy-Metal-Associated (HMA), contains two conserved cysteines that are probably involved in metal binding.

    \

    \ Structure solution of the fourth HMA domain of the Menke\'s copper transporting\ ATPase shows a well-defined structure comprising a four-stranded antiparallel\ beta-sheet and two alpha helices packed in an alpha-beta sandwich fold PUBMED:9437429. This fold is common to other domains and is classified\ as "ferredoxin-like".

    \ \ ' '483' 'IPR002202' '\

    Synonym(s): 3-hydroxy-3-methylglutaryl-coenzyme A reductase, HMG-CoA reductase.

    \ \

    There are two distinct classes of hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase enzymes: class I consists of eukaryotic and most archaeal enzymes (), while class II consists of prokaryotic enzymes () PUBMED:10068515, PUBMED:15535874.

    \ \

    Class I HMG-CoA reductases catalyse the NADP-dependent synthesis of mevalonate from 3-hydroxy-3-methylglutaryl-CoA (HMG-CoA). In vertebrates, membrane-bound HMG-CoA reductase is the rate-limiting enzyme in the biosynthesis of cholesterol and other isoprenoids. In plants, mevalonate is the precursor of all isoprenoid compounds PUBMED:15535874. The reduction of HMG-CoA to mevalonate is regulated by feedback inhibition by sterols and non-sterol metabolites derived from mevalonate, including cholesterol. In archaea, HMG-CoA reductase is a cytoplasmic enzyme involved in the biosynthesis of the isoprenoids side chains of lipids PUBMED:10600463. Class I HMG-CoA reductases consist of an N-terminal membrane domain (lacking in archaeal enzymes), and a C-terminal catalytic region. The catalytic region can be subdivided into three domains: an N-domain (N-terminal), a large L-domain, and a small S-domain (inserted within the L-domain). The L-domain binds the substrate, while the S-domain binds NADP.

    \ \

    Class II HMG-CoA reductases catalyse the reverse reaction of class I enzymes, namely the NAD-dependent synthesis of HMG-CoA from mevalonate and CoA PUBMED:15028676. Some bacteria, such as Pseudomonas mevalonii, can use mevalonate as the sole carbon source. Class II enzymes lack a membrane domain. Their catalytic region is structurally related to that of class I enzymes, but it consists of only two domains: a large L-domain and a small S-domain (inserted within the L-domain). As with class I enzymes, the L-domain binds substrate, but the S-domain binds NAD (instead of NADP in class I).

    \ \

    This entry represents the catalytic region found in both class I and II HMG-CoA reductases. The catalytic region from both classes share a common overall structural fold, despite low sequence identities of 14-20%. Class I eukaryotic enzymes contain an extra N-terminal domain not represented by this entry.

    \ ' '484' 'IPR003511' '\ The HORMA (for Hop1p, Rev7p and MAD2) domain has been suggested to recognise chromatin states that result from DNA adducts, double stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins. Hop1 is a meiosis-specific protein, Rev7 is required for DNA damage induced mutagenesis, and MAD2 is a spindle checkpoint protein which prevents progression of the cell cycle upon detection of a defect in mitotic spindle integrity.\ ' '485' 'IPR000536' '\

    Steroid or nuclear hormone receptors constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. The receptors function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner. Nuclear hormone receptors consist of a highly conserved DNA-binding domain that recognises specific sequences (), connected via a linker region to a C-terminal ligand-binding domain. In addition, certain nuclear hormone receptors have an N-terminal modulatory domain (). The ligand-binding domain acts in response to ligand binding, which caused a conformational change in the receptor to induce a response, thereby acting as a molecular switch to turn on transcriptional activity PUBMED:14973393. For example, after binding of the glucocorticoid receptor to the corticosteroid ligand, the receptor is induced to perform functions ranging from nuclear translocation, oligomerisation, cofactor/kinase/transcription factor association, and DNA binding PUBMED:15193451. The ligand-binding domain is a flexible unit, where the binding of a ligand stabilises its conformation, which in turn favours coactivator binding to modify receptor activity PUBMED:15661830; the coactivator can bind to the activator function 2 (AF2) site at the C-terminal end of the ligand-binding domain PUBMED:15728727. The binding of different ligands can alter the conformation of the ligand-binding domain, which ultimately affects the DNA-binding specificity of the DNA-binding domain. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components. This entry represents the C-terminal ligand-binding domain.

    \ ' '486' 'IPR004925' '\ HpaB encodes part of the 4-hydroxyphenylacetate 3-hydroxylase from Escherichia coli PUBMED:8077235. HpaB is part of a heterodimeric\ enzyme that also requires HpaC. The enzyme is NADH-dependent and uses FAD as the redox chromophore. This family also includes\ PvcC, which may play a role in one of the proposed hydroxylation steps of pyoverdine chromophore biosynthesis PUBMED:10383985.\ ' '487' 'IPR000861' '\

    The HR1 repeat was first described as a three times repeated homology region of the N-terminal non-catalytic part of protein kinase PRK1(PKN) PUBMED:7851406. The first two of these repeats were later shown to bind the small G protein rho PUBMED:8647255, PUBMED:9446575 known to activate PKN in its GTP-bound form. Similar rho-binding domains also occur in a number of other protein kinases and in the rho-binding proteins rhophilin and rhotekin. Recently, the structure of the N-terminal HR1 repeat complexed with RhoA has been determined by X-ray crystallography PUBMED:10619026. It forms an antiparallel coiled-coil fold termed an ACC finger.

    \

    This entry includes domains found within rho-associated protein kinases.

    \ ' '488' 'IPR002121' '\ The HRDC (Helicase and RNase D C-terminal) domain has a putative role in nucleic acid binding. Mutations in the HRDC domain associated with the human BLM gene result in Bloom Syndrome (BS), an autosomal recessive disorder characterised by proportionate pre- and postnatal growth deficiency; sun-sensitive, telangiectatic, hypo- and hyperpigmented skin; predisposition to malignancy; and chromosomal instability PUBMED:9397680.\ ' '489' 'IPR005578' '\ This family includes a number of eukaryotic proteins. It is an integral membrane protein, conserved in at least 1 copy in all sequenced eukaryotes. The gene name in Schizosaccharomyces pombe (Fission yeast) is hrf1+ for Heavy metal Resistance Factor 1.\ ' '490' 'IPR001879' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The secretin-like GPCRs include secretin PUBMED:1646711, calcitonin PUBMED:1658940, parathyroid hormone/parathyroid hormone-related peptides PUBMED:1658941 and vasoactive intestinal peptide PUBMED:1314625, all of which activate adenylyl cyclase and the phosphatidyl-inositol-calcium pathway. These receptors contain seven transmembrane regions, in a manner reminiscent of the rhodopsins and other receptors believed to interact with G-proteins (however there is no significant sequence identity between these families, the secretin-like receptors thus bear their own unique \'7TM\' signature). Their N-terminus is probably located on the extracellular side of the membrane and potentially glycosylated. This N-terminal region contains a long conserved region which allow the binding of large peptidic ligand such as glucagon, secretin, VIP and PACAP; this region contains five conserved cysteines residues which could be involved in disulphide bond. The C-terminal region of these receptor is probably cytoplasmic. Every receptor gene in this family is encoded on multiple exons, and several of these genes are alternatively spliced to yield functionally distinct products.

    \

    This domain is found in the extracellular part of some of the secretin-like (family 2) GPCRs including the calcitonin receptor; corticotropin releasing factor receptor 1; diuretic hormone receptor; glucagon-like peptide 1 receptor; and parathyroid hormone peptide receptor.

    \ ' '492' 'IPR001387' '\

    This is large family of DNA binding helix-turn helix proteins that include a bacterial plasmid copy control protein, bacterial methylases, various bacteriophage transcription control proteins and a vegetative specific protein from Dictyostelium discoideum (Slime mould).

    \ ' '493' 'IPR002145' '\

    CopG, also known as RepA, is responsible for the regulation of plasmid copy number. It binds to the repAB promoter and controls synthesis of the plasmid replication initiator protein RepB. Many bacterial transcription regulation proteins bind DNA through a \'helix-turn-helix\' motif, nevertheless CopG displays a fully defined HTH-motif structure that is involved not in DNA-binding, but in the maintenance of the intrinsic dimeric functional structure and cooperativity PUBMED:9714164, PUBMED:9857196.

    \ ' '494' 'IPR002197' '\

    The Factor for Inversion Stimulation (FIS) protein is a regulator of\ bacterial functions, and binds specifically to weakly related DNA sequences \ PUBMED:7536730,PUBMED:11123690. It activates ribosomal RNA transcription, and is involved in upstream\ activation of rRNA promoters. The\ protein has been shown to play a role in the regulation of virulence factors\ in both Salmonella typhimurium and Escherichia coli PUBMED:11532124. Some of its\ functions include inhibition of the initiation of DNA replication from the\ OriC site, and promotion of Hin-mediated DNA inversion.

    \ \

    \ In its C-terminal extremity, FIS encodes a helix-turn-helix (HTH) DNA-\ binding motif, which shares a high degree of similarity with other HTH\ motifs of more primitive bacterial transcriptional regulators, such as the\ nitrogen assimilation regulatory proteins (NtrC) from species like Azobacter,\ Rhodobacter and Rhizobium. This has led to speculation that both evolved\ from a single common ancestor PUBMED:9738943.

    \ \

    \ The 3-dimensional structure of the E. coli FIS DNA-binding protein has been\ determined by means of X-ray diffraction to 2.0A resolution PUBMED:1619650,PUBMED:11183780. FIS is\ composed of four alpha-helices tightly intertwined to form a globular dimer\ with two protruding HTH motifs. The 24 N-terminal amino acids are poorly \ defined, indicating that they might act as \'feelers\' suitable for DNA or\ protein (invertase) recognition PUBMED:1619650. Other proteins belonging to this subfamily include:

    \ \ ' '495' 'IPR007150' '\ Hus1, Rad1, and Rad9 are three evolutionarily conserved proteins required for checkpoint control in fission yeast. These proteins are known to form a stable complex in vivo PUBMED:11739777. Hus1-Rad1-Rad9 complex may form a PCNA-like ring structure, and could function as a sliding clamp during checkpoint control.\ ' '496' 'IPR001494' '\

    The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.

    \

    Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins PUBMED:12372823, PUBMED:17161424, which is important for importin-beta mediated transport.

    \

    Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. As a result, the N-terminal auto-inhibitory region on importin-alpha is free to loop back and bind to the major NLS-binding site, causing the cargo to be released PUBMED:17170104. There are additional release factors as well.

    \

    More information about these proteins can be found at Protein of the Month: Importins PUBMED:.

    \ \ ' '497' 'IPR002867' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a cysteine-rich (C6HC) zinc finger domain that is present in Triad1, and which is conserved in other proteins encoded by various eukaryotes. The C6HC consensus pattern is:

    \
    \
    C-x(4)-C-x(14-30)-C-x(1-4)-C-x(4)-C-x(2)-C-x(4)-H-x(4)-C\
    
    \

    The C6HC zinc finger motif is the fourth family member of the zinc-binding RING, LIM, and LAP/PHD fingers. Strikingly, in most of the proteins the C6HC domain is flanked by two RING finger structures . The novel C6HC motif has been called DRIL (double RING finger linked). The strong conservation of the larger tripartite TRIAD (twoRING fingers and DRIL) structure indicates that the three subdomains are functionally linked and identifies a novel class of proteins PUBMED:10422847.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '498' 'IPR014757' '\

    The many bacterial transcription regulation proteins which bind DNA through a\ \'helix-turn-helix\' motif can be classified into subfamilies on the basis of\ sequence similarities. One of these subfamilies, called \'iclR\', groups several proteins including:\ \

    \

    \ \

    These proteins have\ a Helix-Turn-Helix motif at the N-terminus that is similar to that of other DNA-binding proteins PUBMED:1840643.

    \ ' '499' 'IPR006847' '\ This region is found in the N-terminal half of translation initiation factor IF-2. It is found in two copies in IF-2 alpha isoforms, and in only one copy in the N-terminally truncated beta and gamma isoforms PUBMED:1764105. Its function is unknown.\ ' '500' 'IPR019814' '\

    Initiation factor 3 (IF-3) (gene infC) is one of the three factors required for the \ initiation of protein biosynthesis in bacteria. IF-3 is thought to function as a \ fidelity factor during the assembly of the ternary initiation complex which consist of \ the 30S ribosomal subunit, the initiator tRNA and the messenger RNA. IF-3 is a basic\ protein that binds to the 30S ribosomal subunit PUBMED:8405963. The chloroplast initiation factor IF-3(chl) is a protein that \ enhances the poly(A,U,G)-dependent binding of the initiator tRNA to chloroplast ribosomal\ 30s subunits in which the central section is evolutionary related to the sequence of \ bacterial IF-3 PUBMED:8144528.

    \ ' '501' 'IPR007701' '\ Interferon-related developmental regulator (IFRD1) is the human homologue of the Rattus norvegicus early response protein PC4 and its murine homologue TIS7 PUBMED:9050919. The exact function of IFRD1 is unknown but it has been shown that PC4 is necessary for muscle differentiation and that it might have a role in signal transduction. This entry also contains IFRD2 and its murine equivalent SKMc15, which are highly expressed soon after gastrulation and in the hepatic primordium, suggesting an involvement in early hematopoiesis PUBMED:9722946.\ ' '502' 'IPR001126' '\

    In Escherichia coli, UV and many chemicals appear to cause mutagenesis by a process of translesion synthesis that requires DNA polymerase III and the SOS-regulated proteins UmuD, UmuC and RecA. This machinery allows the replication to continue through DNA lesion, and therefore avoid lethal interruption of DNA replication after DNA damage PUBMED:9560379. UmuC is a well conserved protein in prokaryotes, with a homologue in yeast species.

    \ \

    Proteins currently known to belong to this family are listed below:\

    \

    \ ' '503' 'IPR006921' '\

    This domain, primarily C-terminal, is found in a family of proteins thought to be involved in regulating gene activity in the proliferative and/or differentiative pathways induced by NGF PUBMED:9722946.

    \ ' '504' 'IPR006849' '\ Members of this family are components of the elongator multi-subunit component of a novel RNA polymerase II holoenzyme for transcriptional elongation PUBMED:10024884.\ ' '505' 'IPR001093' '\ Synonym(s): Inosine-5\'-monophosphate dehydrogenase, Inosinic acid dehydrogenase; \ Synonym(s): Guanosine 5\'-monophosphate oxidoreductase \ \ \

    This entry contains two related enzymes IMP dehydrogenase and GMP reducatase. These enzymes adopt a TIM barrel structure.

    \ \

    IMP dehydrogenase () (IMPDH) catalyzes the rate-limiting reaction of de novo GTP biosynthesis, the NAD-dependent reduction of IMP into XMP PUBMED:2902093.\ \ \ \ IMP dehydrogenase is associated with cell proliferation and is a possible target for cancer chemotherapy. Mammalian and bacterial IMPDHs are tetramers of identical chains. There are two IMP dehydrogenase isozymes in humans PUBMED:1969416. IMP dehydrogenase nearly always contains a long insertion that has two CBS domains within it.

    \ \

    GMP reductase () catalyzes the irreversible and NADPH-dependent reductive deamination of GMP into IMP PUBMED:2904262.\ \ \ \ It converts nucleobase, nucleoside and nucleotide derivatives of G to A nucleotides, and maintains intracellular balance of A and G nucleotides.

    \ ' '506' 'IPR005635' '\

    This region of the inner centromere protein has been found to be necessary and sufficient for binding to aurora-related kinase. This interaction has been implicated in the coordination of chromosome segregation with cell division in yeast.

    \ ' '507' 'IPR007306' '\

    This enzyme () modifies exclusively the initiator tRNA in position 64 using 5\'-phosphoribosyl-1\'-pyrophosphate as the modification donor. As the initiator tRNA participates both in the initiation and elongation of translation, the 2\'-O-ribosyl phosphate modification discriminates the initiator tRNAs from the elongator tRNAs. \

    \ ' '508' 'IPR003308' '\

    Retroviral integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains: an N-terminal zinc binding domain, a central catalytic core and a C-terminal DNA-binding domain PUBMED:11743009, PUBMED:11101216. Often found as part of the POL polyprotein.

    \ ' '509' 'IPR005821' '\

    This group of proteins is found in sodium, potassium, and calcium ion channels proteins. The proteins have 6 transmembrane helices in which the last two helices flank a loop which determines ion selectivity. In some Na channels proteins the domain is repeated four times, whereas in others (e.g. K channels) the protein forms a tetramer in the membrane. A bacterial structure of the protein is known for the last two helices but is not included in the Pfam family due to it lacking the first four helices.

    \ ' '510' 'IPR005522' '\

    ArgRIII has been demonstrated to be an inositol polyphosphate kinase PUBMED:10574768 which catalyses the reaction\ \ .

    \ ' '511' 'IPR000048' '\

    Calmodulin (CaM) is recognised as a major calcium sensor and orchestrator of regulatory events through its interaction with a diverse group of cellular proteins. Three classes of recognition motifs exist for many of the known CaM binding proteins; the IQ motif as a consensus for Ca2+-independent binding and two related motifs for Ca2+-dependent binding, termed\ 18-14 and 1-5-10 based on the position of conserved hydrophobic residues PUBMED:9141499.

    \ \

    The regulatory domain of scallop myosin is a three-chain protein complex that\ switches on this motor in response to Ca2+ binding. Side-chain interactions link the two light chains in tandem to adjacent segments of the heavy chain bearing the IQ-sequence motif. The Ca2+-binding site is a novel EF-hand motif on the essential light chain and is stabilised by linkages involving the heavy chain and both light chains, accounting for the requirement of all three chains for Ca2+ binding and regulation in the intact myosin molecule PUBMED:8127365.

    \ ' '512' 'IPR013521' '\

    Inwardly-rectifying potassium channels (Kir) are the principal class of two-TM domain potassium channels. They are characterised by the property of inward-rectification, which is described as the ability to allow large inward currents and smaller outward currents. Inwardly rectifying potassium channels (Kir) are responsible for regulating diverse processes including: cellular excitability, vascular tone, heart rate, renal salt flow, and insulin release PUBMED:10102275. To date, around twenty members of this superfamily have been cloned, which can be grouped into six families by sequence similarity, and these are designated Kir1.x-6.x PUBMED:7580148, PUBMED:10449331.

    \

    Cloned Kir channel cDNAs encode proteins of between ~370-500 residues, both N- and C-termini are thought to be cytoplasmic, and the N-terminus lacks a signal sequence. Kir channel alpha subunits possess only 2TM domains linked with a P-domain. Thus, Kir channels share similarity with the fifth and sixth domains, and P-domain of the other families. It is thought that four Kir subunits assemble to form a tetrameric channel complex, which may be hetero- or homomeric PUBMED:10102275.

    \

    Potassium channels are the most diverse group of the ion channel family\ PUBMED:1772658, PUBMED:1879548. They are important in shaping the action potential, and in neuronal excitability and plasticity PUBMED:2451788. The potassium channel family is\ composed of several functionally distinct isoforms, which can be broadly\ separated into 2 groups PUBMED:2555158: the practically non-inactivating \'delayed\' group and the rapidly inactivating \'transient\' group.

    \

    These are all highly similar proteins, with only small amino acid\ changes causing the diversity of the voltage-dependent gating mechanism,\ channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or\ other second messengers PUBMED:2448635. In eukaryotic cells, K+ channels\ are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes PUBMED:1373731. In prokaryotic cells, they play a role in the\ maintenance of ionic homeostasis PUBMED:11178249.

    \

    All K+ channels discovered so far possess a core of \ alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has\ been termed the K+ selectivity sequence.\ In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane.\ However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains.\ The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK)\ PUBMED:11178249, PUBMED:. The 2TM domain family comprises inward-rectifying K+ \ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.

    \ ' '513' 'IPR002404' '\ Insulin receptor substrate-1 proteins contain both a pleckstrin homology\ domain and a phosphotyrosine binding (PTB) domain. These domains facilitate \ interaction with the activated tyrosine-phosphorylated insulin receptor.\ The PTB domain is situated towards the N terminus. Two arginines in this domain are responsible for\ hydrogen bonding phosphotyrosine residues on a Ac-LYASSNPApY-NH2 peptide\ in the juxtamembrane region of the insulin receptor. Further interactions\ via \'bridged\' water molecules are coordinated by residues an Asn and a Ser residue\ PUBMED:8646778.\

    The PTB domain has a compact, 7-stranded beta-sandwich structure, capped by\ a C-terminal helix. The substrate peptide fits into an L-shaped surface\ cleft formed from the C-terminal helix and strands 5 and 6 PUBMED:8599766.

    \ ' '514' 'IPR004193' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \ Enzymes containing this domain belong to family 13 () of the glycosyl hydrolases. This domain is found in a range of enzymes that act on branched substrates ie. isoamylase, pullulanase and\ branching enzyme. Isoamylase hydrolyses 1,6-alpha-D-glucosidic branch linkages in glycogen, amylopectin and\ dextrin; 1,4-alpha-glucan branching enzyme functions in the formation of 1,6-glucosidic linkages of glycogen; and\ pullulanase is a starch-debranching enzyme.\ ' '515' 'IPR000868' '\ This is a family of hydrolase enzymes. Isochorismatase, also known as 2,3 dihydro-2,3 dihydroxybenzoate\ synthase catalyses the conversion of isochorismate, in the presence of water, to 2,3-dihydroxybenzoate\ and pyruvate.\ ' '516' 'IPR013129' '\ Jumonji protein is required for neural tube formation in mice PUBMED:7758946.There is evidence of domain swapping within the jumonji family of transcription factors PUBMED:10838566. This domain is often associated with jmjN (see ) and belongs to the Cupin superfamily PUBMED:10838566.\ ' '517' 'IPR006155' '\ Human genes containing triplet repeats can markedly expand in length, leading\ to neuropsychiatric disease. Expansion of triplet repeats explains the\ phenomenon of anticipation, i.e. the increasing severity or earlier age of\ onset in successive generations in a pedigree PUBMED:8325628.\ A novel gene containing CAG repeats has been identified and mapped to\ chromosome 14q32.1, the genetic locus for Machado-Joseph disease (MJD).\ Normally, the gene contains 13-36 CAG repeats, but most clinically diagnosed\ patients and all affected members of a family with the clinical and \ pathological diagnosis of MJD show expansion of the repeat number, from \ 68-79 PUBMED:7874163. Similar abnormalities in related genes may give rise to diseases\ similar to MJD. \ MJD is a neurodegenerative disorder characterised by cerebellar ataxia, \ pyramidal and extra-pyramidal signs, peripheral nerve palsy, external \ ophtalmoplegia, facial and lingual fasciculation and bulging. The disease\ is autosomal dominant, with late onset of symptoms, generally after the\ fourth decade.\ ' '518' 'IPR003131' '\

    Potassium channels are the most diverse group of the ion channel family\ PUBMED:1772658, PUBMED:1879548. They are important in shaping the action potential, and in neuronal excitability and plasticity PUBMED:2451788. The potassium channel family is\ composed of several functionally distinct isoforms, which can be broadly\ separated into 2 groups PUBMED:2555158: the practically non-inactivating \'delayed\' group and the rapidly inactivating \'transient\' group.

    \

    These are all highly similar proteins, with only small amino acid\ changes causing the diversity of the voltage-dependent gating mechanism,\ channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or\ other second messengers PUBMED:2448635. In eukaryotic cells, K+ channels\ are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes PUBMED:1373731. In prokaryotic cells, they play a role in the\ maintenance of ionic homeostasis PUBMED:11178249.

    \

    All K+ channels discovered so far possess a core of \ alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has\ been termed the K+ selectivity sequence.\ In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane.\ However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains.\ The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK)\ PUBMED:11178249, PUBMED:. The 2TM domain family comprises inward-rectifying K+ \ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.

    \

    The Kv family can be divided into several subfamilies on the basis of sequence similarity and function. Four of these subfamilies, Kv1 (Shaker), Kv2 (Shab), Kv3 (Shaw) and Kv4 (Shal), consist of pore-forming alpha subunits that associate with different types of beta subunit. Each alpha subunit comprises six hydrophobic TM domains with a P-domain between the fifth and sixth, which partially resides in the membrane. The fourth TM domain has positively charged residues at every third residue and acts as a voltage sensor, which triggers the conformational change that opens the channel pore in response to a displacement in membrane potential PUBMED:10712896. More recently, 4 new electrically-silent alpha subunits have been cloned: Kv5 (KCNF), Kv6 (KCNG), Kv8 and Kv9 (KCNS). These subunits do not themselves possess any functional activity, but appear to form heteromeric channels with Kv2 subunits, and thus modulate Shab channel activity PUBMED:9305895. When highly expressed, they inhibit channel activity, but at lower levels show more specific modulatory actions.

    \

    The N-terminal, cytoplasmic tetramerization domain (T1) of voltage-gated potassium channels encodes molecular determinants for subfamily-specific assembly of alpha-subunits into functional tetrameric channels PUBMED:9886290. This domain is found in a subset of a larger group of proteins that contain the BTB/POZ domain.

    \ ' '519' 'IPR003855' '\ This is a family of K+ potassium transporters that are conserved across phyla, having both bacterial (KUP) PUBMED:8226635, yeast (HAK) PUBMED:7621817, and plant (AtKT) PUBMED:9350997 sequences as members.\ ' '520' 'IPR002777' '\

    Prefoldin (PFD) is a chaperone that interacts exclusively with type II chaperonins, hetero-oligomers lacking an obligate co-chaperonin that are found only in eukaryotes (chaperonin-containing T-complex polypeptide-1 (CCT)) and archaea. Eukaryotic PFD is a multi-subunit complex containing six polypeptides in the molecular mass range of 14-23 kDa. In archaea, on the other hand, PFD is composed of two types of subunits, two alpha and four beta. The six subunits associate to form two back-to-back up-and-down eight-stranded barrels, from which hang six coiled coils. Each subunit contributes one (beta subunits) or two (alpha subunits) beta hairpin turns to the barrels. The coiled coils are formed by the N and C termini of an individual subunit. Overall, this unique arrangement resembles a jellyfish. The eukaryotic PFD hexamer is composed of six different subunits; however, these can be grouped into two alpha-like (PFD3 and -5) and four beta-like (PFD1, -2, -4, and -6) subunits based on amino acid sequence similarity with their archaeal counterparts. Eukaryotic PFD has a six-legged structure similar to that seen in the archaeal homologue PUBMED:11106732, PUBMED:12456645. This family contains the archaeal beta subunit, eukaryotic prefoldin subunits 1, 2, 4 and 6.

    \ \

    Eukaryotic PFD has been shown to bind both actin and tubulin co-translationally. The chaperone then delivers the target protein to CCT, interacting with the chaperonin through the tips of the coiled coils. No authentic target proteins of any archaeal PFD have been identified, to date.

    \ ' '521' 'IPR006652' '\

    Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified PUBMED:8453663. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP PUBMED:8453663 and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin PUBMED:7593276, PUBMED:7822422, and in galactose oxidase from the fungus Dactylium dendroides PUBMED:8126718, PUBMED:2002850. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold PUBMED:8182749.

    \ \

    The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila PUBMED:7593276. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase PUBMED:8126718.

    \ \

    This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.

    \ ' '522' 'IPR018111' '\

    The K homology (KH) domain was first identified in the human heterogeneous\ nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acids\ that is present in a wide variety of quite diverse nucleic acid-binding\ proteins PUBMED:8036511. It has been shown to bind RNA PUBMED:9302998, PUBMED:10369774. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitro RNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently PUBMED:8036511.

    \

    According to structural PUBMED:9302998, PUBMED:10369774, PUBMED:11160884 analysis the KH domain can be separated in two groups. The first group or type-1 contain a beta-alpha-alpha-beta-beta-alpha structure, whereas in the type-2 the two last beta-sheet are located in the N-terminal part of the domain (alpha-beta-beta-alpha-alpha-beta). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helice 1 and 2 in type-1 and between helice 2 and 3 in type-2. Proteins known to contain a type-1 KH domain include bacterial polyribonucleotide nucleotidyltransferases (); vertebrate fragile X mental retardation protein 1 (FMR1); eukaryotic heterogeneous nuclear ribonucleoprotein K (hnRNP K), one of at least 20 major proteins that are part of hnRNP particles in mammalian cells; mammalian poly(rC) binding proteins; Artemia salina glycine-rich protein GRP33; yeast PAB1-binding protein 2 (PBP2); vertebrate vigilin; and human high-density lipoprotein binding protein (HDL-binding protein).

    \

    More information about these proteins can be found at Protein of the Month: RNA Exosomes PUBMED:.

    \ ' '523' 'IPR014030' '\

    Beta-ketoacyl-ACP synthase (KAS) PUBMED:3076376 is the enzyme that catalyzes\ the condensation of malonyl-ACP with the growing fatty acid chain. It is found as a component\ of a number of enzymatic systems, including fatty acid synthetase (FAS), which catalyzes the\ formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH; the \ multi-functional 6-methysalicylic acid synthase (MSAS) from Penicillium patulum PUBMED:2209605, which is\ involved in the biosynthesis of a polyketide antibiotic; polyketide antibiotic synthase enzyme\ systems; Emericella nidulans multifunctional protein Wa, which is involved in the biosynthesis\ of conidial green pigment; Rhizobium nodulation protein nodE, which probably acts as a \ beta-ketoacyl synthase in the synthesis of the nodulation Nod factor fatty acyl chain; and yeast\ mitochondrial protein CEM1. The condensation reaction is a two step process, first the acyl\ component of an activated acyl primer is transferred to a cysteine residue of the enzyme and\ is then condensed with an activated malonyl donor with the concomitant release of carbon\ dioxide.

    \ \

    This entry represents the N-terminal domain of beta-ketoacyl-ACP synthases.

    \ ' '524' 'IPR014031' '\

    Beta-ketoacyl-ACP synthase (KAS) PUBMED:3076376 is the enzyme that catalyzes\ the condensation of malonyl-ACP with the growing fatty acid chain. It is found as a component\ of a number of enzymatic systems, including fatty acid synthetase (FAS), which catalyzes the\ formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH; the \ multi-functional 6-methysalicylic acid synthase (MSAS) from Penicillium patulum PUBMED:2209605, which is\ involved in the biosynthesis of a polyketide antibiotic; polyketide antibiotic synthase enzyme\ systems; Emericella nidulans multifunctional protein Wa, which is involved in the biosynthesis\ of conidial green pigment; Rhizobium nodulation protein nodE, which probably acts as a \ beta-ketoacyl synthase in the synthesis of the nodulation Nod factor fatty acyl chain; and yeast\ mitochondrial protein CEM1. The condensation reaction is a two step process, first the acyl\ component of an activated acyl primer is transferred to a cysteine residue of the enzyme and\ is then condensed with an activated malonyl donor with the concomitant release of carbon\ dioxide.

    \ \

    This entry represents the C-terminal domain of beta-ketoacyl-ACP synthases. The active site is contained in a cleft betweeen N- and C-terminal domains, with residues from both domains contributing to substrate binding and catalysis PUBMED:11152607.

    \ ' '525' 'IPR001752' '\

    Kinesin PUBMED:8542443, PUBMED:2142876, PUBMED:14732151 is a microtubule-associated force-producing protein that may play a role in organelle transport. The kinesin motor activity is directed toward the microtubule\'s plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.

    \

    The heavy chain is composed of three structural domains: a large globular N-terminal domain which is responsible for the motor activity of kinesin (it is known to hydrolyse ATP, to bind and move on microtubules), a central alpha-helical coiled coil domain that mediates the heavy chain dimerisation; and a small globular C-terminal domain which interacts with other proteins (such as the kinesin light chains), vesicles and membranous organelles.

    \

    A number of proteins have been recently found that contain a domain similar to that of the kinesin \'motor\' domain PUBMED:8542443, PUBMED:1832505:\

    \

    The kinesin motor domain is located in the N-terminal part of most of the above proteins, with the exception of KAR3, klpA, and ncd where it is located in the C-terminal section.

    \

    The kinesin motor domain contains about 330 amino acids. An ATP-binding motif of type A is found near position 80 to 90, the C-terminal half of the domain is involved in microtubule-binding.

    \ ' '526' 'IPR005824' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and the bacterial transcription antitermination proteins NusG PUBMED:8987397.

    \ ' '527' 'IPR001909' '\

    The Krueppel-associated box (KRAB) is a domain of around 75 amino acids that\ is found in the N-terminal part of about one third of eukaryotic Krueppel-type\ C2H2 zinc finger proteins (ZFPs) PUBMED:14519192. It is enriched in charged amino acids and can be divided into subregions A and B, which are predicted to fold into two amphipathic alpha-helices. The KRAB A and B boxes can be separated by variable spacer segments and many KRAB proteins contain only the A box PUBMED:2023909.

    \

    The functions currently known for members of the KRAB-containing protein family include transcriptional repression of RNA polymerase I, II, and III promoters, binding and splicing of RNA, and control of nucleolus function. The KRAB domain functions as a transcriptional repressor when tethered to the template DNA by a DNA-binding domain. A sequence of 45 amino acids in the KRAB A subdomain has been shown to be necessary and sufficient for transcriptional repression. The B box does not repress by itself but does potentiate the repression exerted by the KRAB A subdomain PUBMED:8183939, PUBMED:8183940. Gene silencing requires the binding of the KRAB domain to the RING-B box-coiled coil (RBCC) domain of the KAP-1/TIF1-beta corepressor. As KAP-1 binds to the heterochromatin proteins HP1, it has been proposed that the KRAB-ZFP-bound target gene could be silenced following recruitment to heterochromatin PUBMED:10653693, PUBMED:10748030.

    \

    KRAB-ZFPs probably constitute the single largest class of transcription factors within the human genome PUBMED:10360839. Although the function of KRAB-ZFPs is largely unknown, they appear to play important roles during cell differentiation and development. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B.

    \ ' '528' 'IPR007851' '\

    The Kri1 protein is also known as KRR1-interacting protein 1. The Saccharomyces cerevisiae member of this family is found to be required for the assembly of preribosomal 40S subunits in the nucleolus PUBMED:11027267. KRR1 is highly expressed in dividing cells and its expression ceases almost completely when cells enter the stationary phase.

    \ \

    This entry represents a subgroup of the KRR1 interacting protein 1.

    \ ' '529' 'IPR006164' '\

    The Ku heterodimer is composed of Ku70 and Ku80 (or Ku86), 70 kDa and 80 kDa subunits of an ATP-dependent DNA helicase, which contributes to genomic integrity through its ability to bind DNA double-stranded breaks and facilitate repair by the non-homologous end-joining pathway. This is the central DNA-binding beta-barrel domain and is found in both the Ku70 and Ku80 proteins. Ku makes only a few contacts with the sugar-phosphate backbone, and none with the DNA bases, but it fits sterically to major and minor groove contours forming a ring that encircles duplex DNA, cradling two full turns of the DNA molecule. By forming a bridge between the broken DNA ends, Ku acts to structurally support and align the DNA ends, to protect them from degradation, and to prevent promiscuous binding to unbroken DNA. Ku effectively aligns the DNA, while still allowing access of polymerases, nucleases and ligases to the broken DNA ends to promote end joining PUBMED:11483577.

    \ ' '530' 'IPR005160' '\

    The Ku heterodimer (composed of Ku70 and Ku80 ) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the C-terminal arm. This alpha helical region embraces the beta-barrel domain of the opposite subunit PUBMED:11493912.

    \ ' '531' 'IPR005161' '\

    The Ku heterodimer (composed of Ku70 and Ku80 ) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the N-terminal alpha/beta domain. This domain only makes a small contribution to the dimer interface. The domain comprises a six stranded beta sheet of the Rossman fold PUBMED:10191092.

    \ ' '532' 'IPR002223' '\

    The majority of the sequences having this domain belong to the MEROPS inhibitor family I2, clan IB; the Kunitz/bovine pancreatic trypsin inhibitor family, they inhibit proteases of the S1 family PUBMED:14705960 and are restricted to the metazoa with a single exception: Amsacta moorei entomopoxvirus. They are short (~50 residue)\ alpha/beta proteins with few secondary structures. The fold is constrained\ by 3 disulphide bonds. The type example for this family is aprotinin (bovine pancreatic trypsin inhibitor) PUBMED:1714504 (or \ basic protease inhibitor), but the family includes numerous other members\ PUBMED:1703675, PUBMED:1593645, PUBMED:8159751, PUBMED:1304909, such as snake venom basic protease; mammalian inter-alpha-trypsin\ inhibitors; trypstatin, a rodent mast cell inhibitor of trypsin; a domain\ found in an alternatively-spliced form of Alzheimer\'s amyloid beta-protein;\ domains at the C-termini of the alpha(1) and alpha(3) chains of type VII\ and type VI collagens; and tissue factor pathway inhibitor precursor.

    \ ' '533' 'IPR003973' '\

    Potassium channels are the most diverse group of the ion channel family\ PUBMED:1772658, PUBMED:1879548. They are important in shaping the action potential, and in neuronal excitability and plasticity PUBMED:2451788. The potassium channel family is\ composed of several functionally distinct isoforms, which can be broadly\ separated into 2 groups PUBMED:2555158: the practically non-inactivating \'delayed\' group and the rapidly inactivating \'transient\' group.

    \

    These are all highly similar proteins, with only small amino acid\ changes causing the diversity of the voltage-dependent gating mechanism,\ channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or\ other second messengers PUBMED:2448635. In eukaryotic cells, K+ channels\ are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes PUBMED:1373731. In prokaryotic cells, they play a role in the\ maintenance of ionic homeostasis PUBMED:11178249.

    \

    All K+ channels discovered so far possess a core of \ alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has\ been termed the K+ selectivity sequence.\ In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane.\ However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains.\ The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK)\ PUBMED:11178249, PUBMED:. The 2TM domain family comprises inward-rectifying K+ \ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.

    \

    The Kv family can be divided into several subfamilies on the basis of sequence similarity and function. Four of these subfamilies, Kv1 (Shaker), Kv2 (Shab), Kv3 (Shaw) and Kv4 (Shal), consist of pore-forming alpha subunits that associate with different types of beta subunit. Each alpha subunit comprises six hydrophobic TM domains with a P-domain between the fifth and sixth, which partially resides in the membrane. The fourth TM domain has positively charged residues at every third residue and acts as a voltage sensor, which triggers the conformational change that opens the channel pore in response to a displacement in membrane potential PUBMED:10712896. More recently, 4 new electrically-silent alpha subunits have been cloned: Kv5 (KCNF), Kv6 (KCNG), Kv8 and Kv9 (KCNS). These subunits do not themselves possess any functional activity, but appear to form heteromeric channels with Kv2 subunits, and thus modulate Shab channel activity PUBMED:9305895. When highly expressed, they inhibit channel activity, but at lower levels show more specific modulatory actions.

    \

    The Kv2 voltage-dependent potassium channels (also known as the Shab family) are responsible for much of the delayed rectifier current in Drosophila melanogaster (Fruit fly) nervous system and muscle. However, in vertebrates, Kv2 channels have been shwon to be involved in the delayed rectifier currents of the heart and skeletal muscles. They are also thought to be important in determining intrinsic neuronal excitability in both mammals and non-mammals PUBMED:15950285. Kv2 channels can be further divided into 2 subtypes, designated Kv2.1 and Kv2.2 PUBMED:.

    \

    This entry represents the voltage-dependent Kv2 potassium channels.

    \ ' '535' 'IPR014775' '\ The L27 domain is found in receptor targeting proteins Lin-2 and Lin-7, as well as some protein kinases and human MPP2 protein.\ ' '536' 'IPR003475' '\ This family of insect proteins are each about 100 amino acids long and have 6 conserved cysteine residues. They all have a predicted signal peptide and are probably excreted. The function of the proteins is unknown PUBMED:8568884.\ ' '537' 'IPR001279' '\ Apart from the beta-lactamases a number of other proteins contain this domain \ PUBMED:7588620. These proteins include thiolesterases, members of the glyoxalase II family,\ that catalyse the hydrolysis of S-D-lactoyl-glutathione to form glutathione and \ D-lactic acid and a competence protein that is essential for natural transformation in \ Neisseria gonorrhoeae and could be a transporter involved in DNA uptake. Except for the \ competence protein these proteins bind two zinc ions per molecule as cofactor.\ ' '538' 'IPR003804' '\ L-lactate permease is an integral membrane protein probably involved in L-lactate transport.\ ' '539' 'IPR006634' '\

    TLC is a protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p\ are essential for\ acyl-CoA-dependent ceramide synthesis PUBMED:11694577, TRAM is a subunit\ of the translocon and the CLN8\ gene is mutated in Northern epilepsy syndrome. Proteins containing this domain may possess\ multiple functions such as\ lipid trafficking, metabolism, or sensing. Trh homologues possess additional\ homeobox domains PUBMED:9872981.

    \ ' '540' 'IPR004860' '\ This is a family of site-specific DNA endonucleases encoded by DNA mobile elements. Similar to the homing endonuclease LAGLIDADG/HNH domain (), the members of this family are also LAGLIDADG endonucleases. \ ' '541' 'IPR007000' '\

    Phospholipase B (PLB) catalyses the hydrolytic cleavage of both acylester bonds of glycerophospholipids. This family of PLB enzymes has been identified in mammals, flies and nematodes but not in yeast PUBMED:8892229. In Drosophila this protein was named LAMA for laminin ancestor since it is expressed in the neuronal and glial precursors that surround the lamina PUBMED:15193148.

    \ ' '542' 'IPR007174' '\ Las1 is an essential nuclear protein involved in cell morphogenesis and cell surface growth PUBMED:8582632.\ ' '543' 'IPR002172' '\

    Low density lipoprotein (LDL) is the major cholesterol-carrying lipoprotein of plasma. The LDL receptor binds LDL and transports it into cells by endocytosis. In order to be internalised, the receptor-ligand complex must first cluster into clathrin-coated pits. Seven successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multi-domain membrane protein. The LDL receptor is closely related in structure to several other recpetors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins PUBMED:17457719.

    \

    The LDL receptor class A domain contains 6 disulphide-bound cysteines and a highly conserved cluster of negatively charged amino acids, of which many are clustered on one face of the module PUBMED:7603991. In LDL receptors, the class A domains form the binding site for LDL and calcium. The acidic residues between the fourth and sixth cysteines are important for high-affinity binding of positively charged sequences in LDLR\'s ligands. The repeat consists of a beta-hairpin structure followed by a series of beta turns. In the absence of calcium, LDL-A domains are unstructured; the bound calcium ion imparts structural integrity. Following these repeats is a 350 residue domain that resembles part of the epidermal growth factor (EGF) precursor. Numerous familial hypercholestorolemia mutations of the LDL receptor alter the calcium coordinating residue of LDL-A domains or other crucial scaffolding residues.\

    \ ' '544' 'IPR004238' '\ Different types of late embryogenesis abundant (LEA) proteins are expressed at different stages of late embryogenesis in higher plant seed embryos and\ under conditions of dehydration stress. They may be induced by abscisic acid. This domain may be repeated several times in these proteins whose function is unknown.\ ' '546' 'IPR007149' '\ Members of this family are part of the Paf1/RNA polymerase II complex PUBMED:11927560, PUBMED:11884586. The Paf1 complex probably functions during the elongation phase of transcription PUBMED:11927560.\ ' '547' 'IPR001320' '\ The ability of synapses to modify their synaptic strength in response to activity is a fundamental property of the nervous system and may be an essential component of learning and memory. There are three classes of ionotropic glutamate receptor, namely NMDA (N-methyl-D-aspartate), AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazole-4-propionic\ acid) and kainate receptors. They are believed to play critical roles in synaptic plasticity. At many synapses in the brain, transient activation of NMDA receptors leads to a persistent modification in the strength of synaptic transmission mediated by AMPA receptors and kainate receptors can act as the induction trigger for long-term changes in synaptic transmission PUBMED:10580501.\ ' '548' 'IPR004183' '\

    Dioxygenases catalyse the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms. Cleavage of aromatic rings is one of the most important functions of dioxygenases, which play key roles in the degradation of aromatic compounds. The substrates of ring-cleavage dioxygenases can be classified into two groups according to the mode of scission of the aromatic ring. Intradiol enzymes () use a non-haem Fe(III) to cleave the aromatic ring between two hydroxyl groups (ortho-cleavage), whereas extradiol enzymes use a non-haem Fe(II) to cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon (meta-cleavage) PUBMED:10730195, PUBMED:15264822. These two subfamilies differ in sequence, structural fold, iron ligands, and the orientation of second sphere active site amino acid residues. Extradiol dioxygenases are usually homo-multimeric, bind one atom of ferrous ion per subunit and have a subunit size of about 33 kDa. Extradiol dioxygenases can be divided into three classes. Class I and II enzymes () show sequence similarity, with the two-domain class II enzymes having evolved from a class I enzyme through gene duplication. Class III enzymes are different in sequence and structure, but they do share several common active-site characteristics with the class II enzymes, in particular the coordination sphere and the disposition of the putative catalytic base are very similar. Class III enzymes usually have two subunits, designated A and B. This entry represents the extradiol dioxygenase class III enzymes, subunit B.

    \

    Enzymes that belong to the extradiol class III family include Protocatechuate 4,5-dioxygenase (4,5-PCD; LigAB) () PUBMED:10467151, of which LigB is represented by this entry; and 2\'-aminobiphenyl-2,3-diol 1,2-dioxygenase (CarBaBb) PUBMED:12728990, of which CarBb is represented by this entry.

    \ \ ' '549' 'IPR005592' '\

    This N-terminal region is found in a family of mono- and diacylglycerol lipases.

    \ ' '550' 'IPR002921' '\

    Triglyceride lipases are lipolytic enzymes that hydrolyse ester linkages of\ triglycerides PUBMED:3147715. Lipases are widely distributed in animals, plants and prokaryotes. This family of lipases have been called Class 3 as they are not closely related to other lipase families.

    \ ' '551' 'IPR007651' '\ Mutations in the lipin gene lead to fatty liver dystrophy in mice. The protein has been shown to be phosphorylated by the TOR Ser/Thr protein kinases in response to insulin stimulation. The conserved region is found at the N terminus of the member proteins PUBMED:11138012, PUBMED:11792863.\ ' '552' 'IPR004872' '\

    This family of bacterial lipoproteins contains several antigenic members, that may be involved in bacterial virulence. Their precise function is unknown. However they are probably distantly related to which are solute binding proteins.

    \ \ ' '553' 'IPR003111' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This signature defines the N-terminal domain of the archael, bacterial and eukaryotic lon proteases, which are ATP-dependent serine peptidases belonging to the MEROPS peptidase family S16 (lon protease family, clan SF). In the eukaryotes the majority of the proteins are located in the mitochondrial matrix PUBMED:8248235, PUBMED:9620272. In yeast, Pim1, is located in the mitochondrial matrix, is required for mitochondrial function, is constitutively expressed but is increased after thermal stress, suggesting that Pim1 may play a role in the heat shock response PUBMED:8276800.

    \ ' '554' 'IPR001611' '\

    Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively PUBMED:10357231). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.

    \

    Reaction of amidotransferase domain:

    \ \ \

    Reactions of FMN-binding domain:

    \ \ \ ' '555' 'IPR000483' '\

    Leucine-rich repeats (LRR, see ) consist of 2-45 motifs of 20-30 amino acids in length that generally folds into an arc or horseshoe shape PUBMED:14747988. LRRs occur in proteins ranging from viruses to eukaryotes, and appear to provide a structural framework for the formation of protein-protein interactions PUBMED:11751054. Proteins containing LRRs include tyrosine kinase receptors, cell-adhesion molecules, virulence factors, and extracellular matrix-binding glycoproteins, and are involved in a variety of biological processes, including signal transduction, cell adhesion, DNA repair, recombination, transcription, RNA processing, disease resistance, apoptosis, and the immune response.

    \ \

    LRRs are often flanked by cysteine-rich domains: an N-terminal LRR domain () and a C-terminal LRR domain. This entry represents the C-terminal LRR domain.

    \ \ ' '556' 'IPR007307' '\

    The low-temperature viability protein LTV1 was identified in Saccharomyces cerevisiae, the exact function of this protein is unknown.

    \ ' '557' 'IPR001795' '\

    RNA-directed RNA polymerase (RdRp) () is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage PUBMED:2759231, PUBMED:8709232. It catalyses synthesis of the RNA strand complementary to a given RNA template, but the precise molecular mechanism remains unclear.\ The postulated RNA replication process is a two-step mechanism. First, the initiation step of RNA synthesis begins at or near the 3\' end of the RNA template by means of a primer-independent (de novo) mechanism. The de novo initiation consists in the addition of a nucleotide tri-phosphate (NTP) to the 3\'-OH of the first initiating NTP. During the following so-called elongation phase, this nucleotidyl transfer reaction is repeated with subsequent NTPs to generate the complementary RNA product PUBMED:11531403.

    \

    All the RNA-directed RNA polymerases, and many DNA-directed polymerases, employ a fold whose organisation has been likened to the shape of a right hand with three subdomains termed fingers, palm and thumb PUBMED:9309225. Only the palm subdomain, composed of a four-stranded antiparallel beta-sheet with two alpha-helices, is well conserved among all of these enzymes. In RdRp, the palm subdomain comprises three well conserved motifs (A, B and C). Motif A (D-x(4,5)-D) and motif C (GDD) are spatially juxtaposed; the Asp residues of these motifs are implied in the binding of Mg2+ and/or Mn2+. The Asn residue of motif B is involved in selection of ribonucleoside triphosphates over dNTPs and thus determines whether RNA is synthesised rather than DNA PUBMED:10827187.\ The domain organisation PUBMED:9878607 and the 3D structure of the catalytic centre of a wide range of RdPp\'s, even those with a low overall sequence homology, are conserved. The catalytic centre is formed by several motifs containing a number of conserved amino acid residues.

    \

    There are 4 superfamilies of viruses that cover all RNA containing viruses with no DNA stage:

    \ The RNA-directed RNA polymerases in the first of the above superfamilies can be divided into the following three subgroups:\

    \ \

    The nucleotide sequence for the RNA of Potato leafroll virus (PLrV) has been determined PUBMED:2732710, PUBMED:2466700. The sequence contains six large open reading frames (ORFs). The 5\' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5\' block is translated by frameshift read through near the end of the 70K protein, yielding a 118K polypeptide PUBMED:2732710. The C-terminal part of the 118K protein contains a consensus sequence for RNA-dependent RNA-polymerases PUBMED:2732710.

    \

    The genomic RNA sequence of Southern bean mosaic virus (SBMV) has been determined PUBMED:2823471. The genome contains four ORFs. The largest ORF encodes the two largest proteins translated in cell-free extracts from full-length virion RNA PUBMED:2823471. Segments of the predicted amino acid sequence of this ORF resemble those of known viral RNA-polymerases, ATP-binding proteins and viral genome-linked proteins PUBMED:2823471.

    \

    The genome sequence of Pea enation mosaic virus (PEMV) RNA 1 shows strong organisational relationships and sequence similarities to the Beet western yellows virus (BWYV) and PLrV PUBMED:1875194. Sequence analysis reveals five predominant ORFs. The third ORF is characterised by a number of RNA-polymerase motifs and a helicase-like motif typical of RNA-dependent RNA-polymerases PUBMED:1875194. It overlaps (out of frame) the ORF 2 product and is proposed to be expressed by a frameshift fusion of ORF 2 and ORF 3 PUBMED:1875194.

    \

    The PLrV sequence shows some similarities to the putative polymerase of SBMV PUBMED:2823471, and more extensive similarities to the corresponding BWYV polypeptide PUBMED:3194229.

    \ ' '558' 'IPR018392' '\ This domain is about 40 residues long and is found in a variety\ of enzymes involved in bacterial cell wall degradation PUBMED:1352512. This\ domain may have a general peptidoglycan binding function.\ ' '559' 'IPR005119' '\ The structure of this domain is known and is similar to the periplasmic binding proteins PUBMED:9309218. This domain is found in members of the LysR family of prokaryotic transcriptional regulatory proteins which share sequence similarities over approximately 280\ residues including a putative helix-turn-helix DNA-binding motif at their N terminus.\ ' '560' 'IPR004474' '\ This entry describes a domain of unknown function that is found in the predicted extracellular domain of a number of putative membrane-bound proteins. One of these is protein psr, described as a penicillin binding protein 5 (PDP-5) synthesis repressor. Another is Bacillus subtilis LytR, described as a transcriptional attenuator of itself and the LytABC operon, where LytC is N-acetylmuramoyl-L-alanine amidase. A third is CpsA, a putative regulatory protein involved in exocellular polysaccharide biosynthesis. These proteins share the property of having a short putative N-terminal cytoplasmic domain and transmembrane domain forming a signal-anchor.\ ' '561' 'IPR003891' '\

    This entry represents the MI domain (after MA-3 and eIF4G), it is a protein-protein interaction module of ~130 amino acids PUBMED:10958635, PUBMED:10973054, PUBMED:15082783. It appears in several translation factors and is found in:

    \

    \

    The MI domain consists of seven alpha-helices, which pack into a globular form. The packing arrangement consists of repeating pairs of antiparallel helices packed one upon the other such that a superhelical axis is generated perpendicular to the alpha-helical axes PUBMED:17060447.

    \ \

    The MI domain has also been named MA3 domain.

    \ ' '562' 'IPR002190' '\

    The first mammalian members of the MAGE (melanoma-associated antigen) gene\ family were originally described as completely silent in normal adult tissues,\ with the exception of male germ cells and, for some of them, placenta. By\ contrast, these genes were expressed in various kinds of tumors. However, other\ members of the family were recently found to be expressed in normal cells,\ indicating that the family is larger and more disparate than initially\ expected. MAGE-like genes have also been identified in non-mammalian species, including Drosophila melanogaster (Fruit fly) and Danio rerio (Zebrafish). Although no MAGE homologous\ sequences have been identified in Caenorhabditis elegans, Saccharomyces cerevisiae (Baker\'s yeast) or Schizosaccharomyces pombe (Fission yeast), MAGE sequences have been found in\ several vegetal species, including Arabidopsis thaliana (Mouse-ear cress) PUBMED:11454705.

    \

    \ The only region of homology shared by all of the members of the family is a\ stretch of about 200 amino acids which has been named the MAGE conserved\ domain. The MAGE conserved domain is usually located close to the C-terminal,\ although it can also be found in a more central position in some proteins. The\ MAGE conserved domain is generally present as a single copy but it is\ duplicated in some proteins. It has been proposed that the MAGE conserved\ domain of MAGE-D proteins might interact with p75 neurotrophin or related\ receptors PUBMED:11454705.

    \ ' '563' 'IPR012301' '\

    Malic enzymes (malate oxidoreductases) catalyse the oxidative decarboxylation of malate to form pyruvate PUBMED:, a reaction important in a number of metabolic pathways - e.g. carbon dioxide released from the reaction may be used in sugar production during the Calvin cycle of photosynthesis PUBMED:8300616. There are 3 forms of the enzyme PUBMED:1993674: an NAD-dependent form that decarboxylates oxaloacetate; an NAD-dependent form that does not decarboxylate oxalo-acetate; and an NADPH-dependent form PUBMED:8300616. Other proteins known to be similar to malic enzymes are the Escherichia coli scfA protein; an enzyme from Zea mays (Maize), formerly thought to be cinnamyl-alcohol dehydrogenase PUBMED:2103472; and the hypothetical Saccharomyces cerevisiae protein YKL029c.

    \

    Studies on the duck liver malic enzyme reveals that it can be alkylated by bromopyruvate, resulting in the loss of oxidative decarboxylation and the subsequent enhancement of pyruvate reductase activity PUBMED:1911848. The alkylated form is able to bind NADPH but not L-malate, indicating impaired substrate-or divalent metal ion-binding in the active site PUBMED:1911848. Sequence analysis has highlighted a cysteine residue as the point of alkylation, suggesting that it may play an important role in the activity of the enzyme PUBMED:1911848, although it is absent in the sequences from some species.

    \

    There are three well conserved regions in the enzyme sequences. Two of them seem to be involved in the binding NAD or NADP. The significance of the third one, located in the central part of the enzymes, is not yet known.

    \ ' '564' 'IPR012302' '\

    Malic enzymes (malate oxidoreductases) catalyse the oxidative decarboxylation of malate to form pyruvate PUBMED:, a reaction important in a number of metabolic pathways - e.g. carbon dioxide released from the reaction may be used in sugar production during the Calvin cycle of photosynthesis PUBMED:8300616. There are 3 forms of the enzyme PUBMED:1993674: an NAD-dependent form that decarboxylates oxaloacetate; an NAD-dependent form that does not decarboxylate oxalo-acetate; and an NADPH-dependent form PUBMED:8300616. Other proteins known to be similar to malic enzymes are the Escherichia coli scfA protein; an enzyme from Zea mays (Maize), formerly thought to be cinnamyl-alcohol dehydrogenase PUBMED:2103472; and the hypothetical Saccharomyces cerevisiae protein YKL029c.

    \

    Studies on the duck liver malic enzyme reveals that it can be alkylated by bromopyruvate, resulting in the loss of oxidative decarboxylation and the subsequent enhancement of pyruvate reductase activity PUBMED:1911848. The alkylated form is able to bind NADPH but not L-malate, indicating impaired substrate-or divalent metal ion-binding in the active site PUBMED:1911848. Sequence analysis has highlighted a cysteine residue as the point of alkylation, suggesting that it may play an important role in the activity of the enzyme PUBMED:1911848, although it is absent in the sequences from some species.

    \

    There are three well conserved regions in the enzyme sequences. Two of them seem to be involved in the binding NAD or NADP. The significance of the third one, located in the central part of the enzymes, is not yet known.

    \ ' '565' 'IPR000998' '\ A 170 amino acid domain, the so-called MAM domain, has been recognised in the extracellular region of \ functionally diverse proteins PUBMED:8387703. These proteins have a modular, receptor-like architecture \ comprising a signal peptide, an N-terminal extracellular domain, a single transmembrane domain and an \ intracellular domain. Such proteins include meprin (a cell surface glycoprotein) PUBMED:1374387; A5 \ antigen (a developmentally-regulated cell surface protein) PUBMED:1908252; and receptor-like tyrosine \ protein phosphatase PUBMED:1655529. The MAM domain is thought to have an adhesive function. It contains \ 4 conserved cysteine residues, which probably form disulphide bridges.\ ' '566' 'IPR007145' '\ This is a family of microtubule associated proteins. One of its members is the yeast anaphase spindle elongation protein.\ ' '567' 'IPR002083' '\

    Although apparently functionally unrelated, intracellular TRAFs and extracellular meprins share a conserved region of about 180 residues, the meprin and TRAF homology (MATH) domain PUBMED:12387856. Meprins are mammalian tissue-specific metalloendopeptidases of the astacin family implicated in developmental, normal and pathological processes by hydrolysing a variety of proteins. Various growth factors, cytokines, and extracellular matrix proteins are substrates for meprins. They are composed of five structural domains: an N-terminal endopeptidase domain, a MAM domain (see\ ), a MATH domain, an EGF-like domain (see ) and a C-terminal transmembrane region. Meprin A and B form membrane bound homotetramer whereas homooligomers of meprin A are secreted. A proteolitic site adjacent to the MATH domain, only present in meprin A, allows the release of the protein from the membrane PUBMED:7890660.

    \ \

    TRAF proteins were first isolated by their ability to interact with TNF receptors PUBMED:8069916. They promote cell survival by the activation of downstream protein kinases and, finally, transcription factors of the NF-kB and AP-1 family. The TRAF proteins are composed of 3 structural domains: a RING finger (see ) in the N-terminal part of the protein, one to seven TRAF zinc fingers (see ) in the middle and the MATH domain in the C-terminal part PUBMED:12387856. The MATH domain is necessary and sufficient for self-association and receptor interaction. From the structural analysis two consensus sequence recognised by the TRAF domain have been defined: a major one, [PSAT]x[QE]E and a minor one, PxQxxD PUBMED:10518213.

    \ \

    The structure of the TRAF2 protein reveals a trimeric self-association of the MATH domain PUBMED:10206649. The domain forms a new, light-stranded antiparallel beta sandwich structure. A coiled-coil region adjacent to the MATH domain is also important for the trimerisation. The oligomerisation is essential for establishing appropriate connections to form signalling complexes with TNF receptor-1. The ligand binding surface of TRAF proteins is located in beta-strands 6 and 7 PUBMED:10518213.

    \ ' '568' 'IPR001739' '\

    Methylation at CpG dinucleotide, the most common DNA modification in\ eukaryotes, has been correlated with gene silencing associated with various\ phenomena such as genomic imprinting, transposon and chromosome X inactivation, differentiation, and cancer. Effects of DNA methylation are mediated through proteins which bind to symmetrically methylated CpGs. Such proteins contain a specific domain of ~70 residues, the methyl-CpG-binding domain (MBD), which is linked to additional domains associated with chromatin, such as the bromodomain, the AT hook motif,the SET domain, or the PHD finger. MBD-containing proteins appear to act as structural proteins, which recruit a variety of histone deacetylase (HDAC) complexes and chromatin remodelling factors, leading to chromatin compaction and, consequently, to transcriptional repression. The MBD of MeCP2, MBD1, MBD2, MBD4 and BAZ2 mediates binding to DNA, in case of MeCP2, MBD1 and MBD2 preferentially to methylated CpG. In case of human MBD3 and SETDB1 the MBD has been shown to mediate protein-protein interactions PUBMED:12529184, PUBMED:12787239.

    \ \

    The MBD folds into an alpha/beta sandwich structure comprising a layer of\ twisted beta sheet, backed by another layer formed by the alpha1 helix and a\ hairpin loop at the C terminus. These layers are both amphipathic, with the alpha1 helix and the beta sheet lying parallel and the hydrophobic faces tightly packed against each other. The beta sheet is composed of two long inner strands (beta2 and beta3) sandwiched by two shorter outer strands (beta1 and beta4) PUBMED:11371345.

    \ ' '569' 'IPR004299' '\ The MBOAT (membrane bound O-acyl transferase) family of membrane proteins contains a variety of acyltransferase\ enzymes. A conserved histidine has been suggested to be the active site residue PUBMED:10694878.\ ' '570' 'IPR004092' '\

    The function of the malignant brain tumor (MBT) repeat is unknown, but is found in a number of nuclear proteins involved in transcriptional repression. The repeat contains a completely\ conserved glutamate at its amino terminus that may be important for function.

    The crystal structure of the two MBT repeats of human SCM-like 2 protein has been reported. Each repeat consists of an extended "arm" and a globular core. The arm of the first repeat packs against the core of the second repeat and vice versa. The structure of the core-interacting part of each arm consists of an N-terminal alpha-helix and a turn of 310 helix connected by a short beta-strand. The core consists of an Src homology 3-like five-stranded beta-barrel followed by a C-terminal alpha-helix and another short beta-strand. Each arm interacts with its partner core in a similar way, with the orientation of the N-terminal helix relative to the barrel varying slightly. There are also extensive interactions between the two barrels PUBMED:12952983.

    \ ' '571' 'IPR003399' '\

    This domain is found in all 24 mce genes associated with the four mammalian cell entry (mce) operons of Mycobacterium tuberculosis and their homologs in other Actinomycetales PUBMED:12052567, PUBMED:14500535. The archetype (mce1A, Rv0169), was isolated as being necessary for colonisation of, and survival within, the macrophage PUBMED:8367727. The domain is also found in:

    \ \ ' '572' 'IPR007747' '\ MEN1, the gene responsible for multiple endocrine neoplasia type 1, is a tumour suppressor gene that encodes a protein called Menin which may be an atypical GTPase stimulated by nm23 PUBMED:12145286.\ ' '573' 'IPR004843' '\

    Protein phosphorylation plays a central role in the regulation of cell functions PUBMED:2827745, causing \ the activation or inhibition of many enzymes involved in various biochemical pathways PUBMED:2176161. Kinases and phosphatases are the enzymes responsible for this, and may themselves be subject to control through the action of hormones and growth factors PUBMED:2827745. Serine/threonine (S/T) phosphatases catalyse the dephosphorylation of phosphoserine and phosphothreonine residues. In \ mammalian tissues four different types of PP have been identified and are known as PP1, PP2A, PP2B and \ PP2C. Except for PP2C, these enzymes are evolutionary related. The catalytic regions of the proteins are well conserved and have a slow mutation rate, suggesting that major changes in these regions are highly detrimental PUBMED:2827745.

    \

    The metallo-phosphoesterase motif is found in a large number of proteins invoved in phosphoryation. These include serine/threonine phosphatases, DNA polymerase, exonucleases, and other phosphatases.

    \ ' '574' 'IPR000055' '\

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    Type I restriction endonucleases are components of prokaryotic DNA restriction-modification mechanisms that protects the organism against invading foreign DNA. Type I enzymes have three different subunits subunits - M (modification), S (specificity) and R (restriction) - that form multifunctional enzymes with restriction (), methylase () and ATPase activities PUBMED:15121719, PUBMED:12595133. The S subunit is required for both restriction and modification and is responsible for recognition of the DNA sequence specific for the system. The M subunit is necessary for modification, and the R subunit is required for restriction. These enzymes use S-Adenosyl-L-methionine (AdoMet) as the methyl group donor in the methylation reaction, and have a requirement for ATP. They recognise asymmetric DNA sequences split into two domains of specific sequence, one 3-4 bp long and another 4-5 bp long, separated by a nonspecific spacer 6-8 bp in length. Cleavage occurs a considerable distance from the recognition sites, rarely less than 400 bp away and up to 7000 bp away. Adenosyl residues are methylated, one on each strand of the recognition sequence. These enzymes are widespread in eubacteria and archaea. In enteric bacteria they have been subdivide into four families: types IA, IB, IC and ID.

    \

    This entry represents the S subunit of type I restriction endonucleases such as EcoBI and EcoKI (), which recognise the DNA sequence 5\' TGAN(8)TGCT and 5\'-AACN(6)GTGC, respectively PUBMED:16040596, PUBMED:17439637. The M and S subunits together form a methyltransferase that methylates two adenine residues in complementary strands of a bipartite DNA recognition sequence. In the presence of the R subunit the complex can also act as an endonuclease, binding to the same target sequence but cutting the DNA some distance from this site. Whether the DNA is cut or modified depends on the methylation state of the target sequence: when the target site is unmodified, the DNA is cut; when the target site is hemi-methylated, the complex acts as a maintenance methyltransferase to modify the DNA, methylating both strands PUBMED:9837717. Most of the proteins in this family have two copies of the domain.

    \ ' '575' 'IPR002903' '\

    This is a family of S-adenosyl-L-methionine-dependent methyltransferases, which are found primarily, though not exclusively, in bacteria. The Escherichia coli protein is essential and has been linked to peptidoglycan biosynthesis PUBMED:10572301, PUBMED:4563986.

    \ ' '576' 'IPR007754' '\

    N-acetylglucosaminyltransferase II () is a Golgi resident enzyme that catalyzes an essential step in the biosynthetic pathway leading from high mannose to complex N-linked oligosaccharides PUBMED:7797505. Mutations in the MGAT2 gene lead to a congenital disorder of glycosylation (CDG IIa). CDG IIa patients have an increased bleeding tendency, unrelated to coagulation factors PUBMED:11596651.

    \

    Synonym(s): UDP-N-acetyl-D-glucosamine:alpha-6-D-mannoside beta-1,2-N- acetylglucosaminyltransferase II, GnT II/MGAT2.

    \ ' '577' 'IPR011607' '\

    This domain composes the whole protein of methylglyoxal synthetase and the domain is also found in Carbamoyl phosphate synthetase (CPS) where it forms a regulatory domain that binds to the allosteric effector ornithine. This family also includes inosicase. The known structures in this family show a common phosphate binding site PUBMED:10526357.

    \ ' '578' 'IPR006667' '\ This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. Related regions are found also in archaebacterial and eukaryotic proteins. All the archaebacterial and eukaryotic examples have two copies of the region. This suggests that the eubacterial examples may act as dimers. Members of this family probably transport Mg2+ or other divalent cations into the cell. The alignment contains two highly conserved aspartates that may be involved in cation binding.\ ' '579' 'IPR001003' '\

    Major Histocompatibility Complex (MHC) glycoproteins are heterodimeric cell surface receptors that function to present antigen peptide fragments to T cells responsible for cell-mediated immune responses. MHC molecules can be subdivided into two groups on the basis of structure and function: class I molecules present intracellular antigen peptide fragments (~10 amino acids) on the surface of the host cells to cytotoxic T cells; class II molecules present exogenously derived antigenic peptides (~15 amino acids) to helper T cells. MHC class I and II molecules are assembled and loaded with their peptide ligands via different mechanisms. However, both present peptide fragments rather than entire proteins to T cells, and are required to mount an immune response.

    \

    Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection) PUBMED:15120183.

    \

    MHC class II molecules are comprised of two membrane-spanning chains, alpha and beta, of similar size. Both chains consist of two globular domains (N- and C-terminal), and a transmembrane segment to anchor them to the membrane PUBMED:7612235. A groove in the structure acts as the peptide-binding site.

    \ \

    This entry represents the N-terminal domain (also called alpha-1 domain) of the alpha chain.

    \

    More information about these proteins can be found at Protein of the Month: MHC PUBMED:.

    \ ' '580' 'IPR000353' '\

    Major Histocompatibility Complex (MHC) glycoproteins are heterodimeric cell surface receptors that function to present antigen peptide fragments to T cells responsible for cell-mediated immune responses. MHC molecules can be subdivided into two groups on the basis of structure and function: class I molecules present intracellular antigen peptide fragments (~10 amino acids) on the surface of the host cells to cytotoxic T cells; class II molecules present exogenously derived antigenic peptides (~15 amino acids) to helper T cells. MHC class I and II molecules are assembled and loaded with their peptide ligands via different mechanisms. However, both present peptide fragments rather than entire proteins to T cells, and are required to mount an immune response.

    \

    Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection) PUBMED:15120183.

    \

    MHC class II molecules are comprised of two membrane-spanning chains, alpha and beta, of similar size. Both chains consist of two globular domains (N- and C-terminal), and a transmembrane segment to anchor them to the membrane PUBMED:7612235. A groove in the structure acts as the peptide-binding site.

    \ \

    This entry represents the N-terminal domain (also called beta-1 domain) of the beta chain.

    \

    More information about these proteins can be found at Protein of the Month: MHC PUBMED:.

    \ ' '581' 'IPR004166' '\ Proteins containing this domain consist of a novel group of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. Proteins include myosin heavy chain kinases PUBMED:7822274, PUBMED:9054368 and Elongation Factor-2 kinase and a bifunctional ion channel PUBMED:11161216.\ ' '582' 'IPR003890' '\

    This entry represents an MIF4G-like domain. MIF4G domains share a common structure but can differ in sequence. This entry is designated "type 3", and is found in nuclear cap-binding proteins, eIF4G, and UPF2.

    \ \

    The MIF4G domain is a structural motif with an ARM (Armadillo) repeat-type fold, consisting of a 2-layer alpha/alpha right-handed superhelix. Proteins usually contain two or more structurally similar MIF4G domains connected by unstructured linkers. MIF4G domains are found in several proteins involved in RNA metabolism, including eIF4G (eukaryotic initiation factor 4-gamma), eIF-2b (translation initiation factor), UPF2 (regulator of nonsense transcripts 2), and nuclear cap-binding proteins (CBP80, CBC1, NCBP1), although the sequence identity between them may be low PUBMED:10958635.

    \ \

    The nuclear cap-binding complex (CBC) is a heterodimer. Human CBC consists of a large CBP80 subunit and a small CBP20 subunit, the latter being critical for cap binding. CBP80 contains three MIF4G domains connected with long linkers, while CBP20 has an RNP (ribonucleoprotein)-type domain that associates with domains 2 and 3 of CBP80 PUBMED:11545740. The complex binds to 5\'-cap of eukaryotic RNA polymerase II transcripts, such as mRNA and U snRNA. The binding is important for several mRNA nuclear maturation steps and for nonsense-mediated decay. It is also essential for nuclear export of U snRNAs in metazoans PUBMED:16043498.

    \ \

    Eukaryotic translation initiation factor 4 gamma (eIF4G) plays a critical role in protein expression, and is at the centre of a complex regulatory network. Together with the cap-binding protein eIF4E, it recruits the small ribosomal subunit to the 5\'-end of mRNA and promotes the assembly of a functional translation initiation complex, which scans along the mRNA to the translation start codon. The activity of eIF4G in translation initiation could be regulated through intra- and inter-protein interactions involving the ARM repeats PUBMED:16156639. In eIF4G, the MIF4G domain binds eIF4A, eIF3, RNA and DNA.

    \ \

    Nonsense-mediated mRNA decay (NMD) in eukaryotes involves UPF1, UPF2 and UPF3 to accelerate the decay rate of two unique classes of transcripts: (1) nonsense mRNAs that arise through errors in gene expression, and (2) naturally occurring transcripts that lack coding errors but have built-in features that target them for accelerated decay (error-free mRNAs). NMD can trigger decay during any round of translation and can target CBC-bound or eIF-4E-bound transcripts PUBMED:16043493. UPF2 contains MIF4G domains, while UPF3 contains an RNP domain PUBMED:15004547.

    \ ' '583' 'IPR003608' '\

    The MIR domain is named after three of the proteins in which it occurs: protein Mannosyltransferase (), Inositol 1,4,5-trisphosphate receptor (IP3R) and Ryanodine receptor (RyR). MIR domains have also been found in eukaryotic stromal cell-derived factor 2 (SDF-2) and in Chlamydia trachomatis protein CT153. The MIR domain may have a ligand transferase function. This domain has a closed beta-barrel structure with a hairpin triplet, and has an internal pseudo-threefold symmetry. The MIR motifs that make up the MIR domain consist of ~50 residues and are often found in multiple copies.

    \

    Inositol 1,4,5-trisphosphate (InsP3) is an intracellular second messenger that transduces growth factor and neurotransmitter signals. InsP3 mediates the release of Ca2+ from intracellular stores by binding to specific Ca2+ channel-coupled receptors. Ryanodine receptors are involved in communication between transverse-tubules and the sarcoplamic reticulum of cardiac and skeletal muscle. The proteins function as a Ca2+-release channels following depolarisation of transverse-tubules PUBMED:1645727. The function is modulated by Ca2+, Mg2+, ATP and calmodulin. Deficiency in the ryanodine receptor may be the cause of malignant hyperthermia (MH) and of central core disease of muscle (CCD) PUBMED:7829078. protein O-mannosyltransferases transfer mannose from DOL-P-mannose to ser or thr residues on proteins.

    \ ' '584' 'IPR007330' '\

    The MIT domain is found in vacuolar sorting proteins, spastin (probable ATPase involved in the assembly or function of nuclear protein complexes), and a sorting nexin, which may play a role in intracellular trafficking.

    \ ' '585' 'IPR018108' '\

    A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane or integral to the membrane of other eukaryotic organelles such as the peroxisome PUBMED:2158156, PUBMED:, PUBMED:8140286, PUBMED:8487299, PUBMED:8206158, PUBMED:8291088. Such proteins include: ADP, ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others. Structurally, these proteins can consist of up to three tandem repeats of a domain of approximately 100 residues, each domain containing two transmembrane regions.

    \ ' '586' 'IPR005301' '\

    Mob1 is an essential Saccharomyces cerevisiae (Baker\'s yeast) protein, identified from a two-hybrid screen, that binds Mps1p, a protein kinase essential for spindle pole body duplication and mitotic checkpoint regulation. Mob1 contains no known structural motifs; however MOB1 is a member of a conserved gene family and shares sequence similarity with a nonessential yeast gene, MOB2. Mob1 is a phosphoprotein in vivo and a substrate for the Mps1p kinase in vitro. Conditional alleles of MOB1 cause a late nuclear division arrest at restrictive temperature PUBMED:9436989. This family also includes phocein , a rat protein that by yeast two hybrid interacts with striatin PUBMED:11251078.

    \ ' '587' 'IPR001453' '\ Eukaryotic and prokaryotic molybdoenzymes require a molybdopterin cofactor\ (MoCF) for their activity. The biosynthesis of this cofactor involves a\ complex multistep enzymatic pathway. One of the eukaryotic proteins involved\ in this pathway is the Drosophila protein cinnamon PUBMED:8088525 which is highly similar\ to gephyrin, a rat microtubule-associated protein which was thought to anchor\ the glycine receptor to subsynaptic microtubules.

    Cinnamon and gephyrin are\ evolutionary related, in their N-terminal half, to the Escherichia coli MoCF\ biosynthesis proteins mog/chlG and moaB/chlA2 and, in their C-terminal half,\ to E. coli moeA/chlE.

    \ ' '588' 'IPR005111' '\

    The majority of molybdenum-containing enzymes utilise a molybdenum cofactor (MoCF or Moco) consisting of a Mo atom coordinated via a cis-dithiolene moiety to molybdopterin (MPT). MoCF is ubiquitous in nature, and the pathway for MoCF biosynthesis is conserved in all three domains of life. MoCF-containing enzymes function as oxidoreductases in carbon, nitrogen, and sulphur metabolism PUBMED:16784786, PUBMED:12114025.

    \ \

    In Escherichia coli, biosynthesis of MoCF is a three stage process. It begins with the MoaA and MoaC conversion of GTP to the meta-stable pterin intermediate precursor Z. The second stage involves MPT synthase (MoaD and MoaE), which converts precursor Z to MPT; MoeB is involved in the recycling of MPT synthase. The final step in MoCF synthesis is the attachment of mononuclear Mo to MPT, a process that requires MoeA and which is enhanced by MogA in an Mg2 ATP-dependent manner PUBMED:17198377. MoCF is the active co-factor in eukaryotic and some prokaryotic molybdo-enzymes, but the majority of bacterial enzymes requiring MoCF, need a modification of MTP for it to be active; MobA is involved in the attachment of a nucleotide monophosphate to MPT resulting in the MGD co-factor, the active co-factor for most prokaryotic molybdo-enzymes. Bacterial two-hybrid studies have revealed the close interactions between MoeA, MogA, and MobA in the synthesis of MoCF PUBMED:12372836. Moreover the close functional association of MoeA and MogA in the synthesis of MoCF is supported by fact that the known eukaryotic homologues to MoeA and MogA exist as fusion proteins: CNX1 () of Arabidopsis thaliana (Mouse-ear cress), mammalian Gephryin (e.g. ) and Drosophila melanogaster (Fruit fly) Cinnamon () PUBMED:8528286.

    \ \

    This domain is found in proteins involved in biosynthesis of molybdopterin cofactor however\ the exact molecular function of this domain is uncertain. The structure of this domain is\ known PUBMED:11525167 and forms an incomplete beta barrel.

    \ ' '589' 'IPR005110' '\

    The majority of molybdenum-containing enzymes utilise a molybdenum cofactor (MoCF or Moco) consisting of a Mo atom coordinated via a cis-dithiolene moiety to molybdopterin (MPT). MoCF is ubiquitous in nature, and the pathway for MoCF biosynthesis is conserved in all three domains of life. MoCF-containing enzymes function as oxidoreductases in carbon, nitrogen, and sulphur metabolism PUBMED:16784786, PUBMED:12114025.

    \ \

    In Escherichia coli, biosynthesis of MoCF is a three stage process. It begins with the MoaA and MoaC conversion of GTP to the meta-stable pterin intermediate precursor Z. The second stage involves MPT synthase (MoaD and MoaE), which converts precursor Z to MPT; MoeB is involved in the recycling of MPT synthase. The final step in MoCF synthesis is the attachment of mononuclear Mo to MPT, a process that requires MoeA and which is enhanced by MogA in an Mg2 ATP-dependent manner PUBMED:17198377. MoCF is the active co-factor in eukaryotic and some prokaryotic molybdo-enzymes, but the majority of bacterial enzymes requiring MoCF, need a modification of MTP for it to be active; MobA is involved in the attachment of a nucleotide monophosphate to MPT resulting in the MGD co-factor, the active co-factor for most prokaryotic molybdo-enzymes. Bacterial two-hybrid studies have revealed the close interactions between MoeA, MogA, and MobA in the synthesis of MoCF PUBMED:12372836. Moreover the close functional association of MoeA and MogA in the synthesis of MoCF is supported by fact that the known eukaryotic homologues to MoeA and MogA exist as fusion proteins: CNX1 () of Arabidopsis thaliana (Mouse-ear cress), mammalian Gephryin (e.g. ) and Drosophila melanogaster (Fruit fly) Cinnamon () PUBMED:8528286.

    \ \

    Proteins in this family contain two structural domains. One of these contains the conserved DGXA motif. This region is\ found in proteins involved in biosynthesis of molybdopterin cofactor however the exact molecular function of\ this region is uncertain.

    \ ' '591' 'IPR003409' '\ The MORN (Membrane Occupation and Recognition Nexus) motif is found in multiple copies in several proteins including junctophilins (PUBMED:10949023). The function of this motif is unknown.\ ' '592' 'IPR005303' '\

    This domain is found to the N-terminus of MOSC domain (). The function of this domain is unknown, however it is predicted to adopt a beta barrel fold.

    \ ' '593' 'IPR006737' '\

    Motilin is a gastrointestinal regulatory polypeptide produced by motilin cells in the duodenal epithelium. It is released into the general circulation at about 100-min intervals during the inter-digestive state and is the most important factor in controlling the inter-digestive migrating contractions. Motilin also stimulates endogenous release of the endocrine pancreas PUBMED:9210180.

    This domain is also found in ghrelin, a growth hormone secretagogue synthesised by endocrine cells in the stomach. Ghrelin stimulates growth hormone secretagogue receptors in the pituitary. These receptors are distinct from the growth hormone-releasing hormone receptors, and thus provide a means of controlling pituitary growth hormone release by the gastrointestinal system PUBMED:11306336.

    This domain represents a peptide sequence that lies C-terminal to motilin/ghrelin () on the respective precursor peptide. Its function is unknown.

    \ ' '594' 'IPR006738' '\

    Motilin is a gastrointestinal regulatory polypeptide produced by motilin cells in the duodenal epithelium. It is released into the general circulation at about 100-min intervals during the inter-digestive state and is the most important factor in controlling the inter-digestive migrating contractions. Motilin also stimulates endogenous release of the endocrine pancreas PUBMED:9210180.

    This domain is also found in ghrelin, a growth hormone secretagogue synthesised by endocrine cells in the stomach. Ghrelin stimulates growth hormone secretagogue receptors in the pituitary. These receptors are distinct from the growth hormone-releasing hormone receptors, and thus provide a means of controlling pituitary growth hormone release by the gastrointestinal system PUBMED:11306336.

    \ ' '595' 'IPR000555' '\

    Members of this family are found in proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors. This family is also known as the MPN domain PUBMED:9150866 and PAD-1-like domain PUBMED:9605331. It has been shown that this domain occurs in prokaryotes PUBMED:9605331.

    \ \

    Mov34 proteins act as the regulatory subunit of the 26 proteasome, which is involved in the ATP-dependent degradation of ubiquitinated proteins. The function of this domain is unclear, but it is found in the N-terminus of \ the proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors.

    \ \

    A number of the proteins associated with this family belong to MEROPS peptidase family M67 (clan M-). This includes the Poh1 peptidase of Saccharomyces cerevisiae (Baker\'s yeast) which is a component of the 19S proteasome regulatory particle.

    \ ' '596' 'IPR002717' '\

    Moz is a monocytic leukemia Zn_finger protein and the SAS protein from Saccharomyces cerevisiae (Baker\'s yeast) is involved in silencing the Hmr locus. These proteins were reported to be homologous to acetyltransferases PUBMED:8607265 but this similarity is not supported by standard sequence analysis.

    \ ' '597' 'IPR007248' '\

    The 22 kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore-forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information PUBMED:11590176. Mpv17 is a closely related peroxisomal protein involved in the development of early-onset glomerulosclerosis PUBMED:11327696.

    \

    A member of this family found in Saccharomyces cerevisiae (Baker\'s yeast) is an integral membrane protein of the inner mitochondrial membrane and has been suggested to play a role in mitochondrial function during heat shock PUBMED:15189984.

    \ ' '598' 'IPR005304' '\

    Members of this family are essential for 40S ribosomal biogenesis. They play a role in the methylation reaction of pre-rRNA processing. The structure of EMG1 has revealed that it is a novel member of the superfamily of alpha/beta knot fold methyltransferases PUBMED:18063569, PUBMED:11935223.

    \ ' '599' 'IPR003534' '\ The major royal jelly proteins (MRJPs) comprise 12.5% of the mass, and\ 82-90% of the protein content PUBMED:9791542, of honeybee (Apis mellifera) royal jelly. Royal jelly is a substance secreted by the cephalic glands of nurse bees PUBMED:10441680 and it is used to trigger development of a queen bee from a bee larva. The biological function of the MRJPs is unknown, but they are believed to play a major role in nutrition due to their high essential amino acid content PUBMED:10380654.\

    Two royal jelly proteins, MRJP3 and MRJP5, contain a tandem repeat that\ results from a high genetic variablility. This polymorphism may be useful \ for genotyping individual bees PUBMED:10380654.

    \ ' '600' 'IPR006797' '\

    These proteins contain a conserved region found in the yeast YLR168C gene MSF1 product. The function of this protein is unknown, though it is thought to be involved in intra-mitochondrial protein sorting. GFP-tagged MSF1 localizes to mitochondria and is required for wild-type respiratory growth PUBMED:14562095. This region is also found in a number of other eukaryotic proteins. The PRELI/MSF1 domain is an eukaryotic protein module which occurs in stand-alone form in several proteins, including the human PRELI protein and the yeast MSF1 protein, and as an amino-terminal domain in an orthologous group of proteins typified by human SEC14L1, which is conserved in all animals. In this group of proteins, the PRELI/MSF1 domain co-occurs with the CRAL-TRIO (see ) and the GOLD domains (see ). The PRELI/MSF1 domain is approximately 170 residues long and is predicted to assume a globular alpha + beta fold with six beta strands and four alpha helices. It has been suggested that the PRELI/MSF1 domain may have a function associated with cellular membrane PUBMED:12049664.

    \ ' '601' 'IPR000535' '\

    Major sperm proteins (MSP) are central components in molecular interactions underlying sperm motility in Caenorhabditis elegans, whose sperm employ an amoebae-like crawling motion using a MSP-containing lamellipod, rather than the flagellar-based swimming motion associated with other sperm. These proteins oligomerise to form an extensive filament system that extends from sperm villipoda, along the leading edge of the pseudopod. About 30 MSP isoforms may exist in C. elegans.

    \

    MSPs form a fibrous network, whereby MSP dimers form helical subfilaments that coil around one another to produce filaments, which in turn form supercoils to produce bundles. The crystal structure of MSP from C. elegans reveals an immunoglobulin (Ig)-like seven-stranded beta sandwich fold PUBMED:12051923.

    \ ' '602' 'IPR004686' '\ The MTC family consists of a limited number of homologues, all from eukaryotes. One member of the family has been functionally characterised as a tricarboxylate carrier from rat liver mitochondria. The rat liver mitochondrial tricarboxylate carrier has been reported to transport citrate, cis-aconitate, threo-D-isocitrate, D- and L-tartrate, malate, succinate and phosphoenolpyruvate. It presumably functions by a proton symport mechanism. The rest of the characterised proteins appear to be sideroflexins involved in iron transport.\ ' '603' 'IPR014778' '\

    The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. These belong to the SANT domain family that specifically recognise the sequence YAAC(G/T)G PUBMED:3185713, PUBMED:8882580. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding PUBMED:2824190.

    \ ' '604' 'IPR004009' '\ This domain has an SH3-like fold. It is found at the N-terminus of many but not all myosins. The function of this domain is unknown.\ ' '605' 'IPR000857' '\

    The microtubule-based kinesin motors and actin-based myosin motors generate movements required for intracellular trafficking, cell division, and muscle contraction. In general, these proteins consist of a motor domain that generates movement and a tail region that varies widely from class to class and is thought to mediate many of the regulatory or cargo binding functions specific to each class of motor PUBMED:11212352. The Myosin Tail Homology 4 (MyTH4) domain has been identified as a conserved domain in the tail domains of several different unconventional myosins PUBMED:11401444 and a plant kinesin-like protein PUBMED:1074599, but has more recently been found in several non-motor proteins PUBMED:12062040. Although the function is not yet fully understood, there is an evidence that the MyTH4 domain of Myosin-X (Myo10) binds to microtubules and thus could provide a link between an actin-based motor protein and the microtubule cytoskeleton PUBMED:15372037.

    \ \

    The MyTH4 domain is found in one or two copies associated\ with other domains, such as myosin head, kinesin motor, FERM, PH, SH3 and IQ. The domain is predicted to be largely alpha-helical, interrupted by three or\ four turns. The MyTH4 domain contains four highly conserved regions designated\ MGD (consensus sequence L(K/R)(F/Y)MGDhP, LRDE (consensus LRDEhYCQhhKQHxxxN),\ RGW (consensus RGWxLh), and ELEA (RxxPPSxhELEA), where h indicates a\ hydrophobic residue and x is any residue PUBMED:11401444.

    \ \ ' '606' 'IPR005148' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This domain is found at the N-terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition PUBMED:9736621.

    \ ' '607' 'IPR004837' '\ The sodium/calcium exchangers are a family of integral membrane proteins. This domain covers the integral membrane regions of these proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells PUBMED:1700476. Ca2+ is moved into or out of the cytosol depending on Na+ concentration PUBMED:1700476. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3 PUBMED:8798769. \ ' '608' 'IPR006153' '\

    Sodium proton exchangers (NHEs) constitute a large family of integral membrane protein transporters that are responsible for the counter-transport of protons and sodium ions across lipid bilayers PUBMED:12027219, PUBMED:12502567. These proteins are found in organisms across all domains of life. In archaea, bacteria, yeast and plants, these exchangers provide increased salt tolerance by removing sodium in exchanger for extracellular protons. In mammals they participate in the regulation of cell pH, volume, and intracellular sodium concentration, as well as for the reabsorption of NaCl across renal, intestinal, and other epithelia PUBMED:16734752, PUBMED:17071327, PUBMED:16513813, PUBMED:11187762. Human NHE is also involved in heart disease, cell growth and in cell differentiation PUBMED:17218973. The removal of intracellular protons in exchange for extracellular sodium effectively eliminates excess acid from actively metabolising cells. In mammalian cells, NHE activity is found in both the plasma membrane and inner mitochondrial membrane. To date, nine mammalian isoforms have been identified (designated NHE1-NHE9) PUBMED:9278382, PUBMED:9507001. These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N-terminus and a large cytoplasmic region at the C-terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. There is some evidence that the exchangers may exist in the cell membrane as homodimers, but little is currently known about the mechanism of their antiport PUBMED:9537504.

    \ \

    This entry represents a number of cation/proton exchangers, including Na+/H+ exchangers, K+/H+ exchangers and Na+(K+,Li+,Rb+)/H+ exchangers.

    \ ' '609' 'IPR003841' '\ This family includes the mammalian type II renal Na+/Pi-cotransporters and other proteins from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney these proteins may be involved in actively transporting phosphate into cells via Na+ cotransport in the renal brush border membrane PUBMED:8327470.\ ' '610' 'IPR003473' '\ Quinolinate synthetase catalyzes the second step of the de novo biosynthetic pathway of pyridine nucleotide formation. In particular, quinolinate synthetase is involved in the condensation of dihydroxyacetone phosphate and iminoaspartate to form quinolinic acid PUBMED:10648170. This synthesis requires two enzymes, an FAD-containing "B protein" and an "A protein".\ ' '611' 'IPR004041' '\

    The NAF domain is a 24 amino acid domain that is found in a plant-specific subgroup of serine-threonine protein kinases (CIPKs), that interact with calcineurin B-like calcium sensor proteins (CBLs). Whereas the N-terminal part of CIPKs comprises a conserved catalytic domain typical of Ser-Thr kinases, the much less conserved C-terminal domain appears to be unique to this subgroup of kinases. The only exception is the NAF domain that forms an \'island of conservation\' in this otherwise variable region. The NAF domain has been named after the prominent conserved amino acids Asn-Ala-Phe. It represents a minimum protein interaction module that is both necessary and sufficient to mediate the interaction with the CBL calcium sensor proteins PUBMED:11230129.

    \

    The secondary structure of the NAF domain is currently not known, but secondary structure computation of the C-terminal region of Arabidopsis thaliana CBL-interacting protein kinase 1 revealed a long helical structure PUBMED:11230129.

    \ ' '612' 'IPR007781' '\ Alpha-N-acetylglucosaminidase is a lysosomal enzyme required for the stepwise degradation of heparan sulphate PUBMED:10588735. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterised by neurological dysfunction but relatively mild somatic manifestations PUBMED:12049639.\ ' '613' 'IPR003441' '\

    The NAC domain (for Petunia hybrida (Petunia) NAM and for Arabidopsis ATAF1, ATAF2, and CUC2) is an N-terminal module of ~160 amino acids, which is found in proteins of the NAC family of plant-specific transcriptional regulators (no apical meristem (NAM) proteins) PUBMED:9212461. NAC proteins are involved in developmental processes, including formation of the shoot apical meristem, floral organs and lateral shoots, as well as in plant hormonal control and defence. The NAC domain is accompanied by diverse C-terminal transcriptional activation domains. The NAC domain has been shown to be a DNA-binding domain (DBD) and a dimerization domain PUBMED:11114891,PUBMED:12175016.

    \

    The NAC domain can be subdivided into five subdomains (A-E). Each subdomain is distinguishable by blocks of heterogeneous amino acids or gaps. While the NAC\ domains were rich in basic amino acids (R, K and H) as a whole, the distribution of positive and negative amino acids in each subdomain were unequal. Subdomains C and D are rich in basic amino acids but poor in acidic amino acids, while subdomain B contains a high proportion of acidic amino acids. Putative nuclear localization signals (NLS) have been detected in subdomains C and D PUBMED:10660065. The DBD is contained within a 60 amino acid region located within subdomains D and E PUBMED:12175016. The overall structure of the NAC domain monomer consists of a very twisted antiparallel beta-sheet, which packs against an N-terminal alpha-helix on one side and one shorter helix on the other side surrounded by a few helical elements. The structure suggests that the NAC domain mediates dimerization through conserved interactions including a salt bridge, and DNA binding through the NAC dimer face rich in positive charges PUBMED:15083810.

    \ ' '614' 'IPR015977' '\ Nicotinate phosphoribosyltransferase () is the rate-limiting enzyme that catalyses the first reaction in the NAD salvage synthesis. This family also contains a number of closely related proteins for which a catalytic activity has not been experimentally demonstrated.\ ' '615' 'IPR007053' '\

    This domain is found in proteins from viruses, bacteria and the eukayota. The domain contains a well-conserved NCEHF motif. The function of this domain is unknown.

    \ ' '616' 'IPR006988' '\ Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of eukaryotic (metazoa) transcription factors PUBMED:9418898. This region consists of the N-terminal NAB conserved region 1, which interacts with the EGR1 inhibitory domain (R1) PUBMED:9418898. It may also mediate multimerisation.\ \ ' '617' 'IPR006989' '\ Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of eukaryotic (metazoa) transcription factors PUBMED:9418898. This family consists of NAB conserved region 2, near the C terminus of the protein. It is necessary for transcriptional repression by the Nab proteins PUBMED:9418898. It is also required for transcription activation by Nab proteins at Nab-activated promoters PUBMED:10734128.\ ' '618' 'IPR004142' '\ This family consists of proteins from different gene families: Ndr1/RTP/Drg1, Ndr2, and Ndr3. Their similarity was previously noted PUBMED:10581191. The precise molecular and cellular function of members of this family is still unknown, yet they are known to be involved in cellular differentiation events. The Ndr1 group was the first to be discovered. Their expression is repressed by the proto-oncogenes N-myc and c-myc, and in line with this observation, Ndr1 protein expression is down-regulated in neoplastic cells, and is reactivated when differentiation is induced by chemicals such as retinoic acid. Ndr2 and Ndr3 expression is not under the control of N-myc or c-myc. Ndr1 expression is also activated by several chemicals: tunicamycin and homocysteine induce Ndr1 in human umbilical endothelial cells; nickel induces Ndr1 in several cell types. Members of this family are found in wide variety of multicellular eukaryotes, including an Ndr1 type protein in Helianthus annuus (Common sunflower), known as Sf21. Interestingly, the highest scoring matches in the noise are all alpha/beta hydrolases (), suggesting that this family may have an enzymatic function.\ ' '619' 'IPR013132' '\ NeuB is the prokaryotic N-acetylneuraminic acid synthase (Neu5Ac). It catalyses the direct formation of Neu5Ac (the most common sialic acid) by condensation of phosphoenolpyruvate (PEP) and N-acetylmannosamine (ManNAc). This reaction has only been observed in prokaryotes; eukaryotes synthesise the 9-phosphate form, Neu5Ac-9-P, and utilise ManNAc-6-P instead of ManNAc. Such eukaryotic enzymes\ are not present in this family PUBMED:10873658. This family also contains SpsE spore coat polysaccharide biosynthesis proteins.\ ' '620' 'IPR006202' '\

    Neurotransmitter ligand-gated ion channels are transmembrane receptor-ion channel complexes that open transiently upon binding of specific ligands, allowing rapid transmission of signals at chemical synapses PUBMED:1721053, PUBMED:1846404. Five of these ion channel receptor families have been shown to form a sequence-related superfamily:

    \

    \

    These receptors possess a pentameric structure (made up of varying subunits), surrounding a central pore. All known sequences of subunits from neurotransmitter-gated ion-channels are structurally related. They are composed of a large extracellular glycosylated N-terminal ligand-binding domain, followed by three hydrophobic transmembrane regions which form the ionic channel, followed by an intracellular region of variable length. A fourth hydrophobic region is found at the C-terminal of the sequence PUBMED:1721053, PUBMED:1846404.

    \ \

    This entry presents the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure.

    \ ' '621' 'IPR001258' '\

    The NHL repeat, named after NCL-1, HT2A and\ Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the \ copper type II, ascorbate-dependent monooxygenase family which catalyse the C-terminus alpha-amidation of biological peptides PUBMED:1894599.\ \ In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators PUBMED:9868369. The \'Brain Tumor\' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller PUBMED:14561773, PUBMED:11336677.

    \ \

    The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.

    \ ' '622' 'IPR004274' '\ The function of this domain is unclear. It is found in proteins of diverse function including phosphatases some of which may be active in active in ternary elongation complexes and a number of NLI interacting factors. In the phospatases this domain is often present N-terminal to the BRCT domain ().\ ' '623' 'IPR007007' '\ Ninjurin (nerve injury-induced protein) is involved in nerve regeneration and in the formation of some tissues PUBMED:8780658.\ ' '624' 'IPR003765' '\ The nitrate-reducing system, nitrate reductase , is stimulated by anaerobiosis, nitrate, and nitrite. The delta subunit is not part of the nitrate reductase enzyme but is most likely needed for assembly of the multisubunit enzyme complex. In the absence of the delta\ subunit the core alpha beta enzyme complex is unstable PUBMED:9738886. The delta subunit is essential for enzyme activity in vivo\ and in vitro.\ ' '625' 'IPR000064' '\

    The Escherichia coli NLPC/Listeria P60 domain occurs at the C terminus of a number of different bacterial and viral proteins. The viral proteins are either described as tail assembly proteins or Gp19. In bacteria, the proteins are variously described as being putative tail component of prophage, invasin, invasion associated protein, putative lipoprotein, cell wall hydrolase, or putative endopeptidase.

    \ \

    The E. coli NLPC/Listeria P60 domain is contained within the boundaries of the cysteine peptidase domain that defines the MEROPS peptidase family C40 (clan C-). A type example being dipeptidyl-peptidase VI from Bacillus sphaericus and gamma-glutamyl-diamino acid-endopeptidase precursor from Lactococcus lactis . This group also contains proteins classified as non-peptidase homologues in that they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of peptidases in the C40 family.\

    \ ' '626' 'IPR007064' '\ The NMD3 protein is involved in nonsense mediated mRNA decay. This N-terminal region contains four conserved CXXC motifs that could be metal binding. NMD3 is involved in export of the 60S ribosomal subunit is mediated by the adapter protein Nmd3p in a Crm1p-dependent pathway PUBMED:10022925.\ ' '627' 'IPR000940' '\

    Methyl transfer from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalysed by methyltransferases (Mtases) and modifies DNA, RNA, proteins and small molecules, such as catechol for regulatory purposes. The various aspects of the role of DNA methylation in prokaryotic restriction-modification systems and in a number of cellular processes in eukaryotes including gene regulation and differentiation is well documented.

    \ \

    Three classes of DNA Mtases transfer the methyl group from AdoMet to the target base to form either N-6-methyladenine, or N-4-methylcytosine, or C-5- methylcytosine. In C-5-cytosine Mtases, ten conserved motifs are arranged in the same order PUBMED:8127644. Motif I (a glycine-rich or closely related consensus sequence; FAGxGG in M.HhaI PUBMED:8343957), shared by other AdoMet-Mtases PUBMED:2684970, is part of the cofactor binding site and motif IV (PCQ) is part of the catalytic site. In contrast, sequence comparison among N-6-adenine and N-4-cytosine Mtases indicated two of the conserved segments PUBMED:2690010, although more conserved segments may be present. One of them corresponds to motif I in C-5-cytosine Mtases, and the other is named (D/N/S)PP(Y/F). Crystal structures are known for a number of Mtases PUBMED:7607476, PUBMED:8343957, PUBMED:8127644, PUBMED:7971991. The cofactor binding sites are almost identical and the essential catalytic amino acids coincide. The comparable protein folding and the existence of equivalent amino acids in similar secondary and tertiary positions indicate that many (if not all) AdoMet-Mtases have a common catalytic domain structure. This permits tertiary structure prediction of other DNA, RNA, protein, and small-molecule AdoMet-Mtases from their amino acid sequences PUBMED:7897657.

    \ \

    Several cytoplasmic vertebrate methyltransferases are evolutionary related PUBMED:8182091, including\ nicotinamide N-methyltransferase () (NNMT); phenylethanolamine N-methyltransferase \ () (PNMT); and thioether S-methyltransferase \ () (TEMT). NNMT catalyzes the \ N-methylation of nicotinamide and other pyridines to form pyridinium ions. This activity is important \ for the biotransformation of many drugs and xenobiotic compounds. PNMT catalyzes the last step in \ catecholamine biosynthesis, the conversion of noradrenalin to adrenalin; and TEMT catalyzes the\ methylation of dimethyl sulphide into trimethylsulphonium. These three enzymes use S-adenosyl-L-methionine \ as the methyl donor. They are proteins of 30 to 32 kDa.

    \ ' '628' 'IPR007276' '\ Emg1 and Nop14 are novel proteins whose interaction is required for the maturation of the 18S rRNA and for 40S ribosome production PUBMED:11694595.\ ' '629' 'IPR007282' '\ NOT1, NOT2, NOT3, NOT4 and NOT5 form a nuclear complex that negatively regulates the basal and activated transcription of many genes. This family includes NOT2, NOT3 and NOT5.\ ' '630' 'IPR007207' '\

    The Ccr4-Not complex (Not1, Not2, Not3, Not4 and Not5) is a global regulator of transcription that affects genes positively and negatively and is thought to regulate transcription factor TFIID PUBMED:7926748. This domain is the N-terminal region of the Not proteins.

    \ ' '631' 'IPR004136' '\

    2-Nitropropane dioxygenase () catalyses the oxidation of nitroalkanes into their corresponding carbonyl compounds and nitrite using eithr FAD or FMN as a cofactor PUBMED:15582992. This entry also includes fatty acid synthase subunit beta (), which catalyses the formation of long- chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH. The beta subunit contains domains for: [acyl-carrier protein] acetyltransferase and malonyltransferase, S-acyl fatty acid synthase thioesterase, enoyl-[acyl-carrier protein] reductase, and 3-hydroxypalmitoyl-[acyl-carrier protein] dehydratase.

    \ ' '632' 'IPR007717' '\ The HRD4 gene is identical to NPL4, a gene previously implicated in nuclear transport. Using a diverse set of substrates and direct ubiquitination assays, analysis revealed that HRD4/NPL4 is required for a poorly characterised step in ER-associated degradation following ubiquitination of target proteins but preceeding their recognition by the 26S proteasome PUBMED:11739805. Npl4p physically associates with Cdc48p via Ufd1p to form a Cdc48p-Ufd1p-Npl4p complex. The Cdc48-Ufd1-Npl4 complex functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation or even more specific processing.\ ' '633' 'IPR002934' '\

    A small region that overlaps with a nuclear localization signal and binds to the RNA primer contains three aspartates that are essential for catalysis. Sequence and secondary structure comparisons of regions surrounding these aspartates with sequences of other polymerases revealed a significant homology to the palm structure of DNA polymerase beta, terminal deoxynucleotidyltransferase and DNA polymerase IV of Saccharomyces cerevisiae, all members of the family X of polymerases. This homology extends as far as cca: tRNA nucleotidyltransferase and streptomycin adenylyltransferase, an antibiotic resistance factor PUBMED:7482698, PUBMED:8665867.

    \

    \ Proteins containing this domain include kanamycin nucleotidyltransferase (KNTase) which is a plasmid-coded enzyme responsible for some types of bacterial resistance to aminoglycosides. KNTase inactivates antibiotics by catalysing the addition of a nucleotidyl group onto the drug. In experiments, Mn2+ strongly stimulated this reaction due to a 50-fold lower Ki for 8-azido-ATP in the presence of Mn2+. Mutations of the highly conserved\ Asp residues 113, 115, and 167, critical for metal binding in the catalytic domain of bovine poly(A) polymerase, led to a strong\ reduction of cross-linking efficiency, and Mn2+ no longer stimulated the reaction. Mutations in the region of the "helical turn motif"\ (a domain binding the triphosphate moiety of the nucleotide) and in the suspected nucleotide-binding helix of bovine poly(A) polymerase\ impaired ATP binding and catalysis. The results indicate that ATP is bound in part by the helical turn motif and in part by a region that\ may be a structural analogue of the fingers domain found in many polymerases.

    \ ' '634' 'IPR018933' '\

    The netrin (NTR) module is an about 130-residue domain found in the C-terminal parts of netrins, complement proteins C3, C4, and C5, secreted frizzled-related proteins, and type I procollagen C-proteinase enhancer proteins (PCOLCEs), as well as in the N-terminal parts of tissue inhibitors of metalloproteinases (TIMPs). The proteins harboring the NTR domain fulfill diverse biological roles ranging from axon guidance, regulation of Wnt signalling, to the control of the activity of metalloproteinases. The NTR domain can be found associated to other domains such as CUB, WAP, Kazal, Kunitz, Ig-like, laminin N-terminal, laminin-type EGF or frizzled. The NTR domain is implicated in inhibition of zinc metalloproteinases of the metzincin family PUBMED:10452607, PUBMED:11274388.

    \ \

    The NTR module is a basic domain containing six conserved cysteines, which are likely to form internal disulphide bonds, and several conserved blocks of hydrophobic residues (including an YLLLG-like motif). The NTR module consists of a beta-barrel with two terminal alpha-helices packed side by side against the face of the beta-barrel (see ) PUBMED:10452607.

    \

    This entry includes most netrin modules, but excludes those found in TIMPs.

    \ ' '635' 'IPR007271' '\

    This family of membrane proteins transport nucleotide sugars from the cytoplasm into golgi vesicles. transports CMP-sialic acid, transports UDP-galactose and transports UDP-GlcNAc. This family has some but not complete overlap with the UDP-galactose transporter family .

    \ ' '637' 'IPR007230' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of autocatalytic serine endopeptidases belong to MEROPS peptidase family S59 (clan SP).

    \

    The nuclear pore complex protein plays a role in bidirectional transport across the nucleoporin complex in nucleocytoplasmic transport. The mammalian nuclear pore complex (NPC) is comprised of approximately 50 unique proteins, collectively known as nucleoporins. A number of the peptides are synthesised as precursors and undergo self-catalyzed cleavage.

    \ \

    The proteolytic cleavage site of yeast Nup145p has been mapped upstream of an evolutionary conserved serine residue. Cleavage occurs at the same site when a precursor is artificially expressed in Escherichia coli. A hydroxyl-containing residue is critical for the reaction, although a thiol-containing residue offers an acceptable replacement. In vitro kinetics experiments using a purified precursor molecule demonstrate that the cleavage is self-catalyzed and that the catalytic domain lies within the N-terminal moiety. Taken altogether, the data are consistent with a proteolytic mechanism involving an N>O acyl rearrangement and a subsequent ester intermediate uncovered in other self-processing proteins PUBMED:10542288.

    \ \

    Nup98 is a component of the nuclear pore that plays its primary role in the export of RNAs. Nup98 is expressed in two forms, derived from alternate mRNA splicing. Both forms are processed into two peptides through autoproteolysis mediated by the C-terminal domain of hNup98. The three-dimensional structure of the C-terminal domain reveals a novel protein fold, and thus a new class of autocatalytic proteases. The structure further reveals that the suggested nucleoporin RNA binding motif is unlikely to bind to RNA PUBMED:12191480.

    \ ' '638' 'IPR002259' '\

    Delayed-early response (DER) gene products include growth progression\ factors and several unknown products of novel cDNAs. Murine and human cDNAs\ from one novel DER gene (DER12) have been characterised to identify its\ product and to examine its role in the growth response PUBMED:7639753. Both sequences\ encode a hydrophobic 36kDa protein that is predicted to contain 8\ transmembrane (TM) domains. The protein has been localised to the nucleolus,\ where its concentration increases following mitogen stimulation PUBMED:7639753.

    \

    Although the function of the protein is unknown, its identification as a\ nucleolar gene transcriptionally activated by growth factors implicates it\ as participating in the proliferative response PUBMED:7639753. Sequence analysis\ reveals the protein to share a high degree of similarity with the C-terminal\ portion of equilibrative nucleoside transporters. These proteins are integral membrane proteins which enable the movement of hydrophilic nucleosides\ and nucleoside analogs down their concentration gradients across cell membranes. ENT family members have been identified in humans, mice, fish, tunicates, slime molds, and bacteria PUBMED:12446811.

    \ ' '639' 'IPR006964' '\

    This domain represents the C-terminal conserved region of NUDE proteins. Emericella nidulans (Aspergillus nidulans) NUDE, acts in the cytoplasmic dynein/dynactin pathway and is required for distribution of nuclei PUBMED:11509576. It is a homologue of the nuclear distribution protein RO11 of Neurospora crassa. NUDE interacts with the NUDF via an N-terminal coiled coil domain; this is the only domain which is absolutely required for NUDE function.

    \ ' '640' 'IPR000086' '\ MutT is a small bacterial protein (~12-15Kd) involved in the GO system PUBMED:1328155\ responsible for removing an oxidatively damaged form of guanine (8-hydroxy-\ guanine or 7,8-dihydro-8-oxoguanine) from DNA and the nucleotide pool.\ 8-oxo-dGTP is inserted opposite dA and dC residues of template DNA with near equal efficiency, leading to A.T to G.C transversions. MutT\ specifically degrades 8-oxo-dGTP to the monophosphate, with the concomitant\ release of pyrophosphate. A short conserved N-terminal region of mutT \ (designated the MutT domain) is also found in a variety of other\ prokaryotic, viral and eukaryotic proteins PUBMED:8233837, PUBMED:8170394, PUBMED:8226881, PUBMED:10373642.\ \

    The generic name \'NUDIX hydrolases\' (NUcleoside DIphosphate linked\ to some other moiety X) has been coined for this domain family PUBMED:8810257. The\ family can be divided into a number of subgroups, of which MutT anti-\ mutagenic activity represents only one type; most of the rest hydrolyse\ diverse nucleoside diphosphate derivatives (including ADP-ribose, GDP-\ mannose, TDP-glucose, NADH, UDP-sugars, dNTP and NTP).

    \ ' '641' 'IPR003654' '\

    This 14 amino acid motif has been identified within the C-terminal region of several Paired-like homeodomain (HD) containing proteins PUBMED:8944018, PUBMED:9466998. It was named OAR domain after the initials of otp, aristaless, and rax PUBMED:9096350. Although it has been proposed that this domain could be important for transactivation and be involved in protein-protein interactions or DNA binding PUBMED:9096350, PUBMED:9140395, is function is not yet known. Some proteins known to contain a OAR domain include human RIEG, defects in which are the cause of Rieger syndrome PUBMED:8944018; human OG12X and Mus musculus (Mouse) Og12x, whose function is not yet known PUBMED:9466998; vertebrate Rax, which plays a role in the proliferation and/or differentiation of retinal cells PUBMED:9096350; Drosophila DRX, which appears to be important in brain development PUBMED:9482887; and human SHOX, encoded by the short stature homeobox-containing gene. Defects or lack of this protein are the cause of short stature associated with the Turner syndrome PUBMED:9140395.

    \ ' '642' 'IPR004156' '\

    This family consists of several eukaryotic Organic-Anion-Transporting Polypeptides (OATPs). Several have been identified mostly in human and rat. Different OATPs vary in tissue distribution and substrate specificity. Since the numbering of different OATPs in particular species was based originally on the order of discovery, similarly numbered OATPs in humans and rats did not necessarily correspond in function, tissue distribution and substrate specificity (in spite of the name, some OATPs also transport organic cations and neutral molecules) so a scheme of using digits for rat OATPs and letters for human ones was introduced PUBMED:10873595. Prostaglandin transporter (PGT) proteins are also considered to be OATP family members. In addition, the methotrexate transporter OATK is closely related to OATPs. This family also includes several predicted proteins from Caenorhabditis elegans and Drosophila melanogaster. This similarity was not previously noted. Note: Members of this family are described (in the UniProtKB/Swiss-Prot database) as belonging to the SLC21 family of transporters.

    \ ' '644' 'IPR007702' '\ This family is comprised of the Ocnus, Janus-A and Janus-B proteins. These proteins have been found to be testes specific in Drosophila melanogaster PUBMED:11319264.\ ' '645' 'IPR001754' '\

    Orotidine 5\'-phosphate decarboxylase (OMPdecase) PUBMED:2835631, PUBMED:1730672 catalyses the last step in the de novo biosynthesis of pyrimidines, the decarboxylation of OMP into UMP. In higher eukaryotes OMPdecase is part, with orotate phosphoribosyltransferase, of a bifunctional enzyme, while the prokaryotic and fungal OMPdecases are monofunctional protein.

    \

    Some parts of the sequence of OMPdecase are well conserved across species. The best conserved region is located in the N-terminal half of OMPdecases and is centred around a lysine residue which is essential for the catalytic function of the enzyme.

    \ ' '647' 'IPR007220' '\

    All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed the origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in PUBMED:7867956. This entry is subunit 2, which binds the origin of replication. It plays a role in chromosome replication and mating type transcriptional silencing.

    \ ' '648' 'IPR005055' '\

    A class of small (14-20 Kd) water-soluble proteins, called odorant binding proteins (OBPs), first discovered in the insect sensillar lymph but also in the mucus of vertebrates, is postulated to mediate the solubilisation of hydrophobic odorant molecules, and thereby to facilitate their transport to the receptor neurons. The product of a gene expressed in the olfactory system of Drosophila melanogaster (Fruit fly), OS-D, shares features common to vertebrate\ odorant-binding proteins, but has a primary structure unlike odorant-binding proteins PUBMED:8206941. OS-D derivatives have subsequently been found in chemosensory organs of phylogenetically distinct insects, including cockroaches, phasmids and moths, suggesting that OS-D-like proteins seem to be conserved in the insect phylum.

    \ ' '649' 'IPR006844' '\ The proteins in this family are a part of a complex of eight ER proteins that transfers core oligosaccharide from dolichol carrier to Asn-X-Ser/Thr motifs PUBMED:7622558. This family includes both OST3 and OST6, each of which contains four predicted transmembrane helices. Disruption of OST3 and OST6 leads to a defect in the assembly of the complex. Hence, the function of these genes seems to be essential for recruiting a fully active complex necessary for efficient N-glycosylation PUBMED:10358084.\ ' '650' 'IPR002128' '\ This domain represents a C-terminal extension of NADH-Ubiquinone/plastoquinone (complex I) chains (see ). Only NADH-Ubiquinone chain 5 from chloroplasts belong to this family. Chain 5 is a component of complex I which catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane PUBMED:1470679.\

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '651' 'IPR002884' '\

    This domain, termed the P domain is approximately 150 amino acids in length and C-terminal to a serine endopeptidase domain which belong to MEROPS peptidase family S8 (clan SB), subfamily S8B (kexin). The domain is primarily associated with the calcium-dependent serine endopeptidases, kex2/subtilisin proprotein convertases (PCs), which have been identified in all eukaryotes PUBMED:9353231 and in the gammaproteobacteria, Nostoc (cyanobacteria) and in Streptomyces avermitilis.

    \ \

    The P domain appears necessary for folding and maintaining the endopeptidase catalytic domain and to regulate its calcium and acidic pH dependence. In addition, contained within the middle of the P domain in most PC family members is the cognate integrin binding RGD sequence PUBMED:10212221, which may be required for intracellular compartmentalization and maintenance of enzyme stability within the ER. The integrity of the RGD sequence of proprotein convertase PC1 is critical for its zymogen and C-terminal processing and for its cellular trafficking PUBMED:9307023, PUBMED:10212221. The carboxy-terminal tail provides uniqueness to each PC family member being the least conserved region of all convertases PUBMED:10842308.

    \ ' '652' 'IPR007204' '\

    The Arp2/3 complex is a seven-protein assembly that is critical for actin nucleation and branching in cells. Arp2/3 nucleates new actin filaments while bound to existing filaments, thus creating a branched network PUBMED:15040784. The complex consists of Arp2, Arp3, p41, p34, p21, p20 and p16. Subunits p34 and p20 constitute the core of the structure, with the remaining subunits located peripherally PUBMED:11741539. This entry describes the p21 subunit. Proteins such as WASp and Scar1 may mediate receptor signalling through interactions with p21-Arc, resulting in the activation of Arc2/3 complex activity PUBMED:11162547.

    \ ' '653' 'IPR003429' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    The anti-apoptotic protein p35 from baculovirus is thought to prevent the suicidal response of\ infected insect cells by inhibiting caspases. Ectopic expression of p35 in a number of transgenic animals or cell lines is also anti-apoptotic, giving rise to the hypothesis that the protein is a general inhibitor of caspases.

    \ \

    This protein belongs to MEROPS proteinase inhibitor family I50, clan IQ. Purified recombinant p35 inhibits human caspase-1, -3, -6, -7, -8, and -10 but does not significantly inhibit unrelated serine or cysteine proteases, implying that p35 is a potent caspase-specific inhibitor. The interaction of p35 with caspase-3, as a model of the inhibitory mechanism,revealed classic slow-binding inhibition, with both active-sites of the caspase-3 dimer acting equally and independently. Inhibition resulted from complex formation between the enzyme and inhibitor, which could be visualised under non-denaturing conditions, but was dissociated by SDS to give p35 cleaved at Asp87, the P1 residue of the inhibitor. Complex formation requires the substrate-binding cleft to be unoccupied PUBMED:9692966.

    \ \

    Infecting the insect cell line IPLB-Ld652Y with the baculovirus Autographa californica nuclear polyhedrosis virus (AcMNPV) results in global translation arrest, which correlates with the presence of the AcMNPV apoptotic suppressor, p35. However, the anti-apoptotic function of p35 in translation arrest is not solely due to caspase inactivation, but its activity enhances signalling to a separate translation arrest pathway, possibly by stimulating the late stages of the baculovirus infection cycle PUBMED:14980489.

    \ ' '654' 'IPR003137' '\ The PA (Protease associated) domain is found as an insert domain in diverse proteases, which include the MEROPS peptidase families A22B, M28, and S8A PUBMED:7674922. The PA domain is also found in a plant vacuolar sorting receptor and members of the RZF family, e.g. .\ ' '656' 'IPR001926' '\

    Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination PUBMED:8690703, PUBMED:7748903, PUBMED:15189147. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors PUBMED:17109392. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy PUBMED:16763894.

    \

    PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic PUBMED:15581583.

    \ \ \

    Pyridoxal-5\'-phosphate-dependent enzymes (B6 enzymes) catalyze manifold reactions in the metabolism of amino acids. Most of these enzymes can be assigned to one of three different families of homologous proteins, the alpha, beta and gamma families. The alpha and gamma family might be distantly related with one another, but are clearly not homologous with the beta family. The beta family includes L- and D-serine dehydratase, threonine dehydratase, the beta subunit of tryptophan synthase, threonine synthase and cysteine synthase. These enzymes catalyze beta-replacement or beta-elimination reactions PUBMED:8112347.

    \

    Comparison of sequences from eukaryotic, archebacterial, and eubacterial species indicates that the functional specialization of most B6 enzymes has occurred already in the universal ancestor cell. The cofactor pyridoxal-5-phosphate must have emerged very early in biological evolution; conceivably, organic cofactors and metal ions were the first biological catalysts PUBMED:10800595.

    \

    The 3D structure of the beta-subunit of tryptophan synthase has been solved. The subunit has two domains that are approximately the same size and similar to\ each other in folding pattern. Each has a core containing a four-stranded\ parallel beta-sheet with three helices on its inner side and one on the outer\ side. The cofactor is bound at the interface between the domains PUBMED:7748903.

    \ ' '657' 'IPR003014' '\

    It has been shown that, the N-terminal N domains of members of the plasminogen/hepatocyte growth factor family, the apple domains of the plasma prekallikrein/coagulation factor XI family, and domains of various nematode proteins belong to the same module superfamily, the PAN module PUBMED:10561497. PAN contains a conserved core of three disulphide\ bridges. In some members of the family there is an additional\ fourth disulphide bridge that links the N and C termini of the\ domain.

    \

    PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions PUBMED:10561497. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs

    \ ' '658' 'IPR000326' '\

    This entry represents type 2 phosphatidic acid phosphatase (PAP2; ) enzymes, such as phosphatidylglycerophosphatase B from Escherichia coli. PAP2 enzymes have a core structure consisting of a 5-helical bundle, where the beginning of the third helix binds the cofactor PUBMED:10835340. PAP2 enzymes catalyse the dephosphorylation of phosphatidate, yielding diacylglycerol and inorganic phosphate PUBMED:17079146. In eukaryotic cells, PAP activity has a central role in the synthesis of phospholipids and triacylglycerol through its product diacylglycerol, and it also generates and/or degrades lipid-signalling molecules that are related to phosphatidate.

    \

    Other related enzymes have a similar core structure, including haloperoxidases such as bromoperoxidase (contains one core bundle, but forms a dimer), chloroperoxidases (contains two core bundles arranged as in other family dimers), bacitracin transport permease from Bacillus licheniformis, glucose-6-phosphatase from rat. The vanadium-dependent haloperoxidases exclusively catalyse the oxidation of halides, and act as histidine phosphatases, using histidine for the nucleophilic attack in the first step of the reaction PUBMED:12447906. Amino acid residues involved in binding phosphate/vanadate are conserved between the two families, supporting a proposal that vanadium passes through a tetrahedral intermediate during the reaction mechanism.

    \ ' '659' 'IPR002058' '\

    These PAP/25A associated domains are found in uncharacterised eukaryotic proteins, a number of which are described as \'topoisomerase 1-related\' though they appear to have little or no homology to topoisomerase 1. The signatures that define this group of sequences often occur towards the C-terminus after the PAP/25A core domain .

    \ ' '660' 'IPR007012' '\

    In eukaryotes, polyadenylation of pre-mRNA plays an essential role in the initiation step of protein synthesis, as well as in the export and stability of mRNAs. Poly(A) polymerase, the enzyme at the heart of the polyadenylation machinery, is a template-independent RNA polymerase which specifically incorporates ATP at the 3\' end of mRNA. The crystal structure of bovine poly(A) polymerase bound to an ATP analog at 2.5 A resolutio has been determined PUBMED:10944102. The structure revealed expected and unexpected similarities to other proteins. As expected, the catalytic domain of poly(A) polymerase shares substantial structural homology with other nucleotidyl transferases such as DNA polymerase beta and kanamycin transferase.

    \ \

    The central domain of Poly(A) polymerase shares structural similarity with the allosteric activity domain of ribonucleotide reductase R1, which comprises a four-helix bundle and a three-stranded mixed beta-sheet. Even though the two enzymes bind ATP, the ATP-recognition motifs are different.

    \ ' '661' 'IPR007724' '\ Poly(ADP-ribose) glycohydrolase (PARG) is a ubiquitously expressed exo- and endoglycohydrolase which mediates oxidative and excitotoxic neuronal death PUBMED:11593040.\ ' '662' 'IPR002641' '\ This family consists of various patatin glycoproteins from the total soluble protein in potato tubers PUBMED:3371664. Patatin is a storage protein but it also has the enzymatic activity of lipid acyl hydrolase, catalysing the cleavage of fatty acids from membrane lipids PUBMED:3371664.\ ' '663' 'IPR003392' '\ The transmembrane protein, patched, is a receptor for the morphogene Sonic Hedgehog. In Drosophila melanogaster, this protein associates with the smoothened protein to transduce hedgehog signals, leading to the activation of wingless, decapentaplegic and patched itself. It participates in cell interactions that establish pattern within the segment and imaginal disks during development. The mouse homologue may play a role in epidermal development. The human Niemann-Pick C1 protein, defects in which cause Niemann-Pick type II disease, is also a member of this family. This protein is involved in the intracellular trafficking of cholesterol, and may play a role in vesicular trafficking in glia, a process that may be crucial for maintaining the structural functional integrity of nerve terminals.\ ' '664' 'IPR003100' '\

    This domain is named after the proteins Piwi Argonaut and Zwille. It is also found in the CAF protein from Arabidopsis thaliana. The function of the domain is unknown but has been found in the middle region of a number of members of the Argonaute protein family, which also contain the Piwi domain () in their C-terminal region PUBMED:12906857. Several members of this family have been implicated in the\ development and maintenance of stem cells through the RNA-mediated gene-quelling mechanisms\ associated with the protein DICER.

    \ \ ' '665' 'IPR000270' '\ The Phox and Bem1p domain, is present in many eukaryotic cytoplasmic signalling proteins. The domain adopts a beta-grasp fold, similar to\ that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region\ of the majority of PB1 domains, and is necessary for PB1 domain function. This function is the formation of PB1 domain\ heterodimers, although not all PB1 domain pairs associate. \ ' '666' 'IPR000095' '\ The molecular bases of the versatile functions of Rho-like GTPases are still unknown.\ Small domains that bind Cdc42p- and/or Rho-like small GTPases.\ Also known as the Cdc42/Rac interactive binding (CRIB). The Cdc42/Rac interactive binding\ (CRIB) region has been shown to inhibit transcriptional activation\ and cell transformation mediated by the Ras-Rac pathway PUBMED:9119069. In fission yeast pak1+ encodes a protein kinase that interacts\ with Cdc42p and is involved in the control of cell polarity\ and mating PUBMED:8846783.\ ' '667' 'IPR008914' '\

    The PEBP (PhosphatidylEthanolamine-Binding Protein) family is a highly conserved group of proteins that have been identified in numerous tissues in a wide variety of organisms, including bacteria, yeast, nematodes, plants, drosophila and mammals. The various functions described for members of this family include lipid binding, neuronal development PUBMED:12492898, serine protease inhibition PUBMED:11034991, the control of the morphological switch between shoot growth and flower structures PUBMED:10764580, and the regulation of several signalling pathways such as the MAP kinase pathway PUBMED:12551925, and the NF-kappaB pathway PUBMED:11585904. The control of the latter two pathways involves the PEBP protein RKIP, which interacts with MEK and Raf-1 to inhibit the MAP kinase pathway, and with TAK1, NIK, IKKalpha and IKKbeta to inhibit the NF-kappaB pathway. Other PEBP-like proteins that show strong structural homology to PEBP include Escherichia coli YBHB and YBCL, the Rattus norvegicus (Rat) neuropeptide HCNP, and Antirrhinum majus (Garden snapdragon) protein centroradialis (CEN).

    \

    Structures have been determined for several members of the PEBP-like family, all of which show extensive fold conservation. The structure consists of a large central beta-sheet flanked by a smaller beta-sheet on one side, and an alpha helix on the other. Sequence alignments show two conserved central regions, CR1 and CR2, that form a consensus signature for the PEBP family. These two regions form part of the ligand-binding site, which can accommodate various anionic groups. The N- and C-terminal regions are the least conserved, and may be involved in interactions with different protein partners. The N-terminal residues 2-12 form the natural cleavage peptide HCNP involved in neuronal development. The C-terminal region is deleted in plant and bacterial PEBP homologues, and may help control accessibility to the active site.

    \ ' '668' 'IPR000717' '\ A homology domain of unclear function, occurs in the C-terminal region of several\ regulatory components of the 26S proteasome as well as in other proteins. This domain\ has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15) PUBMED:9644972.\ Apparently, all of the characterised proteins containing PCI domains are parts of larger\ multi-protein complexes. Proteins with PCI domains include budding yeast proteasome\ regulatory components Rpn3(Sun2), Rpn5, Rpn6, Rpn7and Rpn9 PUBMED:9584156; mammalian proteasome regulatory components p55, p58 and p44.5, and translation\ initiation factor 3 complex subunits p110 and INT6 PUBMED:8995409, PUBMED:9341143; Arabidopsis\ COP9 and FUS6/COP11 PUBMED:8689678; mammalian G-protein pathway suppressor GPS1, and several\ uncharacterised ORFs from plant, nematodes and mammals. The complete homology domain comprises\ approx. 200 residues, the highest conservation is found in the C-terminal half. Several of the\ proteins mentioned above have no detectable homology to the N-terminal half of the domain.\ ' '669' 'IPR005139' '\

    This domain is found in peptide chain release factors. Peptide chain release factors are important for protein synthesis since they direct the termination of translation in response to the peptide chain termination codons UAG and UAA. Bacteria contain RF1 and Eukaryotes contain RF2. These are structurally distinct but both contain the PCRF domain PUBMED:11779511.

    \ ' '670' 'IPR003099' '\

    Members of this family are prephenate dehydrogenases involved in tyrosine biosynthesis.

    \ ' '671' 'IPR005255' '\

    Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination PUBMED:8690703, PUBMED:7748903, PUBMED:15189147. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors PUBMED:17109392. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy PUBMED:16763894.

    \

    PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic PUBMED:15581583.

    \ \ \

    In Escherichia coli, the pdx genes involved in vitamin B6 have been characterised PUBMED:10225425, PUBMED:15242009, PUBMED:17344055. This entry represents 4-hydroxythreonine-4-phosphate dehydrogenase (PdxA, ). PdxA takes part in vitamin B6 biosynthesis, forming pyridoxine 5\'-phosphate from 4-(phosphohydroxy)-L-threonine and 1-deoxy-D-xylulose-5-phosphate.

    \ ' '672' 'IPR000084' '\ This family is named after a PE motif near to the amino\ terminus. The carboxyl terminus of this family\ are variable and fall into several classes. The\ largest class of PE proteins is the highly repetitive\ PGRS class which have a high glycine content.\ The function of these proteins is uncertain but it\ has been suggested that they may be related to\ antigenic variation of Mycobacterium tuberculosis PUBMED:9634230.\ ' '673' 'IPR000070' '\

    Pectinesterase (pectin methylesterase) catalyses the de-esterification of pectin into pectate and methanol. Pectin is one of the main components of the plant cell wall. In plants, pectinesterase plays an important role in cell wall metabolism during fruit ripening. In plant bacterial pathogens such as Erwinia carotovora and in fungal pathogens such as Aspergillus niger, pectinesterase is involved in maceration and soft-rotting of plant tissue. Plant pectinesterases are regulated by pectinesterase inhibitors, which are ineffective against microbial enzymes PUBMED:15722470.

    \

    Prokaryotic and eukaryotic pectinesterases share a few regions of sequence similarity. The crystal structure of pectinesterase from Erwinia chrysanthemi revealed a beta-helix structure similar to that found in pectinolytic enzymes, though it is different from most structures of esterases PUBMED:11162105. The putative catalytic residues are in a similar location to those of the active site and substrate-binding cleft of pectate lyase.

    \ ' '674' 'IPR006800' '\ Pellino is involved in Toll-like signalling pathways, and associates with the kinase domain of the Pelle Ser/Thr kinase PUBMED:10858658, PUBMED:11132151, PUBMED:10330490.\ ' '675' 'IPR003477' '\

    PemK is a growth inhibitor in Escherichia coli known to bind to the promoter region of the Pem operon, auto-regulating synthesis. It is responsible for mediating cell death through inhibiting protein synthesis through the cleavage of single-stranded RNA. PemK is part of the PemK-PemI system, where PemI is an antitoxin that inhibits the action of the PemK toxin PUBMED:15024022. PemK homologues have been found in a wide range of bacteria, which together form an endonuclease family that interfere with mRNA function. This family consists of the PemK protein in addition to ChpA, ChpB, Kid and MazF.

    \ \ \ ' '676' 'IPR002692' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The penicillin amidases or penicillin acylases are serine peptidases belonging to the MEROPS peptidase family S45 (clan PB(S)). The protein fold of the peptidase domain for members of this family resembles that of archaean proteasome subunit B, the type example of clan PB.

    \ \

    Penicillin amidase or penicillin acylase catalyses the hydrolysis of benzylpenicillin to phenylacetic acid and 6-aminopenicillanic acid (6-APA) a key intermediate in the the synthesis of penicillins PUBMED:9292993.

    \ ' '677' 'IPR002989' '\ These repeats are found in many mycobacterial proteins. The repeats\ are most common in the PPE family of proteins , where they\ are found in the MPTR subfamily. The function of\ these repeats is unknown. The repeat can be approximately described as\ XNXGX, where X can be any amino acid. These repeats are similar to\ A(D/N)LXX repeats PUBMED:9655353, however it is not clear if these two families are\ structurally related.\ ' '678' 'IPR008279' '\ A number of enzymes that catalyze the transfer of a phosphoryl group from\ phosphoenolpyruvate (PEP) via a phospho-histidine intermediate have been shown\ to be structurally related PUBMED:7686067, PUBMED:8973315, PUBMED:2176881, PUBMED:1557039. All these enzymes share the same catalytic mechanism: they bind PEP and\ transfer the phosphoryl group from it to a histidine residue. This domain is a "swivelling" beta/beta/alpha domain which is thought to be mobile in all\ proteins known to contain it PUBMED:12083528. It is often found associated with the pyruvate phosphate dikinase, PEP/pyruvate-binding domain () at its N-terminus.\ ' '679' 'IPR007810' '\ This region is found in a number of proteins identified as being involved in Golgi function and vacuolar sorting. The molecular function of this region is unknown. Proteins containing this domain also contain a C-terminal ring finger domain.\ ' '680' 'IPR004134' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of proteins belong to MEROPS peptidase family C1, sub-family C1B (bleomycin hydrolase, clan CA). This family contains prokaryotic and eukaryotic aminopeptidases and bleomycin hydrolases.

    \ ' '681' 'IPR000668' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues.

    \ \

    The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity PUBMED:7845226. Members of the papain family are widespread, found in baculovirus PUBMED:8439290, eubacteria, yeast, and practically all protozoa, plants and mammals PUBMED:7845226. The proteins are typically\ lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals PUBMED:3117099. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate PUBMED:12188906.\

    \ \

    The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the \'oxyanion hole\', and Asn-175, which orientates the imidazole ring of His-159.

    \ \ ' '682' 'IPR001096' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to the MEROPS peptidase family C13 (legumain family, clan CD). A type example is legumain from Canavalia ensiformis (Jack bean, Horse bean). The blood fluke parasite Schistosoma mansoni has two cysteine proteases in its digestive tract, one a cathepsin B-like protease, the other termed hemoglobinase PUBMED:7845226, PUBMED:3305515. The latter has been hard to purify, free of cathepsin\ B, and expressed forms in Escherichia coli prove to be inactive, suggesting that hemoglobinase may act in association with cathepsin B PUBMED:7845226, PUBMED:8457210. Plant vacuolar processing enzyme and legumain from legumes PUBMED:7845226 have been shown to have\ sequence and functional similarity to hemoglobinase. The catalytic residues\ of the family are currently unknown, but sequence alignments reveal one\ totally conserved cysteine and two totally conserved histidines.

    \ ' '683' 'IPR001730' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) PUBMED:7845226, PUBMED:.

    \ \

    Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process PUBMED:7845226.

    \

    This peptidase is present in the nuclear inclusion protein of potyviruses.

    \ ' '684' 'IPR003653' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of proteins contain cysteine peptidases belonging to MEROPS peptidase family C48 (Ulp1 endopeptidase family, clan CE). The protein fold of the peptidase domain for members of this family resembles that of adenain, the type example for clan CE. This group of sequences also contains a number of hypothetical proteins, which have not yet been characterised, and non-peptidase homologues. These are proteins that have either been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of the peptidases in the family.

    \ \

    The Ulp1 endopeptidase family contain the deubiquitinating enzymes (DUB) that can de-conjugate ubiquitin or ubiquitin-like proteins from ubiquitin-conjugated proteins. They can be classified in 3 families according to sequence homology PUBMED:10603300, PUBMED:8982460: Ubiquitin carboxyl-terminal hydrolase (UCH) (see ), Ubiquitin-specific processing protease (UBP) (see ), and ubiquitin-like protease (ULP) specific for de-conjugating ubiquitin-like proteins. In contrast to the UBP pathway, which is very redundant (16 UBP enzymes in yeast), there are few ubiquitin-like proteases (only one in yeast, Ulp1).

    \

    Ulp1 catalyses two critical functions in the SUMO/Smt3 pathway via its\ cysteine protease activity. Ulp1 processes the Smt3 C-terminal sequence\ (-GGATY) to its mature form (-GG), and it de-conjugates Smt3 from the lysine\ epsilon-amino group of the target protein PUBMED:10094048.

    \

    Crystal structure of yeast Ulp1 bound to Smt3 PUBMED:10882122 revealed that the catalytic and interaction interface is situated in a shallow and narrow cleft where conserved residues recognise the Gly-Gly motif at the C-terminal extremity of Smt3 protein. Ulp1 adopts a novel architecture despite some structural similarity with other cysteine protease. The secondary structure is composed of seven alpha helices and seven beta strands. The catalytic domain includes the central alpha helix, beta-strands 4 to 6, and the catalytic triad (Cys-His-Asp). This profile is directed against the C-terminal part of ULP proteins that displays full proteolytic activity PUBMED:10882122.

    \ ' '685' 'IPR005078' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This is a group of cysteine peptidases which constitute MEROPS peptidase family C54 (Aut2 peptidase family, clan CA), which are a group of proteins of unknown function.

    \ ' '686' 'IPR014782' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M1 (clan MA(E)), the type example being aminopeptidase N from Homo sapiens (Human). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.

    \ \ \ Membrane alanine aminopeptidase ()\ is part of the HEXXH+E\ group; it consists entirely of aminopeptidases, spread across a wide\ variety of species PUBMED:7674922. Functional studies show that CD13/APN catalyzes the removal of single amino acids from the amino terminus of small peptides and probably plays a role in their final digestion; one family member (leukotriene-A4 hydrolase) is known to hydrolyse the epoxide leukotriene-A4\ to form an inflammatory mediator PUBMED:7674922. This hydrolase has been shown to\ have aminopeptidase activity PUBMED:2244921, and the zinc ligands of the M1 family\ were identified by site-directed mutagenesis on this enzyme PUBMED:7674922 CD13 participates in trimming peptides bound to MHC class II molecules PUBMED:8691132 and cleaves MIP-1 chemokine, which alters target cell specificity from basophils to eosinophils PUBMED:8627182. CD13 acts as a receptor for specific strains of RNA viruses (coronaviruses) which cause a relatively large percentage of upper respiratory\ trace infections.

    \ \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).\ \

    \ ' '687' 'IPR018497' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M13 (neprilysin family, clan MA(E)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH PUBMED:7674922.

    \ \

    M13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk PUBMED:11223883, PUBMED:7674922. The family includes eukaryotic and prokaryotic oligopeptidases, as well as some of the proteins responsible for the molecular basis of the blood group antigens e.g. Kell PUBMED:7674922.

    \ \

    Neprilysin (), is another member of this group, it is variously known as common acute lymphoblastic leukemia antigen (CALLA), enkephalinase (gp100) and neutral endopeptidase metalloendopeptidase (NEP). It is a plasma membrane-bound mammalian enzyme that is able to digest biologically-active peptides, including enkephalins PUBMED:7674922. The zinc ligands of neprilysin are known and are analogous to those in thermolysin, a related peptidase PUBMED:7674922, PUBMED:8099556. Neprilysins, like thermolysin, are inhibited by phosphoramidon, which appears to selectively inhibit this family in mammals. The enzymes are all oligopeptidases, digesting oligo- and polypeptides, but not proteins PUBMED:7674922. Neprilysin consists of a short cytoplasmic domain, a membrane-spanning region and a large extracellular domain. The cytoplasmic domain contains a conformationally-restrained octapeptide, which is thought to act as a stop transfer sequence that prevents proteolysis and secretion PUBMED:7674922, PUBMED:3555489.

    \ \ \ ' '688' 'IPR007863' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    These metallopeptidases belong to MEROPS peptidase family M16 (clan ME). They include proteins, which are classified as non-peptidase homologues either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.

    \ \

    The peptidases in this group of sequences include:

    \ \ \

    These proteins do not share many regions of sequence similarity; the most noticeable is in the N-terminal section. This region includes a conserved histidine followed, two residues later by a glutamate and another histidine. In pitrilysin, it has been shown PUBMED:7990931 that this H-x-x-E-H motif is involved in enzymatic activity; the two histidines bind zinc and the glutamate is necessary for catalytic activity. The mitochondrial processing peptidase consists of two structurally related domains. One is the active peptidase whereas the other, the C-terminal region, is inactive. The two domains hold the substrate like a clamp PUBMED:11470436.

    \ \ ' '689' 'IPR008283' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine).

    \ \

    Aminopeptidases are exopeptidases involved in the processing and regular\ turnover of intracellular proteins, although their precise role in cellular\ metabolism is unclear PUBMED:1555602, PUBMED:2395881. Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids PUBMED:2395881.

    \ \

    The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of\ one another PUBMED:2395881. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape PUBMED:2395881. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices PUBMED:2395881. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core PUBMED:2395881. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer PUBMED:2395881. The two zinc ions and the active site are entirely located in the C-terminal catalytic domain PUBMED:2395881.

    \ ' '690' 'IPR001948' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M18, (clan MH). The proteins have two catalytic zinc ions at the active site, bound by His/Asp, Asp, Glu, Asp/Glu and His. The catalysed reaction involves the release of an N-terminal aminoacid, usually neutral or hydrophobic, from a polypeptide PUBMED:7674922.

    \ \

    The type example is aminopeptidase I from Saccharomyces cerevisiae (Baker\'s yeast), the sequence of which has been deduced, and the mature protein shown to consist\ of 469 amino acids PUBMED:2651436. A 45-residue presequence contains both\ positively- and negatively-charged and hydrophobic residues, which could be arranged\ in an N-terminal amphiphilic alpha-helix PUBMED:2651436. The presequence differs from\ signal sequences that direct proteins across bacterial plasma membranes and\ endoplasmic reticulum or into mitochondria. It is unclear how this unique\ presequence targets aminopeptidase I to yeast vacuoles, and how this\ sorting utilises classical protein secretory pathways PUBMED:2651436.

    \ ' '691' 'IPR002933' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of proteins contains the metallopeptidases and non-peptidase homologues that belong to the MEROPS peptidase family M20 (clan MH) PUBMED:7674922. The peptidases of this clan have two catalytic zinc ions at the active site, bound by His/Asp, Asp, Glu, Asp/Glu and His. The catalysed reaction involves the release of an N-terminal amino acid, usually neutral or hydrophobic, from a polypeptide PUBMED:7674922. The peptidase M20 family has four sub-families: \

    \ ' '692' 'IPR000905' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M22 (clan MK). The type example being O-sialoglycoprotein endopeptidase () from Pasteurella haemolytica (Mannheimia haemolytica).

    \ \

    O-Sialoglycoprotein endopeptidase is secreted by the bacterium P. haemolytica, and digests only proteins that are heavily sialylated, in particular those with sialylated serine and threonine residues PUBMED:7674959. Substrate proteins include glycophorin A and leukocyte surface antigens CD34, CD43, CD44 and CD45 PUBMED:7674922, PUBMED:7674959. Removal of glycosylation, by treatment with neuraminidase, completely negates susceptibility to O-sialoglycoprotein endopeptidase digestion PUBMED:7674922, PUBMED:7674959.

    \ \

    Sequence similarity searches have revealed other members of the M22 family,\ from yeast, Mycobacterium, Haemophilus influenzae and the cyanobacterium Synechocystis PUBMED:7674922. The zinc-binding and catalytic residues of this family have not been determined, although the motif HMEGH may be a zinc-binding region PUBMED:7674922.

    \ ' '693' 'IPR007484' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This domain is found in metallopeptidases belonging to the MEROPS peptidase family M28 (aminopeptidase Y, clan MH) PUBMED:7674922. They also contain a transferrin receptor-like dimerisation domain () and a protease-associated PA domain ().

    \ ' '694' 'IPR016047' '\ Members of this family are zinc metallopeptidases with a range of specificities. The peptidase family M23 is included in this family, these are Gly-Gly endopeptidases. Peptidase family M23 are also endopeptidases. This family also includes some bacterial lipoproteins such as Swiss:P33648 for which no proteolytic activity has been demonstrated. This family also includes leukocyte cell-derived chemotaxin 2 (LECT2) proteins. LECT2 is a liver-specific protein which is thought to be linked to hepatocyte growth although the exact function of this protein is unknown.\ ' '695' 'IPR000642' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M41 (FtsH endopeptidase family, clan MA(E)). The predicted active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH.

    \ \ \

    The peptidase M41 family belong to a larger family of zinc metalloproteases. This family\ includes the cell division protein FtsH, and the yeast mitochondrial respiratory chain complexes\ assembly protein, which is a putative ATP-dependent protease required for assembly of the\ mitochondrial respiratory chain and ATPase complexes. FtsH is an integral membrane protein,\ which seems to act as an ATP-dependent zinc metallopeptidase that binds one zinc ion.

    \ ' '696' 'IPR008915' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This entry contains metallopeptidases belonging to MEROPS peptidase family M50 (S2P protease family, clan MM).

    \ \

    Members of the M50 metallopeptidase family include: mammalian sterol-regulatory element binding protein (SREBP) site 2 protease, Escherichia coli protease EcfE, stage IV sporulation protein FB and various hypothetical bacterial and eukaryotic homologues. A number of proteins are classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.

    \ ' '697' 'IPR004106' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This entry represents the beta-propeller domain found at the N-terminal of prolyl oligopeptidase, including acylamino-acid-releasing enzyme (also known as acylaminoacyl peptidase), which belong to the MEROPS peptidase family S9 (clan SC), subfamily S9A. The prolyl oligopeptidase family consist of a number of evolutionary related peptidases whose catalytic activity seems to be provided by a charge relay system similar to that of the trypsin family of serine proteases, but which evolved by independent convergent evolution. The N-terminal domain of prolyl oligopeptidases form an unusual 7-bladed beta-propeller consisting of seven 4-stranded beta-sheet motifs.

    \

    Prolyl oligopeptidase is a large cytosolic enzyme involved in the maturation and degradation of peptide hormones and neuropeptides, which relate to the induction of amnesia. The enzyme contains a peptidase domain, where its catalytic triad (Ser554, His680, Asp641) is covered by the central tunnel of the N-terminal beta-propeller domain. In this way, large structured peptides are excluded from the active site, thereby protecting larger peptides and proteins from proteolysis in the cytosol PUBMED:9695945. The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. Mammalian acylaminoacyl peptidase is an exopeptidase that is a member of the same prolyl oligopeptidase family of serine peptidases. This enzyme removes acylated amino acid residues from the N terminus of oligopeptides PUBMED:17350041.

    \ ' '698' 'IPR007217' '\ A member of this family has been implemented in protein processing in the endoplasmic reticulum PUBMED:10831844.\ ' '699' 'IPR006785' '\ This conserved region defines a group of peroxisomal membrane anchor proteins which bind the PTS1 (peroxisomal targeting signal) receptor and are required for the import of PTS1-containing proteins into peroxisomes. Loss of functional Pex14p results in defects in both the PTS1 and PTS2-dependent import pathways. Deletion analysis of this conserved region implicates it in selective peroxisome degradation. In the majority of members this region is situated at the N-terminus of the protein PUBMED:9094717, PUBMED:11564741.\ ' '700' 'IPR011611' '\

    This entry includes a variety of carbohydrate and pyrimidine kinases. The family includes phosphomethylpyrimidine kinase (). This enzyme is part of the Thiamine pyrophosphate (TPP) synthesis pathway, TPP is an essential cofactor for many enzymes PUBMED:9519409.

    \ ' '701' 'IPR001849' '\

    The \'pleckstrin homology\' (PH) domain is a domain of about 100 residues that occurs in a wide range of proteins involved in intracellular signalling or as constituents of the cytoskeleton PUBMED:8500161, PUBMED:8497315, PUBMED:8236453, PUBMED:7985225, PUBMED:7531822, PUBMED:7890802, PUBMED:7583640.

    \

    The function of this domain is not clear, several putative functions have been suggested:

    \
  • binding to the beta/gamma subunit of heterotrimeric G proteins,
  • \
  • binding to lipids, e.g. phosphatidylinositol-4,5-bisphosphate,
  • \
  • binding to phosphorylated Ser/Thr residues,
  • \
  • attachment to membranes by an unknown mechanism.
  • \

    It is possible that different PH domains have totally different ligand requirements.

    \

    The 3D structure of several PH domains has been determined PUBMED:7634082. All known cases have a common structure consisting of two perpendicular anti-parallel beta sheets, followed by a C-terminal amphipathic helix. The loops connecting the beta-strands differ greatly in length, making the PH domain relatively difficult to detect. There are no totally invariant residues within the PH domain.

    \

    Proteins reported to contain one more PH domains belong to the following families:

    \ \ ' '702' 'IPR006444' '\

    This family represents the major capsid protein component of the heads (capsids) of bacteriophage HK97, phi-105, P27, and related phage. This group represent one of several analogous families lacking detectable sequence similarity. The gene encoding this component is typically located in an operon encoding the small and large terminase subunits, the portal protein and the prohead or maturation protease.

    \ ' '703' 'IPR004107' '\

    Proteins containing this domain cleave DNA substrates by a series of staggered cuts, during which the protein becomes covalently linked to the DNA through a catalytic tyrosine residue at the carboxy end of the alignment PUBMED:9082984, PUBMED:9288963.

    \ \

    The phage integrase N-terminal SAM-like domain is almost always found with the signature that defines the phage integrase family (see ).

    \ ' '704' 'IPR006944' '\

    This is a family of bacteriophage and prophage portal proteins. Positioned at one of the twelve icosahedral vertices, of the viral capsid, is a dodecameric complex of the virus encoded portal protein. This dodecameric complex, known as the portal or connector complex, forms the channel through which the viral DNA is packaged into the capsid, and through which it exits during infection. While the portal proteins from different phage show relatively little sequence homology and vary widely in molecular weight, portal complexes display significant morphological similarity as determined by electron microscopy. Morphologically, they present as disk-like structures approximately 150 Angstroms in diameter with radially arranged projections and a 30 Angstroms central channel. The packaging reaction is energy dependent and typically involves several components. ATP hydrolysis provides the driving force, and it is estimated that one ATP molecule is required for every base pair that is packaged. It appears that the portal motor may represent a new and extremely powerful class of motor which couples rotation to DNA translocation PUBMED:11839289.

    \ ' '705' 'IPR006448' '\

    This group of sequences describe the distinct family of phage (and integrated prophage) putative terminase small subunit sequnces. Members tend to be encoded by the gene adjacent to the phage terminase large subunit gene.

    \ ' '706' 'IPR005021' '\ The majority of the members of this family are bacteriophage proteins, several of which are thought to be terminase large subunit proteins. There are also a number\ of bacterial proteins of unknown function.\ ' '707' 'IPR019787' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the PHD (homeodomain) zinc finger domain PUBMED:7701562,PUBMED:, which is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in chromatin-mediated transcriptional regulation. The PHD finger motif is reminiscent of, but distinct from the C3HC4 type RING finger.

    \

    The function of this domain is not yet known but in analogy with the LIM domain it could be involved in protein-protein interaction and be important for the assembly or activity of multicomponent complexes involved in transcriptional activation or repression. Alternatively, the interactions could be intra-molecular and be important in maintaining the structural integrity of the protein. In similarity to the RING finger and the LIM domain, the PHD finger is thought to bind two zinc ions.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '708' 'IPR001204' '\

    The PHO-4 family of transporters includes the phosphate-repressible phosphate permease (PHO-4) from Neurospora crassa which is probably a sodium-phosphate symporter PUBMED:7732001. This family also includes the human leukemia virus receptor.

    \ ' '709' 'IPR002591' '\

    This family consists of phosphodiesterases, including human\ plasma-cell membrane glycoprotein PC-1 / alkaline phosphodiesterase I\ / nucleotide pyrophosphatase (nppase). These enzymes catalyse the\ cleavage of phosphodiester and phosphosulphate bonds in NAD, \ deoxynucleotides and nucleotide sugars PUBMED:9344668. Another member of this family is\ ATX an autotaxin, tumor cell motility-stimulating protein which \ exhibits type I phosphodiesterases activity PUBMED:7982964.\ The alignment encompasses the active site PUBMED:7730366, PUBMED:7982964.\ Also present within this family is 60 kDa Ca2+-ATPase from \ Myroides odoratus PUBMED:8617788.

    This signature also hits a number of ethanolamine phosphate transferase involved in glycosylphosphatidylinositol-anchor biosynthesis.

    \ ' '710' 'IPR004013' '\ The PHP (Polymerase and Histidinol Phosphatase) domain is a putative phosphoesterase domain. This family is often associated with an N-terminal region .\ ' '712' 'IPR004228' '\

    Cryptophytes are unicellular photosynthetic algae that use a lumenally located light-harvesting system, which is distinct from the phycobilisome structure found in cyanobacteria and red algae. One of the key components of this system is water-soluble phycoerythrin (PE) 545 whose expression is enhanced by low light levels PUBMED:10430868. Phycoerythrin (PE) 545 is a heterodimeric of alpha(1)alpha(2)betabeta subunits. Each alpha subunit carries a covalently linked 15,16-dihydrobiliverdin chromophore that probably acts as the final energy acceptor. The architecture of the heterodimer suggests that PE 545 may dock to an acceptor protein via a deep cleft and that energy may be transferred via this intermediary protein to the reaction centre PUBMED:10430868.

    \ ' '713' 'IPR007719' '\

    This entry represents plant phytochelatin synthases (also known as glutathione gamma-glutamylcysteinyltransferase; ), which is involved in the synthesis of phytochelatins (PC) and homophytochelatins (hPC), the heavy-metal-binding peptides of plants. This enzyme is required for detoxification of heavy metals such as cadmium and arsenate. The N-terminal region of phytochelatin synthase contains the active site, as well as four highly conserved cysteine residues that appear to play an important role in heavy-metal-induced phytochelatin catalysis. The C-terminal region is rich in cysteines, and may act as a metal sensor, whereby the Cys residues bind cadmium ions to bring them into closer proximity and transferring them to the activation site in the N-terminal catalytic domain PUBMED:18270423. The C-terminal region displays homology to the functional domains of metallothionein and metallochaperone.

    \ ' '714' 'IPR003719' '\

    Five genes, phzF, phzA, phzB, phzC and phzD, encode enzymes for phenazine biosynthesis in the biological control bacterium Pseudomonas chlororaphis (Pseudomonas aureofaciens). Protein PhzF is similar to 3-deoxy-D-arabino-heptulosonate-7-phosphate synthases of solanaceous plants. PhzC is responsible for the conversion of phenazine-I-carboxylic acid to 2-hydroxy-phenazine-I-carboxylic acid PUBMED:8586283.

    \ ' '715' 'IPR000403' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    Phosphatidylinositol 3-kinase (PI3-kinase) () PUBMED:1322797 is an enzyme\ that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol\ ring. The three products of PI3-kinase - PI-3-P,\ PI-3,4-P(2) and PI-3,4,5-P(3) function as secondary messengers in cell signalling.\ Phosphatidylinositol 4-kinase (PI4-kinase) () PUBMED:8194527 is an enzyme\ that acts on phosphatidylinositol (PI) in the first committed step in the\ production of the secondary messenger inositol-1\'4\'5\'-trisphosphate. This domain is also present in a wide range of protein kinases, involved in diverse cellular functions, such as control of cell growth, regulation of cell cycle progression, a DNA damage checkpoint, recombination, and maintenance of telomere length. Despite significant homology to lipid kinases, no lipid kinase activity has been demonstrated for any of the PIK-related kinases PUBMED:12456783.

    The PI3- and PI4-kinases share a well conserved domain at their C-terminal\ section; this domain seems to be distantly related to the catalytic domain of\ protein kinases PUBMED:8387896, PUBMED:12151228. The catalytic domain of PI3K has the typical bilobal structure that is seen in other ATP-dependent\ kinases, with a small N-terminal lobe and a large C-terminal lobe. The core of this domain is the most conserved region of the PI3Ks.\ The ATP cofactor binds in the crevice formed by the N-and C-terminal lobes, a loop between two strands provides\ a hydrophobic pocket for binding of the adenine moiety, and a lysine residue interacts\ with the alpha-phosphate. In contrast to protein kinases, the PI3K loop which interacts with the\ phosphates of the ATP and is known as the glycine-rich or P-loop, contains no glycine residues.\ Instead, contact with the ATP -phosphate is maintained through the side chain of a conserved serine\ residue.

    \ \ ' '716' 'IPR002420' '\

    Phosphatidylinositol 3-kinase (PI3-kinase) () is an enzyme\ that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol\ ring. The usually N-terminal C2 domain interacts mainly with the scaffolding helical domain of the enzyme, and exhibits only minor\ interactions with the catalytic domain PUBMED:12151228. The domain consists of two four-stranded antiparallel beta-sheets\ that form a beta-sandwich. Isolated C2 domain binds multilamellar phospholipid\ vesicles which suggests that this domain could play a role in membrane association. Membrane attachment by C2 domains is typically mediated by the loops connecting beta-strand regions\ that in other C2 domain-containing proteins are calcium-binding region

    \ ' '717' 'IPR006020' '\

    The PI domain has a similar structure to the insulin receptor substrate-1 \ PTB domain, a 7-stranded beta-sandwich, capped by a C-terminal helix.\ However, the PI domain contains an additional short N-terminal helix and a\ large insertion between strands 1 and 2, which forms a helix and 2 long\ connecting loops. The substrate peptide fits into a surface cleft formed\ from the C-terminal helix and strand 5 PUBMED:8599766.

    \ ' '718' 'IPR017852' '\

    This entry represents the C-terminal region of GPI ethanolamine phosphate transferase 1 enzymes, including the yeast enzyme MCD4 and the mammalian homolgoue PIG-N (also known as phosphatidylinositolglycan class N) PUBMED:10574991, PUBMED:10069808. These enzymes are multi-pass endoplasmic reticulum membrane proteins involved in glycosylphosphatidylinositol (GPI)-anchor biosynthesis. These enzymes transfer ethanolamine phosphate to the first alpha-1,4-linked mannose of the GPI precursor of the GPI-anchor. Ethanolamine phosphate on the alpha-1,4-linked mannose is essential for further mannosylation by GPI10 and is necessary for an efficient recognition of GPI lipids and GPI proteins by the GPI transamidase, for the efficient transport of GPI anchored proteins from endoplasmic reticulum to Golgi and for the physiological incorporation of ceramides into GPI anchors by lipid remodeling. MCD4 is also involved in non-mitochondrial ATP movements across the membrane and participates in Golgi and endoplasmic reticulum function, and is required for the incorporation of BGL2 into the cell wall.

    \ ' '719' 'IPR002716' '\

    The PilT protein, N-terminal domain (PIN) is a compact domain of about 100 amino acids. The domain has two nearly invariant aspartates and forms a coiled-coil with other monomer units to polymerise a pilus fibre PUBMED:10216854. The function of the PIN domain is unknown but a role in signalling appears likely given the presence of this domain in some bacterial plasmid stability proteins and Dis3 from yeast that is implicated in mitotic control PUBMED:8896453.

    \ ' '720' 'IPR006786' '\

    This conserved region is located adjacent and C-terminal to a N-terminal pinin/SKD domain . Members of this family have very varied localisations within the eukaryotic cell. Pinin is known to localise at the desmosomes and is implicated in anchoring intermediate filaments to the desmosomal plaque PUBMED:8922384. SDK2/3 is a dynamically localised nuclear protein thought to be involved in modulation of alternative pre-mRNA splicing PUBMED:9447706. MemA is a tumour marker preferentially expressed in human melanoma cell lines. A common feature of the members of this family is that they may all participate in regulating protein-protein interactions PUBMED:10095061.

    \ ' '721' 'IPR002498' '\ This entry represents a conserved region from the common kinase core found in the type I phosphatidylinositol-4-phosphate 5-kinase (PIP5K) family as described in\ PUBMED:9535851. This region is found in I, II and III phosphatidylinositol-4-phosphate 5-kinases (PIP5K enzymes). \ PIP5K catalyses the formation of phosphoinositol-4,5-bisphosphate via the \ phosphorylation of phosphatidylinositol-4-phosphate a precursor in the \ phosphinositide signalling pathway.\ ' '722' 'IPR003165' '\

    This domain is found in the stem cell self-renewal protein Piwi and its relatives in Drosophila melanogaster PUBMED:9851978. It has been found in the C-terminal of a number of proteins which also contain the PAZ domain () in their central region, for example the Argonaute proteins. Several of these proteins have been implicated in the\ development and maintenance of stem cells through the RNA-mediated gene-quelling mechanisms\ associated with the protein DICER.

    \ ' '723' 'IPR000601' '\

    The PKD domain was first identified in the Polycystic kidney disease protein, polycystin-1 (PDK1 gene), and contains an Ig-like fold consisting of a beta-sandwich of seven strands in two sheets with a Greek key topology, although some members have additional strands PUBMED:9889186. Polycystin-1 is a large cell-surface glycoprotein involved in adhesive protein-protein and protein-carbohydrate interactions; however it is not clear if the PKD domain mediates any of these interactions.

    \

    PKD domains are also found in other proteins, usually in the extracellular parts of proteins involved in interactions with other proteins. For example, domains with a PKD-type fold are found in archaeal surface layer proteins that protect the cell from extreme environments PUBMED:12377130, and in the human VPS10 domain-containing receptor SorCS2 PUBMED:11499680.

    \ ' '724' 'IPR001024' '\

    Lipoxygenases () are a class of iron-containing dioxygenases which catalyses the hydroperoxidation of lipids, containing a cis,cis-1,4-pentadiene structure. They are common in plants where they may be involved in a number of diverse aspects of plant physiology including growth and development, pest resistance, and senescence or responses to wounding PUBMED:. In mammals a number of lipoxygenases isozymes are involved in the metabolism of prostaglandins and leukotrienes PUBMED:3017195. Sequence data is available for the following lipoxygenases:

    \ \

    \ \

    The iron atom in lipoxygenases is bound by four ligands, three of which are\ histidine residues PUBMED:8502991. Six histidines are conserved in all lipoxygenase sequences, five of them are found clustered in a stretch of 40 amino acids. This region contains two of the three zinc-ligands; the other histidines have been shown PUBMED:1567851 to be important for the activity of lipoxygenases.

    \

    This entry represents a domain found in lipoxygenases and other enzymes. It is known as the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain, is found in a variety of membrane or lipid associated proteins. Structurally, this domain forms a beta-sandwich composed of two sheets of four strands each PUBMED:10469604, PUBMED:11985859, PUBMED:11412104. The most highly conserved regions coincide with the beta-strands, with most of the highly conserved residues being buried within the protein. An exception to this is a surface lysine or arginine that occurs on the surface of the fifth beta-strand of the eukaryotic domains. In pancreatic lipase, the lysine in this position forms a salt bridge with the procolipase protein. The conservation of a charged surface residue may indicate the location of a conserved ligand-binding site. It is thought that this domain may mediate membrane attachment via other protein binding partners.

    \ ' '725' 'IPR001736' '\

    Phosphatidylcholine-hydrolysing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, and/or asparagine residues which may contribute to the active site aspartic acid. An Escherichia coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs PUBMED:8732763, PUBMED:8755242, PUBMED:8051126, PUBMED:9242915.

    \ ' '726' 'IPR002929' '\

    This family consists mainly of the Potato leafroll virus (PLrV) read through protein otherwise known as the minor capsid protein. This is generated via a readthrough of open reading frame 3, the coat protein, allowing transcription of open reading frame 5 to give an extended coat protein with a large C-terminal addition or read through domain PUBMED:7513925.

    \ \

    The read through protein is essential for the circulative aphid transmission of PLrV PUBMED:7513925 and Beet western yellows virus PUBMED:7882968. The N-terminal region of the luteovirus readthrough domain determines virus binding to Buchnera GroEL and is essential for virus persistence in the aphid PUBMED:9311800.

    \ \ ' '727' 'IPR004343' '\

    The yeast Paf1 complex consists of Pfa1, Rtf1, Cdc73, Ctr9, and Leo1. The complex regulates histone H2B ubiquitination, histone H3 methylation, RNA polymerase II carboxy-terminal domain (CTD) Ser2 phosphorylation, and RNA 3\' end processing. The conservation of Paf1 complex function in higher eukaryotes has been confirmed in human cells, Drosophila and Arabidopsis. The Plus3 domain spans the most conserved regions of the Rtf1 protein and is surrounded by regions of low complexity and coiled-coil propensity PUBMED:11014804. It contains only a limited number of highly conserved amino acids, among which are three positively charged residues that gave the Plus3 domain its name. The capacity to bind single-stranded DNA is at least one function of the Plus3 domain PUBMED:18184592.

    \ \

    The plus-3 domain is about 90 residues in length and is often found associated with the GYF domain (). The Plus3 domain structure consists of six alpha helices intervened by a sequence of six beta strands in a mixed alpha/beta topology. Beta strands 1, 2, 5, and 6 compose a four-stranded antiparallel beta sheet with a beta-hairpin insertion formed by strands 3 and 4. The N-terminal helices alpha1-alpha3 and C-terminal helix alpha6 pack together to form an alpha subdomain, while the beta strands and the small 3(10) helix alpha 4 form a beta subdomain. The two subdomains pack together to form a compact, globular protein PUBMED:18184592.

    \ ' '728' 'IPR006501' '\

    This entry represents a plant domain of about 200 amino acids, characterised by four conserved cysteine residues. This domain inhibits pectinesterase/pectin methylesterases (PMEs) and invertases through formation of a non-covalent 1:1 complex PUBMED:8521860. It has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. It has been observed that it is often expressed as a large inactive preprotein PUBMED:8521860. This domain is also found at the N-termini of PMEs predicted from DNA sequences, suggesting that both PMEs and their inhibitors are expressed as a single polyprotein and subsequently processed. It has two disulphide bridges and is mainly alpha-helical in structure PUBMED:10880981.

    \ ' '729' 'IPR006588' '\

    The PAW domain of unknown function is found in peptide N glycanase (PNGase, ) and in a number of hypothetical proteins.

    \ ' '730' 'IPR015848' '\

    The PH (phosphorolytic) domain is responsible for 3\'-5\' exoribonuclease activity, although in some proteins this domain has lost its catalytic function. An active PH domain uses inorganic phosphate as a nucleophile, adding it across the phosphodiester bond between the end two nucleotides in order to release ribonucleoside 5\'-diphosphate (rNDP) from the 3\' end of the RNA substrate.

    \

    PH domains can be found in bacterial/organelle RNases and PNPases (polynucleotide phosphorylases) PUBMED:17084501, as well as in archaeal and eukaryotic RNA exosomes PUBMED:15951817, PUBMED:17174896, the later acting as nano-compartments for the degradation or processing of RNA (including mRNA, rRNA, snRNA and snoRNA). Bacterial/organelle PNPases share a common barrel structure with RNA exosomes, consisting of a hexameric ring of PH domains that act as a degradation chamber, and an S1-domain/KH-domain containing cap that binds the RNA substrate (and sometimes accessory proteins) in order to regulate and restrict entry into the degradation chamber PUBMED:16285927. Unstructured RNA substrates feed in through the pore made by the S1 domains, are degraded by the PH domain ring, and exit as nucleotides via the PH pore at the opposite end of the barrel PUBMED:16713559, PUBMED:17380186.

    \ \

    This entry represents an RNA-binding phosphorolytic (PH) domain found in bacterial and organelle PNPases, but not in exosomes. It usually occurs in combination with PH domain 1 () and PH domain 2 (), both of which are found in PNPases and exosomes. The core structure of the RNA-binding PH domain consists of a DNA/RNA-binding 3-helical bundle.

    \

    More information about these proteins can be found at Protein of the Month: RNA Exosomes PUBMED:.

    \ ' '731' 'IPR007117' '\

    Expansins are unusual proteins that mediate cell wall extension in plants PUBMED:7568110. They are believed to act as a sort of chemical grease, allowing polymers to slide past one another by disrupting non-covalent hydrogen bonds that hold many wall polymers to one another. This process is not\ degradative and hence does not weaken the wall, which could otherwise rupture under internal pressure during growth.

    \

    Sequence comparisons indicate at least four distinct expansin cDNAs in rice and at least six in Arabidopsis. The proteins are highly conserved in\ size and sequence (75-95% amino acid sequence similarity between any pairwise comparison), and phylogenetic trees indicate that this multigene\ family formed before the evolutionary divergence of monocotyledons and dicotyledons PUBMED:7568110. Sequence and motif analyses show no similarities to known functional domains that might account for expansin action on wall extension. It is thought that several highly-conserved tryptophans may function in expansin binding to cellulose, or other glycans. The high conservation of the family indicates that the mechanism by which expansins promote wall extensin tolerates little variation in protein structure.

    \

    Grass pollens, such as pollen from timothy grass, represent a major cause of type I allergy PUBMED:7930302. Interestingly, expansins share a high degree of\ sequence similarity with the Lol p I family of allergens. This entry represents the C-terminal domain.

    \ ' '732' 'IPR002509' '\ This domain is found in polysaccharide deacetylase. This family of\ polysaccharide deacetylases includes NodB (nodulation protein B from \ Rhizobium) which is a chitooligosaccharide deacetylase PUBMED:9163424.\ It also includes chitin deacetylase from yeast PUBMED:9133736,\ and endoxylanases which hydrolyses glucosidic bonds in xylan PUBMED:8170399.\ ' '733' 'IPR006916' '\

    The Popeye (POP) family of proteins, is restricted to vertebrates and is preferentially expressed in developing and adult striated muscle. It is represented by a conserved region which includes three potential transmembrane domains PUBMED:10882522. The strong conservation of POP genes during evolution and their preferential expression in heart and skeletal muscle suggest that these novel proteins may have an important function in these tissues in vertebrates.

    \ ' '734' 'IPR019752' '\ This domain is found in prokaryotes. It includes a region of the large protein pyruvate-flavodoxin oxidoreductase and the whole pyruvate ferredoxin oxidoreductase gamma subunit protein. It is not known whether the\ gamma subunit has a catalytic or regulatory role. Pyruvate\ oxidoreductase (POR) catalyses the final step in the fermentation\ of carbohydrates in anaerobic microorganisms PUBMED:8550425. This involves the\ oxidative decarboxylation of pyruvate with the participation of\ thiamine followed by the transfer of an acetyl moiety to coenzyme\ A for the synthesis of acetyl-CoA PUBMED:8550425. The family also includes\ pyruvate flavodoxin oxidoreductase as encoded by the nifJ gene in\ cyanobacterium which is required for growth on molecular nitrogen\ when iron is limited PUBMED:8415612.\ ' '735' 'IPR002880' '\ This family includes the N-terminal region of the pyruvate ferredoxin oxidoreductase, corresponding to the first two structural domains. This region is involved in inter subunit contacts PUBMED:10048931. Pyruvate oxidoreductase (POR) catalyses the final step in the fermentation of carbohydrates in anaerobic microorganisms PUBMED:8550425. This involves the oxidative decarboxylation of pyruvate with the participation of thiamine followed by the transfer of an acetyl moiety to coenzyme A for the synthesis of acetyl-CoA PUBMED:8550425. The family also includes pyruvate flavodoxin oxidoreductase as encoded by the nifJ gene in cyanobacterium which is required for growth on molecular nitrogen when iron is limited PUBMED:8415612.\ ' '736' 'IPR007280' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \ This domain is normally found at the C terminus of secreted archaeal and bacterial peptidases, the majority of which belong to MEROPS peptidase families M4 (vibriolysin, ), M9A amd M9B (microbial collangenase, ), M28 (aminopeptidase Ap1, ) and S8 (subtilisin family peptidases, ).\ ' '737' 'IPR002540' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The potyviridae are a family of positive strand RNA viruses, members of which include Zucchini yellow mosaic virus,\ and Turnip mosaic virus (strain Japanese) which cause considerable losses of crops worldwide.

    \ \

    This entry represents a C-terminal region from various plant\ potyvirus P1 proteins (found at the N terminus of the polyprotein).\ The C terminus of P1 is a serine peptidase belonging to MEROPS peptidase family S30 (clan PA(S)). It is the protease responsible for \ autocatalytic cleavage between P1 and the helper component protease, which is a cysteine peptidase belonging to MEROPS peptidase family C6 PUBMED:7844540, PUBMED:1529535. The P1 protein may be involved in virus-host interactions PUBMED:7844540.

    \ ' '738' 'IPR004971' '\ This is a family of viral mRNA capping enzymes. The enzyme catalyses the first two reactions in the mRNA cap formation pathway. It is a heterodimer consisting of a large and small subunit. This entry is the large subunit. \ ' '739' 'IPR014045' '\

    This domain is found in protein phosphatase 2C, as well as other proteins eg. pyruvate dehydrogenase (lipoamide)]-phosphatase and adenylate cyclase .

    \

    Protein phosphatase 2C (PP2C) is one of the four major classes of mammalian\ serine/threonine specific protein phosphatases . PP2C PUBMED:1312947 is a\ monomeric enzyme of about 42 Kd which shows broad substrate specificity and\ is dependent on divalent cations (mainly manganese and magnesium) for its\ activity. Its exact physiological role is still unclear. Three isozymes are\ currently known in mammals: PP2C-alpha, -beta and -gamma. In yeast, there are\ at least four PP2C homologs: phosphatase PTC1 PUBMED:8395005 which has weak tyrosine\ phosphatase activity in addition to its activity on serines, phosphatases PTC2\ and PTC3, and hypothetical protein YBR125c. Isozymes of PP2C are also known\ from Arabidopsis thaliana (ABI1, PPH1), Caenorhabditis elegans (FEM-2,\ F42G9.1, T23F11.1), Leishmania chagasi and Paramecium tetraurelia.\ In A. thaliana, the kinase associated protein phosphatase (KAPP) PUBMED:7973632\ is an enzyme that dephosphorylates the Ser/Thr receptor-like kinase RLK5 and\ which contains a C-terminal PP2C domain.

    \

    PP2C does not seem to be evolutionary related to the main family of serine/\ threonine phosphatases: PP1, PP2A and PP2B. However, it is significantly\ similar to the catalytic subunit of pyruvate dehydrogenase phosphatase\ (PDPC) PUBMED:8396421, which catalyzes dephosphorylation and concomitant\ reactivation of the alpha subunit of the E1 component of the pyruvate\ dehydrogenase complex. PDPC is a mitochondrial enzyme and, like PP2C, is\ magnesium-dependent.

    \ ' '740' 'IPR000030' '\ This mycobacterial family is named after a conserved amino-terminal region of about 180\ amino acids, the PPE motif. The carboxy termini of proteins belonging to the PPE family are variable, and on the basis of this region at least three groups can be distinguished. The MPTR subgroup is characterised by tandem copies of a motif NXGXGNXG. The second subgroup contains a conserved motif at about position 350.\ The third group shares only similarity in the amino terminal region.\ The function of these proteins is uncertain but it has been suggested that they may be related to antigenic variation of Mycobacterium tuberculosis PUBMED:9634230.\ ' '741' 'IPR002885' '\

    This entry represents the PPR repeat.

    \ \

    Pentatricopeptide repeat (PPR) proteins are characterised by tandem repeats of a degenerate 35 amino acid motif PUBMED:10664580. Most of PPR proteins have roles in mitochondria or plastid PUBMED:15270678. PPR repeats were discovered while screening Arabidopsis proteins for those predicted to be targeted to mitochondria or chloroplast PUBMED:10664580, PUBMED:15269332. Some of these proteins have been shown to play a role in post-transcriptional processes within organelles and they are thought to be sequence-specific RNA-binding proteins PUBMED:12782738, PUBMED:12832482, PUBMED:18031283. Plant genomes have between one hundred to five hundred PPR genes per genome whereas non-plant genomes encode two to six PPR proteins.

    \ \

    Although no PPR structures are yet known, the motif is predicted to fold into a helix-turn-helix structure similar to those found in the tetratricopeptide repeat (TPR) family (see ) PUBMED:10664580.

    \ \

    The plant PPR protein family has been divided in two subfamilies on the basis of their motif content and organisation PUBMED:15269332, PUBMED:17560114.

    \ \

    Examples of PPR repeat-containing proteins include PET309 , which may be involved in RNA stabilisation PUBMED:7664742, and crp1, which is involved in RNA processing PUBMED:8039510. The repeat is associated with a predicted plant protein that has a domain organisation similar to the human BRCA1 protein.

    \ ' '742' 'IPR006603' '\

    This repeated motif of unknown function has been found between the transmembrane helices of cystinosin, yeast\ ERS1 and mannose-P-dolichol utilization defect\ 1. The positioning of this repeat suggests that it may be\ associated with the glycosylation machinery.

    \ ' '743' 'IPR007728' '\

    This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain.

    \ \

    Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception PUBMED:12123582, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain PUBMED:11691919, PUBMED:11893494. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities PUBMED:12540855.

    \ \

    The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilising the SET domain.

    \ \

    The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site PUBMED:12389037 when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site PUBMED:12887903. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity PUBMED:12372305.

    \ ' '744' 'IPR000836' '\

    The name PRT comes from phosphoribosyltransferase (PRTase) enzymes, which carry out phosphoryl transfer reactions on 5-phosphoribosyl-alpha1-pyrophosphate PRPP, an activated form of ribose-5-phosphate. Members of Phosphoribosyltransferase (PRT) are catalytic and are regulatory proteins involved in nucleotide synthesis and salvage PUBMED:11751055. This includes a range of diverse phosphoribosyl transferase enzymes including adenine phosphoribosyltransferase (); hypoxanthine-guanine-xanthine phosphoribosyltransferase; \ hypoxanthine phosphoribosyltransferase (); ribose-phosphate pyrophosphokinase (); amidophosphoribosyltransferase (); orotate phosphoribosyltransferase ();uracil phosphoribosyltransferase (); and xanthine-guanine phosphoribosyltransferase \ ().

    \

    Not all PRT proteins are enzymes. For example, in some bacteria PRT proteins regulate the expression of purine and pyrimidine synthetic genes.

    \

    Members of PRT are defined by the protein fold and by a short 13-residue sequence motif, The motif consists of four hydrophobic amino acids, two acidic amino acids and seven amino acids of variable character, usually including glycine and threonine. The motif has been predicted to be a PRPP-binding site in advance of structural information PUBMED:3009477, PUBMED:3527873. Apart of this motif, different PRT proteins have a low level of sequence identity, less than 15%. \ The PRT sequence motif is only found in PRTases from the nucleotide synthesis and salvage pathways. Other PRTases, from the tryptophan, histidine and nicotinamide synthetic and salvage pathways, lack the PRT sequence motif and appear to be unrelated to each other and unrelated to the PRT family.

    \ ' '745' 'IPR000817' '\

    Prion protein (PrP-c) PUBMED:2572197, PUBMED:1916104, PUBMED:2908696 is a small glycoprotein found in high \ quantity in the brain of animals infected with certain degenerative neurological diseases, such as \ sheep scrapie and bovine spongiform encephalopathy (BSE), and the human dementias Creutzfeldt-Jacob \ disease (CJD) and Gerstmann-Straussler syndrome (GSS). PrP-c is encoded in the host genome and is \ expressed both in normal and infected cells. During infection, however, the PrP-c molecule become \ altered (conformationally rather than at the amino acid level) to an abnormal isoform, PrP-sc. In detergent-treated brain extracts from infected individuals, fibrils\ composed of polymers of PrP-sc, namely scrapie-associated fibrils or prion rods, can be evidenced by electron microscopy. The precise function of the normal PrP isoform in healthy individuals remains unknown. Several results, mainly obtained in transgenic animals, indicate that PrP-c\ might play a role in long-term potentiation, in sleep physiology, in oxidative burst compensation (PrP can fix four Cu2+ through its octarepeat domain), in\ interactions with the extracellular matrix (PrP-c can bind to the precursor of the laminin receptor, LRP), in apoptosis and in signal transduction (costimulation of\ PrP-c induces a modulation of Fyn kinase phosphorylation) PUBMED:12354606.

    The normal isoform, PrP-c, is anchored at the cell membrane, in rafts, through a glycosyl phosphatidyl inositol (GPI); its half-life at the cell surface is 5 h, after which\ the protein is internalised through a caveolae-dependent mechanism and degraded in the endolysosome compartment. Conversion between PrP-c and PrP-sc\ occurs likely during the internalisation process.

    In humans, PrP is a 253 amino acid protein, which has a molecular weight of 35-36 kDa. It has two hexapeptides\ and repeated octapeptides at the N-terminus, a disulphide bond and is associated at the C-terminus with a GPI, which enables it to anchor to the external part of the\ cell membrane. The\ secondary structure of PrP-c is mainly composed of alpha-helices, whereas PrP-sc is mainly beta-sheets: transconformation of alpha-helices into beta-sheets has been\ proposed as the structural basis by which PrP acquires pathogenicity in TSEs. The three-dimensional structures shows the protein to be made of a globular domain which includes three alpha-helices and two small antiparallel beta-sheet\ structures, and a long flexible tail whose conformation depends on the biophysical parameters of the environment. Crystals of the globular domain of PrP\ have recently been obtained; their analysis suggests a possible dimerisation of the protein through the three-dimensional swapping of the C-terminal helix 3 and\ rearrangement of the disulphide bond.

    \ ' '746' 'IPR001353' '\

    The proteasome (or macropain) () PUBMED:7682410, PUBMED:2643381, PUBMED:1317508, PUBMED:7697118, PUBMED:8882582 is a eukaryotic and\ archaeal multicatalytic proteinase complex that seems to be involved in\ an ATP/ubiquitin-dependent nonlysosomal proteolytic pathway. In eukaryotes the\ proteasome is composed of about 28 distinct subunits which form a highly\ ordered ring-shaped structure (20S ring) of about 700 kDa.\ Most proteasome subunits can be classified, on the basis on sequence\ similarities into two groups, alpha (A) and beta (B).

    \

    ATP-dependent protease complexes are present in all three kingdoms of life, where they rid the cell of misfolded or damaged proteins and control the level of certain regulatory proteins. They include the proteasome in Eukaryotes, Archaea, and Actinomycetales and the HslVU (ClpQY, clpXP) complex in other eubacteria. Genes homologous to eubacterial HslV (ClpQ) and HslU (ClpY, clpX) have also been demonstrated in to be present in the genome of trypanosomatid protozoa PUBMED:12446803.

    \

    The prokaryotic ATP-dependent proteasome is coded for by the heat-shock locus VU (HslVU). It consists of HslV, the protease (MEROPS peptidase subfamily T1B), and HslU, , the ATPase and chaperone belonging to the AAA/Clp/Hsp100 family. The crystal structure ofThermotoga maritima HslV has been determined to 2.1-A resolution. The structure of the dodecameric enzyme is well conserved compared to those from Escherichia coli and Haemophilus influenzae PUBMED:12646382, PUBMED:12823960.

    \

    This entry contains threonine peptidases and non-peptidase homologs belong to MEROPS peptidase family T1 (proteasome family, clan PB(T)). The family consists of the protease components of the archaeal and bacterial proteasomes and the alpha and beta subunits of the eukaryotic proteasome.

    \ ' '747' 'IPR005037' '\

    Members of this family are related to the pre mRNA splicing factor PRP38 from yeast PUBMED:1508195, therefore all the members of this family could be involved in splicing. This\ conserved region could be involved in RNA binding. The putative domain is about 180 amino acids in length. PRP38 is a unique component of the U4/U6.U5 tri-small\ nuclear ribonucleoprotein (snRNP) particle and is necessary for an essential step late in spliceosome maturation PUBMED:9582287.

    \ ' '748' 'IPR002683' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    In PSII, the oxygen-evolving complex (OEC) is responsible for catalysing the splitting of water to O(2) and 4H+. The OEC is composed of a cluster of manganese, calcium and chloride ions bound to extrinsic proteins. In cyanobacteria there are five extrinsic proteins in OEC (PsbO, PsbP-like, PsbQ-like, PsbU and PsbV), while in plants there are only three (PsbO, PsbP and PsbQ), PsbU and PsbV having been lost during the evolution of green plants PUBMED:15258264.

    \

    This family represents the PSII OEC protein PsbP. Both PsbP and PsbQ () are regulators that are necessary for the biogenesis of optically active PSII. PsbP increases the affinity of the water oxidation site for chloride ions and provides the conditions required for high affinity binding of calcium ions. The crystal structure of PsbP from Nicotiana tabacum (Common tobacco) revealed a two-domain structure, where domain 1 may play a role in the ion retention activity in PSII, the N-terminal residues being essential for calcium and chloride ion retention activity PUBMED:15031714. PsbP is encoded in the nuclear genome in plants.

    \ ' '749' 'IPR020097' '\

    Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an alpha+beta structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are PUBMED:10529181:

    \ \

    \ \

    This entry represents pseudouridine synthase I (TruA). TruA from Escherichia coli modifies positions uracil-38, U-39 and/or U-40 in tRNA PUBMED:10625422, PUBMED:17466622. TruA contains one atom of zinc essential for its native conformation and tRNA recognition and has a strictly conserved aspartic acid that is likely to be involved in catalysis PUBMED:9585540. These enzymes are dimeric proteins that contain two positively charged, RNA-binding clefts along their surface. Each cleft contains a highly conserved aspartic acid located at its centre. The structural domains have a topological similarity to those of other RNA-binding proteins, though the mode of interaction with tRNA appears to be unique.

    \ ' '750' 'IPR002165' '\ This is a cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found in plexin () PUBMED:7605632. Two copies of the repeat are found in mahogany protein. A related Caenorhabditis elegans protein () contains four copies of the repeat, while the Met receptor contains a single copy of the repeat.\ ' '751' 'IPR006568' '\

    PSP is a proline-rich domain of unknown function found in spliceosome associated proteins.

    \ \ ' '752' 'IPR007557' '\ This region is present in both eukaryotes and eubacteria. The yeast PSP1 protein is involved in suppressing mutations in the DNA polymerase alpha subunit in yeast PUBMED:9529527.\ ' '753' 'IPR007168' '\ This domain is found in Phage shock protein C (PspC) that is thought to be a transcriptional regulator. The presumed domain is 60 amino acid residues in length.\ ' '754' 'IPR006970' '\

    This short repeat is composed on the tetrapeptide XPTX. This repeat is found in a variety of proteins, however it is not clear if these repeats are homologous to each other.

    \ ' '755' 'IPR001533' '\

    DCoH is the dimerisation cofactor of hepatocyte nuclear factor 1 (HNF-1) that functions as both a transcriptional coactivator and a pterin dehydratase PUBMED:8897596. X-ray crystallographic studies have shown that the ligand binds at four sites per tetrameric enzyme, with little apparent conformational change in the protein.

    \ ' '756' 'IPR004327' '\

    Phosphotyrosyl phosphatase activator (PTPA) proteins stimulate the phosphotyrosyl phosphatase (PTPase) activity of the dimeric form of protein phosphatase 2A (PP2A). PTPase activity in PP2A (in vitro) is relatively low when compared to the better recognised phosphoserine/ threonine protein phosphorylase activity. The specific biological role of PTPA is unknown, Basal expression of PTPA depends on the activity of a ubiquitous transcription factor, Yin Yang 1 (YY1). The tumour suppressor protein p53 can inhibit PTPA expression through an unknown mechanism that negatively controls YY1 PUBMED:11171037.

    \ ' '757' 'IPR003501' '\ The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the\ regulation of a variety of metabolic and transcriptional processes. The lactose/cellobiose-specific family are one of four\ structurally and functionally distinct group IIB PTS system cytoplasmic enzymes. The fold of IIB cellobiose shows similar\ structure to mammalian tyrosine phosphatases. This signature is often found downstream of .\ ' '758' 'IPR002478' '\

    The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was named after the proteins in which it was first found PUBMED:10093218. PUA is a highly conserved RNA-binding motif found in a wide range of archaeal, bacterial and eukaryotic proteins, including enzymes that catalyse tRNA and rRNA post-transcriptional modifications, proteins involved in ribosome biogenesis and translation, as well as in enzymes involved in proline biosynthesis PUBMED:16793063, PUBMED:16407303. The structures of several PUA-RNA complexes reveal a common RNA recognition surface, but also some versatility in the way in which the motif binds to RNA PUBMED:17803682. PUA motifs are involved in dyskeratosis congenita and cancer, pointing to links between RNA metabolism and human diseases PUBMED:16943774.

    \ ' '759' 'IPR002483' '\

    The PWI domain, named after a highly conserved PWI tri-peptide located within its N-terminal region, is a ~80 amino acid module, which is found either at the N-terminus or at the C-terminus of eukaryotic proteins involved in pre-mRNA processing PUBMED:10322432. It is generally found in association with other domains such as RRM and RS. The PWI domain is a RNA/DNA-binding domain that has an equal preference for single- and double-stranded nucleic acids and is likely to have multiple important functions in pre-mRNA processing PUBMED:12600940. Proteins containing this domain include the SR-related nuclear matrix protein of 160kDa (SRm160) splicing and 3\'-end cleavage-stimulatory factor, and the mammalian splicing factor PRP3.

    \ \

    The PWI domain is a soluble, globular and independently folded domain which consists of a four-helix bundle, with structured N- and C-terminal elements PUBMED:12600940.

    \ ' '760' 'IPR001683' '\

    The PX (phox) domain PUBMED:8931154 occurs in a variety of eukaryotic proteins and have been implicated in highly diverse functions such as cell signalling, vesicular trafficking, protein sorting and lipid modification PUBMED:10782093, PUBMED:11736640, PUBMED:12461558. PX domains are important phosphoinositide-binding modules that have varying lipid-binding specificities PUBMED:11884510.\ The PX domain is approximately 120 residues long PUBMED:11373621,\ and folds into a three-stranded beta-sheet followed by three -helices and a proline-rich region that immediately preceeds a membrane-interaction loop and spans approximately eight hydrophobic and polar residues. \ The PX domain of p47phox binds to the SH3 domain in the same protein\ PUBMED:11373621. Phosphorylation of p47(phox), a cytoplasmic activator of the microbicidal phagocyte oxidase (phox), elicits interaction of p47(phox) with phoinositides. The protein phosphorylation-driven conformational change of p47(phox) enables its PX domain to bind to phosphoinositides, the interaction of which plays a crucial role in recruitment of p47(phox) from the cytoplasm to membranes and subsequent activation of the phagocyte oxidase. The lipid-binding activity of this protein is normally suppressed by intramolecular interaction of the PX domain with the C-terminal Src homology 3 (SH3) domain PUBMED:12356722.

    \ \

    The PX domain is conserved from yeast to human. A recent multiple alignment of representative PX domain sequences can be found in PUBMED:9687503, although showing relatively little sequence conservation, their structure appears to be highly conserved. Although phosphatidylinositol-3-phosphate (PtdIns(3)P) is the primary target of PX domains, binding to phosphatidic acid, phosphatidylinositol-3,4-bisphosphate (PtdIns(3,4)P2), phosphatidylinositol-3,5-bisphosphate (PtdIns(3,5)P2), phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2), and phosphatidylinositol-3,4,5-trisphosphate (PtdIns(3,4,5)P3) has been reported as well. The PX-domain is also a protein-protein interaction domain PUBMED:15263065.

    \ \ \ \ \ \ \ ' '761' 'IPR003114' '\ This domain is found associated with PX domains. The PX (phox) domain PUBMED:8931154 occurs in a variety of eukaryotic proteins associated with intracellular signalling pathways.\ ' '762' 'IPR001327' '\

    This entry describes a small NADH binding domain within a larger FAD binding domain described by . It is found in both class I and class II oxidoreductases.

    \

    FAD flavoproteins belonging to the family of pyridine nucleotide-disulphide \ oxidoreductases (glutathione reductase, trypanothione reductase, lipoamide dehydrogenase, mercuric reductase, thioredoxin reductase, alkyl hydroperoxide reductase) share sequence similarity with a number of other flavoprotein oxidoreductases, in particular with ferredoxin-NAD+ reductases involved in oxidative metabolism of a variety of hydrocarbons (rubredoxin reductase, putidaredoxin reductase, terpredoxin reductase, ferredoxin-NAD+ \ reductase components of benzene 1,2-dioxygenase, toluene 1,2-dioxygenase, chlorobenzene dioxygenase, biphenyl dioxygenase), NADH oxidase and NADH peroxidase PUBMED:2319593, PUBMED:1404382, PUBMED:2067578. Comparison of the crystal structures of human glutathione \ reductase and Escherichia coli thioredoxin reductase reveals different locations of their active sites, suggesting that the enzymes diverged from an ancestral FAD/NAD(P)H reductase and acquired their disulphide reductase activities independently PUBMED:2067578.

    \

    \ Despite functional similarities, oxidoreductases of this family show no sequence \ similarity with adrenodoxin reductases PUBMED:2924777 and flavoprotein pyridine nucleotide cytochrome reductases (FPNCR) PUBMED:1748631. Assuming that disulphide reductase activity \ emerged later, during divergent evolution, the family can be referred to as FAD-dependent pyridine nucleotide reductases, FADPNR.

    \

    To date, 3D structures of glutathione reductase PUBMED:3656429, thioredoxin reductase PUBMED:2067578, mercuric reductase PUBMED:2067577, lipoamide dehydrogenase PUBMED:1880807, \ trypanothione reductase PUBMED:1924336 and NADH peroxidase PUBMED:1942054 have been solved. The enzymes share similar tertiary structures based on a doubly-wound alpha/beta fold, but the relative orientations of their FAD- and NAD(P)H-binding domains may vary \ significantly. By contrast with the FPNCR family, the folds of the FAD- and \ NAD(P)H-binding domains are similar, suggesting that the domains evolved by gene \ duplication PUBMED:7411611.\

    \ ' '763' 'IPR004099' '\

    This entry represents a dimerisation domain that is usually found at the C-terminal of both class I and class II oxidoreductases, as well as in NADH oxidases and peroxidases PUBMED:7766608, PUBMED:11090282, PUBMED:12390015.

    \ ' '764' 'IPR002638' '\ Quinolinate phosphoribosyl transferase (QPRTase) or nicotinate-nucleotide pyrophosphorylase is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to give rise to nicotinic acid mononucleotide (NaMN), pyrophosphate and carbon dioxide PUBMED:9016724, PUBMED:8561507. Unlike , this domain also includes the molybdenum transport system protein ModD.\ ' '765' 'IPR002638' '\ Quinolinate phosphoribosyl transferase (QPRTase) or nicotinate-nucleotide pyrophosphorylase is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to give rise to nicotinic acid mononucleotide (NaMN), pyrophosphate and carbon dioxide PUBMED:9016724, PUBMED:8561507. Unlike , this domain also includes the molybdenum transport system protein ModD.\ ' '766' 'IPR001374' '\

    The R3H motif: a domain that binds single-stranded nucleic acids.

    \ \

    The most prominent feature of the R3H motif is the presence of an invariant arginine residue and a highly conserved histidine residue that are separated by three residues. The motif also displays a conserved pattern of hydrophobic residues, prolines and glycines. The R3H motif is present in proteins from a diverse range of organisms that includes Eubacteria, green plants, fungi and various groups of metazoans. Intriguingly, it has not yet been identified in Archaea and Escherichia coli.

    \ \

    The sequences that contain the R3H domain, many of which are hypothetical proteins predicted from genome sequencing projects, can be grouped into eight families on the basis of similarities outside the R3H region. Three of the families contain ATPase domains either upstream (families II and VII) or downstream of the R3H domain (family VIII). The N-terminal part of members of family VII contains an SF1 helicase domain5. The C-terminal part of family VIII contains an SF2 DEAH helicase domain5. The ATPase domain in the members of family II is similar to the stage-III sporulation protein AA (S3AA_BACSU), the proteasome ATPase, bacterial transcription-termination factor r and the mitochondrial F1-ATPase b subunit (the F5 helicase family5). Family VI contains Cys-rich repeats6, as well as a ring-type zinc finger upstream of the R3H domain. JAG bacterial proteins (family I) contain a KH domain N-terminal to the R3H domain. The functions of other domains in R3H proteins support the notion that the R3H domain might be involved in interactions with single-stranded nucleic acids PUBMED:9787637.

    \ ' '767' 'IPR004579' '\

    All proteins in this family for which functions are known are components in a multiprotein endonuclease complex (usually made up of Rad1 and Rad10 homologs). This complex is used primarily for nucleotide excision repair but also for some aspects of recombination repair. In yeast, Rad10 works as a heterodimer with Rad1, and is involved in nucleotide excision repair of DNA damaged with UV light, bulky adducts or cross-linking agents. The complex forms an endonuclease which specifically degrades single-stranded DNA.

    \ \

    Ercc1 and XPF (xeroderma pigmentosum group F-complementing protein) are two structure-specific endonucleases of a class of seven containing an ERCC4 domain. Together they form an obligate complex that functions primarily in nucleotide excision repair (NER), a versatile pathway able to detect and remove a variety of DNA lesions induced by UV light and environmental carcinogens, and secondarily in DNA inter-strand cross-link repair and telomere maintenance. This domain in fact binds simultaneously to both XPF and single-stranded DNA; this ternary complex explains the important role of Ercc1 in targeting its catalytic XPF partner to the NER pre-incision complex PUBMED:17720715.

    \ ' '768' 'IPR007517' '\ The Mre11 complex (Mre11 Rad50 Nbs1) is central to chromosomal maintenance and functions in homologous recombination, telomere maintenance and sister chromatid association. The Rad50 coiled-coil region contains a dimer interface at the apex of the coiled coils in which pairs of conserved Cys-X-X-Cys motifs form interlocking hooks that bind one Zn ion. This alignment includes the zinc hook motif and a short stretch of coiled-coil on either side.\ ' '769' 'IPR007197' '\

    Radical SAM proteins catalyze diverse reactions, including unusual methylations, isomerization, sulphur insertion, ring formation, anaerobic oxidation and protein radical formation. Evidence exists that these proteins generate a radical species by reductive cleavage of S:-adenosylmethionine (SAM) through an unusual Fe-S centre PUBMED:11222759, PUBMED:15317939.

    \ ' '770' 'IPR000331' '\

    The Rap/ran-GAP domain is found in the GTPase activating protein (GAP) responsible for the activation of nuclear Ras-related regulatory proteins Rap1, Rsr1 and Ran in vitro converting it to the putatively inactive GDP-bound state PUBMED:1904317, PUBMED:7799964. Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. RanGAP is a leucine rich repeat containing protein which forms a highly curved crescent. Each LRR forms a short beta-strand and a longer alpha-helix that results in a beta-alpha hairpin motif PUBMED:12019565.

    \ \

    The domain is also present in tuberin (a tuberous sclerosis homologue protein) that specifically stimulates the intrinsic GTPase activity of Ras-related protein Rap1A suggesting a possible mechanism for its role in the regulation of cellular growth.

    \ ' '771' 'IPR007216' '\

    Rcd1 (Required cell differentiation 1) -like proteins are found among a wide range of organisms PUBMED:16899141. Rcd1 was initially identified as an essential factor in nitrogen starvation-invoked differentiation in fission yeast. This results largely from a defect in nitrogen starvation-invoked induction of ste11+, a key transcriptional factor gene required for the onset of sexual development. It is one of the most conserved proteins in eukaryotes, and its mammalian homologue is expressed in a variety of differentiating tissues PUBMED:9447985, PUBMED:12356739. The mammalian Rcd1 is a novel transcriptional cofactor and is critical for retinoic acid-induced differentiation of F9 mouse teratocarcinoma cells, at least in part, via forming complexes with retinoic acid receptor and activation transcription factor-2 (ATF-2) PUBMED:12356739. Two of the members in this family have been characterised as being involved in regulation of Ste11 regulated sex genes PUBMED:9671458, PUBMED:9447985.

    \ ' '772' 'IPR007850' '\ Proteins containing this region include Caenorhabditis elegans, UNC-89. This region is found repeated in UNC-89 and shows conservation in\ prolines, lysines and glutamic acids. Proteins with RCSD are involved in muscle M-line assembly, but the function of this region RCSD is\ not clear. \ ' '773' 'IPR000494' '\ The type-1 insulin-like growth-factor receptor (IGF-1R) and insulin receptor (IR) are closely related members of the tyrosine-kinase receptor superfamily . IR is essential for glucose homeostasis, whereas IGF-1R is involved in both normal growth and development and malignant transformation. Homologues of these\ receptors are found in animals as simple as cnidarians. The epidermal growth-factor receptor (EGFR) family is closely related to the IR family and has\ significant sequence identity to the first three domains of the extracellular portion of IGF-IR (L1-Cys-rich-L2). \

    The L domains each consist of a single-stranded right-handed beta-helix. The Cys-rich region is composed of eight disulphide-bonded modules, seven of which form a rod-shaped domain with modules associated in an unusual manner. The three domains surround a central space of sufficient size to accommodate a ligand molecule. Although the fragment (residues 1-462) does not bind ligand, many of the determinants responsible for hormone binding and ligand specificity map to this central site. This structure therefore shows how the IR subfamily might interact with their ligandsPUBMED:9690478.

    \

    A number of receptor systems have been implicated to play an important role in the\ development and progression of many human cancers. The epidermal growth\ factor (EGF) receptor tyrosine kinase family has been found to consistently play a\ leading role in tumor progression PUBMED:10579913.

    \ ' '774' 'IPR002859' '\ Sequence similarity between a region of the autosomal dominant polycystic kidney disease (ADPKD) protein, polycystin-1 and a sea urchin sperm glycoprotein involved in fertilization, the receptor for egg jelly (suREJ) has been known for some time. The suREJ protein binds the glycoprotein coat of the egg (egg jelly), triggering the acrosome reaction, which transforms the sperm into a fusogenic cell. The sequence similarity and expression pattern suggests that the predicted human PKDREJ protein is a mammalian equivalent of the suREJ protein and therefore may have a central role in human fertilization PUBMED:9949214.\ ' '775' 'IPR005094' '\ Relaxases/mobilization proteins are required for the horizontal transfer of genetic information contained on plasmids that occurs during bacterial conjugation. The\ relaxase, in conjunction with several auxiliary proteins, forms the relaxation complex or relaxosome. Relaxases nick duplex DNA in a specific manner by catalysing\ trans-esterification PUBMED:9350859.\ ' '776' 'IPR004322' '\

    This is a family of bacterial plasmid DNA replication initiator proteins. These RepA proteins exist as monomers and dimers in equilibrium: monomers bind directly to repeated DNA sequences and thus activate replication; dimers repress repA transcription by binding an inversely repeated DNA operator. Dimer dissociation can occur spontaneously or may be mediated by Hsp70 chaperones. A similar RepA family of proteins found mainly in Escherichia coli is involved in plasmid replication (see ).

    \ ' '777' 'IPR004932' '\

    RER1 family proteins are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi\ compartment. The C terminus of yeast Rer1p interacts with a coatomer complex PUBMED:11238450.

    \ \ ' '778' 'IPR003388' '\

    Eukaryotic proteins of the reticulon (RTN) family all share an association with the endoplasmic reticulum (ER). Whereas amino-terminal regions are not related to one another, all reticulon proteins share a 200 amino acid residue region of sequence similarity at the C-terminal. This region contains two\ large hydrophobic regions separated by a 66 residue hydrophilic segment. The\ conserved hydrophobic C-terminal portion has been shown to play an essential\ role in the association of reticulons with the ER membrane. The hydrophobic\ portions are supposed to be membrane-embedded and the hydrophilic 66 residue\ localized to the lumenal/extracellular face of the membrane. Most reticulons\ have a di-lysine ER retention motif at the C-terminal. Because of their likely\ association with the rough as well as the smooth ER, the reticulons might play\ some role in transport processes or in regulation of intracellular calcium\ levels. It has been suggested that the reticulons may be serving as ER-associated channel-like complexes PUBMED:7844160, PUBMED:8833145, PUBMED:9693037, PUBMED:10667797.

    \ ' '779' 'IPR007614' '\

    This entry consists of a number of Drosophila proteins which share a conserved C-terminal region related to that of the fly Retinin protein.

    \ ' '780' 'IPR005162' '\

    Transposable elements (TEs) promote various chromosomal rearrangements more efficiently, and often more specifically, than\ other cellular processes. Retrotransposons are structurally similar to retroviruses and are bounded by long terminal repeats. This entry represents eukaryotic Gag or capsid-related retrotranspon-related proteins. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved.

    \ ' '781' 'IPR002610' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of proteins contain serine peptidases belonging to the MEROPS peptidase family S54 (Rhomboid, clan S-). They are integral membrane proteins related to the Drosophila melanogaster (Fruit fly) rhomboid protein . Members of this family are found in archaea, bacteria and eukaryotes.

    \ \

    The D. melanogaster rhomboid protease cleaves type-1 transmembrane domains using a catalytic triad composed of serine, histidine and asparagine contributed by different transmembrane domains. It cleaves the transmembrane proteins Spitz, Gurken and Keren within their transmembrane domains to release a soluble TGFalpha-like growth factor. Cleavage occurs in the Golgi, following translocation of the substrates from the endoplasmic reticulum membrane by Star, another transmembrane protein. The growth factors are then able to activate the epidermal growth factor receptor PUBMED:2110920, PUBMED:11672525.

    \ \

    Few substrates of mammalian rhomboid homologues have been determined, but rhomboid-like protein 2 (MEROPS S54.002) has been shown to cleave ephrin B3 PUBMED:15047175. Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite.

    \ \

    In Saccharomyces cerevisiae (Baker\'s yeast) the Pcp1 (MDM37) protein (MEROPS S54.007) is a mitochondrial endopeptidase required for the activation of cytochrome c peroxidase and for the processing of the mitochondrial dynamin-like protein Mgm1 PUBMED:12417197, PUBMED:12707284. Mutations in Pcp1 result in cells have fragmented mitochondria, which have very few short tubulues PUBMED:11907266.

    \ \ \ \ \ \ \ ' '782' 'IPR007794' '\ The ribosome receptor is an integral endoplasmic reticulum protein that has been suggested to be involved in secretion. This highly conserved region is found towards the C terminus of the transmembrane domain PUBMED:11836413. The function is unclear.\ ' '783' 'IPR000999' '\

    Prokaryotic ribonuclease III () (gene rnc) PUBMED:3903434 is an enzyme that digests double-stranded RNA. It is involved in the processing of ribosomal RNA precursors and of some mRNAs. \ RNase III is evolutionary related to a number of proteins including PUBMED:9241229:\

    \

    \ ' '784' 'IPR007676' '\ Ribophorin I is an essential subunit of oligosaccharyltransferase (OST), which is also known as dolichyl-diphosphooligosaccharide--protein glycosyltransferase, (). OST catalyses the transfer of an oligosaccharide from dolichol pyrophosphate to selected asparagine residues of nascent polypeptides as they are translocated into the lumen of the rough endoplasmic reticulum. Ribophorin I and OST48 are thought to be responsible for OST catalytic activity PUBMED:11443278. Both yeast and mammalian proteins are glycosylated but the sites are not conserved. Glycosylation may contribute towards general solubility but is unlikely to be involved in a specific biochemical function PUBMED:7720878. Most family members are predicted to have a transmembrane helix at the C terminus of this region.\ ' '785' 'IPR002670' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L18ae forms part of the 60S ribosomal subunit PUBMED:1840484. This family is found in eukaryotes. Rat ribosomal protein L18 is homologous to Xenopus laevis L14 PUBMED:3371159.

    \ ' '786' 'IPR000039' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Members of this family are large subunit ribosomal proteins which are found in the Eukaryota and Archaea. These proteins have 115 to 187 amino-acid residues. The family consists of:

    \

    \ ' '787' 'IPR001569' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of proteins of 56 to 96 amino-acid residues that share a highly conserved region located in the N-terminal part.

    \ ' '788' 'IPR020040' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding\ site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 contains two domains with almost identical folds, suggesting that is was derived by the duplication of an\ ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N-terminus being involved in protein-protein interactions and the C-terminus containing possible RNA-binding sites PUBMED:8262035.

    \

    This entry represents the alpha-beta domain found duplicated in ribosomal L6 proteins. This domain consists of two beta-sheets and one alpha-helix packed around single core PUBMED:8262035.

    \ ' '789' 'IPR003489' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family contains the sigma-54 modulation protein family and the S30Ae family of ribosomal proteins which includes the light-repressed protein (lrtA) PUBMED:8063707.

    \ ' '790' 'IPR001047' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaeal ribosomal proteins have been grouped\ based on sequence similarities PUBMED:7662106. One of these families, S8e, consists of a number of proteins with either about 220 amino acids (in eukaryotes) or about\ 125 amino acids (in archaea).

    \ ' '791' 'IPR000772' '\ Ricin is a legume lectin from the seeds of the castor bean plant, \ Ricinus communis. The seeds are poisonous to \ people, animals and insects and just one milligram of ricin can kill an adult. \ \

    Primary structure analysis has shown the presence of a similar domain in many carbohydrate-recognition proteins like plant and bacterial AB-toxins, glycosidases or proteases PUBMED:9603958, PUBMED:7664090, PUBMED:8844840. This domain, known as the ricin B lectin domain, can be present in one or more copies and has been shown in some instance to bind simple sugars, such as galactose or lactose.

    \

    The ricin B lectin domain is composed of three homologous subdomains of 40 amino acids (alpha, beta and gamma) and a linker peptide of around 15 residues (lambda). It has been proposed that the ricin B lectin domain arose by gene triplication from a primitive 40 residue galactoside-binding peptide PUBMED:3561502, PUBMED:1881882. The most characteristic, though not completely conserved, sequence feature is the presence of a Q-W pattern. Consequently, the ricin B lectin domain as also been refered as the (QxW)3 domain and the three homologous regions as the QxW repeats PUBMED:7664090, PUBMED:8844840. A disulphide bond is also conserved in some of the QxW repeats PUBMED:7664090.

    \

    The 3D structure of the ricin B chain has shown that the three QxW repeats pack around a pseudo threefold axis that is stabilised by the lambda linker PUBMED:3561502. The ricin B lectin domain has no major segments of a helix or beta sheet but each of the QxW repeats contains an omega loop PUBMED:1881882. An idealized omega-loop is a compact, contiguous segment of polypeptide that traces a \'loop-shaped\' path in three-dimensional space; the main chain resembles a Greek omega.

    \ ' '792' 'IPR017941' '\

    There are multiple types of iron-sulphur clusters which are grouped into three main categories based on their atomic content: [2Fe-2S], [3Fe-4S], [4Fe-4S] (see ), and other hybrid or mixed metal types. Two general types of [2Fe-2S] clusters are known and they differ in their coordinating residues. The ferredoxin-type [2Fe-2S] clusters are coordinated to the protein by four cysteine residues (see ). The Rieske-type [2Fe-2S] cluster is coordinated to its protein by two cysteine residues and two histidine residues PUBMED:16168954, PUBMED:16271700.

    \ \ \

    The structure of several Rieske domains has been solved PUBMED:8736555. It contains three layers of antiparallel beta sheets forming two beta sandwiches. Both beta sandwiches share the central sheet 2. The metal-binding site is at the top of the beta sandwich formed by the sheets 2 and 3. The Fe1 iron of the Rieske cluster is coordinated by two cysteines while the other iron Fe2 is coordinated by two histidines. Two inorganic sulphide ions bridge the two iron ions forming a flat, rhombic cluster.

    \ \

    Rieske-type iron-sulphur clusters are common to electron transfer chains of mitochondria and chloroplast and to non-haem iron oxygenase systems:

    \ \ ' '793' 'IPR000391' '\ The degradation of aromatic compounds by aerobic bacteria frequently begins with the dihydroxylation of the substrate by nonhaem iron-containing dioxygenases. These enzymes consist of two or three soluble proteins that interact to form an electron-transport chain that transfers electrons from reduced nucleotides (NADH) via flavin and [2Fe-2S] redox centres to a terminal dioxygenase PUBMED:1444257.\ Aromatic-ring-hydroxylating dioxygenases oxidise aromatic hydrocarbons and related compounds to cis-arene diols. These enzymes utilise a mononuclear non-haem iron centre to catalyse the addition of dioxygen to their respective substrates. \

    Naphthalene 1,2-dioxygenase (NDO) from Pseudomonas sp. NCIB9816-4 has a domain structure and iron coordination of the Rieske domain is very similar to that of the cytochrome bc1 domain. The active-site iron centre of one of the alpha subunits is directly connected by hydrogen bonds through a single amino acid, Asp205, to the Rieske [2Fe-2S] centre in a neighbouring alpha subunit. This may be the main route for electron transfer PUBMED:9634695.

    \ ' '794' 'IPR007209' '\ This is a possible metal-binding domain in endoribonuclease RNase L inhibitor. It is found at the N-terminal end of RNase L inhibitor proteins, adjacent to the 4Fe-4S binding domain, fer4, . Also often found adjacent to in uncharacterised proteins. The RNase L system plays a major role in the anti-viral and anti-proliferative activities of interferons PUBMED:9524254, and could possibly play a more general role in the regulation of RNA stability in mammalian cells. Inhibitory activity requires concentration-dependent association of RLI with RNase L PUBMED:7539425.\ ' '795' 'IPR007528' '\

    This family includes RINT-1, a Rad50 interacting protein which participates in radiation induced checkpoint control PUBMED:11096100, that interacts with Rad50 only during late S and G2/M phases. RINT1 also functions in membrane trafficking from the endoplasmic reticulum(ER) to the Golgi complex in interphase cells PUBMED:11096100, PUBMED:16600870, PUBMED:16571679.\ In addition to this, the TIP-1 protein, which is involved in the retrograde transport from the Golgi to the ER PUBMED:8334998.

    \ \

    They share a similar domain organization with an N-terminal leucine heptad repeat rich coiled coil and an ~500-residue C-terminal RINT1/TIP20 domain, which might be a protein-protein interaction module necessary for the formation of functional complexes.

    \ ' '796' 'IPR018934' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    This entry represents RIO kinase, they exhibit little sequence similarity with eukaryotic protein kinases, and are classified as atypical protein kinases PUBMED:16183636. The conformation of ATP when bound to the RIO kinases is unique when compared with ePKs, such as serine/threonine kinases or the insulin receptor tyrosine kinase, suggesting that the detailed mechanism by which the catalytic aspartate of RIO kinases participates in phosphoryl transfer may not be identical to that employed in known serine/threonine ePKs. Representatives of the RIO family are present in organisms varying from Archaea to humans, although the RIO3 proteins have only been identified in multicellular eukaryotes, to date.

    \

    Yeast Rio1 and Rio2 proteins are required for proper cell cycle progression and chromosome maintenance, and are necessary for survival of the cells. These proteins are involved in the processing of 20 S pre-rRNA via late 18 S rRNA processing.

    \ ' '797' 'IPR011262' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \

    RNA polymerase (RNAP) II, which is responsible for all mRNA synthesis in eukaryotes, consists of 12 subunits. Subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the archaeal RNAP D/L heterodimer, and to the prokaryotic RNAP alpha (RpoA) subunit homodimer. In each case, they play a key role in RNAP assembly by forming a platform on which the catalytic subunits (eukaryotic Rpb1/Rpb2, and prokaryotic beta/beta\') can interact PUBMED:11453250.

    \

    The dimerisation domains differ between the different subunit families. In eukaryotic Rpb3, archaeal D and bacterial RpoA subunits (), the dimerisation domain is comprised of a central insert domain, which interrupts an Rpb11-like domain (), dividing it into two halves PUBMED:9657722. In eukaryotic Rpb11 and archaeal L subunits, the insert domain is lacking, leaving the Rpb11-like domain intact and contiguous.

    \ ' '798' 'IPR007080' '\

    RNA polymerases catalyse the DNA-dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain, domain 1, represents the clamp domain, which is a mobile domain involved in positioning the DNA, maintenance of the transcription bubble and positioning of the nascent RNA strand PUBMED:8910400, PUBMED:11313498.

    \ ' '799' 'IPR000722' '\

    RNA polymerases catalyse the DNA dependent polymerisation of RNA from DNA, using the four ribonucleoside triphosphates as substrates. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Eukaryotic RNA polymerase I is essentially used to transcribe ribosomal RNA units, polymerase II is used for mRNA precursors, and III is used to transcribe 5S and tRNA genes. Each class of RNA polymerase is assembled from nine to fourteen different polypeptides. Members of the family include the largest subunit from eukaryotes; the gamma subunit from Cyanobacteria; the beta\'\ subunit from bacteria; the A\' subunit from archaea; and the B\'\' subunit from chloroplast RNA polymerases.

    \ ' '800' 'IPR007066' '\ RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain, domain 3, represents the pore domain. The 3\' end of RNA is positioned close to this domain. The pore delimited by this domain is thought to act as a channel through which nucleotides enter the active site and/or where the 3\' end of the RNA may be extruded during back-tracking PUBMED:8910400, PUBMED:11313498.\ ' '801' 'IPR007083' '\ RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This entry, domain 4, represents the funnel domain. The funnel domain contains the binding site for some elongation factors PUBMED:8910400, PUBMED:11313498.\ ' '802' 'IPR007081' '\ RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain, domain 5, represents the discontinuous cleft domain that is required to form the central cleft or channel where the DNA is bound PUBMED:8910400, PUBMED:11313498.\ ' '803' 'IPR007075' '\ RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain, domain 6, represents a mobile module of the RNA polymerase. Domain 6 forms part of the shelf module PUBMED:8910400, PUBMED:11313498. This family appears to be specific to the largest subunit of RNA polymerase II.\ ' '804' 'IPR000684' '\

    RNA polymerase II () PUBMED:1883205, PUBMED:1700503 is one of the three forms of RNA polymerase that exist in eukaryotic nuclei. The C-terminal region of the largest subunit of this oligomeric enzyme consists of the tandem repeat of a conserved heptapeptide PUBMED:2251729. The number of repeats varies according to the species (for example there are 17 in Plasmodium, 26 in yeast, 44 in Drosophila, and 52 in mammals). The region containing these repeats is essential for the function of polymerase II. This repeated heptapeptide\ (called CT7n or CTD) is rich in hydroxyl groups. It probably projects out of the globular catalytic domain and may interact with the acidic activator domains of transcriptional regulatory proteins. It is also known to bind by intercalation to DNA. RNA polymerase II is activated by phosphorylation. The serine and threonine residues in the CT7n repeats are the target of such phosphorylation.

    \ ' '805' 'IPR007645' '\ RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Domain 3, is also known as the fork domain and is proximal to catalytic site PUBMED:11313498.\ ' '806' 'IPR007647' '\ RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Domain 5, is also known as the external 2 domain PUBMED:11313498.\ ' '807' 'IPR007120' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \

    RNA polymerases () catalyse the DNA dependent polymerisation of RNA.\ Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not\ including mitochondrial and chloroplast polymerases). This domain represents the\ hybrid-binding domain and the wall domain PUBMED:11313498. The\ hybrid-binding domain binds the nascent RNA strand/template DNA strand in the\ Pol II transcription elongation complex. This domain contains the important structural\ motifs, switch 3 and the flap loop and binds an active site metal ion PUBMED:11313498. This domain is also involved in binding to Rpb1 and Rpb3\ PUBMED:11313498. Many of the bacterial members contain large insertions\ within this domain, which are known as dispensable region 2 (DRII).

    \ ' '808' 'IPR007641' '\

    RNA polymerases catalyse the DNA-dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Rpb2 is the second largest subunit of the RNA polymerase. This domain comprised of the structural domains anchor and clamp. The clamp region (C-terminal) contains a zinc-binding motif. The clamp region is named due to its interaction with the clamp domain found in Rpb1. The domain also contains a region termed switch 4. The switches within the polymerase are thought to signal different stages of transcription PUBMED:11313498.

    \ ' '809' 'IPR005574' '\

    The eukaryotic RNA polymerase subunits RPB4 and RPB7 form a heterodimer that reversibly associates with the RNA polymerase II core. Archaeal cells contain a single RNAP made up of about 12 subunits, displaying considerable homology to the eukaryotic RNAPII subunits. The RPB4 and RPB7 homologs are called subunits F and E, respectively, and have been shown to form a stable heterodimer. While the RPB7 homologue is reasonably well conserved, the similarity between the eukaryotic RPB4 and the archaeal F subunit is barely detectable PUBMED:11741548.

    \ ' '810' 'IPR001352' '\

    Ribonuclease HII is involved in the degradation of the ribonucleotide moiety on RNA-DNA hybrid molecules carrying out endonucleolytic cleavage to 5\'-phospo-monoester. Proteins which belong to this family have been found in bacteria, archaea, and yeasts. This family also includes Ribonuclease HIII.

    \ ' '811' 'IPR001247' '\

    The PH (phosphorolytic) domain is responsible for 3\'-5\' exoribonuclease activity, although in some proteins this domain has lost its catalytic function. An active PH domain uses inorganic phosphate as a nucleophile, adding it across the phosphodiester bond between the end two nucleotides in order to release ribonucleoside 5\'-diphosphate (rNDP) from the 3\' end of the RNA substrate.

    \

    PH domains can be found in bacterial/organelle RNases and PNPases (polynucleotide phosphorylases) PUBMED:17084501, as well as in archaeal and eukaryotic RNA exosomes PUBMED:15951817, PUBMED:17174896, the later acting as nano-compartments for the degradation or processing of RNA (including mRNA, rRNA, snRNA and snoRNA). Bacterial/organelle PNPases share a common barrel structure with RNA exosomes, consisting of a hexameric ring of PH domains that act as a degradation chamber, and an S1-domain/KH-domain containing cap that binds the RNA substrate (and sometimes accessory proteins) in order to regulate and restrict entry into the degradation chamber PUBMED:16285927. Unstructured RNA substrates feed in through the pore made by the S1 domains, are degraded by the PH domain ring, and exit as nucleotides via the PH pore at the opposite end of the barrel PUBMED:16713559, PUBMED:17380186.

    \ \

    This entry represents the phosphorolytic (PH) domain 1, which has a core 2-layer alpha/beta structure with a left-handed crossover, similar to that found in ribosomal protein S5. This domain is found in bacterial/organelle PNPases and in archaeal/eukaryotic exosomes PUBMED:9390555.

    \

    More information about these proteins can be found at Protein of the Month: RNA Exosomes PUBMED:.

    \ ' '812' 'IPR015847' '\

    The PH (phosphorolytic) domain is responsible for 3\'-5\' exoribonuclease activity, although in some proteins this domain has lost its catalytic function. An active PH domain uses inorganic phosphate as a nucleophile, adding it across the phosphodiester bond between the end two nucleotides in order to release ribonucleoside 5\'-diphosphate (rNDP) from the 3\' end of the RNA substrate.

    \

    PH domains can be found in bacterial/organelle RNases and PNPases (polynucleotide phosphorylases) PUBMED:17084501, as well as in archaeal and eukaryotic RNA exosomes PUBMED:15951817, PUBMED:17174896, the later acting as nano-compartments for the degradation or processing of RNA (including mRNA, rRNA, snRNA and snoRNA). Bacterial/organelle PNPases share a common barrel structure with RNA exosomes, consisting of a hexameric ring of PH domains that act as a degradation chamber, and an S1-domain/KH-domain containing cap that binds the RNA substrate (and sometimes accessory proteins) in order to regulate and restrict entry into the degradation chamber PUBMED:16285927. Unstructured RNA substrates feed in through the pore made by the S1 domains, are degraded by the PH domain ring, and exit as nucleotides via the PH pore at the opposite end of the barrel PUBMED:16713559, PUBMED:17380186.

    \ \

    This entry represents the phosphorolytic (PH) domain 2, which has a core 3-layer alpha/beta/alpha structure. This domain is found in bacterial/organelle PNPases and in archaeal/eukaryotic exosomes PUBMED:9390555.

    \

    More information about these proteins can be found at Protein of the Month: RNA Exosomes PUBMED:.

    \ ' '813' 'IPR004018' '\ The RPEL repeat is named after four conserved amino acids it contains. The function of the RPEL repeat is unknown however it might be a DNA binding repeat based on the observation that Q9VZY2 contains a SAP domain that is also implicated in DNA binding.\ ' '814' 'IPR007201' '\

    This RNA recognition motif 2 is found in Meiosis protein mei2. It is found C-terminal to the RNA-binding region RNP-1 ().

    \ ' '815' 'IPR000228' '\ RNA cyclases are a family of RNA-modifying enzymes that are conserved in\ eukaryotes, bacteria and archaea.\ RNA 3\'-terminal phosphate cyclase () PUBMED:9184239, PUBMED:2199762 catalyses the conversion\ of 3\'-phosphate to a 2\',3\'-cyclic phosphodiester at the end of RNA.\ \ These enzymes might be responsible for production of the cyclic phosphate RNA ends that are known to be required by many RNA ligases in both prokaryotes and eukaryotes.\

    RNA cyclase is a protein of from 36 to 42 kDa. The best conserved region is a\ glycine-rich stretch of residues located in\ the central part of the sequence and which is reminiscent of various ATP, GTP\ or AMP glycine-rich loops.

    \

    The crystal structure of RNA 3\'-terminal phosphate cyclase shows that each molecule consists of two domains. The larger domain contains three repeats of a folding unit comprising two parallel alpha helices and a\ four-stranded beta sheet; this fold was previously identified in translation initiation factor 3 (IF3).\ The large domain is similar to one of the two domains of 5-enolpyruvylshikimate-3-phosphate\ synthase and UDP-N-acetylglucosamine enolpyruvyl transferase. The smaller domain uses a\ similar secondary structure element with different topology, observed in many other proteins such\ as thioredoxin PUBMED:10673421. Although the active site of this enzyme could not be\ unambiguously assigned, it can be mapped to a region surrounding His309, an adenylate\ acceptor, in which a number of amino acids are highly conserved in the enzyme from different\ sources PUBMED:10673421.

    \ ' '816' 'IPR013796' '\ RNA cyclases are a family of RNA-modifying enzymes that are conserved in\ eukaryotes, bacteria and archaea.\ RNA 3\'-terminal phosphate cyclase () PUBMED:9184239, PUBMED:2199762 catalyses the conversion\ of 3\'-phosphate to a 2\',3\'-cyclic phosphodiester at the end of RNA.\ \ These enzymes might be responsible for production of the cyclic phosphate RNA ends that are known to be required by many RNA ligases in both prokaryotes and eukaryotes.\

    RNA cyclase is a protein of from 36 to 42 kDa. The best conserved region is a\ glycine-rich stretch of residues located in\ the central part of the sequence and which is reminiscent of various ATP, GTP\ or AMP glycine-rich loops.

    \

    The crystal structure of RNA 3\'-terminal phosphate cyclase shows that each molecule consists of two domains. The larger domain contains three repeats of a folding unit comprising two parallel alpha helices and a\ four-stranded beta sheet; this fold was previously identified in translation initiation factor 3 (IF3).\ The large domain is similar to one of the two domains of 5-enolpyruvylshikimate-3-phosphate\ synthase and UDP-N-acetylglucosamine enolpyruvyl transferase. The smaller domain uses a\ similar secondary structure element with different topology, observed in many other proteins such\ as thioredoxin PUBMED:10673421. Although the active site of this enzyme could not be\ unambiguously assigned, it can be mapped to a region surrounding His309, an adenylate\ acceptor, in which a number of amino acids are highly conserved in the enzyme from different\ sources PUBMED:10673421.

    \ \

    This entry contains the insert-domain of approximately 100 amino acids.

    \ ' '817' 'IPR004039' '\ Rubredoxin is a low molecular weight iron-containing bacterial protein involved in electron transfer PUBMED:2244884, PUBMED:1992166, sometimes\ replacing ferredoxin as an electron carrier PUBMED:7726577.\ \

    The 3-D structures of a number of rubredoxins have been solved PUBMED:1303768, PUBMED:3441010. The fold belongs to the alpha+beta class, with 2 alpha-helices and 2-3\ beta-strands. Its active site contains an iron ion which is co-ordinated by the sulphurs of four conserved cysteine residues forming an\ almost regular tetrahedron. The conserved cysteines reside on two loops, which are the most conserved regions of the protein. In addition, a ring of acidic residues in the proximity of the [Fe(Cys)4] centre is also well-conserved PUBMED:3441010. \

    \ ' '818' 'IPR004012' '\ This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could therefore play important roles in multiple Ras-like GTPase signalling pathways.\ ' '819' 'IPR013524' '\ The AML1 gene is rearranged by the t(8;21) translocation in acute myeloid\ leukemia PUBMED:7651838. The gene is highly similar to the Drosophila melanogaster segmentation \ gene runt and to the mouse transcription factor PEBP2 alpha subunit gene PUBMED:7651838.\ The region of shared similarity, known as the Runt domain, is responsible \ for DNA-binding and protein-protein interaction. \

    In addition to the highly-conserved Runt domain, the AML-1 gene product\ carries a putative ATP-binding site (GRSGRGKS), and has a C-terminal region\ rich in proline and serine residues. The protein (known as acute myeloid \ leukemia 1 protein, oncogene AML-1, core-binding factor (CBF), alpha-B \ subunit, etc.) binds to the core site, 5\'-pygpyggt-3\', of a number of\ enhancers and promoters.

    \

    The protein is a heterodimer of alpha- and beta-subunits. The alpha-subunit\ binds DNA as a monomer, and appears to have a role in the development of\ normal hematopoiesis. CBF is a nuclear protein expressed in numerous tissue\ types, except brain and heart; highest levels have been found to occur in \ thymus, bone marrow and peripheral blood.

    \

    This domain occurs towards the N-terminus of the proteins in this entry.

    \ ' '820' 'IPR001584' '\

    Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis PUBMED:10384242.

    \

    Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group.

    \

    HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS PUBMED:9161051. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity PUBMED:10384243.

    \ ' '821' 'IPR018061' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses).

    \

    Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.

    \ ' '822' 'IPR000477' '\ The use of an RNA template to produce DNA, for integration into the host genome and exploitation of a host cell, is a strategy employed in the replication of retroid elements, such as the retroviruses and bacterial retrons. The enzyme catalysing polymerisation is an RNA-directed DNA-polymerase, or reverse trancriptase (RT) (). Reverse transcriptase occurs in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses.\

    Retroviral reverse transcriptase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The discovery of retroelements in the prokaryotes raises intriguing questions concerning their roles in bacteria and the origin and evolution of reverse transcriptases and whether the bacterial reverse transcriptases are older than eukaryotic reverse transcriptases PUBMED:8828137.

    \ ' '823' 'IPR005326' '\

    This presumed domain is found at the N terminus of some isoforms of the cytoskeletal muscle protein plectin as well as the ribosomal S10 protein. This domain may be involved in RNA binding.

    \ ' '824' 'IPR002942' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The S4 domain is a small domain consisting of 60-65 amino acid residues\ that was detected in the bacterial ribosomal protein S4, eukaryotic\ ribosomal S9, two families of pseudouridine synthases, a novel family\ of predicted RNA methylases, a yeast protein containing a pseudouridine\ synthetase and a deaminase domain, bacterial tyrosyl-tRNA synthetases,\ and a number of uncharacterised, small proteins that may be involved in\ translation regulation PUBMED:10093218. The S4 domain probably mediates binding to\ RNA.

    \ ' '825' 'IPR007477' '\

    This presumed domain is found in proteins containing FERM domains . This domain is found to bind to both spectrin and actin, hence the name SAB (Spectrin and Actin Binding) domain.

    \ ' '826' 'IPR005062' '\

    This large family includes diverse proteins involved in large complexes PUBMED:11042177, PUBMED:8918598, PUBMED:8621436. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3 PUBMED:12631707, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localisation of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle. The 26S protease (or 26S proteasome) is responsible for degrading ubiquitin conjugates. It consists of 19S regulatory complexes associated with the ends of 20S proteasomes. The 19S regulatory complex is composed of about 20 different polypeptides and confers ATP-dependence and substrate specificity to the 26S enzyme. The conserved region occurs at the C-terminal of the Nin1-like regulatory subunit PUBMED:8143766, PUBMED:10733502, PUBMED:9712829. This family includes several eukaryotic translation initiation factor 3 subunit 11 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multisubunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits PUBMED:10716708.

    \ ' '827' 'IPR005097' '\ This family comprised of three structural domains that can not be separated in the linear sequence. In some\ organisms this enzyme is found as a bifunctional polypeptide with lysine ketoglutarate reductase (PF). The\ saccharopine dehydrogenase can also function as a saccharopine reductase.\ ' '828' 'IPR001660' '\

    The sterile alpha motif (SAM) domain is a putative protein interaction module present in a wide variety of proteins PUBMED:9007998\ involved in many biological processes. The SAM domain that spreads over around 70 residues is found in diverse\ eukaryotic organisms PUBMED:9886291. SAM domains have been shown to homo- and hetero-oligomerise, forming multiple self-association architectures and also binding to various non-SAM\ domain-containing proteins PUBMED:9343432, nevertheless with a\ low affinity constant PUBMED:9933164. SAM domains also appear to possess the ability to bind RNA PUBMED:14659692. Smaug, a protein that helps to establish a morphogen gradient in Drosophila embryos by\ repressing the translation of nanos (nos) mRNA, binds to the 3\'\ untranslated region (UTR) of nos mRNA via two similar hairpin structures. The 3D crystal\ structure of the Smaug RNA-binding region shows a cluster of positively charged residues on the Smaug-SAM domain, which\ could be the RNA-binding surface. This electropositive potential is unique among all previously\ determined SAM-domain structures and is conserved among Smaug-SAM homologs. These results\ suggest that the SAM domain might have a primary role in RNA binding.

    \ \

    Structural analyses show that the SAM domain is arranged in a small five-helix bundle with two large interfaces PUBMED:9343432. In\ the case of the SAM domain of EphB2, each of these interfaces is able to form dimers. The presence of these two\ distinct intermonomers binding surface suggest that SAM could form extended polymeric structures PUBMED:9933164.

    \ \ \ ' '829' 'IPR003118' '\

    Transcription factors are protein molecules that bind to specific DNA\ sequences in the genome, resulting in the induction or inhibition of gene\ transcription PUBMED:2163347. The ets oncogene is such a factor, possessing a region \ of 85-90 amino acids known as the ETS (erythroblast transformation specific) domain PUBMED:2163347, PUBMED:2253872. This domain is rich in\ positively-charged and aromatic residues, and binds to purine-rich segments\ of DNA. The ETS domain has been identified in other transcription factors\ such as PU.1, human erg, human elf-1, human elk-1, GA binding protein, and\ a number of others PUBMED:2163347, PUBMED:2253872, PUBMED:8425553.\ It is generally localized at the C-terminus of the protein,\ with the exception of ELF-1, ELK-1, ELK-3, ELK-4 and ERF where it is found at\ the N-terminus.

    \ \

    This entry describes a subfamily of the SAM domain a widespread domain in signalling and nuclear proteins that occurs along with the ETS domain.

    \ ' '830' 'IPR000770' '\

    The SAND domain (named after Sp100, AIRE-1, NucP41/75, DEAF-1) is a conserved\ ~80 residue region found in a number of nuclear proteins, many of which\ function in chromatin-dependent transcriptional control. These include\ proteins linked to various human diseases, such as the Sp100 (Speckled protein\ 100 kDa), NUDR (Nuclear DEAF-1 related), GMEB (Glucocorticoid Modulatory\ Element Binding) proteins and AIRE-1 (Autoimmune regulator 1) proteins.

    \

    \ Proteins containing the SAND domain have a modular structure; the SAND domain\ can be associated with a number of other modules, including the bromodomain, the PHD finger and the MYND finger.\ Because no SAND domain has been found in yeast, it is thought that the SAND\ domain could be restricted to animal phyla. Many SAND domain-containing\ proteins, including NUDR, DEAF-1 (Deformed epidermal autoregulatory factor-1)\ and GMEB, have been shown to bind DNA sequences specifically. The SAND domain\ has been proposed to mediate the DNA binding activity of these proteins PUBMED:9697411, PUBMED:11427895.

    \

    \ The resolution of the 3D structure of the SAND domain from Sp100b has revealed\ that it consists of a novel alpha/beta fold. The SAND domain\ adopts a compact fold consisting of a strongly twisted, five-stranded\ antiparallel beta-sheet with four alpha-helices packing against one side of\ the beta-sheet. The opposite side of the beta-sheet is solvent exposed. The\ beta-sheet and alpha-helical parts of the structure form two distinct regions.\ Multiple hydrophobic residues pack between these regions to form a structural\ core. A conserved KDWK sequence motif is found within the alpha-helical,\ positively charged surface patch. The DNA binding surface has been mapped to\ the alpha-helical region encompassing the KDWK motif PUBMED:11427895.

    \ ' '831' 'IPR003034' '\

    The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA binding domain found in diverse nuclear proteins involved in chromosomal organization PUBMED:10694879, including in apoptosis PUBMED:10490026. In yeast, SAP is found in the most distal N-terminal region of E3 SUMO-protein ligase SIZ1, where it is involved in nuclear localization PUBMED:16109721.

    \ ' '832' 'IPR007856' '\

    Synonym(s):cerebroside sulphate activator, CSAct

    Saposin B is a small non-enzymatic glycoprotein required for the breakdown\ of cerebroside sulphates (sulphatides) in lysosomes. Saposin B contains three intramolecular disulphide bridges, exists as a dimer and is remarkably heat, protease, and pH stable. The crystal structure of human saposin B reveals an unusual shell-like dimer consisting of a monolayer of alpha-helices enclosing a large hydrophobic cavity. Although the secondary structure of saposin B is similar to that of the known monomeric members of the saposin-like superfamily, the helices are repacked into a different tertiary arrangement to form the homodimer. A comparison of the two forms of the saposin B dimer suggests that extraction of target lipids from membranes involves a conformational change that facilitates access to the inner cavity PUBMED:12518053.

    \ \ ' '833' 'IPR008138' '\ Saposins are small lysosomal proteins that serve as activators of various\ lysosomal lipid-degrading enzymes PUBMED:7595087. They probably act by isolating the\ lipid substrate from the membrane surroundings, thus making it more \ accessible to the soluble degradative enzymes. All mammalian saposins\ are synthesized as a single precursor molecule (prosaposin) which contains\ four Saposin-B domains, yielding the active saposins after proteolytic\ cleavage, and two Saposin-A domains that are removed in the activation\ reaction. \ The Saposin-B domains also occur in other \ proteins, many of them active in the lysis of membranes PUBMED:8003971, PUBMED:8868085.\ \ ' '834' 'IPR007587' '\

    This family includes a conserved region from a group of yeast proteins that associate with the SIT4 phosphatase. This association is required for SIT4\'s role in G1 cyclin transcription and for bud formation. This family also includes homologous regions from other eukaryotes.

    \ ' '835' 'IPR007146' '\

    This family contains Utp3 and LCP5 which are components of the U3 ribonucleoprotein complex PUBMED:12068309. It also includes the Homo sapiens (Human) C1D protein and Saccharomyces cerevisiae (Baker\'s yeast) YHR081W (rrp47), an exosome-associated protein required for the 3\' processing of stable RNAs PUBMED:9611201 and Sas10 which has been identified as a regulator of chromatin silencing PUBMED:12972615. This entry also includes the human protein Neuroguidin, an initiation factor 4E (eIF4E)-binding protein PUBMED:9611201.

    \ ' '836' 'IPR006059' '\

    Bacterial high affinity transport systems are involved in active transport of solutes across the cytoplasmic membrane. The protein components of these traffic systems include one or two transmembrane protein components, one or two membrane-associated ATP-binding proteins and a high affinity periplasmic solute-binding protein. In Gram-positive bacteria, which are surrounded by a single membrane and therefore have no periplasmic region, the equivalent proteins are bound to the membrane via an N-terminal lipid anchor. These homologue proteins do not play an integral role in the transport process per se, but probably serve as receptors to trigger or initiate translocation of the solute through the membrane by binding to external sites of the integral membrane proteins of the efflux system. In addition at least some solute-binding proteins function in the initiation of sensory transduction pathways.

    \

    On the basis of sequence similarities, the vast majority of these solute-binding proteins can be grouped PUBMED:8336670\ into eight families of clusters, which generally correlate with the nature of the solute bound. Family 1 currently \ includes the periplasmic proteins maltose/maltodextrin-binding proteins of Enterobacteriaceae (gene malE) PUBMED:7853407 \ and Streptococcus pneumoniae malX; multiple oligosaccharide binding protein of Streptococcus mutans (gene msmE); Escherichia coli \ glycerol-3-phosphate-binding protein; Serratia marcescens iron-binding protein (gene sfuA) and the homologous proteins \ (gene fbp) from Haemophilus influenzae and Neisseria; and the E. coli thiamine-binding protein (gene tbpA).

    \ ' '837' 'IPR001638' '\

    Bacterial high affinity transport systems are involved in active transport of solutes across the cytoplasmic membrane. The protein components of these traffic systems include one or two transmembrane protein components, one or two membrane-associated ATP-binding proteins (ABC transporters; see ) and a high affinity periplasmic solute-binding protein. The latter are thought to bind the substrate in the vicinity of the inner membrane, and to transfer it to a complex of inner membrane proteins for concentration into the cytoplasm.

    \

    In Gram-positive bacteria which are surrounded by a single membrane and have therefore no periplasmic region, the equivalent proteins are bound to the membrane via an N-terminal lipid anchor. These homologue proteins do not play an integral role in the transport process per se, but probably serve as receptors to trigger or initiate translocation of the solute throught the membrane by binding to external sites of the integral membrane proteins of the efflux system.

    \

    In addition, at least some solute-binding proteins function in the initiation of sensory transduction pathways.

    \

    On the basis of sequence similarities, the vast majority of these solute-binding proteins can be grouped PUBMED:8336670 into eight families or clusters, which generally correlate with the nature of the solute bound.

    \

    Family 3 groups together specific amino acids and opine-binding periplasmic proteins and a periplasmic homologue with catalytic activity.

    \ ' '838' 'IPR007273' '\ In vertebrates, secretory carrier membrane proteins (SCAMPs) 1-3 constitute a family of putative membrane-trafficking proteins composed of cytoplasmic N-terminal sequences with NPF repeats, four central transmembrane regions (TMRs), and a cytoplasmic tail. SCAMPs probably function in endocytosis by recruiting EH-domain proteins to the N-terminal NPF repeats but may have additional functions mediated by their other sequences PUBMED:11050114.\ ' '839' 'IPR003309' '\ A number of C2H2-zinc finger proteins contain a highly conserved N-terminal motif termed the SCAN domain. The SCAN domain may play an important role in the assembly and function of this newly defined subclass of transcriptional regulators PUBMED:10567577.\ ' '840' 'IPR003452' '\ Stem cell factor (SCF) is a homodimer involved in hematopoiesis. SCF binds to and activates the SCF receptor (SCFR), a receptor tyrosine kinase. SCF stimulates the proliferation of mast cells and is able to augment the proliferation of both myeloid and lymphoid hematopoietic progenitors in bone marrow culture. It also mediates cell-cell adhesion and acts synergistically with other cytokines. SCF is a type I membrane protein, but is also found in a secretable, soluble form. The crystal structure of human SCF has been resolved and a potential receptor-binding site identified PUBMED:10884405.\ ' '841' 'IPR014044' '\

    This domain is found in eukaryotes as well as prokaryotes PUBMED:12625841. It has been proposed to be a Ca++ chelating serine protease PUBMED:9067611. The Ca++-chelating function would fit with the various signalling processes (e.g. the CRISP proteins) that members of this family are involved in PUBMED:12759345, and also the sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how blocks the Ca++ transporting ryanodine receptors.

    \ ' '842' 'IPR005552' '\ Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury PUBMED:11487015.\ ' '843' 'IPR000082' '\ SEA is an extracellular domain associated with\ O-glycosylation PUBMED:7670383.\ Proteins found to contain SEA-modules include, agrin, enterokinase, 63 kDa Strongylocentrotus purpuratus (Purple sea urchin)\ sperm protein, perlecan (heparan sulphate proteoglycan core, mucin 1 and the cell surface antigen, 114/A10, and two functionally uncharacterised,\ probably extracellular, Caenorhabditis elegans proteins. Despite the functional\ diversity of these adhesive proteins, a common denominator seems to be their\ existence in heavily glycosylated environments. In addition, the better characterised\ proteins all contain O-glycosidic-linked carbohydrates such as\ heparan sulphate that contribute considerably to their molecular masses. The common\ module might regulate or assist binding to neighbouring carbohydrate moieties.\

    Enterokinase, the initiator of intestinal digestion, is a\ mosaic protease composed of a distinctive assortment of\ domains PUBMED:8052624.

    \ ' '844' 'IPR007225' '\

    Sec15 is a component of the exocyst complex involved in the docking of exocystic vesicles with a fusion site on the plasma membrane. The exocyst complex is composed of Sec3, Sec5, Sec6, Sec8, Sec10, Sec15, Exo70 and Exo84.

    \ ' '845' 'IPR006900' '\

    COPII (coat protein complex II)-coated vesicles carry proteins from the endoplasmic reticulum (ER) to the Golgi complex PUBMED:11535824. COPII-coated vesicles form on the ER by the stepwise recruitment of three cytosolic components: Sar1-GTP to initiate coat formation, Sec23/24 heterodimer to select SNARE and cargo molecules, and Sec13/31 to induce coat polymerisation and membrane deformation PUBMED:12239560.

    \

    Sec23 p and Sec24p are structurally related, folding into five distinct domains: a beta-barrel, a zinc-finger (), an alpha/beta trunk domain (), an all-helical region, and a C-terminal gelsolin-like domain (). This entry describes the all-helical domain, which forms an approximately 105-residue segment with the C-terminal 30 residues. The linker between alpha-M and alpha-N contacts Sar1.

    \ \ ' '846' 'IPR006896' '\

    COPII (coat protein complex II)-coated vesicles carry proteins from the endoplasmic reticulum (ER) to the Golgi complex PUBMED:11535824. COPII-coated vesicles form on the ER by the stepwise recruitment of three cytosolic components: Sar1-GTP to initiate coat formation, Sec23/24 heterodimer to select SNARE and cargo molecules, and Sec13/31 to induce coat polymerisation and membrane deformation PUBMED:12239560.

    \

    Sec23 p and Sec24p are structurally related, folding into five distinct domains: a beta-barrel, a zinc-finger (), an alpha/beta trunk domain, an all-helical region (), and a C-terminal gelsolin-like domain (). This entry describes the Sec23/24 alpha/beta trunk domain, which is formed from a single, approximately 250-residue segment plugged into the beta-barrel between strands beta-1 and beta-19. The trunk has an alpha/beta fold with a vWA topology, and it forms the dimer interface, primarily involving strand beta-14 on Sec23 and Sec24; in addition, the trunk domain of Sec23 contacts Sar1.

    \ \ ' '847' 'IPR007265' '\

    Sec34 and Sec35 form a sub-complex in a seven-protein complex that includes Dor1. This complex is thought to be important for tethering vesicles to the Golgi PUBMED:11703943.

    \ ' '848' 'IPR004179' '\

    This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons PUBMED:16368690, PUBMED:11023840. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases.

    \ ' '849' 'IPR007191' '\

    Sec8 is a component of the exocyst complex involved in the docking of exocystic vesicles with a fusion site on the plasma membrane. The exocyst complex is composed of Sec3, Sec5, Sec6, Sec8, Sec10, Sec15, Exo70 and Exo84.

    \ ' '850' 'IPR001214' '\

    The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control PUBMED:9537414. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure.

    \ \

    Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception PUBMED:12123582, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain PUBMED:11691919, PUBMED:11893494. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities PUBMED:12540855.

    \ \

    The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.

    \ \

    The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site PUBMED:12887903. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity PUBMED:12372305 PUBMED:12389037.

    \ \ \ \ \ \ ' '851' 'IPR005224' '\

    The sugar fermentation stimulation protein is a probable regulatory factor involved in maltose metabolism. It contains a putative\ DNA-binding domain, and was isolated as a gene which enabled Escherichia coli W3110 (strain MK2001) to use maltose PUBMED:2013578.

    \ ' '852' 'IPR007699' '\ This domain was thought to be unique to the SGT1-like proteins, but is also found in calcyclin binding proteins. Sgt1p is a highly conserved eukaryotic protein that is required for both SCF (Skp1p/Cdc53p-Cullin-F-box)-mediated ubiquitination and kinetochore function in yeast and also plays a role in the cAMP pathway. Calcyclin (S100A6) is a member of the S100A family of calcium binding proteins and appears to play a role in cell proliferation PUBMED:12577318.\ ' '853' 'IPR007131' '\

    The SLA1 homology domain is found in the cytoskeleton assembly control protein SLA1, which is responsible for the correct formation of the actin cytoskeleton.

    \ ' '854' 'IPR007009' '\ This conserved region identifies a set of hypothetical protein sequences from the Metazoa and Ascomycota which include SHQ1 from Saccharomyces cerevisiae.\ ' '855' 'IPR003582' '\

    The ShK toxin domain is found in metridin, a toxin from Metridium senile (brown sea anemone) and in ShK, a structurally defined polypeptide from the sea anemone Stoichactis helianthus (Stichodactyla helianthus) (Caribbean sea anemone). ShK is a powerful inhibitor of T lymphocyte voltage-gated potassium channels, in particular Kv1.3 PUBMED:10545177. It has been proposed that structural analogues may have use as an immunosuppressants for the prevention of graft rejection and for the treatment of autoimmune diseases PUBMED:9830012.

    \ \

    The ShK toxin domain, is also found in one or more copies as a C-terminal domain in the metallopeptidases of Caenorhabditis elegans. The metallopeptidases belonging to MEROPS peptidase families: M10A, M12A and M14A. The majority belonging to M12A, the astacin/adamalysin family of metallopeptidases.

    \ \ ' '856' 'IPR007627' '\

    The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes.

    With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding \'helix-turn-helix\' motif involved in binding the conserved -35 region of promoters recognised by the major sigma factors PUBMED:3092189, PUBMED:1597408. \

    \ Region 2 of sigma-70 is the most conserved region of the entire protein. All members of this class of sigma-factor contain region 2. The high conservation is due to region 2 containing both the -10 promoter recognition helix and the primary core RNA polymerase binding determinant. The core-binding helix, interacts with the clamp domain of the largest polymerase subunit, beta prime PUBMED:11931761, PUBMED:8858155. The aromatic residues of the recognition helix, found at the C terminus of this domain are thought to mediate strand separation, thereby allowing transcription initiation PUBMED:11931761, PUBMED:8858155.\ ' '857' 'IPR007630' '\

    The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes.

    With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding \'helix-turn-helix\' motif involved in binding the conserved -35 region of promoters recognised by the major sigma factors PUBMED:3092189, PUBMED:1597408. \

    \ Region 4 of sigma-70 like sigma-factors is involved in binding to the -35 promoter element via a helix-turn-helix motif PUBMED:11931761. Due to the way Pfam works, the threshold has been set artificially high to prevent overlaps with other helix-turn-helix families. Therefore there are many false negatives.\ ' '858' 'IPR018121' '\ The seven in absentia (sina) gene was first identified in Drosophila. The Drosophila Sina protein is essential for the determination of the R7 pathway in photoreceptor cell development: the loss of functional Sina results in the transformation of the R7 precursor cell to a non-neuronal cell type. The Sina protein contains an N-terminal RING finger domain C3HC4-type. Through this domain, Sina binds E2 ubiquitin-conjugating enzymes (UbcD1) Sina also interacts with Tramtrack (TTK88) via PHYL. Tramtrack is a transcriptional repressor that blocks photoreceptor determination, while PHYL down-regulates the activity of TTK88. In turn, the activity of PHYL requires the activation of the Sevenless receptor tyrosine kinase, a process essential for R7 determination. It is thought that Sina targets TTK88 for degradation, therefore promoting the R7 pathway. Murine and human homologues of Sina have also been identified. The human homologue Siah-1 PUBMED:9403064 also binds E2 enzymes (UbcH5) and through a series of physical interactions, targets beta-catenin for ubiquitin degradation. Siah-1 expression is enhanced by p53, itself promoted by DNA damage. Thus this pathway links DNA damage to beta-catenin degradation PUBMED:9267026, PUBMED:11389839. Sina proteins, therefore, physically interact with a variety of proteins. The N-terminal RING finger domain that binds ubiquitin conjugating enzymes is a C3HC4-type, and does not form part of the alignment for this family. The remainder C-terminal part is involved in interactions with other proteins, and is included in this alignment. In addition to the Drosophila protein and mammalian homologues, whose similarity was noted previously, this family also includes putative homologues from Caenorhabditis elegans, Arabidopsis thaliana.\ ' '859' 'IPR007022' '\ Survival motor neuron (SMN) interacting protein 1, SIP1, interacts with SMN protein and plays a crucial role in the biogenesis of spliceosomes. There is an evidence that the protein is linked to spinal muscular atrophy and amyotrophic lateral sclerosis in humans PUBMED:11943600.\ ' '860' 'IPR011996' '\

    Potassium channels are the most diverse group of the ion channel family\ PUBMED:1772658, PUBMED:1879548. They are important in shaping the action potential, and in neuronal excitability and plasticity PUBMED:2451788. The potassium channel family is\ composed of several functionally distinct isoforms, which can be broadly\ separated into 2 groups PUBMED:2555158: the practically non-inactivating \'delayed\' group and the rapidly inactivating \'transient\' group.

    \

    These are all highly similar proteins, with only small amino acid\ changes causing the diversity of the voltage-dependent gating mechanism,\ channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or\ other second messengers PUBMED:2448635. In eukaryotic cells, K+ channels\ are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes PUBMED:1373731. In prokaryotic cells, they play a role in the\ maintenance of ionic homeostasis PUBMED:11178249.

    \

    All K+ channels discovered so far possess a core of \ alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has\ been termed the K+ selectivity sequence.\ In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane.\ However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains.\ The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK)\ PUBMED:11178249, PUBMED:. The 2TM domain family comprises inward-rectifying K+ \ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.

    \

    Ca2+-activated K+ channels are a diverse group of channels that are activated by an increase in intracellular Ca2+ concentration. They are found in the majority of nerve cells, where they modulate cell excitability and action potential. Three types of Ca2+-activated K+ channel have been characterised, termed small-conductance (SK), intermediate conductance (IK) and large conductance (BK) respectively PUBMED:9687354.

    \

    SK channels are thought to play an important role in the functioning of all excitable tissues. To date, 3 subtypes (designated SK1-SK3) have been cloned, each of which possesses a different tissue expression profile: SK1 channels are expressed in the heart; SK2 channels are found in the adrenal gland; and SK3 channels are known to be present in skeletal muscle PUBMED:8781233. SK channels have a single-channel conductance of 2-20 pS and are activated by rises in cytosolic calcium with half maximal activation in the 400-800 nM range PUBMED:2432249, PUBMED:7993625. Unlike BK channels, they are voltage insensitive and unaffected by low concentrations of TEA, charybdotoxin, or iberiotoxin. However, they are potently blocked by the bee venom apamin PUBMED:6099412, PUBMED:2430185, tubocurarine, and quaternary salts of bicuculline PUBMED:9280156, PUBMED:10390643. A new series of compounds that block SK channels include dequalinium

    Synonym(s): SK Channel

    \

    This entry represents a conserved region, found in proteins of SK channels family.

    \ ' '861' 'IPR003380' '\ The c-ski proto-oncogene has been shown to influence proliferation, morphological transformation and myogenic differentiation PUBMED:7999783. It may play a role in terminal differentiation of skeletal muscle cells but not in the determination of cells to the myogenic lineage. Sno, a Ski proto-oncogene homologue, is expressed in two isoforms and plays a role in the response to proliferation stimuli.\ ' '862' 'IPR016072' '\

    SKP1 (together with SKP2) was identified as an essential component of the \ cyclin A-CDK2 S phase kinase complex PUBMED:10205047. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex PUBMED:8670864 and is also involved in the ubiquitin pathway PUBMED:9390558. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1 PUBMED:7852383.

    \

    This entry represents a dimerisation domain found at the C-terminal of SKP1 proteins PUBMED:11099048, as well as in subunit D of the centromere DNA-binding protein complex Cbf3 PUBMED:12553912. This domain is multi-helical in structure, and consists of an interlocked herterodimer in F-box proteins.

    \ ' '863' 'IPR016073' '\

    SKP1 (together with SKP2) was identified as an essential component of the \ cyclin A-CDK2 S phase kinase complex PUBMED:10205047. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex PUBMED:8670864 and is also involved in the ubiquitin pathway PUBMED:9390558. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1 PUBMED:7852383.

    \

    This entry represents a POZ domain with a core structure consisting of beta(2)/alpha(2)/beta(2)/alpha(2) in two layers, alpha/beta. This domain is found at the N-terminal of SKP1 proteins PUBMED:11099048 as well as in subunit D of the centromere DNA-binding protein complex Cbf3 PUBMED:12553912.\

    \ ' '864' 'IPR001119' '\

    S-layers are paracrystalline mono-layered assemblies of (glyco)proteins which coat the surface of bacteria PUBMED:10366863, PUBMED:10648507. Several S-layer proteins and some other cell wall proteins contain one or more copies of a domain of about 50-60 residues, which has been called SLH (for S-layer homology). Although it was originally proposed that SLH domains bind to peptidoglycan, it is now evident that pyruvylated secondary cell wall polymers (SCWPs), which are either teichoic acids, teichuronic acids, lipoteichoic acids or lipoglycans, serve as the anchoring structures for SLH motifs in the Gram-positive cell wall PUBMED:10049812, PUBMED:15758211. However, the study of S-layer protein SbpA of Bacillus sphaericus revealed that SLH motifs are not sufficient for specific binding to SCWPs. Thus, the molecular basis explaining SLH affinity and specificity of interaction with cell wall polymers are not completely elucidated PUBMED:16487313.

    \ ' '865' 'IPR001163' '\

    This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology.

    \

    Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function PUBMED:10801455, PUBMED:12438310. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B\', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) PUBMED:15130578. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker PUBMED:7744013. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing PUBMED:15526162. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins.

    \

    The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA PUBMED:15561140. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.

    \ \ ' '866' 'IPR003395' '\

    The SMC (structural maintenance of chromosomes)\ family of proteins exists in virtually all organisms including both bacteria and archaea. The SMC proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms and form three types of heterodimer (SMC1-SMC3, SMC2-SMC4, SMC5-SMC6), which are core components of large multiprotein complexes.\ The best known complexes are cohesin, which is responsible for\ sister-chromatid cohesion, and condensin, which is required for full\ chromosome condensation in mitosis.

    SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170 kDa, and share a five-domain structure, with globular N- and C-terminal domains separated by a long\ (circa 100 nm or 900 residues) coiled coil segment in the centre of which is a globular \'\'hinge\'\' domain, characterised by a set of four highly conserved glycine residues\ that are typical of flexible regions in a protein. The amino-terminal domain contains a \'Walker A\' nucleotide-binding domain (GxxGxGKS/T), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a \'Walker B\' motif (XXXXD, where X is any hydrophobic residue), and a LSGG motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases PUBMED:12360193.

    All\ SMC proteins appear to form dimers, either forming homodimers with themselves, as in the case of prokaryotic\ SMC proteins, or heterodimers between different but related SMC proteins. The\ dimers are arranged in an antiparallel alignment. This orientation brings the N- and C-terminal globular domains (from either different or\ identical protamers) together, which unites an ATP binding site (Walker A motif) within the N-terminal domain\ with a Walker B motif (DA box) within the C-terminal domain, to form a potentially functional ATPase. Protein interaction and microscopy data suggest that SMC\ dimers form a ring-like structure which might embrace DNA molecules. Non-SMC subunits\ associate with the SMC amino- and carboxy-terminal domains. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences.

    \

    SMCs share not only sequence similarity but also structural similarity with ABC proteins. SMC proteins function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair and epigenetic silencing of gene expression PUBMED:11983169.

    \

    This domain is found at the N terminus of SMC proteins.

    \ ' '867' 'IPR007287' '\ Sof1 is essential for cell growth and is a component of the nucleolar rRNA processing machinery PUBMED:8508778.\ ' '868' 'IPR001212' '\

    Somatomedin B, a serum factor of unknown function, is a small cysteine-rich peptide, derived proteolytically from the N terminus of the cell-substrate adhesion protein vitronectin PUBMED:2447940. Cys-rich\ somatomedin B-like domains are found in a number of proteins PUBMED:1710108, including plasma-cell membrane glycoprotein (which\ has nucleotide pyrophosphate and alkaline phosphodiesterase I activities) PUBMED:1647027 and placental protein 11 (which appears to possess amidolytic activity).

    \ \

    The SMB domain of vitronectin has been demonstrated to interact with both the urokinase receptor and the plasminogen activator inhibitor-1 (PAI-1) and the conserved cysteines of the NPP1 somatomedin B-like domain have been shown to mediate homodimerisation PUBMED:12533192.

    \ \

    The SMB domain contains eight Cys residues, arranged into four disulphide bonds. It has been suggested that the active SMB domain may be permitted considerable disulphide bond heterogeneity or variability, provided that the Cys25-Cys31 disulphide bond is preserved. The three dimensional structure of the SMB domain is extremely compact and the disulphide bonds are packed in the centre of the domain forming a covalently bonded core PUBMED:15157085. The structure of the SMB domain presents a new protein fold, with the only ordered secondary structure being a single-turn alpha-helix and a single-turn 3(10)-helix PUBMED:12808446.

    \ ' '869' 'IPR005329' '\

    SNXs are hydrophilic molecules that are localized in the cytoplasm\ and have the potential for membrane association either through their lipid-binding\ PX domains () or through protein-protein interactions with membrane-associated\ protein complexes PUBMED:12461558. Indeed, several of the SNXs require several targeting motifs\ for their appropriate cellular localization. In almost every case studied,\ mammalian SNXs can be shown to have a role in protein sorting, with the\ most commonly used experimental model being plasma-membrane receptor\ endocytosis and sorting through the endosomal pathway. However, it is equally\ probable that SNXs sort vesicles that are not derived from the plasma\ membrane, and have a function in the accurate targeting of these vesicles and\ their cargo.

    The N-terminal domain appears to be specific to sorting nexins 1 and 2. SNX1 is both membrane-associated and cytosolic, where it probably exists as a\ tetramer in large protein complexes and may hetero-oligomerize with SNX2.

    \ ' '870' 'IPR007259' '\

    Members of this family are spindle pole body (SBP) components such as Spc97, Spc98 and gamma-tubulin. The SPB functions as the microtubule-organising centre in yeast, with the microtubule cytoskeleton playing an essential role in chromosome segregation, cellular organisation and vesicle trafficking in eukaryotic cells. In most cells, the centrosome is the primary microtubule-organising centre that nucleates and organises microtubules. Gamma-tubulin localises to centrosomes and is required for microtubule nucleation. In Saccharomyces cerevisiae, gamma-tubulin forms a stable complex with Spc97 and Spc98 PUBMED:11950928.

    \ ' '871' 'IPR006570' '\

    SPK is a domain of unknown function found in SET and PHD domain containing proteins and protein\ kinases.

    \ ' '872' 'IPR007159' '\ This domain is found in AbrB from Bacillus subtilis. The product of the abrB gene is an ambiactive repressor and activator of the transcription of genes expressed during the transition state between vegetative growth and the onset of stationary phase and sporulation PUBMED:2504584. AbrB is thought to interact directly with the transcription initiation regions of genes under its control PUBMED:8755877. AbrB contains a helix-turn-helix structure, but this domain ends before the helix-turn-helix begins PUBMED:1908787. The product of the B. subtilis gene spoVT is another member of this family and is also a transcriptional regulator PUBMED:8755877. DNA-binding activity in this AbrB homologue requires hexamerisation PUBMED:10978510. Another family member has been isolated from the Sulfolobus solfataricus and has been identified as a homologue of bacterial repressor-like proteins. The Escherichia coli family member SohA or Prl1F appears to be bifunctional and is able to regulate its own expression as well as relieve the export block imposed by high-level synthesis of beta-galactosidase hybrid proteins PUBMED:2152898.\ ' '873' 'IPR003877' '\ The SPRY domain is of unknown function. Distant homologues are domains in\ butyrophilin/marenostrin/pyrin PUBMED:9204703.\ Ca2+-release from the sarcoplasmic or endoplasmic reticulum, the intracellular\ Ca2+ store, is mediated by the ryanodine receptor (RyR) and/or the inositol\ trisphosphate receptor (IP3R).\ ' '874' 'IPR004331' '\

    The SPX domain is named after SYG1/Pho81/XPR1 proteins. This 180 residue length domain is found at the amino terminus of a variety of proteins. In the yeast protein SYG1, the N-terminus directly binds to the G- protein beta subunit and inhibits transduction of the mating pheromone signal PUBMED:7592711 suggesting that all the members of this family are involved in G-protein associated signal transduction. The C-terminal of these proteins often have an EXS domain () PUBMED:9990033.

    \

    The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors PHO81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family PUBMED:8918192, PUBMED:11069666. NUC-2 contains several ankyrin repeats ().

    \

    Several members of this family are the XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with Murine leukemia virus (MLV) PUBMED:9990033. The similarity between SYG1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, and Saccharomyces cerevisiae PUBMED:9990033, PUBMED:9927670. In addition, given the similarities between XPR1 and SYG1 and phosphate regulatory proteins, it has been proposed that XPR1 might be involved in G-protein associated signal transduction PUBMED:16905115, PUBMED:18315545, PUBMED:18055586 and may itself function as a phosphate sensor PUBMED:9990033.

    \ ' '875' 'IPR004151' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class e (Sre) from the Sra superfamily PUBMED:15618405.

    \ ' '876' 'IPR007222' '\

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes PUBMED:17622352, PUBMED:16469117. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor PUBMED:17507650. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5\' and 3\' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    \

    This entry represents the alpha subunit of the SR receptor.

    \

    The SR receptor is a monomer consisting of the loosely membrane-associated SR-alpha homologue FtsY, while the eukaryotic SR receptor is a heterodimer of SR-alpha (70 kDa) and SR-beta (25 kDa), both of which contain a GTP-binding domain PUBMED:12654246. SR-alpha regulates the targeting of SRP-ribosome-nascent polypeptide complexes to the translocon PUBMED:10859309. SR-alpha binds to the SRP54 subunit of the SRP complex. The SR-beta subunit is a transmembrane GTPase that anchors the SR-alpha subunit (a peripheral membrane GTPase) to the ER membrane PUBMED:7844142. SR-beta interacts with the N-terminal SRX-domain of SR-alpha, which is not present in the bacterial FtsY homologue. SR-beta also functions in recruiting the SRP-nascent polypeptide to the protein-conducting channel.

    \ ' '877' 'IPR000897' '\

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes PUBMED:17622352, PUBMED:16469117. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor PUBMED:17507650. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5\' and 3\' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    \

    This entry represents the GTPase domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species. The GTPase domain is evolutionary related to P-loop NTPase domains found in a variety of other proteins PUBMED:7518075.

    \

    These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homologue of ftsY; and bacterial flagellar biosynthesis protein flhF.

    \ ' '878' 'IPR013822' '\

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes PUBMED:17622352, PUBMED:16469117. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor PUBMED:17507650. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5\' and 3\' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    \

    This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.

    \

    These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homologue of ftsY; and bacterial flagellar biosynthesis protein flhF.

    \ ' '879' 'IPR004125' '\

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes PUBMED:17622352, PUBMED:16469117. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor PUBMED:17507650. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5\' and 3\' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    \

    This entry represents the M domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.

    \

    These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homologue of ftsY; and bacterial flagellar biosynthesis protein flhF.

    \ ' '880' 'IPR000424' '\

    The Escherichia coli single-strand binding protein PUBMED:2087220 (gene ssb), also known as the helix-destabilising protein, is a protein of 177 amino acids. It binds tightly, as a homotetramer, to single-stranded DNA (ss-DNA) and plays an important role in DNA replication, recombination and repair. Closely related variants of SSB are encoded in the genome of a variety of large self-transmissible plasmids. SSB has also been characterised in bacteria such as Proteus mirabilis or Serratia marcescens. Eukaryotic mitochondrial proteins that bind ss-DNA and are probably involved in mitochondrial DNA replication are structurally and evolutionary related to prokaryotic SSB.

    \ ' '881' 'IPR007591' '\ This is a family of eukaryotic single-stranded DNA binding-proteins with specificity to a pyrimidine-rich element found in the promoter region of the alpha2(I) collagen gene.\ ' '882' 'IPR007726' '\ The SSXT or SS18 protein is involved in synovial sarcoma in humans. A SYT-SSX fusion gene resulting from the chromosomal translocation t(X;18) (p11;q11) is characteristic of synovial sarcomas. This translocation fuses the SSXT (SYT) gene from chromosome 18 to either of two homologous genes at Xp11, SSX1 or SSX2 PUBMED:12173050.\ ' '884' 'IPR002913' '\

    START (StAR-related lipid-transfer) is a lipid-binding domain in StAR, HD-ZIP and signalling proteins PUBMED:10322415. StAR (Steroidogenic Acute Regulatory protein) is a mitochondrial protein that is synthesised in response to luteinising hormone stimulation PUBMED:7961770.\ Expression of the protein in the absence of hormone stimulation is sufficient to induce\ steroid production, suggesting that this protein is required in the acute regulation of\ steroidogenesis. Representatives of the START domain family have\ been shown to bind different ligands such as sterols (StAR protein) and\ phosphatidylcholine (PC-TP). Ligand binding by the START domain can also\ regulate the activities of other domains that co-occur with the START domain\ in multidomain proteins such as Rho-gap, the homeodomain,\ and the thioesterase domain PUBMED:10322415, PUBMED:11276083.

    \

    \ The crystal structure of START domain of human MLN64 shows an\ alpha/beta fold built around an U-shaped incomplete beta-barrel. Most\ importantly, the interior of the protein encompasses a 26 x 12 x 11 Angstroms\ hydrophobic tunnel that is apparently large enough to bind a single\ cholesterol molecule PUBMED:10802740. The START domain structure revealed an unexpected\ similarity to that of the birch pollen allergen Bet v 1 and to bacterial\ polyketide cyclases/aromatases PUBMED:11276083, PUBMED:10802740.

    \ ' '885' 'IPR002645' '\ The STAS (Sulphate Transporter and AntiSigma factor antagonist) domain is found in the C-terminal region of sulphate transporters and bacterial anti-sigma factor antagonists. It has been suggested that this domain may have a general NTP binding function. The establishment of differential gene expression in sporulating Bacillus subtilis involves four protein components one of which is SpoIIAA (). The four components regulate the sporulation sigma factor F. Early in sporulation, SpoIIAA is in the phosphorylated state (SpoIIAA-P), as a result of the activity of the ATP-dependent protein kinase SpoIIAB (). The site at which this protein is a conserved serine. SpoIIAB is an anti-sigma factor that in its free form inhibits F by binding to it. Competition by SpoIIAA (the anti-anti-sigma factor) for binding to SpoIIAB releases Sigma F activity PUBMED:9560229. The STAS domain is found in the anti-sigma factor antagonist SpoIIAA.\ ' '886' 'IPR004262' '\ This family represents the C-terminal region of the male sterility protein in a number of organisms. The Arabidopsis thaliana male sterility 2 (MS2) protein is involved in male\ gametogenesis. The MS2 protein shows sequence similarity to a jojoba protein (also a member of this group) that converts wax fatty acids to fatty alcohols. It has been suggested that a possible function of the MS2 protein may be as a fatty acyl reductase in the formation\ of pollen wall substances PUBMED:9351246.\ ' '887' 'IPR001104' '\

    Synonym(s): Steroid 5-alpha-reductase

    \ \

    3-oxo-5-alpha-steroid 4-dehydrogenases, catalyse the conversion of 3-oxo-5-alpha-steroid + acceptor to 3-oxo-delta(4)-steroid + reduced acceptor. The steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during foetal development PUBMED:1686016. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related steroid reductase enzyme, DET2, is found in plants such as Arabidopsis. Mutations in this enzyme cause defects in light-regulated development PUBMED:8602526. This domain is present in both type 1 and type 2 forms.

    \ ' '888' 'IPR004112' '\

    In bacteria two distinct, membrane-bound, enzyme complexes are responsible for\ the interconversion of fumarate and succinate (): fumarate\ reductase (Frd) is used in anaerobic growth, and succinate dehydrogenase (Sdh)\ is used in aerobic growth. Both complexes consist of two main components: a\ membrane-extrinsic component composed of a FAD-binding flavoprotein and an\ iron-sulphur protein; and an hydrophobic component composed of a membrane\ anchor protein and/or a cytochrome B.

    \

    In eukaryotes mitochondrial succinate dehydrogenase (ubiquinone) ()\ is an enzyme composed of two subunits: a FAD flavoprotein and and iron-sulphur\ protein.

    \

    The flavoprotein subunit is a protein of about 60 to 70 Kd to which FAD is\ covalently bound to a histidine residue which is located in the N-terminal\ section of the protein PUBMED:2668268. The sequence around that histidine is well\ conserved in Frd and Sdh from various bacterial and eukaryotic species PUBMED:1375942.

    \

    This family includes members that bind FAD such as the flavoprotein subunits from\ succinate and fumarate dehydrogenase, aspartate oxidase and the alpha subunit of adenylylsulphate\ reductase.

    \ ' '889' 'IPR000917' '\

    Sulphatases are enzymes that hydrolyze various sulphate esters. The sequence of different types of sulphatases are available and have shown to be structurally related PUBMED:2303452, PUBMED:2122463, PUBMED:2476654, including arylsulphatase A (ASA), a lysosomal enzyme which hydrolyses cerebroside sulphate; arylsulphatase B (ASB), which hydrolyses the sulphate ester group from N-acetylgalactosamine 4-sulphate residues of dermatan sulphate; arylsulphatase C (ASD) and E (ASE);\ steryl-sulphatase (STS), a membrane bound microsomal enzyme which hydrolyses 3-beta-hydroxy steroid sulphates; iduronate 2-sulphatase precursor (IDS), a lysosomal enzyme that hydrolyses the 2-sulphate groups from non-reducing-terminal iduronic acid residues in dermatan sulphate and heparan sulphate; N-acetylgalactosamine-6-sulphatase , which hydrolyses the 6-sulphate groups of the N-acetyl-d-galactosamine 6-sulphate units of chondroitin sulphate and the D-galactose 6-sulphate units of \ keratan sulphate; glucosamine-6-sulphatase (G6S), which hydrolyses the N-acetyl-D-glucosamine 6-sulphate units of heparan sulphate and keratan sulphate; N-sulphoglucosamine sulphohydrolase (sulphamidase), the lysosomal enzyme that catalyses the hydrolysis of N-sulpho-d-glucosamine into glucosamine and sulphate; sea urchin embryo arylsulphatase ; green algae arylsulphatase , which plays an important role in the mineralisation of sulphates; and arylsulphatase from Escherichia coli (aslA), Klebsiella aerogenes (gene atsA) and Pseudomonas aeruginosa (gene atsA).

    \ ' '890' 'IPR000863' '\

    This family includes a range of sulphotransferase proteins including flavonyl 3-sulphotransferase, aryl sulphotransferase, alcohol sulphotransferase, oestrogen sulphotransferase and phenol-sulphating phenol sulphotransferase. These enzymes are responsible for the transfer of sulphate groups to specific compounds.

    \ ' '891' 'IPR005331' '\

    This entry consists of a number of carbohydrate sulphotransferases that transfer sulphate to carbohydrate groups in glycoproteins and glycolipids. These include:\ \

    \

    \ ' '892' 'IPR007019' '\

    The surfeit locus protein SURF-6 has been shown to be a component of the nucleolar matrix and has a strong binding capacity for nucleic acids PUBMED:9548374. SURF-6 is always found in the nucleolus regardless of the phase of the cell cycle\ suggesting that it is a structural protein constitutively present in nucleolar substructures. A role in rRNA processing has been proposed for this protein.

    \ ' '893' 'IPR000061' '\ SWAP is derived from the Suppressor-of-White-APricot splicing\ regulator from Drosophila melanogaster. The domain is found in regulators responsible for pervasive, nonsex-specific alternative pre-mRNA\ splicing characteristics and has been found in splicing regulatory proteins PUBMED:8206918. These ancient, conserved\ SWAP proteins share a colinearly arrayed series of novel\ sequence motifs PUBMED:7971282.\ ' '894' 'IPR003121' '\

    The SWI/SNF family of complexes, which are conserved from yeast to humans, are ATP-dependent chromatin-remodelling proteins that facilitate transcription activation PUBMED:11147808. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). The BAF60 family have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. BAF60b is present in alternative forms of the SWI/SNF complex, including complex B (SWIB), which lacks BAF60a. The SWIB domain is a conserved region found within the BAF60b proteins PUBMED:12016060, and can be found fused to the C-terminus of DNA topoisomerase in Chlamydia.

    \ \

    MDM2 is an oncoprotein that acts as a cellular inhibitor of the p53 tumour suppressor by binding to the transactivation domain of p53 and suppressing its ability to activate transcription PUBMED:8875929. p53 acts in response to DNA damage, inducing cell cycle arrest and apoptosis. Inactivation of p53 is a common occurrence in neoplastic transformations. The core of MDM2 folds into an open bundle of four helices, which is capped by two small 3-stranded beta-sheets. It consists of a duplication of two structural repeats. MDM2 has a deep hydrophobic cleft on which the p53 alpha-helix binds; p53 residues involved in transactivation are buried deep within the cleft of MDM2, thereby concealing the p53 transactivation domain.

    \

    The SWIB and MDM2 domains are homologous and share a common fold.

    \ ' '895' 'IPR007527' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination PUBMED:16522193. SWIM domains are also found in the homologous recombination protein Sws1 PUBMED:16710300, as well as in several hypothetical proteins.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '896' 'IPR007526' '\

    The SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in eukaryotic chromosomal proteins. It is named after the proteins SWI3, RSC8 and MOIRA in which it was first recognised. This domain is predicted to mediate protein-protein interactions in the assembly of chromatin-protein complexes. The SWIRM domain can be linked to different domains, such as the ZZ-type zinc finger (), the Myb DNA-binding domain (), the HORMA domain (), the amino-oxidase domain, the chromo domain (), and the JAB1/PAD1 domain.

    \ ' '897' 'IPR006011' '\

    Syntaxins A and B are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane. Syntaxins are a family of\ receptors for intracellular transport vesicles. Each target membrane may be\ identified by a specific member of the syntaxin family PUBMED:7690687.\ Members of the syntaxin family PUBMED:8493722, PUBMED:8490959 have a size ranging from\ 30 Kd to 40 Kd; a C-terminal extremity which is highly hydrophobic and anchors the protein on the cytoplasmic surface of cellular membranes; a central, well\ conserved region, which seems to be in a coiled-coil conformation.\

    \ ' '898' 'IPR001699' '\

    Transcription factors of the T-box family are required both for early cell-fate decisions, such as those necessary for formation of the basic vertebrate body plan, and for differentiation and organogenesis PUBMED:12093383. The T-box is defined as the minimal region within the T-box protein that is both necessary and sufficient for sequence-specific\ DNA binding, all members of the family so far examined bind to the DNA consensus sequence TCACACCT. The T-box is a relatively large DNA-binding domain, generally comprising about a third of the entire protein (17-26 kDa).

    \

    These genes were uncovered on the basis of similarity to the DNA binding domain PUBMED:9504043 of Mus musculus (Mouse) Brachyury (T) gene product, which similarity is the defining feature of the family. The Brachyury gene is named for its phenotype, which was identified 70 years ago as a mutant mouse strain with a short blunted tail. The gene, and its paralogues, have become a well-studied model for the family, and hence much of what is known about the T-box family is derived from the murine Brachyury gene.

    \

    Consistent with its nuclear location, Brachyury protein has a sequence-specific DNA-binding activity and can act as a transcriptional regulator PUBMED:9503012. Homozygous mutants for the gene undergo extensive developmental anomalies, thus rendering the mutation lethal PUBMED:9395282. The postulated role of Brachyury is as a transcription factor, regulating the specification and differentiation of posterior mesoderm during gastrulation in a dose-dependent manner PUBMED:9504043.

    \

    T-box proteins tend to be expressed in specific organs or cell types, especially during development, and they are generally required for the development of those tissues, for example, Brachyury is expressed in posterior mesoderm and in the developing notochord, and it is required for the formation of these cells in mice PUBMED:9196325.

    \ ' '899' 'IPR006809' '\ The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. The conserved region is found at the C terminus of most member proteins. The crystal structure of hTAFII28 with hTAFII18 shows that this region is involved in the binding of these two subunits. The conserved region contains four alpha helices and three loops arranged as in histone H3 PUBMED:7729427, PUBMED:9695952.\ ' '900' 'IPR007304' '\ The TOR signalling pathway activates a cell-growth program in response to nutrients PUBMED:10604478. TIP41 interacts with TAP42 and negatively regulates the TOR signalling pathway PUBMED:11741537.\ ' '901' 'IPR005637' '\

    This entry contains the NXF family of shuttling transport receptors for nuclear export of mRNA, which include:

    \ \ \ \

    Members of the NXF family have a modular structure. A nuclear localization sequence and a noncanonical RNA recognition motif (RRM) (see ) followed by four LRR repeats are located in its N-terminal half. The C-terminal half contains a NTF2 domain (see ) followed by a second domain, TAP-C. The TAP-C domain is important for binding to FG repeat-containing nuclear pore proteins (FG-nucleoporins) and is sufficient to mediate nuclear shuttling PUBMED:11875519,PUBMED:11256625.

    \ \

    The Tap-C domain is made of four alpha helices packed against each other. The arrangement of helices 1, 2 and 3 is similar to that seen in a UBA fold. and is joined to the next module by flexible 12-residue Pro-rich linker PUBMED:11256625, PUBMED:11875519.

    \ ' '902' 'IPR004345' '\

    This family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein which in human is deleted in severe forms of familial adenomatous polyposis, an autosomal\ dominant oncological inherited disease.

    \

    The family also includes the plant protein of known similarity to TB2/DP1, the\ HVA22 abscisic acid-induced protein (e.g. Q07764), which is thought to be a regulatory protein.

    \ ' '903' 'IPR000814' '\

    The TATA-box binding protein (TBP) is required for the initiation of transcription by RNA polymerases I, II and III, from promoters with or without a TATA box PUBMED:12782648, PUBMED:10974559. TBP associates with a host of factors, including the general transcription factors TFIIA, -B, -D, -E, and -H, to form huge multi-subunit pre-initiation complexes on the core promoter. Through its association with different transcription factors, TBP can initiate transcription from different RNA polymerases. There are several related TBPs, including TBP-like (TBPL) proteins PUBMED:12878007.

    \

    The C-terminal core of TBP (~180 residues) is highly conserved and contains two 77-amino acid repeats that produce a saddle-shaped structure that straddles the DNA; this region binds to the TATA box and interacts with transcription factors and regulatory proteins PUBMED:1436073. By contrast, the N-terminal region varies in both length and sequence.

    \ \ ' '904' 'IPR000435' '\ Tektin heteropolymers form unique protofilaments of flagellar microtubules\ PUBMED:8609631. The proteins are predicted to form extended rods composed of 2 alpha-\ helical segments (~180 residues long) capable of forming coiled coils,\ interrupted by non-helical linkers PUBMED:8609631. The 2 segments are similar in \ sequence, indicating a gene duplication event. Along each tektin rod, \ cysteine residues occur with a periodicity of ~8nm, coincident with the\ axial repeat of tubulin dimers in microtubules PUBMED:8609631. It is proposed that\ the assembly of tektin heteropolymers produces filaments with repeats of\ 8, 16, 24, 32, 40, 48 and 96nm, generating the basis for the complex\ spatial arrangements of axonemal components PUBMED:8609631.\ ' '905' 'IPR001906' '\

    Sequences containing this domain belong to the terpene synthase family. It has been suggested that this gene family be designated tps (for terpene synthase). Sequence comparisons reveal similarities between the monoterpene (C10) synthases, sesquiterpene (C15) synthases and the diterpene (C20) synthases. It has been split into six subgroups on the basis of phylogeny, called Tpsa-Tpsf PUBMED:9268308.

    \ \ \ \

    In the fungus Phaeosphaeria sp. (strain L487) the synthesis of ent-kaurene from geranylgeranyl dophosphate is promoted by a single bifunctional protein PUBMED:9268298.

    \ ' '906' 'IPR001647' '\

    This entry represents a DNA-binding domain with a helix-turn-helix (HTH) structure that is found in several bacterial and archaeal transcriptional regulators, such as TetR, the tetracycline resistance repressor. Numerous other transcriptional regulatory proteins also contain HTH-type DNA-binding domains, and can be grouped into subfamiles based on sequence similarity. The domain represented by this entry is found in a subfamily of proteins that includes the transcriptional regulators TetR, TetC, AcrR, BetI, Bm3R1, EnvR, QacR, MtrR, TcmR, Ttk, YbiH, and YhgD PUBMED:7826010, PUBMED:8196548, PUBMED:8428974. Many of these proteins function as repressors that control the level of susceptibility to hydrophobic antibiotics and detergents. They all have similar molecular weights, ranging from 21 to 25 kDa. The helix-turn-helix motif is located in the initial third of the protein. The 3D structure of the homodimeric TetR protein complexed with 7-chloro-tetracycline-magnesium has been determined to 2.1 A resolution PUBMED:7707374. TetR folds into ten alpha-helices with connecting turns and loops. The three N-terminal alpha-helices of the repressor form the DNA-binding domain: this structural motif encompasses an HTH fold with an inverse orientation compared with that of other DNA-binding proteins.

    \ ' '907' 'IPR004600' '\ Members of this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair. The core-TFIIH basal transcription factor complex has six subunits, this is the p34 subunit.\ ' '908' 'IPR003195' '\

    This family includes the Spt3 yeast transcription factors and the 18 kDa subunit from human transcription initiation factor IID (TFIID-18). Determination of the crystal structure reveals an atypical histone fold PUBMED:9695952.

    \ ' '909' 'IPR003228' '\ Human transcription initiation factor TFIID is composed of the TATA-binding polypeptide (TBP) and at least 13 TBP-associated factors (TAFs) that collectively or individually are involved in activator-dependent transcription PUBMED:7667268.\ ' '910' 'IPR007582' '\ This region, possibly a domain is found in subunits of transcription factor TFIID. The function of this region is unknown.\ ' '911' 'IPR007365' '\

    This entry represents the dimerisation domain found in the transferrin receptor, as well as in a number of other proteins including glutamate carboxypeptidase II and N-acetylated-alpha-linked acidic dipeptidase like protein.

    \

    The transferrin receptor (TfR) assists iron uptake into vertebrate cells through a cycle of endo- and exocytosis of the iron transport protein transferrin (Tf). TfR binds iron-loaded (diferric) Tf at the cell surface and carries it to the endosome, where the iron dissociates from Tf. The apo-Tf remains bound to TfR until it reaches the cell surface, where apo-Tf is replaced by diferric Tf from the serum to begin the cycle again. Human TfR is a homodimeric type II transmembrane protein. The crystal structure of a TfR monomer reveals a 3-domain structure: a protease-like domain that closely resembles carboxy- and amino-peptidases; an apical domain consisting of a beta-sandwich; and a helical dimerisation domain. The dimerisation domain consists of a 4-helical bundle that makes contact with each of the three domains in the dimer partner PUBMED:10531064.

    \ ' '912' 'IPR004095' '\

    The TGS domain is present in a number of enzymes, for example, in threonyl-tRNA synthetase (ThrRS), GTPase, and guanosine-3\',5\'-bis(diphosphate) 3\'-pyrophosphohydrolase (SpoT) PUBMED:10447505. The TGS domain is also present at the amino terminus of the uridine kinase from the spirochaete Treponema pallidum (but not any other organism, including the related spirochaete Borrelia burgdorferi).

    \

    TGS is a small domain that consists of ~50 amino acid residues and is predicted to possess a predominantly beta-sheet structure. There is no direct information\ on the functions of the TGS domain, but its presence in two types of\ regulatory proteins (the GTPases and guanosine polyphosphate phosphohydrolases/synthetases) suggests a ligand (most likely nucleotide)-binding, regulatory role PUBMED:10447505.

    \ ' '913' 'IPR000672' '\ Enzymes that participate in the transfer of one-carbon units require the coenzyme tetrahydrofolate (THF).\ Various reactions generate one-carbon derivatives of THF, which can be interconverted between different\ oxidation states by methylene-THF dehydrogenase (), methenyl-THF cyclohydrolase ()\ and formyl-THF synthetase () PUBMED:2541774, PUBMED:8485162. The dehydrogenase and cyclohydrolase\ activities are expressed by a variety of multifunctional enzymes, including the tri-functional eukaryotic\ C1-tetrahydrofolate synthase PUBMED:2541774; a bifunctional eukaryotic mitochondrial protein; and the\ bifunctional Escherichia coli folD protein PUBMED:2541774, PUBMED:8485162. Methylene-tetrahydrofolate dehydrogenase and\ methenyltetrahydrofolate cyclo-hydrolase share an overlapping active site PUBMED:2541774, and as such are\ usually located together in proteins, acting in tandem on the carbon-nitrogen bonds of substrates other\ than peptide bonds.\ ' '914' 'IPR000672' '\ Enzymes that participate in the transfer of one-carbon units require the coenzyme tetrahydrofolate (THF).\ Various reactions generate one-carbon derivatives of THF, which can be interconverted between different\ oxidation states by methylene-THF dehydrogenase (), methenyl-THF cyclohydrolase ()\ and formyl-THF synthetase () PUBMED:2541774, PUBMED:8485162. The dehydrogenase and cyclohydrolase\ activities are expressed by a variety of multifunctional enzymes, including the tri-functional eukaryotic\ C1-tetrahydrofolate synthase PUBMED:2541774; a bifunctional eukaryotic mitochondrial protein; and the\ bifunctional Escherichia coli folD protein PUBMED:2541774, PUBMED:8485162. Methylene-tetrahydrofolate dehydrogenase and\ methenyltetrahydrofolate cyclo-hydrolase share an overlapping active site PUBMED:2541774, and as such are\ usually located together in proteins, acting in tandem on the carbon-nitrogen bonds of substrates other\ than peptide bonds.\ ' '915' 'IPR000594' '\ Ubiquitin-activating enzyme (E1 enzyme) PUBMED:1647207, PUBMED:1656558 activates ubiquitin by first\ adenylating with ATP its C-terminal glycine residue and thereafter linking\ this residue to the side chain of a cysteine residue in E1, yielding an\ ubiquitin-E1 thiolester and free AMP. Later the ubiquitin moiety is\ transferred to a cysteine residue on one of the many forms of ubiquitin-\ conjugating enzymes (E2).\

    The family of ubiquitin-activating enzymes shares in its catalytic domain significant similarity with a large\ family of NAD/FAD-binding proteins. This domain is based on the common NAD/FAD-binding fold and\ finds members of several families, including UBA ubiquitin activating enzymes; the hesA/moeB/thiF family;\ NADH peroxidases; the LDH family; sarcosin oxidase; phytoene dehydrogenases; alanine dehydrogenases;\ hydroxyacyl-CoA dehydrogenases and many other NAD/FAD dependent dehydrogenases and oxidases.

    \ ' '916' 'IPR003749' '\ ThiS (thiaminS) is a 66 aa protein involved in sulphur transfer. ThiS is coded in the thiCEFSGH operon in Escherichia coli. This family of proteins have two conserved Glycines at the COOH terminus. Thiocarboxylate is formed at the last G in the activation process. Sulphur is transferred from ThiI to ThiS in a reaction catalysed by IscS PUBMED:10781607. MoaD, a protein involved in sulphur transfer during molybdopterin synthesis, is about the same length and shows limited sequence similarity to ThiS. Both have the conserved GG at the COOH end.\ ' '917' 'IPR002909' '\ This family consists of a domain that has an immunoglobulin like fold. These domains are found in cell surface receptors such as Met and Ron as well as in intracellular transcription factors where it is involved in DNA binding.\ The Ron tyrosine kinase receptor shares with the members of its subfamily (Met and Sea) a unique functional feature: the control of cell dissociation, motility, and invasion of extracellular matrices (scattering) PUBMED:8816464.\ ' '918' 'IPR007379' '\

    Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane PUBMED:10430866. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region PUBMED:10430866.

    \ ' '919' 'IPR007303' '\ The TOR signalling pathway activates a cell-growth program in response to nutrients PUBMED:10604478. TIP41 interacts with TAP42 and negatively regulates the TOR signalling pathway PUBMED:11741537.\ ' '920' 'IPR000157' '\

    In Drosophila melanogaster the Toll protein is involved in establishment of dorso-ventral polarity in the embryo. In addition, members of the Toll family play a key role in innate antibacterial and antifungal immunity in insects as well as in mammals. These proteins are type-I transmembrane receptors that share an intracellular 200 residue domain with the interleukin-1 receptor (IL-1R), the Toll/IL-1R homologous region (TIR). The similarity between Toll-like receptors (LTRs) and IL-1R is not restricted to sequence homology since these proteins also share a similar signalling pathway. They both induce the activation of a Rel type transcription factor via an adaptor protein and a protein kinase PUBMED:8621445. Interestingly, MyD88, a cytoplasmic adaptor protein found in mammals, contains a TIR domain associated to a DEATH domain (see ) PUBMED:8621445, PUBMED:9374458, PUBMED:10679407. Besides the mammalian and Drosophila melanogaster proteins, a TIR domain is also found in a number of plant proteins implicated in host defence PUBMED:9868361. As MyD88, these proteins are cytoplasmic.

    \

    Site directed mutagenesis and deletion analysis have shown that the TIR domain is essential for Toll and IL-1R activities. Sequence analysis have revealed\ the presence of three highly conserved regions among the different members of\ the family: box 1 (FDAFISY), box 2 (GYKLC-RD-PG), and box 3 (a conserved W\ surrounded by basic residues). It has been proposed that boxes 1 and 2 are\ involved in the binding of proteins involved in signalling, whereas box 3 is\ primarily involved in directing localization of receptor, perhaps through\ interactions with cytoskeletal elements PUBMED:10671496.

    \ ' '921' 'IPR005617' '\

    The N-terminal domain of the Grouch/TLE co-repressor proteins are involved in oligomerisation.

    \ ' '922' 'IPR007829' '\ This domain is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts.\ ' '923' 'IPR005016' '\

    This is a family of proteins which display differential expression in various tumour and cell lines. The function of these proteins is unknown.

    \ ' '924' 'IPR005116' '\

    The TOBE domain PUBMED:10829230 (Transport-associated OB) always occurs as a dimer as the C-terminal strand of each domain is supplied by the partner. It is probably involved in the recognition of small ligands such as molybdenum () and sulphate (), and is found in ABC transporters immediately after the ATPase domain.

    \ ' '925' 'IPR007373' '\ Thiamin pyrophosphokinase (TPK, ) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis PUBMED:11435118.\ ' '926' 'IPR007371' '\ Thiamin pyrophosphokinase (TPK, ) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggests that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis PUBMED:11435118.\ ' '927' 'IPR001440' '\

    The tetratrico peptide repeat (TPR) is a structural motif present in a wide range of proteins PUBMED:7667876, PUBMED:9482716, PUBMED:1882418. It mediates protein-protein interactions and the assembly of multiprotein complexes PUBMED:14659697. The TPR motif consists of 3-16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding.

    \

    The X-ray structure of a domain containing three TPRs from protein phosphatase 5 revealed that\ TPR adopts a helix-turn-helix arrangement, with adjacent TPR motifs packing in a parallel\ fashion, resulting in a spiral of repeating anti-parallel alpha-helices PUBMED:14659697. The two helices are denoted\ helix A and helix B. The packing angle between helix A and helix B is ~24 degrees; within a\ single TPR and generates a right-handed superhelical shape. Helix A interacts with helix B and\ with helix A\' of the next TPR. Two protein surfaces are generated: the inner concave surface is\ contributed to mainly by residue on helices A, and the other surface presents residues from both\ helices A and B.

    \ ' '928' 'IPR002792' '\

    The TRAM (after TRM2 and miaB) domain is a 60-70-residue-long module that is found in:\ \

    \ \ The TRAM domain can be found alone or in association with other domains, such as the catalytic biotin/lipoate synthetase-like domain, the RNA methylase domain, the ribosomal S2 domain and the eIF2-beta domain. The TRAM domain is predicted to bind tRNA and deliver the RNA-modifying enzymatic domain to their targets PUBMED:11313137.\ \ Secondary structure prediction indicates that the TRAM domain adopts a simple beta-barrel fold. The conservation pattern of the TRAM domain consists primarily of small and hydrophobic residues that correspond to five beta-strands in the predicted secondary structure PUBMED:11313137.

    \ ' '929' 'IPR001867' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    This entry represents a domain that is almost always found associated with the response regulator receiver domain (see ). It may play a role in DNA binding PUBMED:9016718.

    \ ' '930' 'IPR003480' '\ This family includes a number of transferase enzymes. These include anthranilate N-hydroxycinnamoyl/benzoyltransferase that catalyzes the first committed reaction of phytoalexin biosynthesis PUBMED:9426598. Deacetylvindoline 4-O-acetyltransferase () catalyzes the last step in vindoline biosynthesis is also a member of this family PUBMED:9681034. The motif HXXXD is probably part of the active site. The family also includes trichothecene 3-O-acetyltransferase.\ ' '931' 'IPR004365' '\

    The OB-fold (oligonucleotide/oligosaccharide-binding fold) is found in all three kingdoms and its common architecture presents a binding face that has adapted to bind different ligands. The OB-fold is a five/six-stranded closed beta-barrel formed by 70-80 amino acid residues. The strands are connected by loops of varying length which form the functional appendages of the protein. The majority of OB-fold proteins use the same face for ligand binding or as an active site. Different OB-fold proteins use this \'fold-related binding face\' to, variously, bind oligosaccharides, oligonucleotides, proteins, metal ions and catalytic substrates.

    \

    This entry contains OB-fold domains that bind to nucleic acids PUBMED:10829230. It includes the anti-codon binding domain of lysyl, aspartyl, and asparaginyl-tRNA synthetases (See ). Aminoacyl-tRNA synthetases catalyse the addition of an amino acid to the appropriate tRNA molecule . This domain is found in RecG helicase involved in DNA repair. Replication factor A is a heterotrimeric complex, that contains a subunit in this family PUBMED:7760808, PUBMED:8990123. This domain is also found at the C terminus of bacterial DNA polymerase III alpha chain.

    \ ' '932' 'IPR002931' '\

    This domain is found in many proteins known to have transglutaminase activity, i.e. which cross-link proteins through an acyl-transfer reaction between the gamma-carboxamide group of peptide-bound glutamine and the epsilon-amino group of peptide-bound lysine, resulting in a epsilon-(gamma-glutamyl)lysine isopeptide bond. Tranglutaminases have been found in a diverse range of species, from bacteria through to mammals. The enzymes require calcium binding and their activity leads to post-translational modification of proteins through acyl-transfer reactions, involving peptidyl glutamine residues as acyl donors and a variety of primary amines as acyl acceptors, with the generation of proteinase resistant isopeptide bonds PUBMED:12366374.

    \ \

    Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterised transglutaminase, the human blood clotting factor XIIIa\' PUBMED:7913750. On the basis of the experimentally demonstrated activity of the Methanobacterium phage psiM2 pseudomurein endoisopeptidase PUBMED:9791169, it is proposed that many, if not all, microbial homologs of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease PUBMED:10452618.

    \ \

    A subunit of plasma Factor XIII revealed that each Factor XIIIA subunit is composed of four domains (termed N-terminal beta-sandwich, core domain (containing the catalytic and the regulatory sites), and C-terminal beta-barrels 1 and 2) and that two monomers assemble into the native dimer through the surfaces in domains 1 and 2, in opposite orientation. This organization in four domains is highly conserved during evolution among transglutaminase isoforms PUBMED:12366374.

    \ ' '933' 'IPR008958' '\

    Synonym(s): Protein-glutamine gamma-glutamyltransferase, Fibrinoligase, TGase

    \

    Transglutaminases catalyse the post-translational modification of proteins at glutamine residues, with formation of isopeptide bonds. Members of the transglutaminase family usually have three domains: N-terminal (), middle () and C-terminal. The middle domain is usually well conserved, but family members can display major differences in their N- and C-terminal domains, although their overall structure is conserved PUBMED:10411627. This entry represents the C-terminal domain found in transglutaminases, which consists of an immunoglobulin-like beta-sandwich consisting of seven strands in two sheets with a Greek key topology.

    \

    The best known transglutaminase is blood coagulation factor XIII, a plasma tetrameric protein composed of two catalytic A subunits and two non-catalytic B subunits. Factor XIII is responsible for cross-linking fibrin chains, thus stabilising the fibrin clot. Protein-glutamine gamma-glutamyltransferases () are calcium-dependent enzymes that catalyse the cross-linking of proteins by promoting the formation of isopeptide bonds between the gamma-carboxyl group of a glutamine in one polypeptide chain and the epsilon-amino group of a lysine in a second polypeptide chain. TGases also catalyse the conjugation of polyamines to proteins PUBMED:1683845, PUBMED:1974250.

    \ ' '934' 'IPR005475' '\

    Transketolase (TK) catalyzes the reversible transfer of a\ two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as\ ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3-\ phosphate. This enzyme, together with transaldolase, provides a link between\ the glycolytic and pentose-phosphate pathways.\ TK requires thiamine pyrophosphate as a cofactor. In most sources where TK has\ been purified, it is a homodimer of approximately 70 Kd subunits. TK sequences\ from a variety of eukaryotic and prokaryotic sources PUBMED:1567394, PUBMED:1737042 show that the\ enzyme has been evolutionarily conserved.\ In the peroxisomes of methylotrophic yeast Pichia angusta (Yeast) (Hansenula polymorpha), there is a\ highly related enzyme, dihydroxy-acetone synthase (DHAS) (also\ known as formaldehyde transketolase), which exhibits a very unusual\ specificity by including formaldehyde amongst its substrates.

    \ 1-deoxyxylulose-5-phosphate synthase (DXP synthase) PUBMED:9371765 is an enzyme so far\ found in bacteria (gene dxs) and plants (gene CLA1) which catalyzes the\ thiamine pyrophosphoate-dependent acyloin condensation reaction between carbon\ atoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate to yield 1-deoxy-D-\ xylulose-5-phosphate (dxp), a precursor in the biosynthetic pathway to\ isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6). DXP synthase\ is evolutionary related to TK. \ The N-terminal section, contains a histidine residue which appears to function in\ proton transfer during catalysis PUBMED:1628611. In the central\ section there are conserved acidic residues that are part of the active cleft\ and may participate in substrate-binding PUBMED:1628611.\ This family includes transketolase enzymes \ and also partially matches to 2-oxoisovalerate dehydrogenase\ beta subunit . Both these enzymes\ utilise thiamine pyrophosphate as a cofactor, suggesting\ there may be common aspects in their mechanism of catalysis.

    \ ' '935' 'IPR005476' '\

    Transketolase (TK) catalyzes the reversible transfer of a\ two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as\ ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3-\ phosphate. This enzyme, together with transaldolase, provides a link between\ the glycolytic and pentose-phosphate pathways.\ TK requires thiamine pyrophosphate as a cofactor. In most sources where TK has\ been purified, it is a homodimer of approximately 70 Kd subunits. TK sequences\ from a variety of eukaryotic and prokaryotic sources PUBMED:1567394, PUBMED:1737042 show that the\ enzyme has been evolutionarily conserved.\ In the peroxisomes of methylotrophic yeast Pichia angusta (Yeast) (Hansenula polymorpha), there is a\ highly related enzyme, dihydroxy-acetone synthase (DHAS) (also\ known as formaldehyde transketolase), which exhibits a very unusual\ specificity by including formaldehyde amongst its substrates.

    \ 1-deoxyxylulose-5-phosphate synthase (DXP synthase) PUBMED:9371765 is an enzyme so far\ found in bacteria (gene dxs) and plants (gene CLA1) which catalyzes the\ thiamine pyrophosphoate-dependent acyloin condensation reaction between carbon\ atoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate to yield 1-deoxy-D-\ xylulose-5-phosphate (dxp), a precursor in the biosynthetic pathway to\ isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6). DXP synthase\ is evolutionary related to TK.\ The N-terminal section, contains a histidine residue which appears to function in\ proton transfer during catalysis PUBMED:1628611. In the central\ section there are conserved acidic residues that are part of the active cleft\ and may participate in substrate-binding PUBMED:1628611.\ This family includes transketolase enzymes \ and also partially matches to 2-oxoisovalerate dehydrogenase\ beta subunit . Both these enzymes\ utilise thiamine pyrophosphate as a cofactor, suggesting\ there may be common aspects in their mechanism of catalysis.

    \ ' '936' 'IPR018499' '\

    A number of eukaryotic CD antigens have been shown to be related\ PUBMED:1860863. CD9 (also called DRAP-27, MRP-1 or p24) upregulates HB-EGF activity as a receptor for diphtheria toxin as well as its juxtacrine activity. CD9 mAbs modulate cell adhesion and migration and trigger platelet activation that is blocked by mAbs directed to the platelet Fc receptor CD32. In mice, CD9 mAb KMC8.8 has been shown to inhibit the production of myeloid cells in vitro and has a costimulatory activity for T cells. CD9 is a type III membrane protein, with four putative transmembrane domains.

    \

    CD37 (or gp52-40) is involved in signal transduction and serves as a stable marker for malignancies derived from mature B cells, like B-CLL, HCL, and all types of B-NHL.

    \

    CD63 transfection reduced melanoma cell motility on fibronectin, collagen and laminin, and reduced the growth and metastasis of melanoma cells in nude mice PUBMED:9120293.\ CD63 has been used as a marker for late endosomes and for primary melanomas.\

    \ \

    These proteins are all type II membrane proteins: they contain an\ N-terminal transmembrane (TM) domain, which acts both as a signal sequence\ and a membrane anchor, and 3 additional TM regions (hence the name \'TM4\').\ The sequences contain a number of conserved cysteine residues.

    \ \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).\

    \ \ ' '937' 'IPR001460' '\

    This signature identifies a large group of proteins, which include:

    \ \

    The large number of penicillin binding proteins, which are represented in this group of sequences, are responsible for the final stages of peptidoglycan biosynthesis for cell wall formation. The proteins synthesise cross-linked peptidoglycan from lipid intermediates, and contain a penicillin-sensitive transpeptidase carboxy-terminal domain. The active site serine (residue 337 in ) is conserved in all members of this family PUBMED:8605631.

    \ \

    MecR1 and BlaR1 are metallopeptidases belonging to MEROPS peptidase family M56, clan M-. BlaR1 and MecR1 cleave their cognate transcriptional repressors BlaI and MecI, respectively, activating the synthesis of MecA.

    \ \

    MecR1 is present in Staphylococcus aureus and Staphylococcus sciuri, whereas BlaR1 (also known as BlaR, PenR1, or PenJ) has been found in Bacillus licheniformis, Staphylococcus epidermidis, Staphylococcus haemolyticus, and several S. aureus strains. These proteins are either plasmid-encoded, chromosomal, or transposon-mediated. MecR1/BlaR1 proteins are made up by homologous N-terminal 330-residue transmembrane metallopeptidase domains linked to extracellular 260-residue homologous PBP-like penicillin sensor moieties.

    \ ' '938' 'IPR002559' '\

    Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting\ the mobile element. Transposases have been grouped into various families PUBMED:8041625, PUBMED:1310791, PUBMED:1718819. This family includes the IS4 transposase.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '939' 'IPR003346' '\

    Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases for IS116, IS110 and IS902. It is often found with the transposase IS111A/IS1328/IS1533 family (see ) PUBMED:1348267, PUBMED:10217489.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '940' 'IPR002513' '\

    This family includes transposases of Tn3, Tn21, Tn1721, Tn2501, Tn3926 transposons from Escherichia coli. The specific binding of the Tn3 transposase to DNA has been demonstrated. Sequence analysis has suggested that the invariant triad of Asp689, Asp765, Glu895 (numbering as in Tn3) may correspond to the D-D-35-E motif previously implicated in the catalytic performance of numerous transposases PUBMED:8932514.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '941' 'IPR002525' '\

    Transposase proteins are necessary for efficient DNA transposition.\ This family includes an amino-terminal region of the pilin gene inverting\ protein (PIVML) and members of the IS111A/IS1328/IS1533 family of\ transposases PUBMED:10620670, PUBMED:10220167.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '942' 'IPR002305' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \ ' '943' 'IPR002319' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    Phenylalanyl-tRNA synthetase () is an alpha2/beta2 tetramer composed of 2 subunits that belongs to class IIc. In eubacteria, a small subunit (pheS gene) can be designated as beta (E. coli) or alpha subunit (nomenclature adopted in InterPro). Reciprocally the large subunit\ (pheT gene) can be designated as alpha (E. coli) or beta (see and ). In all other kingdoms the two subunits have equivalent length in eukaryota, and can be identified by specific signatures. The enzyme from Thermus thermophilus has an alpha 2 beta 2 type quaternary structure and is one of the most complicated members of the synthetase family. Identification of phenylalanyl-tRNA synthetase as a member of class II aaRSs was based only on sequence alignment of the small alpha-subunit with other synthetases PUBMED:8199244.

    \ ' '944' 'IPR002547' '\ This domain is found in prokaryotic methionyl-tRNA synthetases, \ prokaryotic phenylalanyl tRNA synthetases the yeast GU4 nucleic-binding \ protein (G4p1 or p42, ARC1) PUBMED:8895587, human tyrosyl-tRNA synthetase PUBMED:9162081,\ and endothelial-monocyte activating polypeptide II. \ G4p1 binds specifically to tRNA form a complex with methionyl-tRNA \ synthetases PUBMED:8895587. In human tyrosyl-tRNA synthetase this domain may direct\ tRNA to the active site of the enzyme PUBMED:8895587. This domain may perform a\ common function in tRNA aminoacylation PUBMED:9162081.\ ' '945' 'IPR004934' '\

    Actin filaments have an intrinsic polarity, each with a\ fast-growing (barbed) end and a slow-growing (pointed) end. To regulate the dynamics at these\ ends, capping proteins have evolved that specifically bind to either the barbed or the pointed ends\ of the filament, where they block the association and dissociation of monomers. Pointed ends, for which actin\ monomers have significantly lower association and dissociation rate-constants than for barbed, are capped by either\ the Arp2/3 complex or tropomodulins PUBMED:14573353.

    \ \ Tropomodulin is a novel tropomyosin regulatory protein that binds to the end of erythrocyte tropomyosin and blocks head-to-tail\ association of tropomyosin along actin filaments PUBMED:1370827. Limited proteolysis shows this protein is composed of two domains. The unstructured tropomyosin-binding\ region at the N-terminus has an actin pointed-end-capping activity that is dramatically up-regulated\ by tropomyosin coating of the actin filamentPUBMED:11029591. The second region is found near the C-terminus. This tropomyosin-independent\ capping-domain caps pure actin.

    \ ' '946' 'IPR001978' '\ The troponin (Tn) complex regulates Ca2+ induced muscle contraction. Tn contains three subunits, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). This family includes troponin T and troponin I. Troponin I binds to actin and troponin T binds to tropomyosin PUBMED:3102969, PUBMED:7852318, PUBMED:7601340.\ ' '947' 'IPR003612' '\

    This domain is found is several proteins, including plant lipid transfer proteins PUBMED:15823028, seed storage proteins PUBMED:14636051 and trypsin-alpha amylase inhibitors PUBMED:9354618, PUBMED:10713515. The domain forms a four-helical bundle in a right-handed superhelix with a folded leaf topology, which is stabilised by disulphide bonds, and which has an internal cavity.

    \

    More information about this protein can be found at Protein of the Month: alpha-Amylase PUBMED:.

    \ \ ' '948' 'IPR001254' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases).

    \

    The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum PUBMED:7845208. The enzymes are inherently secreted, being synthesised with a signal peptide that\ targets them to the secretory pathway. Animal enzymes are either secreted\ directly, packaged into vesicles for regulated secretion, or are retained\ in leukocyte granules PUBMED:7845208.

    \

    The Hap family, \'Haemophilus adhesion and penetration\', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region.

    \ ' '949' 'IPR012680' '\

    Laminins are large heterotrimeric glycoproteins involved in basement membrane function PUBMED:15037599. The laminin globular (G) domain can be found in one to several copies in various laminin family members, including a large number of extracellular proteins. The C-terminus of the laminin alpha chain contains a tandem repeat of five laminin G domains, which are critical for heparin-binding and cell attachment activity PUBMED:10747011. Laminin alpha4 is distributed in a variety of tissues including peripheral nerves, dorsal root ganglion, skeletal muscle and capillaries; in the neuromuscular junction, it is required for synaptic specialisation PUBMED:15823034. The structure of the laminin-G domain has been predicted to resemble that of pentraxin PUBMED:9480764.

    \ \

    Laminin G domains can vary in their function, and a variety of binding functions have been ascribed to different LamG modules. For example, the laminin alpha1 and alpha2 chains each have five C-teminal laminin G domains, where only domains LG4 and LG5 contain binding sites for heparin, sulphatides and the cell surface receptor dystroglycan PUBMED:10747011. Laminin G-containing proteins appear to have a wide variety of roles in cell adhesion, signalling, migration, assembly and differentiation. This entry represents one subtype of laminin G domains, which is sometimes found in association with thrombospondin-type laminin G domains ().

    \ ' '950' 'IPR004344' '\

    Tubulins and microtubules are subjected to several post-translational modifications of which the reversible\ detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This\ modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL) PUBMED:10685598. Tubulin-tyrosine ligase (TTL) catalyses the\ ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. The true\ physiological function of TTL has so far not been established. In\ normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated\ form frequently predominates, with a correlation to tumour aggressiveness PUBMED:11431336.

    \

    3-nitrotyrosine has\ been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not\ reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered\ microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis PUBMED:10339593.

    \ ' '951' 'IPR000007' '\

    Tubby, an autosomal recessive mutation, mapping to mouse chromosome 7, was recently found to be the result of a splicing defect in a novel gene with unknown function. This mutation maps to the tub gene PUBMED:8612280, PUBMED:8606774. The mouse tubby mutation is the cause of maturity-onset obesity, insulin resistance and sensory deficits. By contrast with the rapid juvenile-onset weight gain seen in diabetes (db) and obese (ob) mice, obesity in tubby mice develops gradually, and strongly resembles the late-onset obesity observed in the human population. Excessive deposition of adipose tissue culminates in a two-fold increase of body weight. Tubby mice also suffer retinal degeneration and neurosensory hearing loss. The tripartite character of the tubby phenotype is highly similar to human obesity syndromes, such as Alstrom and Bardet-Biedl. Although these phenotypes indicate a vital role for tubby proteins, no biochemical function has yet been ascribed to any family member PUBMED:10591637, although it has been suggested that the phenotypic features of tubby mice may be the result of cellular apoptosis triggered by expression of the mutated tub gene. TUB is the founding-member of the tubby-like proteins, the TULPs. TULPs are found in multicellular organisms from both the plant and animal kingdoms. Ablation of members of this protein family cause disease phenotypes that are indicative of their importance in nervous-system function and development PUBMED:14708010.

    \

    Mammalian TUB is a hydrophilic protein of ~500 residues. The N-terminal () portion of the protein is conserved neither in length nor sequence, but, in TUB, contains the nuclear localisation signal and may have transcriptional-activation activity. The C-terminal 250 residues are highly conserved. The C-terminal extremity contains a cysteine residue that might play an important role in the normal functioning of these proteins. The crystal structure of the C-terminal core domain from mouse tubby has been determined to 1.9A resolution. This domain is arranged as a 12-stranded, all anti-parallel, closed beta-barrel that surrounds a central alpha helix, (which is at the extreme carboxyl terminus of the protein) that forms most of the hydrophobic core. Structural analyses suggest that TULPs constitute a unique family of bipartite transcription factors PUBMED:10591637.

    \ ' '952' 'IPR018316' '\

    This domain is found in the tubulin alpha, beta and gamma chains, as\ well as the bacterial FtsZ family of proteins. These proteins\ are GTPases and are involved in polymer formation. Tubulin is the major component\ of microtubules, while FtsZ is the polymer-forming protein\ of bacterial cell division, it is part of a ring in the middle of the\ dividing cell that is required for constriction of cell membrane and\ cell envelope to yield two daughter cells. \ FtsZ can polymerise into tubes, sheets, and rings in vitro and is\ ubiquitous in bacteria and archaea. This is the C-terminal domain.

    \ ' '953' 'IPR008191' '\ There are multiple copies of this domain in the Drosophila melanogaster tudor protein and it\ has been identified in several RNA-binding proteins PUBMED:9048482. Although the\ function of this domain is unknown, in Drosophila melanogaster the tudor protein is required\ during oogenesis for the formation of primordial germ cells and for normal\ abdominal segmentation PUBMED:9003410.\ ' '954' 'IPR006990' '\ None of the members of the tweety (tty) family have been functionally characterised. However, they are considered to be transmembrane proteins with five potential membrane-spanning regions. A number of potential functions have been suggested on the basis of homology to the yeast FTR1 and FTH1 iron\ transporter proteins and the mammalian neurotensin receptors 1 and 2 in that they have a similar hydrophobicity profiles\ although there is no detectable sequence homology to the tweety-related proteins. It has been proposed that the tweety-related\ proteins could be involved in transport of iron or other divalent cations or alternatively that they may be\ membrane-bound receptors PUBMED:10950931.\ ' '955' 'IPR003613' '\

    Quality control of intracellular proteins is essential for cellular homeostasis. Molecular chaperones recognise and contribute to the refolding of misfolded or unfolded proteins, whereas the ubiquitin-proteasome system mediates the degradation of such abnormal proteins. Ubiquitin-protein ligases (E3s) determine the substrate specificity for ubiquitylation and have been classified into HECT and RING-finger families. More recently, however, U-box proteins, which contain a domain (the U box) of about 70 amino acids that is conserved from yeast to humans, have been identified as a new type of E3 PUBMED:12944364.

    \ \

    Members of the U-box family of proteins constitute a class of ubiquitin-protein ligases (E3s) distinct from the HECT-type and RING finger-containing E3\ families PUBMED:12944364. Using yeast two-hybrid technology, all mammalian U-box proteins have been reported to interact with molecular chaperones or co-chaperones, including Hsp90, Hsp70, DnaJc7, EKN1, CRN, and VCP. This suggests that the function of U box-type E3s is to mediate the degradation of unfolded or misfolded proteins in conjunction with molecular chaperones as receptors that recognise such abnormal proteins PUBMED:15115282, PUBMED:15189447.

    \ \

    Unlike the RING finger domain, , that is stabilised by Zn2+ ions coordinated by\ the cysteines and a histidine, the U-box scaffold is probably stabilised by a system of salt-bridges and hydrogen bonds. The charged and polar residues that participate in this network of bonds are more strongly conserved in the U-box proteins than in classic RING fingers, which supports their role in maintaining the stability of the U box. Thus, the U box appears to have evolved from a RING finger domain by appropriation of a new set of residues required to stabilise its structure, concomitant with the loss of the\ original, metal-chelating residues PUBMED:10704423.

    \ \ ' '956' 'IPR000449' '\

    UBA domains are a commonly occurring sequence motif of approximately 45 amino acid residues that are found in diverse proteins involved in the ubiquitin/proteasome pathway, DNA excision-repair, and cell signalling via protein kinases PUBMED:8871400. The human homologue of yeast Rad23A is one example of a nucleotide excision-repair protein that contains both an internal and a C-terminal UBA domain. The solution structure of human Rad23A UBA(2) showed that the domain forms a compact three-helix bundle PUBMED:9846873. Comparison of the structures of UBA(1) and UBA(2) reveals that both form very similar folds and have a conserved large hydrophobic surface patch which may be a common protein-interacting surface present in diverse UBA domains. Evidence that ubiquitin binds to UBA domains leads to the prediction that the hydrophobic surface patch of UBA domains interacts with the hydrophobic surface on the five-stranded beta-sheet of ubiquitin PUBMED:12079361.

    \

    This domain is similar in sequence to the N-terminal domain of translation elongation factor EF1B (or EF-Ts) from bacteria, mitochondria and chloroplasts.

    \

    More information about EF1B (EF-Ts) proteins can be found at Protein of the Month: Elongation Factors PUBMED:.

    \ ' '957' 'IPR000537' '\

    The COX10/ctaB/cyoE signature is found in prenyltransferases including bacterial 4-hydroxybenzoate octaprenyltransferase (gene ubiA); yeast mitochondrial para-hydroxybenzoate--polyprenyltransferase (gene COQ2); and protohaem IX farnesyltransferase (haem O synthase) from yeast and mammals(gene COX10), and from bacteria (genes cyoE or ctaB) PUBMED:8155731, PUBMED:7885224. These are integral membrane proteins, which probably contain seven transmembrane segments. The signature is also found in cytochrome C oxidase assembly factor. The complexity of cytochrome C oxidase requires assistance in building the complex, and this is carried out by the cytochrome C oxidase assembly factor.

    \ ' '958' 'IPR000626' '\

    Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1, ), a ubiquitin-conjugating enzyme (E2, ), and a ubiquitin ligase (E3, , ), which work sequentially in a cascade. There are many different E3 ligases, which are responsible for the type of ubiquitin chain formed, the specificity of the target protein, and the regulation of the ubiquitinylation process PUBMED:12646216. Ubiquitinylation is an important regulatory tool that controls the concentration of key signalling proteins, such as those involved in cell cycle control, as well as removing misfolded, damaged or mutant proteins that could be harmful to the cell. Several ubiquitin-like molecules have been discovered, such as Ufm1 (), SUMO1 (), NEDD8, Rad23 (), Elongin B and Parkin (), the latter being involved in Parkinson\'s disease PUBMED:15564047.

    \

    Ubiquitin is a protein of 76 amino acid residues, found in all eukaryotic cells and whose sequence is extremely well conserved from protozoan to vertebrates. Ubiquitin acts through its post-translational attachment (ubiquitinylation) to other proteins, where these modifications alter the function, location or trafficking of the protein, or targets it for destruction by the 26S proteasome PUBMED:15454246. The terminal glycine in the C-terminal 4-residue tail of ubiquitin can form an isopeptide bond with a lysine residue in the target protein, or with a lysine in another ubiquitin molecule to form a ubiquitin chain that attaches itself to a target protein. Ubiquitin has seven lysine residues, any one of which can be used to link ubiquitin molecules together, resulting in different structures that alter the target protein in different ways. It appears that Lys(11)-, Lys(29) and Lys(48)-linked poly-ubiquitin chains target the protein to the proteasome for degradation, while mono-ubiquitinylated and Lys(6)- or Lys(63)-linked poly-ubiquitin chains signal reversible modifications in protein activity, location or trafficking PUBMED:14998368. For example, Lys(63)-linked poly-ubiquitinylation is known to be involved in DNA damage tolerance, inflammatory response, protein trafficking and signal transduction through kinase activation PUBMED:15556404. In addition, the length of the ubiquitin chain alters the fate of the target protein. Regulatory proteins such as transcription factors and histones are frequent targets of ubquitinylation PUBMED:15525528.

    \ ' '959' 'IPR001012' '\ The UBX domain is found in ubiquitin-regulatory proteins, which are members of the ubiquitination pathway, as well as a number of other proteins including FAF-1 (FAS-associated factor 1), the human Rep-8 reproduction protein and several hypothetical proteins from yeast. The function of the UBX domain is not known although the fragment of avian FAF-1 containing the UBX domain causes apoptosis of transfected cells.\ ' '960' 'IPR001394' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to the MEROPS peptidase family C19 (ubiquitin-specific protease family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. Predicted active site residues for members of this family and family C1 occur in the same order in the sequence: N/Q, C, H. The type example is human ubiquitin-specific protease 14.

    \ \

    Ubiquitin is highly conserved, commonly found conjugated to proteins in\ eukaryotic cells, where it may act as a marker for rapid degradation, or\ it may have a chaperone function in protein assembly PUBMED:7845226. The ubiquitin is released by cleavage from the bound protein by a protease PUBMED:7845226. A number of\ deubiquitinising proteases are known: all are activated by thiol compounds\ PUBMED:7845226, PUBMED:3015923, and inhibited by thiol-blocking agents and ubiquitin aldehyde PUBMED:7845226, PUBMED:3031653, and as such have the properties of cysteine proteases PUBMED:7845226.

    \ \

    The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa, ,\ and 100-200 kDa) PUBMED:7845226: this family are the 100-200 kDa peptides which includes the Ubp1 ubiquitin peptidase from yeast. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative of the cysteine/histidine spacing of a cysteine protease catalytic dyad PUBMED:7845226.

    \ ' '961' 'IPR005113' '\ This region is always found associated with . It is predicted to form an all beta domain PUBMED:11563850.\ ' '962' 'IPR005122' '\

    This entry represents various uracil-DNA glycosylases and related DNA glycosylases (), such as uracil-DNA glycosylase PUBMED:7697717, thermophilic uracil-DNA glycosylase PUBMED:10339434, G:T/U mismatch-specific DNA glycosylase (Mug) PUBMED:9489705, and single-strand selective monofunctional uracil-DNA glycosylase (SMUG1) PUBMED:2820976. These proteins have a 3-layer alpha/beta/alpha structure. Uracil-DNA glycosylases are DNA repair enzymes that excise uracil residues from DNA by cleaving the N-glycosylic bond, initiating the base excision repair pathway. Uracil in DNA can arise either through the deamination of cytosine to form mutagenic U:G mispairs, or through the incorporation of dUMP by DNA polymerase to form U:A pairs PUBMED:17116429. These aberrant uracil residues are genotoxic PUBMED:16860315. The sequence of uracil-DNA glycosylase is extremely well conserved PUBMED:2555154 in bacteria and eukaryotes as well as in herpes viruses. More distantly related uracil-DNA glycosylases are also found in poxviruses PUBMED:8389453. In eukaryotic cells, UNG activity is found in both the nucleus and the mitochondria. Human UNG1 protein is transported to both the mitochondria and the nucleus PUBMED:8332455. The N-terminal 77 amino acids of UNG1 seem to be required for mitochondrial localization, but the presence of a mitochondrial transit peptide has not been directly demonstrated. The most N-terminal conserved region contains an aspartic acid residue which has been proposed, based on X-ray structures PUBMED:7845459 to act as a general base in the catalytic mechanism.

    \ ' '963' 'IPR003903' '\

    The Ubiquitin Interacting Motif (UIM), or \'LALAL-motif\', is a stretch of about 20 amino acid residues, which was first described in the 26S proteasome subunit PSD4/RPN-10 that is known to recognise ubiquitin PUBMED:9488668,PUBMED:11406394. In addition, the UIM is found, often in tandem or triplet arrays, in a variety of proteins either involved in ubiquitination and ubiquitin metabolism, or known to interact with ubiquitin-like modifiers. Among the UIM proteins are two different subgroups of the UBP (ubiquitin carboxy-terminal hydrolase) family of deubiquitinating enzymes, one F-box protein, one family of HECT-containing ubiquitin-ligases (E3s) from plants, and several proteins containing ubiquitin-associated UBA and/or UBX domains PUBMED:12062168. In most of these proteins, the UIM occurs in multiple copies and in association with other domains such as UBA (), UBX (), ENTH, EH (), VHS (), SH3 (), HECT (), VWFA (), EF-hand calcium-binding, WD-40 (), F-box (), LIM (), protein kinase (), ankyrin (), PX (), phosphatidylinositol 3- and 4-kinase (), C2 (), OTU (), dnaJ (), RING-finger () or FYVE-finger (). UIMs have been shown to bind ubiquitin and to serve as a specific targeting signal important for monoubiquitination. Thus, UIMs may have several functions in ubiquitin metabolism each of which may require different numbers of UIMs PUBMED:12121618, PUBMED:11919614, PUBMED:1919637.

    \ \

    The UIM is unlikely to form an independent folding domain. Instead, based on the spacing of the conserved residues, the motif probably forms a short alpha-helix that can be embedded into different protein folds PUBMED:11406394. Some proteins known to contain an UIM are listed below:

    \ \

    \ ' '964' 'IPR006214' '\

    This family of proteins of unknown function contains a subset of Bax inhibitor-1 proteins.

    \ ' '965' 'IPR005369' '\

    The function of this family is unknown, however the proteins contain two cysteine clusters that may be iron sulphur redox centres.

    \ ' '966' 'IPR005372' '\

    This family contains uncharacterised integral membrane proteins.

    \ ' '967' 'IPR005373' '\

    Members of this family are proteins of unknown function.

    \ ' '968' 'IPR005375' '\

    Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1, ), a ubiquitin-conjugating enzyme (E2, ), and a ubiquitin ligase (E3, , ), which work sequentially in a cascade. There are many different E3 ligases, which are responsible for the type of ubiquitin chain formed, the specificity of the target protein, and the regulation of the ubiquitinylation process PUBMED:12646216. Ubiquitinylation is an important regulatory tool that controls the concentration of key signalling proteins, such as those involved in cell cycle control, as well as removing misfolded, damaged or mutant proteins that could be harmful to the cell. Several ubiquitin-like molecules have been discovered, such as Ufm1 (), SUMO1 (), NEDD8, Rad23 (), Elongin B and Parkin (), the latter being involved in Parkinson\'s disease PUBMED:15564047.

    \

    This entry contains Ufm1 (ubiquitin-fold modifier), which is a ubiquitin-like protein belonging to the UBL family that have structural similarities to ubiquitin PUBMED:10884686, PUBMED:15071506. UBLs can be divided into two subclasses: type-1 UBLs, which ligate to target proteins in a manner similar, but not identical, to the ubiquitylation pathway, such as SUMO, NEDD8, and UCRP/ISG15, and type-2 UBLs (also called UDPs, ubiquitin-domain proteins), which contain ubiquitin-like structure embedded in a variety of different classes of large proteins with apparently distinct functions, such as Rad23, Elongin B, Scythe, Parkin, and HOIL-1.

    \ \

    Ufm1 is one of a number of ubiquitin-like modifiers that conjugate to target proteins in cells through Uba5 (E1) and Ufc1 (E2). The Ufm1-system is conserved in metazoa and plants, suggesting it has a potential role in multicellular organisms PUBMED:16527251. Human Ufm1 is synthesized as a precursor consisting of 85 amino-acid residues. Prior to activation by Uba5, the extra amino acids at the C-terminal region of Ufm1 are removed to expose Gly, which is necessary for conjugation to target molecule(s). C-terminal processing of Ufm1 requires two specific cysteine peptidases (): UfSP1 and UfSP2; both peptidases are also able to release Ufm1 from Ufm1-conjugated cellular proteins. UfSP2 is present in most, if not all, of multi-cellular organisms including plant, nematode, fly, and mammal, whereas UfSP1 is not present in plants and nematodes PUBMED:17182609.

    \ \

    For further information on ubiquitin, please see Protein of the Month PUBMED:.

    \ \ ' '969' 'IPR000608' '\

    The post-translational attachment of ubiquitin () to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation PUBMED:15556404, PUBMED:15196553, PUBMED:15454246. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1, ), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, , ), which work sequentially in a cascade PUBMED:14998368. The E1 enzyme mediates an ATP-dependent transfer of a thioester-linked ubiquitin molecule to a cysteine residue on the E2 enzyme. The E2 enzyme () then either transfers the ubiquitin moiety directly to a substrate, or to an E3 ligase, which can also ubiquitinylate a substrate.

    \

    There are several different E2 enzymes (over 30 in humans), which are broadly grouped into four classes, all of which have a core catalytic domain (containing the active site cysteine), and some of which have short N- and C-terminal amino acid extensions: class I enzymes consist of just the catalytic core domain (UBC), class II possess a UBC and a C-terminal extension, class III possess a UBC and an N-terminal extension, and class IV possess a UBC and both N- and C-terminal extensions. These extensions appear to be important for some subfamily function, including E2 localisation and protein-protein interactions PUBMED:15545318. In addition, there are proteins with an E2-like fold that are devoid of catalytic activity, but which appear to assist in poly-ubiquitin chain formation.

    \ \ ' '970' 'IPR004029' '\

    Urease and other nickel metalloenzymes are synthesised as precursors devoid of the metalloenzyme active site. These precursors then undergo a complex post-translational maturation process that requires a number of accessory proteins.

    \ \

    Members of this group are nickel-binding proteins required for urease metallocentre assembly PUBMED:8318889. They are believed to function as metallochaperones to deliver nickel to urease apoprotein PUBMED:12072968, PUBMED:10753863. It has been shown by yeast two-hybrid analysis that UreE forms a dimeric complex with UreG in Helicobacter pylori PUBMED:12388207. The UreDFG-apoenzyme complex has also been shown to exist PUBMED:11157956, PUBMED:7721685 and is believed to be, with the addition of UreE, the assembly system for active urease PUBMED:7721685. The complexes, rather than the individual proteins, presumably bind to UreB via UreE/H recognition sites.

    \ \

    The structure of Klebsiella aerogenes UreE reveals a unique two-domain architecture.The N-terminal domain is structurally related to a heat shock protein, while the C-terminal domain shows homology to the Atx1 copper metallochaperone PUBMED:11591723, PUBMED:11602602. Significantly, the metal-binding sites in UreE and Atx1 are distinct in location and types of residues despite the relationship between these proteins and the mechanism for UreE activation of urease is proposed to be different from the thiol ligand exchange mechanism used by the copper metallochaperones.

    \ \

    The N-terminal domain is termed the peptide-binding domain. Deletion of this domain does not eliminate enzymatic activity, and the truncated protein can still activate urease PUBMED:15866948.

    \ ' '971' 'IPR007319' '\

    A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome PUBMED:12068309, PUBMED:15590835. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:

    \ \

    There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5\' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble PUBMED:15489292. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5\' ends of nascent 18S rRNA.

    \

    Utp21 is a component of the SSU processome, which is required for pre-18S rRNA processing. It interacts with Utp18 PUBMED:15590835.

    \ ' '972' 'IPR001943' '\ During the process of Escherichia coli nucleotide excision repair, DNA damage\ recognition and processing are achieved by the action of the uvrA, uvrB,\ and uvrC gene products PUBMED:8466476. UvrB and UvrC share a common domain of around 35\ amino acids, the so called UVR domain. This domain in UvrB can interact with\ the homologous domain in UvrC throughout a putative coiled coil structure.\ This interaction is important for the incision of the damaged strand PUBMED:8530482.\ ' '973' 'IPR007705' '\

    A crucial step in membrane fusion is the formation of the SNARE complex in which conserved regions, called \'SNARE motifs\', from individual SNAREs associate and twist to form the core complex, which is an all-parallel coiled coil. The neuronal SNARE complex is a heterotrimer of vesicular (v-) SNARE\ VAMP-2 and the two target plasma membrane (t-) SNAREs\ syntaxin 1A and SNAP-25. It has been proposed that SNARE core complex formation proceeds like a zipper, beginning at the\ membrane-distal region and propagating toward the\ membrane-proximal end. SNARE complex formation is an\ energy-releasing process that may supply the required free\ energy for membrane fusion PUBMED:12740606.

    \

    This family includes the Golgi SNAP receptor (SNARE) complex protein, which is involved in transport from the endoplasmic reticulum to the golgi apparatus and intra-golgi transport, and the vesicle transport v-SNARE protein, that mediates vesicle transport pathways through interaction with T-SNAREs on the target membrane.

    \ ' '974' 'IPR002014' '\

    The VHS domain is a ~140 residues long domain, whose name is derived\ from its occurrence in VPS-27, Hrs and STAM. Based on regions surrounding the domain, VHS-proteins can be divided into 4 groups PUBMED:11911875:

    The VHS domain is always found at the N-\ terminus of proteins suggesting that such topology is important for function. The domain is considered to have a general membrane targeting/cargo recognition role in vesicular trafficking PUBMED:10985773.

    \ \

    Resolution of the crystal structure of the VHS domain of Drosophila Hrs and\ human Tom1 revealed that it consists of eight helices arranged in a double-layer superhelix\ PUBMED:10693761. The existence of conserved patches of residues on the domain surface suggests that VHS domains may be involved in protein-protein recognition and docking. Overall, sequence similarity is low (approx 25%) amongst domain family members.

    \ ' '975' 'IPR001807' '\

    Chloride channels (CLCs) constitute an evolutionarily well-conserved family of voltage-gated channels that are structurally unrelated to the other known voltage-gated channels. They are found in organisms ranging from bacteria to yeasts and plants, and also to animals. Their functions in higher animals likely include the regulation of cell volume, control of electrical excitability and trans-epithelial transport PUBMED:9046241.

    \ \

    The first member of the family (CLC-0) was expression-cloned from the electric organ of Torpedo marmorata PUBMED:2174129, and subsequently nine CLC-like proteins have been cloned from mammals. They are thought to function as multimers of two or more identical or homologous subunits, and they have varying tissue distributions and functional properties. To date, CLC-0, CLC-1, CLC-2, CLC-4 and CLC-5 have been demonstrated to form functional Cl- channels; whether the remaining isoforms do so is either contested or unproven. One possible explanation for the difficulty in expressing activatable Cl- channels is that some of the isoforms may function as Cl- channels of intracellular compartments, rather than of the plasma membrane. However, they are all thought to have a similar transmembrane (TM) topology, initial hydropathy analysis suggesting 13 hydrophobic stretches long enough to form putative TM domains PUBMED:2174129. Recently, the postulated TM topology has been revised, and it now seems likely that the CLCs have 10 (or possibly 12) TM domains, with both N- and C-termini residing in the cytoplasm PUBMED:9207144.

    \ \

    A number of human disease-causing mutations have been identified in the genes encoding CLCs. Mutations in CLCN1, the gene encoding CLC-1, the major skeletal muscle Cl- channel, lead to both recessively and dominantly-inherited forms of muscle stiffness or myotonia PUBMED:7581380. Similarly, mutations in CLCN5, which encodes CLC-5, a renal Cl- channel, lead to several forms of inherited kidney stone disease PUBMED:8559248. These mutations have been demonstrated to reduce or abolish CLC function.

    \ \ \ ' '976' 'IPR006925' '\ This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast PUBMED:11702788. The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport PUBMED:11422941. The role of VPS16 in this complex is not known.\ ' '977' 'IPR006926' '\ This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast PUBMED:11702788. The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport PUBMED:11422941. The role of VPS16 in this complex is not known.\ ' '978' 'IPR005378' '\

    The movement of lipid and protein components between intracellular organelles requires the regulated interactions of many\ molecules. Vacuolar protein sorting-associated protein (Vps)5 is a yeast protein that is a subunit of a large multimeric\ complex, termed the retromer complex, involved in retrograde transport of proteins from endosomes to the trans-Golgi network. Sorting nexin (SNX) 1 and SNX2 are its mammalian orthologs PUBMED:11102511.

    \ \

    To carry out its biological functions, Vps5 forms the retromer complex\ with at least four other proteins: Vps17, Vps26, Vps29, and Vps35.Vps35 contains a central region of weaker sequence similarity, thought to indicate the presence of at least three domains PUBMED:11102511.

    \ ' '980' 'IPR007258' '\ Vps52 complexes with Vps53 and Vps54 to form a multi-subunit complex involved in regulating membrane trafficking events PUBMED:10637310.\ ' '981' 'IPR007234' '\ Vps53 complexes with Vps52 and Vps54 to form a multi-subunit complex involved in regulating membrane trafficking events PUBMED:10637310.\ ' '982' 'IPR003123' '\ This domain is present in yeast vacuolar sorting protein 9 and other proteins.\ ' '983' 'IPR005127' '\ During infection, the intestinal protozoan parasite Giardia lamblia virus undergoes continuous antigenic variation which is determined\ by diversification of the parasite\'s major surface antigen, named VSP (variant surface protein).\ ' '984' 'IPR002035' '\ The von Willebrand factor is a large multimeric glycoprotein found in blood\ plasma. Mutant forms are involved in the aetiology of bleeding disorders \ PUBMED:8440408. In von Willebrand factor, the type A domain (vWF) is the prototype for\ a protein superfamily. The vWF domain is found in various plasma proteins:\ complement factors B, C2, CR3 and CR4; the integrins (I-domains); collagen \ types VI, VII, XII and XIV; and other extracellular proteins PUBMED:8412987, PUBMED:8145250, PUBMED:1864378. Although the majority of VWA-containing proteins are extracellular, the most ancient ones present in all eukaryotes are all intracellular proteins involved in functions such as transcription, DNA repair, ribosomal and membrane transport and the proteasome. A common feature appears to be involvement in multiprotein complexes. Proteins\ that incorporate vWF domains participate in numerous biological events\ (e.g. cell adhesion, migration, homing, pattern formation, and signal\ transduction), involving interaction with a large array of ligands PUBMED:8412987. A number of human diseases arise from mutations in VWA domains. Secondary structure prediction from 75 aligned vWF sequences has revealed a largely alternating sequence of alpha-helices and beta-strands PUBMED:8145250. Fold\ recognition algorithms were used to score sequence compatibility with a\ library of known structures: the vWF domain fold was predicted to be a\ doubly-wound, open, twisted beta-sheet flanked by alpha-helices PUBMED:7843416. \ 3D structures have been determined for the I-domains of integrins CD11b\ (with bound magnesium) PUBMED:7867070 and CD11a (with bound manganese) PUBMED:7479767. The domain \ adopts a classic alpha/beta Rossmann fold and contains an unusual metal \ ion coordination site at its surface. It has been suggested that this site\ represents a general metal ion-dependent adhesion site (MIDAS) for binding \ protein ligands PUBMED:7867070. The residues constituting the MIDAS motif in the CD11b\ and CD11a I-domains are completely conserved, but the manner in which the \ metal ion is coordinated differs slightly PUBMED:7479767.\ ' '985' 'IPR001846' '\

    A family of growth regulators (originally called cef10, connective tissue growth factor, fisp-12, cyr61, or, alternatively, beta IG-M1 and beta IG-M2), all belong to immediate-early genes expressed after induction by growth factors or certain oncogenes. Sequence analysis of this family revealed the presence of four distinct modules. Each module has homologues in other extracellular mosaic proteins such as Von Willebrand factor, slit, thrombospondins, fibrillar collagens, IGF-binding proteins and mucins. Classification and analysis of these modules suggests the location of binding regions and, by analogy to better characterised modules in other proteins, sheds some light onto the structure of this new family PUBMED:7687569.

    \

    The vWF domain is found in various plasma proteins:\ complement factors B, C2, CR3 and CR4; the integrins (I-domains); collagen \ types VI, VII, XII and XIV; and other extracellular proteins PUBMED:8412987, PUBMED:8145250, PUBMED:1864378. Although the majority of VWA-containing proteins are extracellular, the most ancient ones present in all eukaryotes are all intracellular proteins involved in functions such as transcription, DNA repair, ribosomal and membrane transport and the proteasome. A common feature appears to be involvement in multiprotein complexes. Proteins\ that incorporate vWF domains participate in numerous biological events\ (e.g. cell adhesion, migration, homing, pattern formation, and signal\ transduction), involving interaction with a large array of ligands PUBMED:8412987. A number of human diseases arise from mutations in VWA domains. Secondary structure prediction from 75 aligned vWF sequences has revealed a largely alternating sequence of alpha-helices and beta-strands PUBMED:8145250.

    \

    One of the functions of von Willebrand factor (vWF) is to serve as a carrier of clotting factor VIII (FVIII). The native conformation of the D\' domain of vWF is not only required for factor VIII (FVIII) binding but also for normal multimerisation and optimal secretion. The interaction between blood clotting factor VIII and VWF is necessary for normal survival of blood clotting factor VIII in blood circulation. The VWFD domain is a highly structured region, in which the first conserved Cys has been found to form a disulphide bridge with the second conserved one PUBMED:10807780.

    \ ' '986' 'IPR008197' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \ \

    Whey acidic protein (WAP) is a major component of the whey fraction of milk, which contains a significant number of different proteins. WAP proteins share limited sequence identity, except for their conserved cysteine-rich regions, known as 4-disulphide core (4-DSC) domains, and the positional conservation of specific residues PUBMED:12751894. Several non-milk proteins also contain 4-DSC patterns, such as certain serine protease inhibitors. WAP itself appears to have a protease-inhibitor function, as seen with its inhibitory effect on the progression of cancer cells PUBMED:17215074.

    \ ' '987' 'IPR019781' '\

    WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase PUBMED:11814058, PUBMED:10322433. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.

    \ ' '988' 'IPR000697' '\

    The EVH1 (WH1, RanBP1-WASP) domain is found in multi-domain proteins implicated in a diverse range of signalling, nuclear transport and cytoskeletal events. This domain of around 115 amino acids is present in species ranging from yeast to mammals. Many EVH1-containing proteins associate with actin-based structures and play a role in cytoskeletal organisation. EVH1 domains recognise and bind the proline-rich motif FPPPP with low-affinity, further interactions then form between flanking residues PUBMED:11911879PUBMED:9312002.

    \ \

    WASP family proteins contain a EVH1 (WH1) in their N-terminals which bind proline-rich sequences in the WASP interacting protein. Proteins of the RanBP1 family contain a WH1 domain in their N-terminal region, which seems to bind a different sequence motif present in the C-terminal part of RanGTP protein PUBMED:9883880,PUBMED:7724562.

    \ \

    Tertiary structure of the WH1 domain of the Mena protein revealed structure similarities with the pleckstrin homology (PH) domain. The overall fold consists of a compact parallel beta-sandwich, closed along one edge by a long alpha-helix. A highly conserved cluster of three surface-exposed aromatic side-chains forms the recognition site for the molecules target ligands. PUBMED:10338211.

    \ ' '989' 'IPR003124' '\

    The WH2 (WASP-Homology 2, or Wiskott-Aldrich homology 2) domain is an ~18 amino acids actin-binding motif. This domain was first recognised as an essential element for the regulation of the cytoskeleton by the mammalian Wiskott-Aldrich syndrome protein (WASP) family. WH2 proteins occur in eukaryotes from yeast to mammals, in insect viruses, and in some bacteria. The WH2 domain is found as a modular part of larger proteins; it can be associated with the WH1 or EVH1 domain and with the CRIB domain, and the WH2 domain can occur as a tandem repeat. The WH2 domain binds actin monomers and can facilitate the assembly of actin monomers into newly forming actin filaments PUBMED:11434350, PUBMED:11911886.

    \ \ \ ' '990' 'IPR003482' '\ WhiB is a putative transcription factor in Actinobacteria, required for differentiation and sporulation. The process of mycelium formation in Streptomyces, which occurs in response to nutrient limitation, is controlled by a number of whi genes, named for the white colour of aerial hyphae when mutations occur in these genes. The normal colour is grey. The exact role of WhiB is not clear, but a mutation in the gene results in white, tightly coiled aerial hyphae.\ ' '991' 'IPR003306' '\

    Wnt proteins constitute a large family of secreted molecules that are\ involved in intercellular signalling during development. The name derives\ from the first 2 members of the family to be discovered: int-1 (mouse) and\ wingless (Drosophila) PUBMED:9891778. It is now recognised that Wnt signalling controls many cell fate decisions in a variety of different organisms, including mammals PUBMED:10508601. Wnt signalling has been implicated in tumourigenesis, early mesodermal patterning of the embryo, morphogenesis of the brain and kidneys, regulation of mammary gland proliferation and Alzheimer\'s disease PUBMED:10967351, PUBMED:9192851.

    \ \

    Wnt-mediated signalling is believed to proceed initially through binding to\ cell surface receptors of the frizzled family; the signal is subsequently\ transduced through several cytoplasmic components to B-catenin, which enters\ the nucleus and activates the transcription of several genes important in\ development PUBMED:10733430. More recently, however, several non-canonical Wnt signalling pathways have been elucidated that act independently of B-catenin. Members of the Wnt gene family are defined by their sequence similarity to mouse Wnt-1 and Wingless in Drosophila. They encode proteins of ~350-400 residues in length, with orthologues identified in several,\ mostly vertebrate, species. Very little is known about the structure of \ Wnts as they are notoriously insoluble; but they share the following features characteristics of secretory proteins: a signal peptide, several potential N-glycosylation sites and 22 conserved cysteines PUBMED:9891778 that are probably involved in disulphide bonds. The Wnt proteins seem to adhere to the plasma membrane of the secreting cells and are therefore likely to signal over only few cell diameters. Fifteen major Wnt gene families have been \ identified in vertebrates, with multiple subtypes within some classes.

    \ \ This entry represents the WIF domain, and is found in the RYK tyrosine kinase receptors and WIF the Wnt-inhibitory-factor. The domain is extracellular and contains two conserved cysteines that may form a disulphide bridge. This domain is Wnt binding in WIF, and it has been suggested that RYK may also bind to Wnt PUBMED:10637605.\ ' '992' 'IPR003657' '\

    The WRKY domain is a 60 amino acid region that is defined by the conserved\ amino acid sequence WRKYGQK at its N-terminal end, together with a novel\ zinc-finger- like motif. The WRKY domain is found in one or two copies in a\ superfamily of plant transcription factors involved in the regulation of\ various physiological programs that are unique to plants, including pathogen\ defence, senescence, trichome development and the biosynthesis of secondary\ metabolites. The WRKY domain binds specifically to the DNA sequence motif\ (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W\ box is essential for function and WRKY binding PUBMED:10785665. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene.

    \ \

    Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure PUBMED:15705956. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.

    \ ' '993' 'IPR003125' '\ This domain has no known function and is found in Caenorhabditis elegans proteins normally at the N-terminal.\ ' '994' 'IPR001202' '\

    Synonym(s): Rsp5 or WWP domain

    \

    The WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded beta-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins PUBMED:7846762, PUBMED:7802651, PUBMED:7828727,\ PUBMED:7641887. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs PUBMED:7644498, PUBMED:11911877. It is frequently associated with other domains typical for proteins in signal transduction processes.

    \ \

    A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker\'s yeast) RSP5, similar to NEDD-4 in its molecular organization; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein, amongst others.

    \ ' '995' 'IPR004170' '\

    The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein-protein interactions in ubiquitin and ADP ribose conjugation systems. This domain is found as a tandem repeat at the N-terminal of Deltex, a cytosolic effector of Notch signalling thought to bind the N-terminal of the Notch receptor PUBMED:16271883. It is also found as an interaction module in protein ubiquination and ADP ribosylation proteins PUBMED:11343911.

    \ ' '996' 'IPR003856' '\

    A number of related proteins are involved in the synthesis of lipopolysaccharide, O-antigen polysaccharide, capsule polysaccharide and exopolysaccharides. Chain length determinant protein (or wzz protein) is involved in lipopolysaccharide (lps) biosynthesis, conferring a modal distribution of chain length on the O-antigen component of lps PUBMED:9573151. It gives rise to a reduced number of short chain molecules and increases in numbers of longer molecules, with a modal value of 20. The MPA/MPA2 proteins function in CPS and EPS polymerisation and export PUBMED:10658645.

    \ ' '997' 'IPR004938' '\ Plant cell walls are crucial for development, signal transduction, and disease resistance in plants. Cell walls are made of cellulose,\ hemicelluloses, and pectins. Xyloglucan (XG), the principal load-bearing hemicellulose of dicotyledonous plants, has a terminal fucosyl\ residue. This fucosyltransferase adds this residue PUBMED:10373113. \ ' '998' 'IPR000465' '\ Xeroderma pigmentosum (XP) PUBMED:8160271 is a human autosomal recessive disease,\ characterised by a high incidence of sunlight-induced skin cancer. Skin cells of individual\'s with this condition are hypersensitive to ultraviolet light, due\ to defects in the incision step of DNA excision repair. There are a minimum of\ seven genetic complementation groups involved in this pathway: XP-A to XP-G.\ XP-A is the most severe form of the disease and is due to defects in a 30 kDa\ nuclear protein called XPA (or XPAC) PUBMED:1918083.\ The sequence of the XPA protein is conserved from higher eukaryotes PUBMED:1764072 to\ yeast (gene RAD14) PUBMED:1741034. XPA is a hydrophilic protein of 247 to 296 amino-acid\ residues which has a C4-type zinc finger motif in its central section.\ \ ' '999' 'IPR000465' '\ Xeroderma pigmentosum (XP) PUBMED:8160271 is a human autosomal recessive disease,\ characterised by a high incidence of sunlight-induced skin cancer. Skin cells of individual\'s with this condition are hypersensitive to ultraviolet light, due\ to defects in the incision step of DNA excision repair. There are a minimum of\ seven genetic complementation groups involved in this pathway: XP-A to XP-G.\ XP-A is the most severe form of the disease and is due to defects in a 30 kDa\ nuclear protein called XPA (or XPAC) PUBMED:1918083.\ The sequence of the XPA protein is conserved from higher eukaryotes PUBMED:1764072 to\ yeast (gene RAD14) PUBMED:1741034. XPA is a hydrophilic protein of 247 to 296 amino-acid\ residues which has a C4-type zinc finger motif in its central section.\ \ ' '1000' 'IPR000242' '\

    Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; ) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation PUBMED:9818190, PUBMED:14625689. The PTP superfamily can be divided into four subfamilies PUBMED:12678841:

    \

    \

    Based on their cellular localisation, PTPases are also classified as:

    \

    \

    All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif PUBMED:9646865. Functional diversity between PTPases is endowed by regulatory domains and subunits.

    \ \

    This entry repesents several receptor and non-receptor protein-tyrosine phosphatases.

    \

    Structurally, all known receptor PTPases, are made up of a variable length\ extracellular domain, followed by a transmembrane region and a C-terminal\ catalytic cytoplasmic domain. Some of the receptor PTPases contain fibronectin\ type III (FN-III) repeats, immunoglobulin-like domains, MAM domains or\ carbonic anhydrase-like domains in their extracellular region. The cytoplasmic\ region generally contains two copies of the PTPase domain. The first seems to\ have enzymatic activity, while the second is inactive. The inactive domains of tandem phosphatases can be divided into two classes. Those which bind phosphorylated tyrosine residues may recruit multi-phosphorylated substrates for the adjacent active domains and are more conserved, while the other class have accumulated several variable amino acid substitutions and have a complete loss of tyrosine binding capability. The second class shows a release of evolutionary constraint for the sites around the catalytic centre, which emphasises a difference in function from the first group. There is a region of higher conservation common to both classes, suggesting a new regulatory centre PUBMED:14739250. PTPase domains consist of about 300 amino acids. There are two conserved cysteines, the second one has been shown to be absolutely required for activity. Furthermore, a number of conserved residues in its immediate vicinity have also been shown to be important.

    \ ' '1001' 'IPR006977' '\

    The Yip1 integral membrane domain contains four transmembrane alpha helices. The domain is characterised by the motifs DLYGP and GY. The Yip1 protein is a golgi protein involved in vesicular transport that interacts with GTPases PUBMED:9724632.

    \ ' '1002' 'IPR004910' '\

    This entry represents the Yippee-like (YPEL) family of putative zinc-binding proteins which is highly conserved among eukaryotes. The first protein in this family to be characterised, the Yippee protein from Drosophila, was identified by yeast interaction trap screen as a protein that physically interacts with moth hemolin PUBMED:11240639. It was subsequently found to be a member of a highly conserved family of proteins found in diverse eukaryotes including plants, animals and fungi PUBMED:15556292. Mammals contain five members of this family, YPEL1 to YPEL5, while other organisms tend to contain only two or three members. The mammalian proteins all appear to localise in the nucleus. YPEL1-4 are located in an unknown structure located on or close to the mitotic apparatus in the mitotic phase, whereas in the interphase they are located in the nuclei and nucleoli. In contrast, YPEL5 is localised to the centrosome and nucleus during interphase and at the mitotic spindle during mitosis, suggesting a function distinct from that of YPEL1-4. The localisation of the YPEL proteins suggests a novel, thopugh still unknown, function involved in cell division.

    \ ' '1003' 'IPR004019' '\

    The YLP motif is found in one or several copies in various Drosophila proteins. Its function is unknown, however the presence of completely conserved tyrosine residues and its presence in the human Erbb-2 and ErbB-4 receptor protein-tyrosine kinases () may suggest it could be a substrate for tyrosine kinases. ErbBs (1-4) are single-pass transmembrane proteins that activate a wide variety of signalling pathways, including those involved in proliferation, migration, differentiation, survival, and apoptosis; they are frequently misregulated in cancer PUBMED:18755183. ErbB-2 is an essential component of a neuregulin-receptor complex, although neuregulins do not interact with it alone. ErbB-4 specifically binds and is activated by neuregulins, NRG-2, NRG-3, heparin-binding EGF-like growth factor, betacellulin and NTAK PUBMED:18721752.

    \ ' '1004' 'IPR007275' '\ This family of poorly characterised proteins contains YT521-B, a putative splicing factor from rat. YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68 kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner PUBMED:10564280.\ ' '1005' 'IPR000315' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents B-box-type zinc finger domains, which are around 40 residues in length. B-box zinc fingers can be divided into two groups, where types 1 and 2 B-box domains differ in their consensus sequence and in the spacing of the 7-8 zinc-binding residues. Several proteins contain both types 1 and 2 B-boxes, suggesting some level of cooperativity between these two domains. B-box domains are found in over 1500 proteins from a variety of organisms. They are found in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). TRIM proteins contain a type 2 B-box domain, and may also contain a type 1 B-box. In proteins that do not contain RING or coiled-coil domains, the B-box domain is primarily type 2. Many type 2 B-box proteins are involved in ubiquitinylation. Proteins containing a B-box zinc finger domain include transcription factors, ribonucleoproteins and proto-oncoproteins; for example, MID1, MID2, TRIM9, TNL, TRIM36, TRIM63, TRIFIC, NCL1 and CONSTANS-like proteins PUBMED:16434393.

    \

    The microtubule-associated E3 ligase MID1 () contains a type 1 B-box zinc finger domain. MID1 specifically binds Alpha-4, which in turn recruits the catalytic subunit of phosphatase 2A (PP2Ac). This complex is required for targeting of PP2Ac for proteasome-mediated degradation. The MID1 B-box coordinates two zinc ions and adopts a beta/beta/alpha cross-brace structure similar to that of ZZ, PHD, RING and FYVE zinc fingers PUBMED:17428496, PUBMED:16529770.

    \ \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1006' 'IPR003656' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents predicted BED-type zinc finger domains. The BED finger which was named after the Drosophila proteins BEAF and DREF, is found in one or more copies in cellular regulatory factors and transposases from plants, animals and fungi. The BED finger is an about 50 to 60 amino acid residues domain that contains a characteristic motif with two highly conserved aromatic positions, as well as a shared pattern of cysteines and histidines that is predicted to form a zinc finger. As diverse BED fingers are able to bind DNA, it has been suggested that DNA-binding is the general function of this domain PUBMED:10973053. Some proteins known to contain a BED domain include animal, plant and fungi AC1 and Hobo-like transposases; Caenorhabditis elegans Dpy-20 protein, a predicted cuticular gene transcriptional regulator; Drosophila BEAF (boundary element-associated factor), thought to be involved in chromatin insulation; Drosophila DREF, a transcriptional regulator for S-phase genes; and tobacco 3AF1 and tomato E4/E8-BP1, light- and ethylene-regulated DNA binding proteins that contain two BED fingers.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1007' 'IPR002515' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the CysCysHisCys (C2HC) type zinc finger domain found in eukaryotes. Proteins containing these domains include:

    \

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1008' 'IPR018957' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    The C3HC4 type zinc-finger (RING finger) is a cysteine-rich domain of 40 to 60 residues that coordinates two zinc ions, and has the consensus sequence: C-X2-C-X(9-39)-C-X(1-3)-H-X(2-3)-C-X2-C-X(4-48)-C-X2-C where X is any amino acid PUBMED:8804826. Many proteins containing a RING finger play a key role in the ubiquitination pathway PUBMED:10500182.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1009' 'IPR000571' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents C-x8-C-x5-C-x3-H (CCCH) type Zinc finger (Znf) domains. Proteins containing CCCH Znf domains include Znf proteins from eukaryotes involved in cell cycle or growth phase-related regulation, e.g. human TIS11B (butyrate response factor 1), a probable regulatory protein involved in regulating the response to growth factors, and the mouse TTP growth factor-inducible nuclear protein, which has the same function. The mouse TTP protein is induced by growth factors. Another protein containing this domain is the human splicing factor U2AF 35kDa subunit, which plays a critical role in both constitutive and enhancer-dependent splicing by mediating essential protein-protein interactions and protein-RNA interactions required for 3\' splice site selection. It has been shown that different CCCH-type Znf proteins interact with the 3\'-untranslated region of various mRNA PUBMED:9703499, PUBMED:10330172. This type of Znf is very often present in two copies.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1010' 'IPR001878' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence:

    \
    \
    C-X2-C-X4-H-X4-C\
    
    \

    where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process PUBMED:17416621, PUBMED:17202191, PUBMED:17029416. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding PUBMED:15937226.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1011' 'IPR002857' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This domain contains eight conserved cysteine residues that bind to two zinc ions. The CXXC domain is found in a variety of chromatin-associated proteins. This domain binds to non-methylated CpG dinucleotides. The domain is characterised by two CGXCXXC repeats. The RecQ helicase has a single repeat that also binds to zinc, but this has not been included in this family. The DNA binding interface has been identified by NMR PUBMED:9207790.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1012' 'IPR001594' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the DHHC-type zinc finger domain, which is also known as NEW1 PUBMED:10231582. The DHHC Zn-finger was first isolated in the Drosophila putative transcription factor DNZ1 PUBMED:10231582. The function of this domain is unknown, but it has been predicted to be involved in protein-protein or protein-DNA interactions PUBMED:1892474.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1013' 'IPR007529' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the HIT-type zinc finger, which contains 7 conserved cysteines and one histidine that can potentially coordinate two zinc atoms. It has been named after the first protein that originally defined the domain: the yeast HIT1 protein () PUBMED:1325386. The HIT-type zinc finger displays some sequence similarities to the MYND-type zinc finger. The function of this domain is unknown but it is mainly found in nuclear proteins involved in gene regulation and chromatin remodeling. This domain is also found in the thyroid receptor interacting protein 3 (TRIP-3) , that specifically interacts with the ligand binding domain of the thyroid receptor.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1014' 'IPR004181' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents MIZ-type zinc finger domains. Miz1 (Msx-interacting-zinc finger) is a zinc finger-containing protein with homology to the yeast protein, Nfi-1. Miz1 is a sequence specific DNA binding protein that can function as a positive-acting transcription factor. Miz1 binds to the homeobox protein Msx2, enhancing the specific DNA-binding ability of Msx2 PUBMED:9256341. Other proteins containing this domain include the human pias family (protein inhibitor of activated STAT protein).

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1015' 'IPR002893' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents MYND-type zinc finger domains. The MYND domain (myeloid, Nervy, and DEAF-1) is present in a large group of proteins that includes RP-8 (PDCD2), Nervy, and predicted proteins from Drosophila, mammals, Caenorhabditis elegans, yeast, and plants PUBMED:7498738, PUBMED:8617243, PUBMED:2072913. The MYND domain consists of a cluster of cysteine and histidine residues, arranged with an invariant spacing to form a potential zinc-binding motif PUBMED:8617243. Mutating conserved cysteine residues in the DEAF-1 MYND domain does not abolish DNA binding, which suggests that the MYND domain might be involved in protein-protein interactions PUBMED:8617243. Indeed, the MYND domain of ETO/MTG8 interacts directly with the N-CoR and SMRT co-repressors PUBMED:9584201, PUBMED:9819404. Aberrant recruitment of co-repressor complexes and inappropriate transcriptional\ repression is believed to be a general mechanism of leukemogenesis caused by the t(8;21) translocations that fuse ETO with the acute myelogenous leukemia 1 (AML1) protein. ETO has been shown to be a co-repressor recruited by the promyelocytic leukemia zinc finger (PLZF) protein PUBMED:10688654. A\ divergent MYND domain present in the adenovirus E1A binding protein BS69 was also shown to interact with N-CoR and mediate transcriptional repression PUBMED:10734313. The current evidence suggests that the MYND motif in mammalian proteins constitutes a protein-protein interaction domain that functions as a co-repressor-recruiting interface.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1016' 'IPR007716' '\ The HRD4 gene is identical to NPL4, a gene previously implicated in nuclear transport. Using a diverse set of substrates and direct ubiquitination assays, analysis revealed that HRD4/NPL4 is required for a poorly characterised step in ER-associated degradation after ubiquitination of target proteins but before their recognition by the 26S proteasome PUBMED:11739805. This region of the protein contains possibly two zinc binding motifs. Npl4p physically associates with Cdc48p via Ufd1p to form a Cdc48p-Ufd1p-Npl4p complex. The Cdc48-Ufd1-Npl4 complex functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation or even more specific processing.\ ' '1017' 'IPR006895' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    COPII (coat protein complex II)-coated vesicles carry proteins from the endoplasmic reticulum (ER) to the Golgi complex PUBMED:11535824. COPII-coated vesicles form on the ER by the stepwise recruitment of three cytosolic components: Sar1-GTP to initiate coat formation, Sec23/24 heterodimer to select SNARE and cargo molecules, and Sec13/31 to induce coat polymerisation and membrane deformation PUBMED:12239560.

    \

    Sec23 p and Sec24p are structurally related, folding into five distinct domains: a beta-barrel, a zinc-finger, an alpha/beta trunk domain (), an all-helical region (), and a C-terminal gelsolin-like domain (). This entry describes an approximately 55-residue Sec23/24 zinc-binding domain, which lies against the beta-barrel at the periphery of the complex.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1018' 'IPR001607' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents UBP-type zinc finger domains, which display some similarity with the Zn-binding domain of the insulinase family. The UBP-type zinc finger domain is found only in a small subfamily of ubiquitin C-terminal hydrolases (deubiquitinases or UBP) PUBMED:9759494, PUBMED:9409543, All members of this subfamily are isopeptidase-T, which are known to cleave isopeptide bonds between ubiquitin moieties.

    \

    Some of the proteins containing an UBP zinc finger include:

    \

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1019' 'IPR006794' '\ Zfx and Zfy are transcription factors implicated in mammalian sex determination. This region is found N-terminal to multiple copies of a C2H2 Zinc finger (). This region has been shown to activate transcription when fused to a GAL4 DNA binding domain PUBMED:2105457.\ ' '1020' 'IPR003689' '\ These ZIP zinc transporter proteins define a family of metal ion transporters that are found in plants, protozoa, fungi, invertebrates, and vertebrates, making it now possible to address questions of metal ion accumulation and homeostasis in diverse organisms PUBMED:9618566.\ ' '1021' 'IPR000834' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of sequences contain a diverse range of gene families, which include metallopeptidases belonging to MEROPS peptidase family M14 (carboxypeptidase A, clan MC), subfamilies M14A and M14B.

    \ \ \

    The carboxypeptidase A family can be divided into two subfamilies:\ carboxypeptidase H (regulatory) and carboxypeptidase A (digestive) PUBMED:7674922. Members of the H family have longer C-termini than those of family A PUBMED:1449602, and carboxypeptidase M (a member of the H family) is bound to the membrane by a glycosylphosphatidylinositol anchor, unlike the majority of the M14 family, which are soluble PUBMED:7674922.

    \ \

    The zinc ligands have been determined as two histidines and a glutamate,\ and the catalytic residue has been identified as a C-terminal glutamate,\ but these do not form the characteristic metalloprotease HEXXH motif PUBMED:7674922, PUBMED:6887246.\ Members of the carboxypeptidase A family are synthesised as inactive\ molecules with propeptides that must be cleaved to activate the enzyme.\ Structural studies of carboxypeptidases A and B reveal the propeptide to\ exist as a globular domain, followed by an extended alpha-helix; this\ shields the catalytic site, without specifically binding to it, while the\ substrate-binding site is blocked by making specific contacts PUBMED:7674922, PUBMED:1548696.

    \ \

    Other examples of protein families in this entry include:

    \ ' '1022' 'IPR001507' '\

    A large domain, containing around 260 amino acids, has been recognised in a variety of receptor-like eukaryotic glycoproteins PUBMED:1313375. All of these proteins are mosaic proteins composed of various domains and that all have a large extracellular region followed by either a transmembrane region and a very short cytoplasmic region or by a GPI-anchor. The domain common to all these proteins is located in the C-terminal portion of the extracellular region, and contains 8 conserved Cys residues, which are probably involved in disulphide bond formation.

    \ \

    CD105 (also called endoglin) is the regulatory component of the TGF-beta receptor complex. It is a modulator of cellular responses to TGF-beta 1.

    \ \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).\

    \ ' '1023' 'IPR007698' '\

    Alanine dehydrogenases () and pyridine nucleotide transhydrogenase () have been\ shown to share regions of similarity PUBMED:8439307. Alanine dehydrogenase catalyzes the NAD-dependent\ reversible reductive amination of pyruvate into alanine. Pyridine nucleotide transhydrogenase catalyzes\ the reduction of NADP+ to NADPH with the concomitant oxidation of NADH to NAD+. This enzyme is located\ in the plasma membrane of prokaryotes and in the inner membrane of the mitochondria of eukaryotes. The\ transhydrogenation between NADH and NADP is coupled with the translocation of a proton across the\ membrane. In prokaryotes the enzyme is composed of two different subunits, an alpha chain (gene pntA)\ and a beta chain (gene pntB), while in eukaryotes it is a single chain protein. The sequence of alanine\ dehydrogenase from several bacterial species are related with those of the alpha subunit of bacterial\ pyridine nucleotide transhydrogenase and of the N-terminal half of the eukaryotic enzyme. The two most\ conserved regions correspond respectively to the N-terminal extremity of these proteins and to a central\ glycine-rich region which is part of the NAD(H)-binding site.

    \

    This is a C-terminal domain of alanine dehydrogenases (). This domain is also found in the lysine 2-oxoglutarate reductases.

    \ ' '1024' 'IPR007886' '\

    Alanine dehydrogenases () and pyridine nucleotide transhydrogenase () have been\ shown to share regions of similarity PUBMED:8439307. Alanine dehydrogenase catalyzes the NAD-dependent\ reversible reductive amination of pyruvate into alanine. Pyridine nucleotide transhydrogenase catalyzes\ the reduction of NADP+ to NADPH with the concomitant oxidation of NADH to NAD+. This enzyme is located\ in the plasma membrane of prokaryotes and in the inner membrane of the mitochondria of eukaryotes. The\ transhydrogenation between NADH and NADP is coupled with the translocation of a proton across the\ membrane. In prokaryotes the enzyme is composed of two different subunits, an alpha chain (gene pntA)\ and a beta chain (gene pntB), while in eukaryotes it is a single chain protein. The sequence of alanine\ dehydrogenase from several bacterial species are related with those of the alpha subunit of bacterial\ pyridine nucleotide transhydrogenase and of the N-terminal half of the eukaryotic enzyme. The two most\ conserved regions correspond respectively to the N-terminal extremity of these proteins, represented in this entry, and to a central\ glycine-rich region which is part of the NAD(H)-binding site.

    \ ' '1025' 'IPR007883' '\ This family contains proteins of unknown function from Caenorhabditis elegans.\ ' '1026' 'IPR007875' '\

    Sprouty (Spry) and Spred (Sprouty related EVH1 domain) proteins have been identified as inhibitors of the Ras/mitogen-activated protein kinase (MAPK) cascade, a pathway crucial for developmental processes initiated by activation of various receptor tyrosine kinases [1,2]. These proteins share a conserved, C-terminal cysteine-rich region, the SPR domain. This domain has been defined as a novel cytosol to membrane translocation domain PUBMED:15683364, PUBMED:12391162, PUBMED:10887178, PUBMED:11493923. It has been found to be a PtdIns(4,5)P2-binding domain that targets the proteins to a cellular localization that maximizes their inhibitory potential PUBMED:12402043, PUBMED:12391162. It also mediates homodimer formation of these proteins PUBMED:15683364, PUBMED:12402043.

    \ \

    The SPR domain can occur in association with the WH1 domain (see ) (located in the N-terminal part of the proteins) in the Spred proteins.

    \ \ \ \ ' '1027' 'IPR007881' '\ This family contains several eukaryotic transmembrane proteins which are related to the Caenorhabditis elegans protein UNC-50 . A mammalian homologue, UNCL is a novel inner nuclear membrane protein that associates with RNA and is involved in the cell-surface expression of neuronal nicotinic receptors. UNCL plays a broader role because UNCL homologues are present in two yeast and a plant species, none of which express nicotinic receptors and it is also found in tissues that lack nicotinic receptors.\ ' '1028' 'IPR005834' '\

    This group of hydrolase enzymes is structurally different from the alpha/beta hydrolase family (abhydrolase). This group includes L-2-haloacid\ dehalogenase, epoxide hydrolases and phosphatases. The structure consists of two domains. One is an\ inserted four helix bundle, which is the least well conserved region of the alignment, between residues 16 and 96 of\ HAD1_PSESP. The rest of the fold is composed of the core alpha/beta domain.

    \ ' '1029' 'IPR007342' '\ Indigoidine is a blue pigment synthesised by Erwinia chrysanthemi implicated in pathogenicity and protection from oxidative stress. IdgA is involved in indigoidine biosynthesis, but its specific function is unknown PUBMED:11790734.\ ' '1030' 'IPR020573' '\ UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase () catalyses an early step in lipid A biosynthesis PUBMED:8366125: \ This enzyme contains several hexapeptide repeats and forms part of the wider bacterial transferase hexapeptide repeat family.\ This entry represents the non-repeating region of the enzyme.\ ' '1031' 'IPR000308' '\

    The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells\ PUBMED:1671102, PUBMED:11911880. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region\ and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle\ for interacting with client proteins. The monomer consists of nine helices\ organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four\ helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids.\ These four helices form the concave amphipathic groove that interacts with target peptides.\

    \ \

    14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase\ Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any\ amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand.\ Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.

    \ ' '1032' 'IPR006140' '\

    \ A number of NAD-dependent 2-hydroxyacid dehydrogenases which seem to be\ specific for the D-isomer of their substrate have been shown to be\ functionally and structurally related. All contain a glycine-rich\ region located in the central section of these enzymes, this region corresponds to the NAD-binding domain. The catalytic domain is described in

    \ ' '1033' 'IPR014051' '\

    This entry represents a domain found in a number of known and predicted phosphoesterases. These include bacterial and archaeal 2\',5\' RNA ligases, and a family of predicted phosphoesterases known as the YjcG family. The 2\',5\' RNA ligases perform a reversible, ATP-independent 2\'-5\'-ligation of what is presumably a non-phyiological substrate: half-tRNA splice intermediates from an intron-containing yeast tRNA PUBMED:8940112. The physiological substrate(s) in prokaryotes may include small 2\',5\'-link-containing oligonucleotides, perhaps with regulatory or biosynthetic roles. This domain contains a conserved HXTX motif which is thought to be important for catalytic activity, as it is in the related enzyme cyclic nucleotide phosphodiesterase (CPDase) PUBMED:12466548. In 2\',5\' RNA ligase this domain is duplicated, with the two conserved motifs forming the proposed active site, which is analogous to that of CPDase PUBMED:12798681.

    \ \ ' '1034' 'IPR005163' '\

    This small triple helical domain has been predicted to assume a topology similar to helix-turn-helix domains. These domains are found at the C-terminus of proteins related to the YiiM protein () from Escherichia coli.

    \ ' '1035' 'IPR002562' '\

    This domain is responsible for the 3\'-5\' exonuclease proofreading\ activity of Escherichia coli DNA polymerase I (polI) and other enzymes, \ it catalyses the hydrolysis of unpaired or mismatched nucleotides. \ This domain consists of the amino-terminal half of the Klenow fragment \ in E. coli polI it is also found in the Werner syndrome helicase \ (WRN), focus forming activity 1 protein (FFA-1) and ribonuclease D\ (RNase D) PUBMED:9697700.

    \ ' '1036' 'IPR000603' '\ The 3A protein is found in Bromovirus and Cucumovirus, whose genomes contain 3 RNA segments.\ The third segment (RNA 3) contains two proteins, the coat protein and the 3A protein. The function of the\ 3A protein is uncertain but has been shown to be involved in movement of the virus from the initially infected\ cells to adjacent cells PUBMED:9356336.\ ' '1037' 'IPR004173' '\

    The 3H domain is named after its three highly conserved histidine residues. The 3H domain appears to be a small molecule-binding domain, based on its occurrence with other domains PUBMED:11292341. Several proteins carrying this domain are transcriptional regulators from the biotin repressor family. The transcription regulator TM1602 from Thermotoga maritima is a DNA-binding protein thought to belong to a family of de novo NAD synthesis pathway regulators. TM1602 has an N-terminal DNA-binding domain and a C-terminal 3H regulatory domain. The N-terminal domain appears to bind to the NAD promoter region and repress the de novo NAD biosynthesis operon, while the C-terminal 3H domain may bind to nicotinamide, nicotinic acid, or other substrate/products PUBMED:17256761. The 3H domain has a 2-layer alpha/beta sandwich fold.

    \ ' '1038' 'IPR006108' '\

    3-hydroxyacyl-CoA dehydrogenase () (HCDH) PUBMED:3479790 is an enzyme involved in fatty acid metabolism, it catalyzes the reduction of 3-hydroxyacyl-CoA to 3-oxoacyl-CoA. Most eukaryotic cells have 2 fatty-acid beta-oxidation systems, one located in mitochondria and the other in peroxisomes. In peroxisomes 3-hydroxyacyl-CoA dehydrogenase forms, with enoyl-CoA hydratase (ECH) and 3,2-trans-enoyl-CoA isomerase (ECI) a multifunctional enzyme where the N-terminal domain bears the hydratase/isomerase activities and the C-terminal domain the dehydrogenase activity. There are two mitochondrial enzymes: one which is monofunctional and the other which is, like its peroxisomal counterpart, multifunctional.

    \

    In Escherichia coli (gene fadB) and Pseudomonas fragi (gene faoA) HCDH is part of a multifunctional enzyme which also contains an ECH/ECI domain as well as a 3-hydroxybutyryl-CoA epimerase domain PUBMED:2204034.

    \

    There are two major region of similarities in the sequences of proteins of the HCDH family, the first one located in the N-terminal, corresponds to the NAD-binding site, the second one is located in the centre of the sequence. This represents the C-terminal domain which is also found in lambda crystallin. Some proteins include two copies of this domain.

    \ ' '1039' 'IPR006176' '\

    3-hydroxyacyl-CoA dehydrogenase () (HCDH) PUBMED:3479790 is an enzyme involved in fatty acid metabolism, it catalyzes the reduction of 3-hydroxyacyl-CoA to 3-oxoacyl-CoA. Most eukaryotic cells have 2 fatty-acid beta-oxidation systems, one located in mitochondria and the other in peroxisomes. In peroxisomes 3-hydroxyacyl-CoA dehydrogenase forms, with enoyl-CoA hydratase (ECH) and 3,2-trans-enoyl-CoA isomerase (ECI) a multifunctional enzyme where the N-terminal domain bears the hydratase/isomerase activities and the C-terminal domain the dehydrogenase activity. There are two mitochondrial enzymes: one which is monofunctional and the other which is, like its peroxisomal counterpart, multifunctional.

    \

    In Escherichia coli (gene fadB) and Pseudomonas fragi (gene faoA) HCDH is part of a multifunctional enzyme which also contains an ECH/ECI domain as well as a 3-hydroxybutyryl-CoA epimerase domain PUBMED:2204034.

    \

    There are two major regions of similarity in the sequences of proteins of the HCDH family, the first one located in the N-terminal, corresponds to the NAD-binding site, the second one is located in the centre of the sequence. This represents the C-terminal domain which is also found in lambda crystallin. Some proteins include two copies of this domain.

    \ ' '1040' 'IPR003385' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \ The enzymes in this entry () belong to the glycoside hydrolase family 77 , and transfer a segment of a (1,4)-alpha-D-glucan to a new 4-position in an acceptor, which may be\ glucose or (1,4)-alpha-D-glucan PUBMED:7678257. They belong to the disproportionating family of enzymes.\ ' '1041' 'IPR002698' '\ 5-formyltetrahydrofolate cyclo-ligase or methenyl-THF synthetase catalyses the interchange of 5-formyltetrahydrofolate (5-FTHF) to 5-10-methenyltetrahydrofolate, this requires ATP and Mg2+ PUBMED:8522195. 5-FTHF is used in chemotherapy where it is clinically known as Leucovorin PUBMED:8034591.\ ' '1042' 'IPR020046' '\

    The N-terminal and internal 5\'3\'-exonuclease domains are commonly found together, and are most often associated with 5\' to 3\' nuclease activities. The XPG protein signatures () are never found outside the \'53EXO\' domains. The latter are found in more diverse proteins PUBMED:7926735, PUBMED:10322433, PUBMED:8464724. The number of amino acids that separate the two 53EXO domains, and the presence of accompanying motifs allow the diagnosis of several protein families.

    In the eubacterial type A DNA-polymerases, the N-terminal and internal domains are separated by a few amino acids, usually four. The pattern DNA_POLYMERASE_A () is always present towards the C-terminus. Several eukaryotic structure-dependent endonucleases and exonucleases have the 53EXO domains separated by 24 to 27 amino acids, and the XPG protein signatures are always present. In several proteins from herpesviridae, the two 53EXO domains are separated by 50 to 120 amino acids. These proteins are implicated in the inhibition of the expression of the host genes. Eukaryotic DNA repair proteins with 600 to 700 amino acids between the 53_EXO domains all carry the XPG protein signatures.

    \

    This entry represents the N-terminal resolvase-like domain, which has a 3-layer alpha/beta/alpha core structure and contains an alpha-helical arch PUBMED:8717047, PUBMED:8657312.

    \ ' '1043' 'IPR020047' '\

    The N-terminal and internal 5\'3\'-exonuclease domains are commonly found together, and are most often associated with 5\' to 3\' nuclease activities. The XPG protein signatures () are never found outside the \'53EXO\' domains. The latter are found in more diverse proteins PUBMED:7926735, PUBMED:10322433, PUBMED:8464724. The number of amino acids that separate the two 53EXO domains, and the presence of accompanying motifs allow the diagnosis of several protein families.

    In the eubacterial type A DNA-polymerases, the N-terminal and internal domains are separated by a few amino acids, usually four. The pattern DNA_POLYMERASE_A () is always present towards the C-terminus. Several eukaryotic structure-dependent endonucleases and exonucleases have the 53EXO domains separated by 24 to 27 amino acids, and the XPG protein signatures are always present. In several proteins from herpesviridae, the two 53EXO domains are separated by 50 to 120 amino acids. These proteins are implicated in the inhibition of the expression of the host genes. Eukaryotic DNA repair proteins with 600 to 700 amino acids between the 53_EXO domains all carry the XPG protein signatures.

    \

    This entry represents the SAM-fold domain found in 5\'-3\' exonucleases. This domain consists of 4-5 helices in a bundle of two orthogonally packed alpha-hairpins PUBMED:8717047, PUBMED:8657312.

    \ ' '1044' 'IPR013086' '\

    Neurotransmitter transport systems are integral to the release, re-uptake and recycling of neurotransmitters at synapses. High affinity transport proteins found in the plasma membrane of presynaptic nerve terminals and glial cells are responsible for the removal from the extracellular space of released-transmitters, thereby terminating their actions PUBMED:15336049. Plasma membrane neurotransmitter transporters fall into two structurally and mechanistically distinct families. The majority of the transporters constitute an extensive family of homologous proteins that derive energy from the co-transport of Na+ and Cl-, in order to transport neurotransmitter molecules into the cell against their concentration gradient. The family has a common structure of 12 presumed transmembrane helices and includes carriers for gamma-aminobutyric acid (GABA), noradrenaline/adrenaline, dopamine, serotonin, proline, glycine, choline, betaine and taurine. They are structurally distinct from the second more-restricted family of plasma membrane transporters, which are responsible for excitatory amino acid transport. The latter couple glutamate and aspartate uptake to the cotransport of Na+ and the counter-transport of K+, with no apparent dependence on Cl- PUBMED:8811182. In addition, both of these transporter families are distinct from the vesicular neurotransmitter transporters PUBMED:8103691, PUBMED:7823024.

    Sequence analysis of the Na+/Cl- neurotransmitter superfamily reveals that it can be divided into four subfamilies, these being transporters for monoamines, the amino acids proline and glycine, GABA, and a group of orphan transporters PUBMED:9779464.

    \

    The serotonin (5-HT) neurotransmitter transporter is known to be expressed in the brain\ and also in the periphery: on platelet, placental and pulmonary cell\ membranes. The brain 5-HT transporter is thought to be the principal site\ of action of therapeutic anti-depressants (which inhibit this transporter),\ and it may also mediate the behavioural effects of cocaine and amphetamines\ PUBMED:7681602. The human form (630 amino acids) is 92% identical to the rat brain\ 5-HT transporter, and shares the same predicted topology and conserved sites\ for post-translational modification.

    \ \

    This domain is found at the N-terminal region of some 5-HT neurotransmitters.

    \ ' '1045' 'IPR001708' '\

    This family of proteins is required for the insertion of integral membrane proteins into cellular membranes. Many of these integral membrane proteins are associated with respiratory chain complexes, for example a large number of members of this family play an essential role in the activity and assembly of cytochrome c oxidase.

    \ Stage III sporulation protein J (SP3J) is a probable lipoprotein, rich in basic and hydrophobic amino acids. Mutations in the protein abolish the transcription of prespore-specific genes transcribed by the sigma G form of RNA polymerase PUBMED:1487728. SP3J could be involved in a signal transduction pathway coupling gene expression in the prespore to events in the mother cell, or it may be necessary for essential metabolic interactions between the two cells PUBMED:1487728. The protein shows a high degree of similarity to Bacillus subtilis YQJG, to yeast OXA1 and also to bacterial 60 kDa inner-membrane proteins PUBMED:7686882, PUBMED:7542800, PUBMED:1552862, PUBMED:8071197.

    \ ' '1046' 'IPR001813' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The 60S acidic ribosomal protein plays an important role in the elongation step of protein synthesis. This family includes archaebacterial L12, eukaryotic P0, P1 and P2 PUBMED:8722011.

    \

    Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee\ King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E.,\ Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of\ the first three letters of the genus; a space; the first letter of the\ species name; a space and an arabic number. In the event that two species\ names have identical designations, they are discriminated from one another\ by adding one or more letters (as necessary) to each species designation.

    \

    The allergens in this family include allergens with the following designations: Alt a 6, Alt a 12, Cla h 3, Cla h 4 and Cla h 12.

    \ ' '1047' 'IPR013079' '\

    6-Phosphofructo-2-kinase (, ) is a bifunctional enzyme that catalyses both the synthesis and the degradation of fructose-2, 6-bisphosphate. The fructose-2,6-bisphosphatase reaction involves a phosphohistidine intermediate. The catalytic pathway is:\ \ \ The enzyme is important in the regulation of hepatic carbohydrate metabolism and is found in greatest quantities in the liver, kidney and heart. In mammals, several genes often encode different isoforms, each of which differs in its tissue distribution and enzymatic activity PUBMED:9652401. The family described here bears a resemblance to the ATP-driven phospho-fructokinases, however, they share little sequence similarity, although a few residues seem key to their interaction with fructose 6-phosphate PUBMED:9753654.

    \ \

    This domain forms the N-terminal region of this enzyme, while forms the C-terminal domain.

    \ ' '1048' 'IPR006114' '\

    6-Phosphogluconate dehydrogenase () (6PGD) is an oxidative carboxylase that catalyses the decarboxylating reduction of 6-phosphogluconate into ribulose 5-phosphate in the presence of NADP. This reaction is a component of the hexose mono-phosphate shunt and pentose phosphate pathways (PPP) PUBMED:2113917, PUBMED:6641716. Prokaryotic and eukaryotic 6PGD are proteins of about 470 amino acids whose sequences are highly conserved PUBMED:1659648. The protein is a homodimer in which the monomers act independently PUBMED:6641716: each contains a large, mainly alpha-helical domain and a smaller beta-alpha-beta domain, containing a mixed parallel and anti-parallel 6-stranded beta sheet PUBMED:6641716. NADP is bound in a cleft in the small domain, the substrate binding in an adjacent pocket PUBMED:6641716.

    This entry represents the C-terminal all-alpha domain of 6-phosphogluconate dehydrogenase. The domain contains two structural repeats of 5 helices each. The NAD-binding domain is described in .

    \ ' '1049' 'IPR003411' '\

    This family consists of a 7 kDa coat protein from Carlavirus and Potexvirus PUBMED:8010191.

    \ ' '1050' 'IPR003212' '\ This family contains members of the hyperthermophilic archaebacterium 7kDa DNA-binding/endoribonuclease P2 family. There are five 7kDa DNA-binding proteins, 7a-7e, found as monomers in the cell. Protein 7e shows the tightest DNA-binding ability.\ ' '1051' 'IPR000276' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The rhodopsin-like GPCRs themselves represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7\ transmembrane (TM) helices PUBMED:2111655, PUBMED:2830256, PUBMED:8386361.

    \ ' '1052' 'IPR000832' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The secretin-like GPCRs include secretin PUBMED:1646711, calcitonin PUBMED:1658940, parathyroid hormone/parathyroid hormone-related peptides PUBMED:1658941 and vasoactive intestinal peptide PUBMED:1314625, all of which activate adenylyl cyclase and the phosphatidyl-inositol-calcium pathway. These receptors contain seven transmembrane regions, in a manner reminiscent of the rhodopsins and other receptors believed to interact with G-proteins (however there is no significant sequence identity between these families, the secretin-like receptors thus bear their own unique \'7TM\' signature). Their N-terminus is probably located on the extracellular side of the membrane and potentially glycosylated. This N-terminal region contains a long conserved region which allow the binding of large peptidic ligand such as glucagon, secretin, VIP and PACAP; this region contains five conserved cysteines residues which could be involved in disulphide bond. The C-terminal region of these receptor is probably cytoplasmic. Every receptor gene in this family is encoded on multiple exons, and several of these genes are alternatively spliced to yield functionally distinct products.

    \ ' '1054' 'IPR004117' '\ All known members of this group are seven-transmembrane proteins that are candidate odorant receptors in Drosophila.\ ' '1055' 'IPR001599' '\

    This family contains serum complement C3 and C4 precursors and alpha-macrogrobulins.

    \ \ \

    The alpha-macroglobulin (aM) family of proteins includes protease inhibitors PUBMED:2473064, typified by the human tetrameric a2-macroglobulin (a2M); they belong to the MEROPS proteinase inhibitor family I39, clan IL. These protease inhibitors share several defining properties, which include (i) the ability to inhibit proteases from all catalytic classes, (ii) the presence of a \'bait region\' and a thiol ester, (iii) a similar protease inhibitory\ mechanism and (iv) the inactivation of the inhibitory capacity by reaction of the thiol ester with small primary amines. \ aM protease inhibitors inhibit by steric hindrance PUBMED:2472396. The mechanism involves protease cleavage of the bait region, a segment of the aM that is particularly susceptible to proteolytic cleavage, which initiates a conformational change such that the aM collapses about the protease. In the resulting aM-protease complex, the active site of the protease is sterically shielded, thus substantially decreasing access to protein substrates. Two additional events occur as a consequence of bait region cleavage, namely (i) the h-cysteinyl-g-glutamyl thiol ester becomes highly reactive and (ii) a major conformational change exposes a conserved COOH-terminal receptor binding domain PUBMED:2469470 (RBD). RBD exposure allows the aM protease complex to bind to clearance receptors and be removed from circulation PUBMED:2430968. Tetrameric, dimeric, and, more recently, monomeric aM protease inhibitors have been identified PUBMED:9914899, PUBMED:10426429.

    \ \ ' '1056' 'IPR002890' '\

    The proteinase-binding alpha-macroglobulins (A2M) PUBMED:2473064 are large glycoproteins found in the plasma of vertebrates, in the hemolymph of some invertebrates and in reptilian and avian egg white. A2M-like proteins are able to inhibit all four classes of proteinases by a \'trapping\' mechanism. They have a peptide stretch, called the \'bait region\', which contains specific cleavage sites for different proteinases. When a proteinase cleaves the bait region, a conformational change is induced in the protein, thus trapping the proteinase. The entrapped enzyme remains active against low molecular weight substrates, whilst its activity toward larger substrates is greatly reduced, due to steric hindrance. Following cleavage in the bait region, a thiol ester bond, formed between the side chains of a cysteine and a glutamine, is cleaved and mediates the covalent binding of the A2M-like protein to the proteinase. This family includes the N-terminal region of the alpha-2-macroglobulin family.

    \ \

    The inhibitor domains belong to MEROPS inhibitor family I39.

    \ ' '1057' 'IPR000833' '\

    Alpha-amylase inhibitor inhibits mammalian alpha-amylases specifically, by forming a tight stoichiometric 1:1 complex with alpha-amylase. The inhibitor has no action on plant and microbial alpha amylases.

    \

    A crystal structure has been determined for tendamistat, the 74-amino acid inhibitor produced by Streptomyces tendae that targets a wide range of mammalian alpha-amylases PUBMED:14501112. The binding of tendamistat to alpha-amylase leads to the steric blockage of the active site of the enzyme. The crystal structure of tendamistat revealed an immunoglobulin-like fold that could potentially adopt multiple conformations. Such molecular flexibility could enable an induced-fit type of binding that would both optimise binding and allow broad target specificity.

    \

    More information about this protein can be found at Protein of the Month: alpha-Amylase PUBMED:.

    \ ' '1058' 'IPR002466' '\ Editase () are enzymes that alter mRNA by catalyzing the\ site-selective deamination of adenosine residue into inosine residue.\ The editase domain contains the active site and binds three Zn atoms PUBMED:9159072.\ \ Several editases share a common global arrangement of domains, from N to C terminus: two\ \'double-stranded RNA-specific adenosine deaminase\' (DRADA) repeat domains (), followed by\ three \'double-stranded RNA binding\' (DsRBD) domains (), followed by\ the editase domain. Other editases have a simplified domains structure with no\ DRADA_REP and possibly fewer DSRBD domains. Editase that deaminate cytidine are not detected by this signature.\ ' '1059' 'IPR004841' '\

    Amino acid permeases are integral membrane proteins involved in the transport of amino acids into the cell. A number of such proteins have been found to be evolutionary related PUBMED:3146645, PUBMED:2687114, PUBMED:8382989. These proteins seem to contain up to 12 transmembrane segments. The best conserved region in this family is located in the second transmembrane segment.

    \

    This domain is found in a wide variety of permeases, as well as several hypothetical proteins.

    \ ' '1060' 'IPR005128' '\ Alpha-acetolactate decarboxylase plays a dual role in the\ cell: (i) it catalyzes the second step of the acetoin pathway, \ and thus potentially the internal pH of cells\ and (ii) it controls the pool of alpha-acetolactate during leucine\ and valine synthesis.\ ' '1061' 'IPR007689' '\ Mating-type protein A-alpha specifies the A-alpha-Y mating type. The A-alpha-Y protein binds to the AalphaZ protein of another mating type in Schizophyllum commune PUBMED:9286672 and may also regulate gene expression of the homokaryotic cell.\ ' '1062' 'IPR005079' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to MEROPS peptidase family C45 (clan PB(C)). The active site residue for members of this family and family T1 is C-terminal to the autolytic cleavage site. They represent a family of enzymes which catalyse the final step in penicillin biosynthesis PUBMED:2120195.

    \ ' '1063' 'IPR003496' '\ This is a family of plant proteins induced by water deficit stress (WDS) PUBMED:9426600, or abscisic acid (ABA) stress and ripening PUBMED:7630961. The Ip3 cDNA clone is expressed at high levels in the roots, and is induced by ABA under WDS.\ ' '1064' 'IPR001626' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain PUBMED:9873074.

    \ The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site PUBMED:11421269, PUBMED:1282354, PUBMED:9640644.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis PUBMED:11988180, PUBMED:11470432, PUBMED:11402022, PUBMED:9872322, PUBMED:11080142, PUBMED:11532960.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions PUBMED:9873074. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette PUBMED:9873074, PUBMED:11421270. More than 50 subfamilies have been described based on a phylogenetic and functional classification PUBMED:9873074, PUBMED:11421269, PUBMED:11421270; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    \ A number of bacterial transport systems have been found to contain integral\ membrane components that have similar sequences PUBMED:1303751: these systems fit the\ characteristics of ATP-binding cassette transporters PUBMED:1659649. The\ proteins form homo- or hetero-oligomeric channels, allowing ATP-mediated \ transport. Hydropathy analysis of the proteins has revealed the presence\ of 6 possible transmembrane regions. These proteins belong to family 3 of ABC transporters.\ ' '1065' 'IPR013525' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain PUBMED:9873074.

    \ The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site PUBMED:11421269, PUBMED:1282354, PUBMED:9640644.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis PUBMED:11988180, PUBMED:11470432, PUBMED:11402022, PUBMED:9872322, PUBMED:11080142, PUBMED:11532960.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions PUBMED:9873074. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette PUBMED:9873074, PUBMED:11421270. More than 50 subfamilies have been described based on a phylogenetic and functional classification PUBMED:9873074, PUBMED:11421269, PUBMED:11421270; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    \ A number of bacterial transport systems have been found to contain integral\ membrane components that have similar sequences PUBMED:1303751: these systems fit the\ characteristics of ATP-binding cassette transporters PUBMED:1659649. The\ proteins form homo- or hetero-oligomeric channels, allowing ATP-mediated \ transport. Hydropathy analysis of the proteins has revealed the presence\ of 6 possible transmembrane regions. These proteins belong to family 2 of ABC transporters.\ ' '1066' 'IPR001140' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain PUBMED:9873074.

    \ The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site PUBMED:11421269, PUBMED:1282354, PUBMED:9640644.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis PUBMED:11988180, PUBMED:11470432, PUBMED:11402022, PUBMED:9872322, PUBMED:11080142, PUBMED:11532960.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions PUBMED:9873074. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette PUBMED:9873074, PUBMED:11421270. More than 50 subfamilies have been described based on a phylogenetic and functional classification PUBMED:9873074, PUBMED:11421269, PUBMED:11421270; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    \

    A variety of ATP-binding transport proteins have a six transmembrane\ helical region. They are all integral membrane proteins\ involved in a variety of transport systems. Members of this family include; the\ cystic fibrosis transmembrane conductance regulator (CFTR), bacterial leukotoxin\ secretion ATP-binding protein, multidrug resistance proteins, the yeast leptomycin B\ resistance protein, the mammalian sulphonylurea receptor and antigen peptide\ transporter 2. Many of these proteins have two such regions.

    \ ' '1067' 'IPR003439' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain PUBMED:9873074.

    \ The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site PUBMED:11421269, PUBMED:1282354, PUBMED:9640644.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis PUBMED:11988180, PUBMED:11470432, PUBMED:11402022, PUBMED:9872322, PUBMED:11080142, PUBMED:11532960.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions PUBMED:9873074. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette PUBMED:9873074, PUBMED:11421270. More than 50 subfamilies have been described based on a phylogenetic and functional classification PUBMED:9873074, PUBMED:11421269, PUBMED:11421270; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    \

    On the basis of sequence similarities a family of related ATP-binding proteins has been characterised PUBMED:2229036, PUBMED:3288195, PUBMED:3762694, PUBMED:3762695, PUBMED:1977073.

    \ \ \

    The proteins belonging to this family also contain one or two copies of the \'A\' consensus sequence PUBMED:6329717 or the \'P-loop\' PUBMED:2126155 (see ).

    \ ' '1068' 'IPR000073' '\

    The alpha/beta hydrolase fold PUBMED:1409539 is common to a number of hydrolytic enzymes of widely differing phylogenetic origin and catalytic function. The core of each enzyme is an alpha/beta-sheet (rather than a barrel), containing 8 strands connected by helices PUBMED:1409539. The enzymes are believed to have diverged from a common ancestor, preserving the arrangement of the catalytic residues. All have a catalytic triad, the elements of which are borne on loops, which are the best conserved structural features of the fold. Esterase (EST) from Pseudomonas putida is a member of the alpha/beta hydrolase fold superfamily of enzymes PUBMED:16321951.

    \ \

    In most of the family members the beta-strands are parallels, but some have an inversion of the first strands, which gives it an antiparallel orientation. The catalytic triad residues are presented on loops. One of these is the nucleophile elbow and is the most conserved feature of the fold. Some other members lack one or all of the catalytic residues. Some members are therefore inactive but others are involved in surface recognition. The ESTHER database PUBMED: gathers and annotates all the published information related to gene and protein sequences of this superfamily PUBMED:14681380.

    \

    This entry represents fold-1 of alpha/beta hydrolase.

    \ ' '1069' 'IPR004697' '\

    The p-aminobenzoyl-glutamate transporter family includes two putative transporters, the AbgT protein of Escherichia coli and MtrF of Neisseria gonorrhoeae. AbgT expression is apparently cryptic in wild type cells, but when present on a high copy number plasmid, or when expressed at higher levels due to mutation, it allows utilization of p-aminobenzoyl-glutamate as a source of p-aminobenzoate for p-aminobenzoate auxotrophs PUBMED:9829935. p-Aminobenzoate is a constituent of, and a precursor for, the biosynthesis of folic acid. It is not currently known if AbgT is naturally involved in transporting p-aminobenzoyl-glutamate, or if it only becomes involved when under altered regulation. MtrF is an inner membrane protein which, together with the MtrCDE efflux pump, is required for high-level resistance to hydrophobic antimicrobial agents in N. gonorrhoeae PUBMED:12493784. Its role in this process is not known, but it has been suggested that it may be a component of the efflux pump which is dispensible for basal activity, but required for high-level activity PUBMED:15901695.

    \ ' '1070' 'IPR003140' '\ This family consists of both phospholipases PUBMED:9644627 and carboxylesterases with broad substrate specificity, and is structurally related to alpha/beta hydrolases PUBMED:9438866.\ ' '1071' 'IPR003675' '\

    This family consists of various hypothetical protein sequences for which the function is unknown.One of the proteins is an abortive infection protein that confers resistance to the Lactococcus phage 712 PUBMED:8795193. AbiG is an abortive infection (Abi) mechanism encoded by the conjugative plasmid pCI750 originally isolated from Lactococcus lactis subsp. cremoris (Streptococcus cremoris). The resistance mechanism acts at neither the phage adsorption or phage DNA restriction level PUBMED:8795193.

    \ \

    Also in this family is a series of bacteriocin-like peptides PlnP, PlnI, PlnT, PlnP and PlnU from Lactobacillus plantarum C11 that secretes a small cationic peptide, plantaricin A, that serves as an induction signal for bacteriocin production as well as transcription of plnABCD. The plnABCD operon encodes the plantaricin A precursor (PlnA) itself and determinants (PlnBCD) for a signal transducing pathway PUBMED:8755874.

    \ ' '1072' 'IPR007138' '\

    This domain is found in monooxygenases involved in the biosynthesis of several antibiotics by Streptomyces species, which can carry out oxygenation without the assistance of any of the prosthetic groups, metal ions or cofactors normally associated with activation of molecular oxygen. The structure of ActVA-Orf6 monooxygenase from Streptomyces coelicolor (), which is involved in actinorhodin biosynthesis, reveals a dimeric alpha+beta barrel topology PUBMED:12514126. There is also a conserved histidine that is likely to be an active site residue. In S. coelicolor SCO1909 () this domain occurs as a repeat.

    \ ' '1073' 'IPR000582' '\

    Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters PUBMED:1454809. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor PUBMED:1649940.

    \ \

    ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species PUBMED:16018771.

    \ \

    Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats () PUBMED:16018771.

    \ \

    The ACB domain consists of four alpha-helices arranged in a bowl shape with a\ highly exposed acyl-CoA-binding site. The ligand is bound\ through specific interactions with residues on the protein, most notably\ several conserved positive charges that interact with the phosphate group on\ the adenosine-3\'phosphate moiety, and the acyl chain is sandwiched between the\ hydrophobic surfaces of CoA and the protein PUBMED:11491287.

    \ \

    Other proteins containing an ACB domain include:\

    \

    \ ' '1074' 'IPR020582' '\

    This entry contains the alpha subunit the acetyl coenzyme A carboxylase complex (). It catalyses the first step in the synthesis of long-chain fatty acids which involves the carboxylation of acetyl-CoA to malonyl-CoA. The acetyl-CoA carboxylase complex is a heterohexamer of biotin carboxyl carrier protein, biotin carboxylase and two non-identical carboxyl transferase subunits (alpha and beta) in a 2:2 association PUBMED:1355089. The reaction involves two steps:

    \ \ \ ' '1075' 'IPR000890' '\ Acetate kinase, which is predominantly found in micro-organisms, facilitates the production of \ acetyl-CoA by phosphorylating acetate in the presence of ATP and a divalent cation PUBMED:8226682, \ PUBMED:8396545. The enzyme is important in the process of glycolysis, enzyme levels being increased \ in the presence of excess glucose. The growth of a bacterial mutant lacking acetate kinase has \ been shown to be inhibited by glucose, suggesting that the enzyme is involved in excretion of excess \ carbohydrate PUBMED:8226682. A related enzyme, butyrate kinase, facilitates the formation of \ butyryl-CoA by phosphorylating butyrate in the presence of ATP to form butyryl phosphate PUBMED:8396545.\ ' '1077' 'IPR003702' '\ This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase from yeast catalyses the formation of acetate from acetyl-CoA, CoA transferase (CAT1)\ produces succinyl-CoA, and acetate-CoA transferase utilises acyl-CoA and acetate to form acetyl-CoA.\ ' '1078' 'IPR001447' '\

    Arylamine N-acetyltransferase (NAT) is a cytosolic enzyme of approximately 30 kDa. It facilitates the transfer of an acetyl\ group from acetyl coenzyme A on to a wide range of arylamine, N-hydroxyarylamines and hydrazines. Acetylation of\ these compounds generally results in inactivation. NAT is found in many species from Mycobacteria (Mycobacterium tuberculosis, Mycobacterium smegmatis etc) to Homo sapiens (Human). It was the first enzyme to be observed to have polymorphic activity amongst human individuals.\ NAT is responsible for the inactivation of Isoniazid (a drug used to treat tuberculosis) in humans. The NAT protein has\ also been shown to be involved in the breakdown of folic acid. NAT catalyses the reaction:

    \ \ \

    NAT is the target of a common genetic polymorphism of clinical relevance in\ humans. The N-acetylation polymorphism is determined by low or high NAT\ activity in liver. NAT has been implicated in the action and toxicity \ of amine-containing drugs, and in the susceptibility to cancer and\ systematic lupus erythematosus. Two highly similar human genes for NAT, \ termed NAT1 and NAT2, encode genetically invariant and variant NAT proteins,\ respectively.

    \ ' '1079' 'IPR000560' '\

    Acid phosphatases () are a heterogeneous group of proteins that hydrolyse phosphate esters, optimally at low pH. It has been shown PUBMED:1989985 that a number of acid phosphatases, from both prokaryotes and eukaryotes, share two regions of sequence similarity, each centred around a conserved histidine residue. These two histidines seem to be involved in the enzymes\' catalytic mechanism PUBMED:8334986, PUBMED:1429631. The first histidine is located in the N-terminal section and forms a phosphohistidine intermediate while the second is located in the C-terminal section and possibly acts as proton donor. Enzymes belonging to this family are called \'histidine acid phosphatases\' and include:

    \

    \ ' '1080' 'IPR005519' '\ This family of class B acid phosphatases also contains a number of vegetative storage proteins (VPS25). The acid phosphatase activity of VPS has been experimentally demonstrated PUBMED:1639823.\ ' '1081' 'IPR001030' '\

    3-isopropylmalate dehydratase (or isopropylmalate isomerase; ) catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family PUBMED:9020582. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively PUBMED:1400210, PUBMED:9813279. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase , converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis PUBMED:15522288. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus PUBMED:16524361.

    \ \ \ \ \

    Aconitase (aconitate hydratase; ) is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop PUBMED:10087914, PUBMED:15877277. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal \'swivel\' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the \'swivel\' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3) PUBMED:9020582. Below is a description of some of the multi-functional activities associated with different aconitases.

    \ \

    \ \

    \ \

    \ \

    \ \

    This entry represents a region containing 3 domains, each with a 3-layer alpha/beta/alpha topology. This regions represents the [4Fe-4S] cluster-binding region found at the N-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the C-terminal of bacterial AcnB. This domain is also found in the large subunit of isopropylmalate dehydratase (LeuC).

    \

    More information about these proteins can be found at Protein of the Month: Aconitase PUBMED:.

    \ ' '1082' 'IPR000573' '\

    3-isopropylmalate dehydratase (or isopropylmalate isomerase; ) catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family PUBMED:9020582. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively PUBMED:1400210, PUBMED:9813279. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase , converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis PUBMED:15522288. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus PUBMED:16524361.

    \ \ \ \ \

    Aconitase (aconitate hydratase; ) is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop PUBMED:10087914, PUBMED:15877277. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal \'swivel\' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the \'swivel\' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3) PUBMED:9020582. Below is a description of some of the multi-functional activities associated with different aconitases.

    \ \

    \ \

    \ \

    \ \

    \ \

    This entry represents the \'swivel\' domain found at the C-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA. This domain has a three layer beta/beta/alpha structure, and in cytosolic Acn is known to rotate between the cAcn and IRP1 forms of the enzyme. This domain is also found in the small subunit of isopropylmalate dehydratase (LeuD).

    \

    More information about these proteins can be found at Protein of the Month: Aconitase PUBMED:.

    \ ' '1083' 'IPR002655' '\

    Acyl-CoA oxidase (ACO) acts on CoA derivatives of fatty acids with chain lengths from 8 to 18. It catalyses the first and rate-determining step of the peroxisomal beta-oxidation of fatty acids PUBMED:11872165.

    \ \

    Acyl-CoA oxidase is a homodimer and the polypeptide chain of the subunit is folded into the N-terminal alpha-domain, beta-domain, and C-terminal alpha-domain PUBMED:11872165. Functional differences between the peroxisomal acyl-CoA oxidases and the mitochondrial acyl-CoA dehydrogenases are attributed to structural differences in the FAD environments PUBMED:15581893.

    \ \

    Experimental data indicate that, in the pumpkin, the expression pattern of ACOX is very similar to that of the glyoxysomal enzyme 3-ketoacyl-CoA thiolase PUBMED:9525937. In humans, defects in ACOX1 are the cause of pseudoneonatal adrenoleukodystrophy, also known as peroxisomal acyl-CoA oxidase deficiency. Pseudo-NALD is a peroxisomal single-enzyme disorder. Clinical features include mental retardation, leukodystrophy, seizures, mild hepatomegaly and hearing deficit. Pseudo-NALD is characterised by increased plasma levels of very-long chain fatty acids due to a decrease in, or absence of, peroxisome acyl-CoA oxidase activity, despite the peroxisomes being intact and functioning.

    \ \

    This entry represents the Acyl-CoA oxidase C-terminal.

    \ ' '1084' 'IPR001036' '\

    The Escherichia coli acrA and acrB genes encode a multi-drug efflux system that is believed to protect the bacterium against hydrophobic inhibitors PUBMED:8407802. The E. coli AcrB protein is a transporter that is energized by proton-motive force and that shows the widest substrate specificity among all known multidrug pumps, ranging from most of the currently used antibiotics, disinfectants, dyes, and detergents to simple solvents.

    \ \

    The structure of ligand-free AcrB shows that it is a homotrimer of 110kDa per subunit. Each subunit contains 12 transmembrane helices and two large\ periplasmic domains (each exceeding 300 residues) between helices 1 and 2, and helices 7 and 8. X-ray analysis of the overexpressed AcrB protein demonstrated that the three periplasmic domains form, in the centre, a funnel-like structure and a connected narrow (or closed) pore. The pore is opened to the periplasm through three vestibules located at subunit interfaces. These vestibules were proposed to allow direct access of drugs from the periplasm as well as the outer leaflet of the cytoplasmic membrane. The three transmembrane domains of AcrB protomers form a large, 30A-wide central cavity that spans the cytoplasmic membrane and extends to the cytoplasm

    \ \

    X-ray crystallographic structures of the trimeric AcrB pump from E. coli with four structurally diverse ligands demonstrated that three molecules of ligand bind simultaneously to the extremely large central cavity of 5000 cubic angstroms, primarily by hydrophobic, aromatic stacking and van der Waals interactions. Each ligand uses a slightly different subset of AcrB residues for binding. The bound ligand molecules often interact with each other, stabilising the binding.

    \ ' '1085' 'IPR007752' '\ The ActA family is found in Listeria and is associated with motility. ActA protein acts as a scaffold to assemble and activate host cell actin cytoskeletal factors at the bacterial surface, resulting in directional actin polymerisation and propulsion of the bacterium through the cytoplasm of the host cell PUBMED:11886549, PUBMED:11854187.\ ' '1086' 'IPR013531' '\

    Pro-opiomelanocortin is present in high levels in the pituitary and is processed into 3 major peptide families: adrenocorticotrophin (ACTH); alpha-, beta- and gamma-melanocyte- stimulating hormones (MSH); and beta-endorphin PUBMED:2266117. ACTH regulates the synthesis and release of glucocorticoids and, to some extent, aldosterone in the adrenal cortex. It is synthesised and released in response to corticotrophin-releasing factor at times of stress (i.e. heat, cold, infection, etc.), its release leading to increased metabolism. The action of MSH in man is poorly understood, but it may be involved in temperature regulation PUBMED:2266117. Full activity of ACTH resides in the first 20 N-terminal amino acids, the first 13 of which are identical to alpha-MSH PUBMED:2266117, PUBMED:2839146.

    \

    The function of this region is not known, though it is found near the centre of these proteins.

    \ ' '1087' 'IPR004000' '\

    Actin PUBMED:1388079, PUBMED:8448030 is a ubiquitous protein involved in the formation of filaments that are major components of the cytoskeleton. These filaments interact with myosin to produce a sliding effect, which is the basis of muscular contraction and many aspects of cell motility, including cytokinesis. Each actin protomer binds one molecule of ATP and has one high affinity site for either calcium or magnesium ions, as well as several low affinity sites. Actin exists as a monomer in low salt concentrations, but filaments form rapidly as salt concentration rises, with the consequent hydrolysis of ATP. Actin from many sources forms a tight complex with deoxyribonuclease (DNase I) although the significance of this is still unknown. The formation of this complex results in the inhibition of DNase I activity, and actin loses its ability to polymerise. It has been shown that an ATPase domain of actin shares similarity with ATPase domains of hexokinase and hsp70\ proteins PUBMED:1828889, PUBMED:1323828.

    \

    In vertebrates there are three groups of actin isoforms: alpha, beta and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exists in most cell types as components of the cytoskeleton and as mediators of internal cell motility. In plants there are many isoforms which are probably involved in a variety of functions such as cytoplasmic streaming, cell shape determination, tip growth, graviperception, cell wall deposition, etc.

    \

    Recently some divergent actin-like proteins have been identified in several species. These proteins include centractin (actin-RPV) from mammals, fungi yeast ACT5, Neurospora crassa ro-4) and Pneumocystis carinii, which seems to be a component of a multi-subunit centrosomal complex involved in microtubule based vesicle motility (this subfamily is known as ARP1); ARP2 subfamily, which includes chicken ACTL, Saccharomyces cerevisiae ACT2, Drosophila melanogaster 14D and Caenorhabditis elegans actC; ARP3 subfamily, which includes actin 2 from mammals, Drosophila 66B, yeast ACT4 and Schizosaccharomyces pombe act2; and ARP4 subfamily, which includes yeast ACT3 and Drosophila 13E.

    \ ' '1088' 'IPR002864' '\ This family consists of various acyl-acyl carrier protein (ACP) thioesterases (TE) which terminate fatty acyl group extension via hydrolyzing an acyl group on a fatty acid PUBMED:7479856.\ ' '1089' 'IPR006091' '\

    Acyl-CoA dehydrogenases () are a family of flavoproteins that catalyse the alpha,beta-dehydrogenation of acyl-CoA thioesters to the corresponding trans 2,3-enoyl CoA-products with the concomitant reduction of enzyme-bound FAD. Different family members share a high sequence identity, catalytic mechanisms, and structural properties, but differ in the position of their catalytic bases and in their substrate binding specificity. Butyryl-CoA dehydrogenase PUBMED:11812788 prefers short chain substrates, medium chain- and long-chain acyl-CoA dehydrogenases prefer medium and long chain substrates, respectively, and Isovaleryl-CoA dehydrogenase PUBMED:9214289 prefers branched-chain substrates.

    \

    The monomeric enzyme is folded into three domains of approximately equal size, where the N-terminal domain is all-alpha, the middle domain is an open (5,8) barrel, and the C-terminal domain is a four-helical bundle. The constituent families differ in the numbers of C-terminal domains. This entry represents the middle beta-barrel domain found in medium chain acyl-CoA dehydrogenases, as well as in the related peroxisomal acyl-CoA oxidase-II enzymes. Acyl-CoA oxidase (ACO; ) catalyzes the first and rate-determining step of the peroxisomal beta-oxidation of fatty acids PUBMED:11872165.

    \ ' '1090' 'IPR006092' '\

    Mammalian Co-A dehydrogenases () are enzymes that catalyse the first step in each cycle of beta-oxidation in mitochondion. Acyl-CoA dehydrogenases PUBMED:3326738, PUBMED:2777793, PUBMED:8034667 catalyze the alpha,beta-dehydrogenation of acyl-CoA thioesters to the corresponding trans 2,3-enoyl CoA-products with concommitant reduction of enzyme-bound FAD. Reoxidation of the flavin involves transfer of electrons to ETF (electron transfering flavoprotein). These enzymes are homodimers containing one molecule of FAD.

    The monomeric enzyme is folded into three domains of approximately equal size. The N-terminal and the C-terminal are mainly alpha-helices packed together, and the middle domain consists of two orthogonal beta-sheets. The flavin ring is buried in the crevise between two alpha-helical domains and the beta-sheet of one subunit, and the adenosine pyrophosphate moiety is stretched into the subunit junction with one formed by two C-terminal domains PUBMED:8356049.

    The N-terminal domain of Acyl-CoA dehydrogenase is an all-alpha domain, on dimerisation, the N-terminal of one molecule extends into the other dimer and lies on the surface of the molecule.

    \ ' '1091' 'IPR003703' '\

    Acyl-CoA thioesterases are a group of enzymes that catalyse the hydrolysis of acyl-CoAs to the free fatty acid and coenzyme A (CoASH). They consequently have the potential to regulate intracellular levels of acyl-CoAs, free fatty acids and CoASH. They may also be involved in the metabolic regulation of peroxisome proliferation.

    \ \

    Thioesters play a central role in cells as they participate in metabolism, membrane synthesis, signal transduction, and gene regulation. Thioesterases catalyse the hydrolysis of thioesters to the thiol and carboxylic acid components. Many thioesterases have a hot dog fold, including YciA from Escherichia coli and its close sequence homologue HI0827 from Haemophilus influenzae (HiYciA) PUBMED:18247525.

    \ \

    In Helicobacter pylori, YbgC also belongs to the hot-dog family of proteins, with a epsilongamma tetrameric arrangement PUBMED:18338382. YbgC proteins are bacterial acyl-CoA thioesterases associated with the Tol-Pal system. This system is important for cell envelope integrity and is part of the cell division machinery.

    \ \

    The E. coli thioesterase II, however, reveals a new tertiary fold: a \'double hot dog\'. It has an internal repeat with a basic unit that is structurally similar to the recently described beta-hydroxydecanoyl thiol ester dehydrase PUBMED:10876240.

    \ \ ' '1092' 'IPR014043' '\ Enzymes like bacterial malonyl CoA-acly carrier protein transacylase () \ and eukaryotic fatty acid synthase () that are involved in fatty acid\ biosynthesis belong to this group. Also included are the polyketide synthases \ 6-methylsalicylic acid synthase (), a multifunctional enzyme that involved\ in the biosynthesis of patulin and conidial green pigment synthase ().\ ' '1093' 'IPR003157' '\

    This bacterial family of Acyl transferases (ACT or myristoyl-acp-specific thioesterases) catalyses the first step in the bioluminescent fatty acid reductase system, which is required for aldehyde biosynthesis.

    \ \

    This enzyme belongs to the LuxD family. Together with acyl-protein synthetase (LuxE) and reductase (LuxC), it belongs to a multienzyme complex. This complex channels activated fatty acids into the aldehyde substrate for the luciferase-catalyzed bacterial bioluminescence reaction PUBMED:11018714, PUBMED:8472957. The C-terminal region of LuxD interacts with LuxE to causes a conformational change PUBMED:11018714. LuxD has a calculated M(r) of 34,384 and comprises 305 aa residues PUBMED:8472957.

    \ \

    Induction of luminescence only occurs at high cell density. Some bacteria have N-acylhomoserine lactone autoinducers for luminescence PUBMED:10398554.

    \ ' '1094' 'IPR001792' '\

    Acylphosphatase () is an enzyme of approximately 98 amino acid residues that specifically catalyses the hydrolysis of the carboxyl-phosphate bond of acylphosphates PUBMED:1664426, its substrates including 1,3-diphosphoglycerate and carbamyl phosphate PUBMED:2538623. The enzyme has a mainly beta-sheet structure with 2 short alpha-helical segments. It is distributed in a tissue-specific manner in a wide variety of species, although its physiological role is as yet unknown PUBMED:2538623: it may, however, play a part in the regulation of the glycolytic pathway and pyrimidine biosynthesis PUBMED:2830253. There are two known isozymes. One seems to be specific to muscular tissues, the other, called \'organ-common type\', is found in many different tissues. While bacterial and archebacterial hypothetical proteins that are highly similar to that enzyme and that probably possess the same activity.

    \

    These proteins include:\

    \

    An acylphosphatase-like domain is also found in some prokaryotic hydrogenase maturation HypF carbamoyltransferases PUBMED:9799289, PUBMED:12206761.

    \ ' '1095' 'IPR002123' '\

    This family contains acyltransferases involved in phospholipid biosynthesis and other proteins of unknown function PUBMED:9259571. This domain is found in tafazzins, defects in which are the cause of Barth syndrome; a severe inherited disorder which is often fatal in childhood and is characterised by cardiac and skeletal abnormalities. Phospholipid/glycerol acyltransferase is not found in the viruses or the archaea and is under represented in the bacteria. Bacterial glycerol-phosphate acyltransferases are involved in membrane biogenesis since they use fatty acid chains to form the first membrane phospholipids PUBMED:18369234.

    \ ' '1096' 'IPR004026' '\

    The Escherichia coli Ada protein repairs O6-methylguanine residues and methyl phosphotriesters in DNA by direct transfer of the methyl group to a cysteine residue. This domain contains four conserved cysteines that form a zinc binding site PUBMED:1581309, PUBMED:8500619. One of these cysteines is a methyl group acceptor. The methylated domain can then specifically bind to the ada box on a DNA duplex PUBMED:8500619.

    \ ' '1097' 'IPR002553' '\

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer PUBMED:15261670.

    \

    Clathrin coats contain both clathrin and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors PUBMED:17449236. All AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). Each subunit has a specific function. Adaptin subunits recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal appendage domains. By contrast, GGAs are monomers composed of four domains, which have functions similar to AP subunits: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The GAE domain is similar to the AP gamma-adaptin ear domain, being responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis PUBMED:12858162.

    \

    While clathrin mediates endocytic protein transport from ER to Golgi, coatomers (COPI, COPII) primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins PUBMED:14690497. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes PUBMED:17041781. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta\', gamma, delta, epsilon and zeta subunits.

    \

    This entry represents the N-terminal domain of various adaptins from different AP clathrin adaptor complexes (including AP1, AP2, AP3 and AP4), and from the beta and gamma subunits of various coatomer (COP) adaptors. This domain has a 2-layer alpha/alpha fold that forms a right-handed superhelix, and is a member of the ARM repeat superfamily PUBMED:12086608. The N-terminal region of the various AP adaptor proteins share strong sequence identity; by contrast, the C-terminal domains of different adaptins share similar structural folds, but have little sequence identity PUBMED:2495531. It has been proposed that the N-terminal domain interacts with another uniform component of the coated vesicles.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '1098' 'IPR005019' '\

    This family of methyladenine glycosylases includes DNA-3-methyladenine glycosylase I () which acts as a base excision repair enzyme by severing the glycosylic bond\ of numerous damaged bases. The enzyme is constitutively expressed and is specific for the alkylated 3-methyladenine DNA.

    \ ' '1099' 'IPR003381' '\

    The late 100 kDa protein is a non-structural viral protein involved in the transport of hexon from the cytoplasm to the nucleus.

    \ ' '1100' 'IPR004292' '\

    The adenoviral protein 52K (named after the earliest known 52kDa members) is a DNA-binding protein PUBMED:8627769. The adenoviral 52K protein is encoded by a late gene PUBMED:9791022 in various adenoviruses including the avian adenovirus: chicken embryo lethal orphan (CELO) virus PUBMED:8627769.

    \ \

    This entry represents the adenoviral 52K-like protein and includes proteins involved in virion assembly.

    \ ' '1101' 'IPR003853' '\ This is a family of adenoviral early E1A proteins. The E1A protein is 32 kDa it can however be cleaved to yield the 28 kDa protein. The E1A protein is responsible for the transcriptional activation of the early genes with in the viral genome at the start of the infection process as well as some cellular genes PUBMED:1835093.\ ' '1102' 'IPR002924' '\

    This family consists of adenovirus E1B 19 kDa protein or small t-antigen. The E1B 19 kDa protein inhibits E1A induced apoptosis and hence prolongs the viability of the host cell PUBMED:8083992. It can also inhibit apoptosis mediated by tumour necrosis factor alpha and Fas antigen PUBMED:8083992. E1B 19 kDa blocks apoptosis by interacting with and inhibiting the p53-inducible and death-promoting Bax protein PUBMED:8600029. The E1B region of adenovirus encodes two proteins E1B 19 kDa the small t-antigen as found in this family and E1B 55 kDa the\ large t-antigen which is not found in this family; both\ of these proteins inhibit E1A induced apoptosis PUBMED:8083992.

    \ ' '1103' 'IPR002612' '\

    This family consists of adenovirus E1B 55 kDa protein or large t-antigen. E1B 55 kDa binds p53 the tumor suppressor protein converting it from a transcriptional activator which responds to damaged DNA in to an unregulated repressor of genes with a p53 binding site PUBMED:10207064. This protects the virus against p53 induced host antiviral responses and prevents apoptosis as induced by the adenovirus E1A protein PUBMED:10207064.\ The E1B region of adenovirus encodes two proteins E1B 55 kDa, the large t-antigen as found in this family and E1B 19 kDa , the small t-antigen. Both of these proteins inhibit E1A induced apoptosis.

    \ ' '1104' 'IPR006717' '\ This domain constitutes the N-terminal of E1B 55 kDa (). E1B 55K binds p53 the tumour suppressor protein converting it from a transcriptional activator which responds to damaged DNA into an unregulated repressor of genes with a p53 binding site PUBMED:10207064. This protects the virus against p53 induced host anitviral responses and prevents apoptosis as induced by the adenovirus E1A protein PUBMED:10207064. The role of the N-terminal in the function of E1B is not known.\ ' '1105' 'IPR008131' '\

    The E3B 14.5 kDa was first identified in human adenovirus type 5. It is an integral membrane protein oriented with its C terminus in the cytoplasm. It functions to down-regulate the epidermal growth factor receptor and prevent tumour necrosis factor cytolysis. It achieves this through the interaction with E3 10.4 kDa protein PUBMED:9488477, PUBMED:1531370.

    \ ' '1106' 'IPR004985' '\

    Adenoviruses have evolved multiple mechanisms to evade the host immune response. Several of the immunomodulatory proteins are encoded in early transcription unit 3 (E3) of human adenoviruses (Ads). These proteins appear to control viral interactions with the host PUBMED:8627757. This entry represents a 15.3kDa protein from the E3 region, which may protect virus-infected cells from TNF-induced cytolysis.

    \ ' '1107' 'IPR003471' '\ Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host PUBMED:8627757. This region called CR1 (conserved region 1) PUBMED:8627757 is found three times in Human adenovirus 19 (a subgroup D virus) 49 Kd protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.\ ' '1108' 'IPR003470' '\ Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host PUBMED:8627757. This region called CR1 (conserved region 1) PUBMED:8627757 is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.\ ' '1109' 'IPR005041' '\

    Adenoviruses are medium-sized, non-enveloped viruses containing double-stranded DNA. They can cause a variety of diseases including pneumonia, cystitis, conjunctivitis and diarrhoea, all of which can be fatal to patients who are immunocompromised PUBMED:7704534. These viruses have many mechanisms to evade the host immune response, including several proteins which are expressed as part of the early transcription unit 3 (E3) PUBMED:7555057. One of the regions of E3, known as the E3B region, encodes three proteins known as 10.4K, 14.5K and 14.7K. Two of these proteins, 10.4K and 14.5K, form the RID complex (receptor internalisation and degradation) which protects the infected cell from host-induced lysis by clearing the the TNF and Fas receptors from the cell surface PUBMED:9707602. Other receptors, such as the epidermal growth factor receptor, are also known to be cleared by RID PUBMED:2522818.

    \ \

    This entry represents the E3B region 10.4K protein, also known as the RID alpha subunit.

    \ \ \ ' '1110' 'IPR007615' '\

    Adenoviruses E4 is essential for DNA replication and late protein synthesis PUBMED:18562516. The adenovirus, early region 4 open reading frame 3 (E4 ORF3) protein is required for viral DNA replication during the interferon (IFN)-induced antiviral state PUBMED:18480450.

    \ \

    The E4 ORF3 protein reorganises the promyelocytic leukemia (PML) protein nuclear bodies. These normally punctate structures are reorganised by E4 ORF3 into tracks that eventually surround viral replication centres. PML rearrangement is an evolutionarily conserved function of E4 ORF3 PUBMED:17287283.

    \ \

    The product of adenovirus early region 4 (E4), open reading frame 6, is E4 34k. It modulates viral late gene expression, DNA replication, apoptosis, double strand break repair, and transformation through multiple interactions with components in infected and transformed cells PUBMED:15890904, PUBMED:10747932. Conservation of several cysteine and histidine residues among E4 34k sequences suggests the presence of a zinc binding domain, which is important for its function PUBMED:10747932.

    \ \

    This entry is a conserved region found in the Adenovirus E4 34 kDa protein.

    \ ' '1111' 'IPR000978' '\ Adenoviruses are responsible for diseases such as pneumonia, cystitis, conjunctivitis and diarrhoea, all \ of which can be fatal to patients who are immunocompromised PUBMED:7704534. Viral infection commences with \ recognition of host cell receptors by means of specialised proteins on viral surfaces. Specific attachment \ of adenovirus is achieved through interactions between host-cell receptors and the adenovirus fiber protein \ and is mediated by the globular carboxy-terminal domain of the adenovirus fiber protein, termed the \ carboxy-terminal knob domain.\ ' '1112' 'IPR000939' '\ Adenoviruses are responsible for diseases such as pneumonia, cystitis, conjunctivitis and diarrhoea, all \ of which can be fatal to patients who are immunocompromised PUBMED:7704534. Viral infection commences with \ recognition of host cell receptors by means of specialised proteins on viral surfaces. Specific attachment \ of adenovirus is achieved through interactions between host-cell receptors and the adenovirus fiber protein \ and is mediated by the globular carboxy-terminal domain of the adenovirus fiber protein, rather than the \ \'shaft\' region represented by this family. The alignment of this family contains two copies of a fifteen\ residue repeat found in the \'shaft\' region of adenoviral fiber proteins.\ ' '1113' 'IPR006965' '\ This 19 kDa glycoprotein binds the major histocompatibility (MHC) class I antigens in the endoplasmic reticulum (ER). The ER retention signal at the C terminus of GP19K causes retention of the complex in the ER, preventing lysis of the cell by cytotoxic T-lymphocytes PUBMED:8249282.\ ' '1114' 'IPR016107' '\

    Hexon is a major coat protein found in various species-specific Adenoviruses, which are type II dsDNA viruses. Hexon coat proteins are synthesised during late infection and form homo-trimers. The 240 copies of the hexon trimer that are produced are organised so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX. The penton complex, formed by the peripentonal hexons and base hexon (holding in place a fibre), lie at each of the 12 vertices PUBMED:7932702. The hexon coat protein is a duplication consisting of two domains with a similar fold packed together like the nucleoplasmin subunits. Within a hexon trimer, the domains are arranged around a pseudo 6-fold axis. The domains have a beta-sandwich structure consisting of 8 strands in two sheets with a jelly-roll topology; each domain is heavily decorated with many insertions PUBMED:12915569.

    \

    This entry represents the N-terminal domain of hexon coat proteins.

    \ ' '1115' 'IPR016108' '\

    Hexon is a major coat protein found in various species-specific Adenoviruses, which are type II dsDNA viruses. Hexon coat proteins are synthesised during late infection and form homo-trimers. The 240 copies of the hexon trimer that are produced are organised so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX. The penton complex, formed by the peripentonal hexons and base hexon (holding in place a fibre), lie at each of the 12 vertices PUBMED:7932702. The hexon coat protein is a duplication consisting of two domains with a similar fold packed together like the nucleoplasmin subunits. Within a hexon trimer, the domains are arranged around a pseudo 6-fold axis. The domains have a beta-sandwich structure consisting of 8 strands in two sheets with a jelly-roll topology; each domain is heavily decorated with many insertions PUBMED:12915569.

    \

    This entry represents the C-terminal domain of hexon coat proteins.

    \ ' '1116' 'IPR003389' '\ Va2 protein can interact with the adenoviral packaging signal and this interaction involves DNA sequences that have previously been demonstrated to be required for packaging PUBMED:10684284. During the course of lytic infection, the adenovirus major late promoter (MLP) is induced to high levels after replication of viral DNA has started. IVa2 is a transcriptional activator of the major late promoter PUBMED:8207818.\ ' '1117' 'IPR002605' '\ This family consists of various adenovirus penton base\ proteins, from both the mastadenoviridae having mammalian hosts\ and the aviadenoviridae having avian hosts. The penton base is a \ major structural protein forming part of the penton which consists\ of a base and a fiber, the pentons hold a morphologically prominent \ position at the vertex capsomer in the adenovirus particle PUBMED:1316685. \ In mammalian adenovirus there is only one tail on each base where as \ in avian adenovirus there are two PUBMED:1316685.\ ' '1118' 'IPR005641' '\

    Hexon () is the major coat protein from adenovirus type 2. Hexon forms a homo-trimer, 240 copies of which are present in the capsid, organised so that 12 lie on each of the 20 facets of this structure. The central 9 hexons in a facet are cemented together by 12 copies of protein IX PUBMED:8334984. Protein IX is not neccessarily required for viral replication, but has been shown to affect several processes including DNA-packaging capacity, thermostability, and the transcriptional activity of several promoters. For more information see PUBMED:15914835.

    \ ' '1119' 'IPR005608' '\

    The nucleoprotein core of adenoviruses consists of a double-stranded DNA molecule, with a covalently linked terminal protein at each end, in association with the highly basic protein VII and the mu peptide, and the more loosely associated protein V PUBMED:11038369. Protein V is also associated with the capsid and appears to provide a bridge between these two structures. After infection of the host cell, the capsid proteins become disassociated and the core is rapidly transported, in a process involving the cellular microtubules, to the nucleus where transcription of viral genes is initiated.

    \ \

    This entry represents protein V of adenoviruses. It has been suggested, though not proven, that this protein is involved in transport of the core virus particle to the nucleus through interaction with the cellular p32 protein PUBMED:9680131. p32 is a protein that is postulated, though again not proven, to be part of a rapid transport system between mitochondria and the nucleus. Protein V is not essential for viral reproduction, but its deletion leads to reduced thermostability and infectivity suggesting that it may play some role in viral assembly PUBMED:17208253.

    \ ' '1120' 'IPR000646' '\ This family includes hexon-associated proteins from adenoviruses. Adenoviruses are responsible for diseases such as pneumonia, cystitis, conjunctivitis and diarrhoea, all \ of which can be fatal to patients who are immunocompromised PUBMED:7704534.\ ' '1121' 'IPR003391' '\

    The genome of adenovirus contains a protein covalently bound\ to the 5\' end of each strand of the linear DNA molecule PUBMED:433158. Since adenovirus DNA replication is initiated at the termini of\ the DNA molecule it has been proposed that the\ terminal protein serves as the primer for initiation of replication.\ \ However, the priming function now appears to reside\ in the precursor form of the terminal protein (pTP) found on the\ 5\' ends of nascent DNA strands replicated in vitro PUBMED:6942401, PUBMED:6933548 and as\ a component of DNA-protein complexes isolated from virions\ of the protease-deficient adenovirus serotype 2 (Ad2) mutant\ tsl. The pTP is encoded by the leftward-transcribed strand of the viral genome and comprises part of a transcription unit that also encodes the single-strand\ DNA binding protein PUBMED:7471210.

    \ ' '1122' 'IPR004912' '\ The function of this protein is unknown. It has a conserved amino terminus of 50 residues followed by a positively charged tail,\ suggesting it may interact with nucleic acid. \ ' '1123' 'IPR007530' '\

    Also known as aminoglycoside 6-adenylyltransferase (), this protein confers resistance to aminoglycoside antibiotics.

    \ ' '1124' 'IPR008172' '\

    These sequences are functionally identified as members of the adenylate cyclase family, which catalyses the conversion of ATP to 3\',5\'-cyclic AMP and pyrophosphate.

    \ \

    The protein CyaB from Aeromonas hydrophila is a second adenylyl cyclase from that species, as demonstrated by complementation in Escherichia coli and by assay of the enzymatic properties of purified recombinant protein PUBMED:9642185. It has no detectable homology to any other protein of known function, and has several unusual properties, including an optimal temperature of 65 degrees and an optimal pH of 9.5. A cluster of uncharacterised archaeal homologs may be orthologous and serve (under certain circumstances) to produce the regulatory metabolite cyclic AMP (cAMP).

    \ ' '1125' 'IPR000274' '\ Adenylate cyclase is the enzyme responsible for the synthesis of cAMP from ATP. From sequence data, it has been proposed that there are three different classes of adenylate cyclases PUBMED:7863008. Class I cyclases are found in enterobacteria and related Gram-negative bacteria. They are proteins of about 850 residues that consist of two functional domains: a N-terminal catalytic domain and a C-terminal regulatory domain.\ There are two highly conserved regions, the first one is located in the catalytic domain and the second one in the regulatory domain. The second signature includes a conserved histidine which could be phosphorylated by a PTS system IIA enzyme, thus leading to the activation of the cyclase.\ ' '1126' 'IPR001114' '\

    Adenylosuccinate synthetase () plays an important role in purine biosynthesis, by catalysing the GTP-dependent conversion of IMP and aspartic acid to AMP. Adenylosuccinate synthetase has been characterised from various sources ranging from Escherichia coli (gene purA) to vertebrate tissues. In vertebrates, two isozymes are present: one involved in purine biosynthesis and the other in the purine nucleotide cycle.

    \ \

    The crystal structure of adenylosuccinate synthetase from E. coli reveals that the dominant structural element of each monomer of the homodimer is a central beta-sheet of 10 strands. The first nine strands of the sheet are mutually parallel with right-handed crossover connections between the strands. The 10th strand is antiparallel with respect to the first nine strands. In addition, the enzyme has two antiparallel beta-sheets, comprised of two strands and three strands each, 11 alpha-helices and two short 3/10-helices. Further, it has been suggested that the similarities in the GTP-binding domains of the synthetase and the p21ras protein are an example of convergent evolution of two distinct families of GTP-binding proteins PUBMED:8244965. Structures of adenylosuccinate synthetase from Triticum aestivum and Arabidopsis thaliana when compared with the known structures from E. coli reveals that the overall fold is very similar to that of the E. coli protein PUBMED:10669609.

    \ ' '1127' 'IPR002198' '\ The short-chain dehydrogenases/reductases family (SDR) PUBMED:7742302 is a very large family of enzymes, most of which are known to be NAD- or NADP-dependent oxidoreductases. As the first member of this family to be characterised was Drosophila alcohol dehydrogenase, this family used to be called PUBMED:2707261, PUBMED:1889416, PUBMED:1740120 \'insect-type\', or \'short-chain\' alcohol dehydrogenases. Most member of this family are proteins of about 250 to 300 amino acid residues. Most dehydrogenases possess at least 2 domains PUBMED:6789320, the first binding the coenzyme, often NAD, and the second binding the substrate. This latter domain determines the substrate specificity and contains amino acids involved in catalysis. Little sequence similarity has been found in the coenzyme binding domain although there is a large degree of structural similarity, and it has therefore been suggested that the structure of dehydrogenases has arisen through gene fusion of a common ancestral coenzyme nucleotide sequence with various substrate specific domains PUBMED:6789320.\ ' '1128' 'IPR006713' '\ This family of adhesins bind to the Dr blood group antigen component of decay-accelerating factor. This mediates adherence of uropathogenic Escherichia coli to the urinary tract. This family contains both fimbriated and afimbriated adherence structures PUBMED:9169726. This protein also confers the phenotype of mannose-resistant hemagglutination, which can be inhibited by chloramphenicol. The N-terminal portion of the protein is thought to be responsible for chloramphenicol sensitivity PUBMED:1670929.\ ' '1129' 'IPR004940' '\ This family corresponds to a short 100 residue region found in adhesins and hypothetical adhesin-like proteins from Mycoplasmas.\ \ ' '1130' 'IPR000850' '\ Adenylate kinases (ADK) are phosphotransferases that catalyse the reversible reaction \ \ an essential reaction for many processes in living cells. Two ADK isozymes \ have been identified in mammalian cells. These specifically bind AMP and favour binding to ATP over \ other nucleotide triphosphates (AK1 is cytosolic and AK2 is located in the mitochondria). A third ADK \ has been identified in bovine heart and human cells PUBMED:6088234, this is a mitochondrial GTP:AMP \ phosphotransferase, also specific for the phosphorylation of AMP, but can only use GTP or ITP as a\ substrate PUBMED:218813. ADK has also been identified in different bacterial species and in yeast \ PUBMED:1587477. Two further enzymes are known to be related to the ADK family, i.e. yeast uridine \ monophosphokinase and slime mold UMP-CMP kinase. Within the ADK family there are several conserved \ regions, including the ATP-binding domains. One of the most conserved areas includes an Arg residue, \ whose modification inactivates the enzyme, together with an Asp that resides in the catalytic cleft \ of the enzyme and participates in a salt bridge.\ ' '1131' 'IPR007862' '\

    Adenylate kinases (ADK; ) are phosphotransferases that catalyse the Mg-dependent reversible conversion of ATP and AMP to two molecules of ADP, an essential reaction for many processes in living cells. In large variants of adenylate kinase, the AMP and ATP substrates are buried in a domain that undergoes conformational changes from an open to a closed state when bound to substrate; the ligand is then contained within a highly specific environment required for catalysis. Adenylate kinase is a 3-domain protein consisting of a large central CORE domain flanked by a LID domain on one side and the AMP-binding NMPbind domain on the other PUBMED:17299745. The LID domain binds ATP and covers the phosphates at the active site. The substrates first bind the CORE domain, followed by closure of the active site by the LID and NMPbind domains.

    \ \

    Comparisons of adenylate kinases have revealed a particular divergence in the active site lid. In some organisms, particularly the Gram-positive bacteria, residues in the lid domain have been mutated to cysteines and these cysteine residues (two CX(n)C motifs) are responsible for the binding of a zinc ion. The bound zinc ion in the lid domain is clearly structurally homologous to Zinc-finger domains. However, it is unclear whether the adenylate kinase lid is a novel zinc-finger DNA/RNA binding domain, or that the lid bound zinc serves a purely structural function PUBMED:9715904.

    \ ' '1132' 'IPR000043' '\

    S-adenosyl-L-homocysteine hydrolase () (AdoHcyase) is an enzyme of the activated methyl cycle, responsible for the reversible hydration of S-adenosyl-L-homocysteine into adenosine and homocysteine. AdoHcyase is an ubiquitous enzyme which binds and requires NAD+ as a cofactor. AdoHcyase is a highly conserved protein PUBMED:1631127 of about 430 to 470 amino acids. The family contains a glycine-rich region in the central part of AdoHcyase, which is thought to be involved in NAD-binding.

    \ ' '1133' 'IPR015878' '\

    S-adenosyl-L-homocysteine hydrolase () (AdoHcyase) is an enzyme of the activated methyl cycle, responsible for the reversible hydration of S-adenosyl-L-homocysteine into adenosine and homocysteine. AdoHcyase is an ubiquitous enzyme which binds and requires NAD+ as a cofactor.\ AdoHcyase is a highly conserved protein PUBMED:1631127 of about 430 to 470 amino acids.

    This entry represents the glycine-rich region in the central part of AdoHcyase, which is thought to be involved in NAD-binding.

    \ ' '1134' 'IPR005502' '\ This family includes enzymes that perform ADP-ribosylations, such as ADP-ribosylarginine hydrolase which cleaves ADP-ribose-L-arginine PUBMED:8349667. The family also includes dinitrogenase reductase activating glycohydrolase PUBMED:2506427, and most surprisingly jellyfish crystallins PUBMED:2506427, although these proteins appear to have lost the presumed active site residues.\ ' '1135' 'IPR007666' '\

    Although ATP is the most common phosphoryl group donor for kinases, certain\ hyperthermophilic archaea, such as Thermococcus litoralis and Pyrococcus furiosus, utilise unusual ADP-dependent glucokinases (ADPGKs) and\ phosphofructokinases (ADPPKKs) in their glycolytic pathways PUBMED:11286887, PUBMED:12237466, PUBMED:12909015. ADPGKs and\ ADPPFKs exhibit significant similarity, and form an ADP-dependent kinase\ (ADPK) family, which was tentatively named the PFKC family PUBMED:11778837. A ~460-residue\ ADPK domain is also found in a bifunctional ADP-dependent gluco/phosphofructo-\ kinase (ADP-GK/PFK) from Methanocaldococcus jannaschii (Methanococcus jannaschii) as well as in homologous\ hypothetical proteins present in several eukaryotes PUBMED:11717273.

    \ \

    The whole structure of the ADPK domain can be divided into large and small\ alpha/beta subdomains. The larger subdomain, which carries\ the ADP binding site, consists of a twisted 12-stranded beta sheet flanked on\ both faces by 13 alpha helices and three 3(10) helices, forming an alpha/beta\ 3-layer sandwich. The smaller subdomain, which covers the active site, forms\ an alpha/beta two-layer structure containing 5 beta strands and four alpha\ helices. The ADP molecule is buried in a shallow pocket in the large\ subdomain. The binding of substrate sugar induces a structural change, the\ small domain closing to form a complete substrate sugar binding site PUBMED:11286887, PUBMED:12237466, PUBMED:12909015.

    \ ' '1137' 'IPR005830' '\

    This family represents the pore forming lobe of aerolysin, and the related toxins haemolysin and the leukocidin S subunit.

    \ \

    Aerolysin PUBMED:3584074 is a cytolytic toxin exported by Aeromonas hydrophila, a Gram-negative bacterium associated with diarrhoeal diseases and deep wound infections PUBMED:7510043. The mature toxin binds to eukaryotic cells and aggregates to form holes (approximately 3 nm in diameter) leading to the destruction of the membrane permeability barrier and osmotic lysis. The structure of proaerolysin has been determined to 2.8A resolution and shows the protoxin to adopt a novel fold PUBMED:7510043. Images of an aerolysin oligomer derived from electron microscopy have helped to construct a model of the protein and to outline a mechanism by which it might insert into lipid bilayers to form ion channels PUBMED:7510043.

    \ ' '1138' 'IPR007797' '\ This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to Homo sapiens diseases such as acute lymphoblastic leukemia and mental retardation PUBMED:11171403. The family also contains a Drosophila AF4 protein homologue Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila PUBMED:11171404.\ ' '1139' 'IPR005654' '\

    ATPase family gene 1 (AFG1) ATPase is a 377 amino acid putative protein with an ATPase motif typical of the protein family including SEC18p PAS1, CDC48-VCP and TBP. AFG1 also has substantial homology to these proteins outside the ATPase domain PUBMED:1441755. This family of proteins contains a P-loop motif.

    \ ' '1140' 'IPR001203' '\

    Enzymes of the aldehyde ferredoxin oxidoreductase (AOR) family PUBMED:9242907 contain a tungsten cofactor and an 4Fe4S cluster and catalyse the interconversion of aldehydes to carboxylates PUBMED:8672295. This family includes AOR, formaldehyde ferredoxin oxidoreductase (FOR), glyceraldehyde-3-phosphate ferredoxin oxidoreductase (GAPOR), all isolated from hyperthermophilic archea PUBMED:9242907; carboxylic acid reductase found in clostridia PUBMED:2550230; and hydroxycarboxylate viologen oxidoreductase from Proteus vulgaris, the sole member of the AOR family containing molybdenum PUBMED:8026480. GAPOR may be involved in glycolysis PUBMED:7721730, but the functions of the other proteins are not yet clear. AOR has been proposed to be the primary enzyme responsible for oxidising the aldehydes that are produced by the 2-keto acid oxidoreductases PUBMED:9275170.

    \

    This entry represents the C-terminal region of these enzymes, containing the alpha-helical structural domains 2 and 3 PUBMED:10024458, PUBMED:7878465.

    \ ' '1141' 'IPR013983' '\

    Enzymes of the aldehyde ferredoxin oxidoreductase (AOR) family PUBMED:9242907 contain a tungsten cofactor and an 4Fe4S cluster and catalyse the interconversion of aldehydes to carboxylates PUBMED:8672295. This family includes AOR, formaldehyde ferredoxin oxidoreductase (FOR), glyceraldehyde-3-phosphate ferredoxin oxidoreductase (GAPOR), all isolated from hyperthermophilic archea PUBMED:9242907; carboxylic acid reductase found in clostridia PUBMED:2550230; and hydroxycarboxylate viologen oxidoreductase from Proteus vulgaris, the sole member of the AOR family containing molybdenum PUBMED:8026480. GAPOR may be involved in glycolysis PUBMED:7721730, but the functions of the other proteins are not yet clear. AOR has been proposed to be the primary enzyme responsible for oxidising the aldehydes that are produced by the 2-keto acid oxidoreductases PUBMED:9275170.

    \

    This entry represents the N-terminal domain of these enzymes. This domain has been shown to interact with the tungsten cofactor PUBMED:7878465.

    \ ' '1142' 'IPR003460' '\

    Antifreeze proteins (AFPs) are a class of proteins that are able to bind to and inhibit the growth of macromolecular ice, thereby permitting an organism to survive subzero temperatures by decreasing the probability of ice nucleation in their bodies PUBMED:15291806. These proteins have been characterised from a variety of organisms, including fish, plants, bacteria, fungi and arthropods. This entry represents insect AFPs of the type found in Tenebrio molitor (Yellow mealworm) and in Dendroides canadensis (Pyrochroid beetle).

    \

    The structure of these AFPs consists of a right-handed beta-helix with 12 residues per coil. Each 12 residue-repeat contains two cys residues that form a disulphide bridge. The beta-helices of insect AFPs present a highly rigid array of threonine residues and bound water molecules that can effectively mimic the ice lattice. As such, beta-helical AFPs provide a more effective coverage of the ice surface compared to the alpha-helical fish AFPs PUBMED:10917536.

    \

    A second insect antifreeze from Choristoneura fumiferana (Spruce budworm) () also consists of beta-helices, however in these proteins the helices form a left-handed twist; these proteins show no sequence homology to the current entry, but may act by a similar mechanism. The beta-helix motif may be used as an AFP structural motif in non-homologous proteins from other (non-fish) organisms as well.

    \ ' '1143' 'IPR005509' '\ This is a key enzyme for A-factor (2-isocapryloyl-3R-hydroxymethyl-gamma-butyrolactone) biosynthesis. A-factor is a diffusible bioregulator that is essential for streptomycin production, streptomycin resistance, and spore formation.\ ' '1144' 'IPR006763' '\

    To date many different Plasmodium antigens recognised by the hyperimmune system human sera have been cloned, sequenced and characterised. The majority contain tandemly repeated amino acid sequences which make up a considerable portion of the protein sequence. It has been suggested that these repeat-containing antigens may provide an immunological smokescreen to the parasite in order to evade the human immune system. This repeat is found exclusively in the Plasmodium falciparum Ag332 protein and occupies most of its length PUBMED:7628570.

    \ ' '1145' 'IPR001480' '\ Members of this domain are plant lectins. Curculin is a sweet-tasting and taste-modifying protein from the fruits of Curculigo latifolia (Lumbah). The three mannose-binding sites are devoid of mannose-binding activity PUBMED:9132060. Other members of this domain are mannose specific and have diverse functions. The lectin of the saffron crocus (Crocus sativus) (Saffron) specifically interacts with a yeast mannan and is a major corm protein specifically expressed in this organ PUBMED:10691656. \

    The actin-binding\ and vesicle-associated protein comitin exhibits a mannose-specific\ lectin activity and may have a role in cell motility. It binds to vesicle membranes via mannose residues and, by way of its interaction with actin, links these membranes to the cytoskeleton.

    \ ' '1146' 'IPR007733' '\ The agouti protein regulates pigmentation in the mouse hair follicle producing a black hair with a subapical yellow band. A highly homologous protein agouti signal protein (ASIP) is present in humans and is expressed at highest levels in adipose tissue where it may play a role in energy homeostasis and possibly human pigmentation PUBMED:11837451, PUBMED:11833005.\ ' '1147' 'IPR006741' '\

    The accessory gene regulator (agr) of Staphylococcus aureus is the central regulatory system that controls the gene expression for a large set of virulence factors. The arg locus consists of two transcripts: RNAII and RNAIII. RNAII encodes four genes (agrA, B, C, and D) whose gene products assemble a quorum sensing system. At low cell density, the agr genes are continuously expressed at basal levels. A signal molecule, autoinducing peptide\ (AIP), produced and secreted by the bacteria, accumulates outside of the cells. When the cell density increases and the AIP concentration reaches a\ threshold, it activates the agr response, i.e. activation of secreted protein gene expression and subsequent repression of cell wall-associated protein genes. AgrB and AgrD are essential for the production of the autoinducing peptide which functions as a signal for quorum sensing. AgrB is a transmembrane protein PUBMED:11195102 involved in the proteolytic processing of AgrD, and may have both proteolytic and transporter activities, facilitating the export of\ the processed AgrD peptide PUBMED:12122003.

    \ ' '1148' 'IPR006819' '\

    The virD operon in Agrobacterium encodes a site-specific endonuclease, and a number of other poorly characterised products PUBMED:3658701. This family represents the VirD5 protein.

    \ ' '1149' 'IPR000866' '\

    Peroxiredoxins (Prxs) are a ubiquitous family of antioxidant enzymes that also control cytokine-induced peroxide levels which mediate signal transduction in mammalian cells. Prxs can be regulated by changes to phosphorylation, redox and possibly oligomerisation states. Prxs are divided into three classes: typical 2-Cys Prxs; atypical 2-Cys Prxs; and 1-Cys Prxs. All Prxs share the same basic catalytic mechanism, in which an active-site cysteine (the peroxidatic cysteine) is oxidised to a sulphenic acid by the peroxide substrate. The recycling of the sulphenic acid back to a thiol is what distinguishes the three enzyme classes. Using crystal structures, a detailed catalytic cycle has been derived for typical 2-Cys Prxs, including a model for the redox-regulated oligomeric state proposed to control enzyme activity PUBMED:12517450.

    \ \ \

    Alkyl hydroperoxide reductase (AhpC) is responsible for directly reducing organic hyperoxides in its reduced dithiol form. Thiol specific antioxidant (TSA) is a physiologically important antioxidant which constitutes an enzymatic defence against sulphur-containing radicals. This family contains AhpC and TSA, as well as related proteins.

    \ \

    Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee, King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.

    \ \

    The allergens in this family include allergens with the following designations: Asp f 3, Mal f 2 and Mal f 3.

    \ ' '1150' 'IPR003778' '\

    Allophanate hydrolase catalyses the second reaction in an ATP-dependent, two-step degradation of urea to ammonia and C02. This follows the action of the biotin-containing urea carboxylase. Saccharomyces cerevisiae can use urea as a sole nitrogen source via this degradation pathway PUBMED:6124544. In yeast, the fusion of allophanate hydrolase to urea carboxylase is called urea amidolyase.

    \ \

    In bacteria, the second step in the urea degradation pathway is also the ATP-dependent allophanate hydrolase. The gene encoding this enzyme is found adjacent to the urea carboxylase gene PUBMED:15796980. Allophanate hydrolase has strict substrate specificity, as analogues of allophanate are not hydrolysed by it PUBMED:15796980.

    \ \

    This domain represents subunit 2 of allophanate hydrolase (AHS2) which is found in urea carboxylase.

    \ ' '1151' 'IPR013982' '\ This entry represents a region associated with formylation activity which is found in a family of bifunctional enzymes catalysing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalysed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase (AICARFT), this enzyme catalyses the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate PUBMED:9332377. The last step is catalysed by IMP (Inosine monophosphate) cyclohydrolase (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP PUBMED:9332377.\ ' '1152' 'IPR006703' '\

    This entry represents Arabidopsis protein AIG1 which appears to be involved in plant resistance to bacteria. The Arabidopsis disease resistance gene RPS2 is involved in recognition of bacterial pathogens carrying the avirulence gene avrRpt2. AIG1 (avrRpt2-induced gene) exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae carrying avrRpt2 PUBMED:8742710.

    \

    The pattern also recognises a number of mammalian proteins, for example the rat immune-associated nucleotide 4 protein, suggesting that the family may have a wider function.

    \ ' '1153' 'IPR000031' '\

    Phosphoribosylaminoimidazole carboxylase is a fusion protein in plants and fungi, but consists of two non-interacting proteins in bacteria, PurK and PurE.\ PurK, N5-carboxyaminoimidazole ribonucleotide (N5_CAIR) synthetase, catalyzes the conversion of 5-aminoimidazole ribonucleotide (AIR), ATP, and bicarbonate to N5-CAIR, ADP, and Pi. PurE converts N5-CAIR to CAIR, the sixth step of de novo purine biosynthesis. In the presence of high concentrations of bicarbonate, PurE is reported able to convert AIR to CAIR directly and without ATP. Some members of this family contain two copies of this domain PUBMED:10074353. The crystal structure of PurE indicates a unique quaternary structure that confirms the octameric nature of the enzyme PUBMED:10574791.

    \ ' '1154' 'IPR000728' '\ This family includes Hydrogen expression/formation protein, HypE, which may be involved in\ the maturation of NifE hydrogenase; AIR synthase and FGAM synthase, which are involved in\ de novo purine biosynthesis; and selenide, water dikinase, an enzyme which synthesizes\ selenophosphate from selenide and ATP.\ ' '1155' 'IPR010918' '\

    This entry includes Hydrogen expression/formation protein, HypE, which may be involved in the maturation of NifE hydrogenase; AIR synthase and FGAM synthase, which are involved in de novo purine biosynthesis; and selenide, water dikinase, an enzyme which synthesizes selenophosphate from selenide and ATP.

    \ ' '1156' 'IPR007071' '\ A-kinase (or PKA)-anchoring protein AKAP95 is implicated in mitotic chromosome condensation by acting as a targeting molecule for the condensin complex. The protein contains two zinc fingers which are thought to mediate the binding of AKAP95 to DNA PUBMED:11964380.\ ' '1157' 'IPR004236' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The alpha-lytic protease prodomain is associated with serine peptidases, specifically the alpha-lytic endopeptidases and streptogrisin A, B, C, D and E, which are bacterial enzymes and which belong to MEROPS peptidase subfamily S1A (). The protease precursor in Gram-negative bacterial proteases may be a general property of extracellular bacterial proteases PUBMED:3234766. The proteases are encoded with a large (166 amino acid) N-terminal pro region that is required transiently both in vivo and in vitro for the correct folding of the protease domain PUBMED:2507926, PUBMED:1552947. The pro region also acts as a potent inhibitor of the mature enzyme PUBMED:1579568.

    \ ' '1159' 'IPR001731' '\

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway PUBMED:16564539. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin PUBMED:17227226.

    \

    The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase (), or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase () to charge a tRNA with glutamate, glutamyl-tRNA reductase () to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase () to catalyse a transamination reaction to produce ALA.

    \

    The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase, ) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase, ) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase () to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.

    \

    Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase (). To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase () PUBMED:11215515.

    \ \ \

    This entry represents porphobilinogen (PBG) synthase (PBGS, or 5-aminoaevulinic acid dehydratase, or ALAD, ), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses a Knorr-type condensation reaction between two molecules of ALA to generate porphobilinogen, the pyrrolic building block used in later steps PUBMED:17311232. The structure of the enzyme is based on a TIM barrel topology made up of eight identical subunits, where each subunit binds to a metal ion that is essential for activity, usually zinc (in yeast, mammals and certain bacteria) or magnesium (in plants and other bacteria). A lysine has been implicated in the catalytic mechanism PUBMED:3092810. The lack of PBGS enzyme causes a rare porphyric disorder known as ALAD porphyria, which appears to involve conformational changes in the enzyme PUBMED:17236137.

    \ ' '1160' 'IPR015590' '\

    Aldehyde dehydrogenases ( and ) are enzymes that oxidize a wide variety of aliphatic and aromatic aldehydes using NADP as a cofactor. In mammals at least four different forms of the enzyme are known PUBMED:2713359: class-1 (or Ald C) a tetrameric cytosolic enzyme, class-2 (or Ald M) a tetrameric mitochondrial enzyme, class- 3 (or Ald D) a dimeric cytosolic enzyme, and class IV a microsomal enzyme. Aldehyde dehydrogenases have also been sequenced from fungal and bacterial species. A number of enzymes are known to be evolutionary related to aldehyde dehydrogenases. A glutamic acid and a cysteine residue have been implicated in the catalytic activity of mammalian aldehyde dehydrogenase. These residues are conserved in all the enzymes of this entry.

    \ \

    Some of the proteins in this entry are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee\ King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E.,\ Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of\ the first three letters of the genus; a space; the first letter of the\ species name; a space and an arabic number. In the event that two species\ names have identical designations, they are discriminated from one another\ by adding one or more letters (as necessary) to each species designation.

    \

    The allergens in this family include allergens with the following designations: Alt a 10 and Cla h 3.

    \ ' '1161' 'IPR000887' '\ 4-Hydroxy-2-oxoglutarate aldolase () (KHG-aldolase) catalyzes the interconversion of \ 4-hydroxy-2-oxoglutarate into pyruvate and glyoxylate. Phospho-2-dehydro-3-deoxygluconate aldolase \ () (KDPG-aldolase) catalyzes the interconversion of 6-phospho-2-dehydro-3-deoxy-D-gluconate \ into pyruvate and glyceraldehyde 3-phosphate. These two enzymes are structurally and functionally \ related PUBMED:3136164. They are both homotrimeric proteins of approximately 220 amino-acid residues. \ They are class I aldolases whose catalytic mechanism involves the formation of a Schiff-base \ intermediate between the substrate and the epsilon-amino group of a lysine residue. In both enzymes, \ an arginine is required for catalytic activity.\ ' '1162' 'IPR001303' '\

    This entry represents the alpha/beta/alpha domain found in class II aldolases and adducin, usually at the N-terminus. These proteins form part of a family that includes: rhamnulose-1-phosphate aldolase (), L-fuculose phosphate aldolase () PUBMED:8515438, PUBMED:8676381 that is involved in the third step in fucose metabolism, L-ribulose- 5-phosphate 4-epimerase () involved in the third step of L-arabinose catabolism, a probable sugar isomerase SgbE, hypothetical proteins and the metazoan adducins which have not been ascribed any enzymatic function but which play a role in cell membrane cytoskeleton organisation.

    \

    Adducins are members of the Ig superfamily and encode cell surface sialoglycoproteins expressed by cytokine-activated endothelium. This type I membrane protein mediates leukocyte-endothelial cell adhesion and signal transduction, and may play a role in the development of artherosclerosis and rheumatoid arthritis. Adducin is a cell-membrane skeletal protein that was first purified from human erythrocytes and subsequently isolated from bovine brain membranes. Isoforms of this protein have been detected in lung, kidney, testes and liver. Erythrocyte adducin is a 200-kDa heterodimer protein, composed of alpha and beta subunits, present at about 30,000 copies per cell. It binds with high affinity to Ca(2+)/calmodulin and is a substrate for protein kinases A and C. Both alpha-adducin and beta-adducin show alternative splicing. Thus, there may be several different heterodimeric or homodimeric forms of adducin, each with a different functional specificity. It is thought to play a role in assembly of the spectrin-actin lattice that underlies the plasma membrane PUBMED:102560. Missense mutations in both the alpha- and beta-adducin genes that alter amino acids that are normally phosphorylated have been associated with the regulation of blood pressure in the Milan hypertensive strain (MHS) of rats. Gamma adducin was isolated from human foetal brain PUBMED:8893809. It shows a high degree of similarity to the alpha and beta adducins.

    \ \ ' '1163' 'IPR005506' '\ This set of repeats is found in a small family of secreted proteins of no known function, which may be involved in signal transduction.\ ' '1164' 'IPR007873' '\ The formation of N-glycosidic linkages of glycoproteins involves the ordered assembly of the common Glc3Man9GlcNAc2 core-oligosaccharide on the lipid carrier dolichyl pyrophosphate. Whereas early mannosylation steps occur on the cytoplasmic side of the endoplasmic reticulum with GDP-Man as donor, the final reactions from Man5GlcNAc2-PP-Dol to Man9GlcNAc2-PP-Dol on the lumenal side use Dol-P-Man PUBMED:11308030. The ALG3 gene encodes the Dol-P-Man:Man5GlcNAc2-PP-Dol mannosyltransferase.\ ' '1165' 'IPR015908' '\

    Allantoicase (also known as allantoate amidinohydrolase) is involved in\ purine degradation, facilitating the utilization of purines as secondary nitrogen sources under nitrogen-limiting conditions. While purine degradation converges to uric acid in all vertebrates, its further degradation varies from species to species. Uric acid is excreted by birds, reptiles, and some mammals that do not have a functional uricase gene, whereas other mammals produce allantoin. Amphibians and microorganisms produce ammonia and carbon dioxide using the uricolytic pathway. Allantoicase performs the second step in this pathway catalyzing the conversion of allantoate into ureidoglycolate and urea.

    \

    \ \

    The structure of allantoicase is best described as being composed of two repeats (the allantoicase repeats: AR1 and AR2), which are connected by a flexible linker. The crystal structure, resolved at 2.4A resolution, reveals that AR1 has a very similar fold to AR2, both repeats being jelly-roll motifs, composed of four-stranded and five-stranded antiparallel beta-sheets PUBMED:15229895. Each jelly-roll motif has two conserved surface patches that probably constitute the active site PUBMED:15020593.

    \ ' '1166' 'IPR006948' '\ Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system PUBMED:12235163.\ ' '1167' 'IPR007173' '\ This domain is specific to D-arabinono-1,4-lactone oxidase , which is involved in the final step of the D-erythroascorbic acid biosynthesis pathway PUBMED:10094636.\ ' '1168' 'IPR006047' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Enzymes containing this domain, such as alpha-amylase, belong to family 13 () of the glycosyl hydrolases. The maltogenic alpha-amylase is an enzyme which catalyses hydrolysis of (1-4)-alpha-D-glucosidic linkages in polysaccharides so as to remove successive alpha-maltose residues from the non-reducing ends of the chains in the conversion of starch to maltose. Other enzymes include neopullulanase, which hydrolyses pullulan to panose, and cyclomaltodextrinase, which hydrolyses cyclodextrins.

    \

    This entry represents the catalytic domain found in several protein members of this family. It has a structure consisting of an 8 stranded alpha/beta barrel that contains the active site, interrupted by a ~70 amino acid calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain PUBMED:16302977.

    \

    More information about this protein can be found at Protein of the Month: alpha-Amylase PUBMED:.

    \ ' '1169' 'IPR004185' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \ Enzymes containing this domain belong to family 13 () of the glycosyl hydrolases. The maltogenic alpha-amylase is an enzyme which catalyses hydrolysis of (1-4)-alpha-D-glucosidic linkages in polysaccharides so as to remove successive alpha-maltose residues from the non-reducing ends of the chains in the conversion of starch to maltose. Other enzymes include neopullulanase, which hydrolyses pullulan to panose, and cyclomaltodextrinase, which hydrolyses cyclodextrins.\ ' '1170' 'IPR003164' '\

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport PUBMED:15261670. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors PUBMED:17449236, PUBMED:11598180.

    \

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes PUBMED:15107467. AP2 associates with the plasma membrane and is responsible for endocytosis PUBMED:12952931. AP3 is responsible for protein trafficking to lysosomes and other related organelles PUBMED:16542748. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins PUBMED:11080148. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface PUBMED:17254016.

    \

    AP adaptor alpha-adaptin can be divided into a trunk domain and the C-terminal appendage domain (or ear domain), separated by a linker region. The C-terminal appendage domain regulates translocation of endocytic accessory proteins to the bud site PUBMED:12057195.

    \

    This entry represents a subdomain of the appendage (ear) domain of alpha-adaptin from AP clathrin adaptor complexes. This domain has a three-layer arrangement, alpha-beta-alpha, with a bifurcated antiparallel beta-sheet PUBMED:10430869.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '1171' 'IPR008152' '\

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport PUBMED:15261670. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors PUBMED:17449236, PUBMED:11598180.

    \

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes PUBMED:15107467. AP2 associates with the plasma membrane and is responsible for endocytosis PUBMED:12952931. AP3 is responsible for protein trafficking to lysosomes and other related organelles PUBMED:16542748. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins PUBMED:11080148. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface PUBMED:17254016.

    \

    GGAs (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) are a family of monomeric clathrin adaptor proteins that are conserved from yeasts to humans. GGAs regulate clathrin-mediated the transport of proteins (such as mannose 6-phosphate receptors) from the TGN to endosomes and lysosomes through interactions with TGN-sorting receptors, sometimes in conjunction with AP-1 PUBMED:14973137, PUBMED:14745135. GGAs bind cargo, membranes, clathrin and accessory factors. GGA1, GGA2 and GGA3 all contain a domain homologous to the ear domain of gamma-adaptin. GGAs are composed of a single polypeptide with four domains: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The VHS domain is responsible for endocytosis and signal transduction, recognising transmembrane cargo through the ACLL sequence in the cytoplasmic domains of sorting receptors PUBMED:11859376. The GAT domain (also found in Tom1 proteins) interacts with ARF (ADP-ribosylation factor) to regulate membrane trafficking PUBMED:16413283, and with ubiquitin for receptor sorting PUBMED:15966896. The hinge region contains a clathrin box for recognition and binding to clathrin, similar to that found in AP adaptins. The GAE domain is similar to the AP gamma-adaptin ear domain, and is responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis PUBMED:12858162.

    \

    This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of alpha-, beta- and gamma-adaptin from AP clathrin adaptor complexes, and the GAE (gamma-adaptin ear) domain of GGA adaptor proteins. These domains have an immunoglobulin-like beta-sandwich fold containing 7 or 8 strands in 2 beta-sheets in a Greek key topology PUBMED:12042876, PUBMED:12808037. Although these domains share a similar fold, there is little sequence identity between the alpha/beta-adaptins and gamma-adaptin/GAE.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '1172' 'IPR000930' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N-terminus of the p130 polyprotein of togaviruses PUBMED:7845208. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes PUBMED:7845208. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity PUBMED:7845208. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin PUBMED:7845208, PUBMED:1944569.

    \ ' '1173' 'IPR002548' '\

    Alphaviruses are enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Semliki Forest and Sindbis viruses PUBMED:15378043. Alphaviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope proteins P62 and E1 that associate as a heterodimer. The viral membrane-anchored surface glycoproteins are responsible for receptor recognition and entry into target cells through membrane fusion. The proteolytic maturation of P62 into E2 () and E3 () causes a change in the viral surface. Together the E1, E2, and sometimes E3, glycoprotein "spikes" form an E1/E2 dimer or an E1/E2/E3 trimer, where E2 extends from the centre to the vertices, E1 fills the space between the vertices, and E3, if present, is at the distal end of the spike PUBMED:8107141. Upon exposure of the virus to the acidity of the endosome, E1 dissociates from E2 to form an E1 homotrimer, which is necessary for the fusion step to drive the cellular and viral membranes together. The alphaviral glycoprotein E1 is a class II viral fusion protein, which is structurally different from the class I fusion proteins found in influenza virus and HIV. The structure of the Semliki Forest virus revealed a structure that is similar to that of flaviviral glycoprotein E, with three structural domains in the same primary sequence arrangement PUBMED:11301009. This entry represents all three domains of the alphaviral E1 glycoprotein.

    \ \ \ \ ' '1174' 'IPR000936' '\

    Alphaviruses are enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Semliki Forest and Sindbis viruses PUBMED:15378043. Alphaviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope proteins P62 and E1 () that associate as a heterodimer. The viral membrane-anchored surface glycoproteins are responsible for receptor recognition and entry into target cells through membrane fusion. The proteolytic maturation of P62 into E2 and E3 () causes a change in the viral surface. Together the E1, E2, and sometimes E3 glycoprotein "spikes" form an E1/E2 dimer or an E1/E2/E3 trimer, where E2 extends from the centre to the vertices, E1 fills the space between the vertices, and E3, if present, is at the distal end of the spike PUBMED:8107141, PUBMED:9445057. Upon exposure of the virus to the acidity of the endosome, E1 dissociates from E2 to form an E1 homotrimer, which is necessary for the fusion step to drive the cellular and viral membranes together PUBMED:11301009. This entry represents the alphaviral E2 glycoprotein. The E2 glycoprotein functions to interact with the nucleocapsid through its cytoplasmic domain, while its ectodomain is responsible for binding a cellular receptor.

    \ ' '1175' 'IPR002533' '\

    Alphaviruses are enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Semliki Forest and Sindbis viruses PUBMED:15378043. Alphaviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope proteins P62 and E1 () that associate as a heterodimer. The viral membrane-anchored surface glycoproteins are responsible for receptor recognition and entry into target cells through membrane fusion. The proteolytic maturation of P62 into E2 () and E3 causes a change in the viral surface. Together the E1, E2, and sometimes E3 glycoprotein "spikes" form an E1/E2 dimer or an E1/E2/E3 trimer, where E2 extends from the centre to the vertices, E1 fills the space between the vertices, and E3, if present, is at the distal end of the spike PUBMED:8107141, PUBMED:9445057. Upon exposure of the virus to the acidity of the endosome, E1 dissociates from E2 to form an E1 homotrimer, which is necessary for the fusion step to drive the cellular and viral membranes together PUBMED:11301009. This entry represents the alphaviral E3 glycoprotein. Most alphaviruses lose the peripheral protein E3, but in Semliki viruses it remains associated with the viral surface.

    \ ' '1176' 'IPR004913' '\

    The exact function of the herpesvirus glycoprotein J is unknown, but it appears to play a role in the inhibition of apotosis of the host cell PUBMED:11090178.

    \ ' '1177' 'IPR000933' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Family 29 () encompasses alpha-L-fucosidases () PUBMED:2482732, which is a lysosomal enzyme responsible for\ hydrolyzing the alpha-1,6-linked fucose joined to the reducing-end\ N-acetylglucosamine of the carbohydrate moieties of glycoproteins. Alpha-L-fucosidase is responsible for hydrolysing the alpha-1,6-linked fucose joined to the reducing-end N-acetylglucosamine of the carbohydrate moieties of glycoproteins.

    \ \

    Fucosylated glycoconjugates are involved in numerous biological events, making alpha-l-fucosidases, the enzymes responsible for their processing, critically important. Deficiency in alpha-l-fucosidase activity is associated with fucosidosis, a lysosomal storage disorder characterised by rapid neurodegeneration, resulting in severe mental and motor deterioration PUBMED:14715651. The enzyme is a hexamer and displays a two-domain fold, composed of a catalytic (beta/alpha)(8)-like domain and a C-terminal beta-sandwich domain PUBMED:14715651.

    \ \

    Drosophila melanogaster spermatozoa contains an alpha-l-fucosidase that might be involved in fertilisation by interacting with alpha-l-fucose residues on the micropyle of the eggshell PUBMED:18556148. In human sperm, membrane-associated alpha-l-fucosidase is stable for extended periods of time, which is made possible by membrane domains and compartmentalisation. These help preserve protein integrity PUBMED:18522672.\

    \ ' '1178' 'IPR003174' '\

    Alpha-TIF (VP16) from Herpes Simplex virus is an essential tegument protein involved in the transcriptional activation of viral immediate early (IE) promoters (alpha genes) during the lytic phase of viral infection. VP16 associates with cellular transcription factors to enhance transcription rates, including the general transcription factor TFIIB and the transcriptional coactivator PC4. The N-terminal residues of VP16 confer specificity for the IE genes, while the C-terminal residues are responsible for transcriptional activation. Within the C-terminal region are two activation regions that can independently and cooperatively activate transcription PUBMED:15654739. VP16 forms a transcriptional regulatory complex with two cellular proteins, the POU-domain transcription factor Oct-1 and the cell-proliferation factor HCF-1 PUBMED:12826401. VP16 is an alpha/beta protein with an unusual fold. Other transcription factors may have a similar topology.

    \ \ ' '1179' 'IPR003298' '\

    A novel antigen of Plasmodium falciparum has been cloned that contains a hydrophobic domain typical of an integral membrane protein. The antigen is designated apical membrane antigen 1 (AMA-1) by virtue of appearing to be located in the apical complex PUBMED:2701947. AMA-1 appears to be transported to the merozoite surface close to the time of schizont rupture.

    \

    The 66kDa merozoite surface antigen (PK66) of Plasmodium knowlesi, a simian malaria, possesses vaccine-related properties believed to originate from a receptor-like role in parasite invasion of erythrocytes PUBMED:2211675. The sequence of PK66 is conserved throughout plasmodium, and shows high similarity to P. falciparum AMA-1. Following schizont rupture, the distribution of PK66 changes in a coordinate manner associated with merozoite invasion. Prior to rupture, the protein is concentrated at the apical end, following which it distributes itself entirely across the surface of the free merozoite. Immunofluorescence studies suggest that, during invasion, PK66 is excluded from the erythrocyte at, and behind, the invasion interface PUBMED:2211675.

    \ ' '1180' 'IPR005611' '\

    Amb V is an Ambrosia sp (ragweed) pollen allergen. Amb t V has been shown to contain a C-terminal helix as the major T cell epitope. Free sulphydryl groups also play a major role in the T cell recognition of cross-reactivity T cell epitopes within these related allergens PUBMED:7594515.

    \ ' '1181' 'IPR004116' '\

    Amelogenins, cell adhesion proteins, play a role in the biomineralisation of\ teeth. They seem to regulate formation of crystallites during the secretory\ stage of tooth enamel development and are thought to play a major role in\ the structural organisation and mineralisation of developing enamel. The\ extracellular matrix of the developing enamel comprises two major classes \ of protein: the hydrophobic amelogenins and the acidic enamelins PUBMED:8118759.

    \

    \ Circular dichroism studies of porcine amelogenin have shown that the protein\ consists of 3 discrete folding units PUBMED:8454575: the N-terminal region appears to\ contain beta-strand structures, while the C-terminal region displays\ characteristics of a random coil conformation. Subsequent studies on the \ bovine protein have indicated the amelogenin structure to contain a\ repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138,\ which sequester a (Pro, Leu, Gln) rich region PUBMED:2598664. The beta-spiral\ offers a probable site for interactions with Ca2+ ions.

    \

    \ Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic\ amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp\ deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, \ Phe7 and Ala8, and replacement by a new threonine codon, disrupting\ the 16-residue (Met1-Ala16) amelogenin signal peptide PUBMED:7782077.

    \ ' '1182' 'IPR006799' '\ Anti-Mullerian hormone, AMH is a signalling molecule involved in male and female sexual differentiation PUBMED:1782869. Defects in synthesis or action of AMH cause persistent Mullerian duct syndrome (PMDS), a rare form of male pseudohermaphroditism PUBMED:8162013. This family represents the N-terminal part of the protein, which is not thought to be essential for activity PUBMED:8162013. AMH contains a TGF-beta domain (), at the C-terminus.\ ' '1183' 'IPR000120' '\

    Amidase signature (AS) enzymes are a large group of hydrolytic enzymes that contain a conserved stretch of approximately 130 amino acids known as the AS sequence. They are widespread, being found in both prokaryotes and eukaryotes. AS enzymes catalyse the hydrolysis of amide bonds (CO-NH2), although the family has diverged widely with regard to substrate specificity and function. Nonetheless, these enzymes maintain a core alpha/beta/alpha structure, where the topologies of the N- and C-terminal halves are similar. AS enzymes characteristically have a highly conserved C-terminal region rich in serine and glycine residues, but devoid of aspartic acid and histidine residues, therefore they differ from classical serine hydrolases. These enzymes posses a unique, highly conserved Ser-Ser-Lys catalytic triad used for amide hydrolysis, although the catalytic mechanism for acyl-enzyme intermediate formation can differ between enzymes PUBMED:15595822.

    Examples of AS enzymes include:

    \ \ ' '1184' 'IPR002901' '\

    This domain is found in many different proteins including mannosyl-glycoprotein endo-beta-N-acetylglucosamidase ().It is also found in flagellar protein J (), which has been shown to\ hydrolyse peptidoglycan PUBMED:10049388.

    \ ' '1185' 'IPR003198' '\

    This family contains glycine and inosamine amidinotransferases, enzymes which are involved in creatine and streptomycin biosynthesis respectively. This family also includes arginine deiminases, which catalyse the reversible reaction:

    \

    The Streptococcus anti-tumour glycoprotein is also found in this family PUBMED:9218780.

    \ ' '1186' 'IPR006680' '\

    This group of enzymes represents a large metal dependent hydrolase superfamily PUBMED:8550522. The family includes adenine deaminase () that hydrolyses adenine to form hypoxanthine and ammonia. The adenine deaminase reaction is important for adenine utilization as a purine and also as a nitrogen source PUBMED:9144792. This family also includes dihydroorotase and N-acetylglucosamine-6-phosphate deacetylases (). These enzymes catalyse the reaction: This family includes dihydroorotase and urease which belong to MEROPS peptidase family M38 (beta-aspartyl dipeptidase, clan MJ), where they are classified as non-peptidase\ homologs.

    \ ' '1187' 'IPR006992' '\

    These proteins are related to the metal-dependent hydrolase superfamily PUBMED:9144792. The family includes 2-amino-3-carboxymuconate-6-semialdehyde decarboxylase which converts alpha-amino-beta-carboxymuconate-epsilon- semialdehyde (ACMS) to alpha-aminomuconate semialdehyde (AMS). ACMS can be converted non-enzymatically to quinolate, a potent endogenous excitoxin of neuronal cells which is implicated in the pathogenesis of various neurodegenerative disorders. In the presence of AMCSD, ACMS is converted to AMS, a benign catabolite.

    \ \ ' '1188' 'IPR004839' '\ Aminotransferases share certain mechanistic features with other pyridoxal-phosphate dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped PUBMED:1990006 into class I and class II. This entry includes proteins from both subfamilies.\ ' '1189' 'IPR005814' '\

    Aminotransferases share certain mechanistic features with other pyridoxalphosphate-dependent enzymes, such as the covalent binding of the pyridoxalphosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped PUBMED:1618757 into subfamilies. One of these, called class-III, includes acetylornithine aminotransferase (), which catalyzes the transfer of an amino group from acetylornithine to alpha-ketoglutarate, yielding N-acetyl-glutamic-5-semi-aldehyde and glutamic acid; ornithine aminotransferase (), which catalyzes the transfer of an amino group from ornithine to alpha-ketoglutarate, yielding glutamic-5-semi-aldehyde and glutamic acid; omega-amino acid--pyruvate aminotransferase (), which catalyzes transamination between a variety of omega-amino acids, mono- and diamines, and pyruvate; 4-aminobutyrate aminotransferase () (GABA transaminase), which catalyzes the transfer of an amino group from GABA to alpha-ketoglutarate, yielding succinate semialdehyde and glutamic acid; DAPA aminotransferase (), a bacterial enzyme (bioA), which catalyzes an intermediate step in the biosynthesis of biotin, the transamination of 7-keto-8-aminopelargonic acid to form 7,8-diaminopelargonic acid; 2,2-dialkylglycine decarboxylase (), a Burkholderia cepacia (Pseudomonas cepacia) enzyme (dgdA) that catalyzes the decarboxylating amino transfer of 2,2-dialkylglycine and pyruvate to dialkyl ketone, alanine and carbon dioxide; glutamate-1-semialdehyde \ aminotransferase () (GSA); Bacillus subtilis aminotransferases yhxA and yodT; Haemophilus influenzae aminotransferase HI0949; and Caenorhabditis elegans aminotransferase T01B11.2.

    \ ' '1190' 'IPR001544' '\

    Aminotransferases share certain mechanistic features with other pyridoxal-phosphate dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped PUBMED:1644759 into subfamilies.

    \

    One of these, called class-IV, currently consists of proteins of about 270 to 415 amino-acid residues that share a few regions of sequence similarity. Surprisingly, the best conserved region does not include the lysine residue to which the pyridoxal-phosphate group is known to be attached, in ilvE, but is located some 40 residues at the C terminus side of the pyridoxal-phosphate-lysine. The D-amino acid transferases (D-AAT), which are among the members of this entry, are required by bacteria to catalyse the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity PUBMED:7626635, PUBMED:9163511.

    \ ' '1191' 'IPR000192' '\ Aminotransferases share certain mechanistic features with other pyridoxal-\ phosphate dependent enzymes, such as the covalent binding of the pyridoxal-\ phosphate group to a lysine residue. On the basis of sequence similarity,\ these various enzymes can be grouped PUBMED:8482384 into subfamilies. This entry represents the class V aminotransferases and the related, though functionally distinct, cysteine desulfurases.\ ' '1192' 'IPR003211' '\

    Helicobacter pylori is a Gram-negative, ureolytic bacteria that can colonise the human stomach. It does not survive in a medium with a pH less than 4.0 unless urea is present, preferring a neutral pH. Gastric juice urea is able to rapidly access intrabacterial urease when the periplasmic pH falls below approximately 6.2 owing to pH-gating of a urea channel, UreI. UreI is a six-transmembrane segment protein that is homologous to the amiS genes of the amidase gene cluster and to UreI of Helicobacter hepaticus and Streptococcus salivarius. UreI in H. pylori and H. hepaticus can transport urea only at acidic pH, whereas that of S. salivarius is open at both neutral and acidic pH PUBMED:12471160.

    \ \

    The amiS gene encodes an 18-kDa protein with a high content of hydrophobic residues. It has six transmembrane helices. AmiB and AmiS resemble two components of an ABC transporter system PUBMED:7642533.

    \ \

    This family includes UreI and proton gated urea channel as well as putative amide transporters PUBMED:10642549.

    \ ' '1193' 'IPR003393' '\

    Ammonia monooxygenase and the particulate methane monooxygenase are both integral membrane proteins, occurring in ammonia oxidisers and methanotrophs respectively, which are thought to be evolutionarily related PUBMED:7590173. These enzymes have a relatively wide substrate specificity and can catalyse the oxidation of a range of substrates including ammonia, methane, halogenated hydrocarbons and aromatic molecules PUBMED:12209257. These enzymes are composed of 3 subunits - A (), B () and C () - and contain various metal centres, including copper. Particulate methane monooxygenase from Methylococcus capsulatus str. Bath is an ABC homotrimer, which contains mononuclear and dinuclear copper metal centres, and a third metal centre containing a metal ion whose identity in vivo is not certainPUBMED:15674245.

    \

    The A subunit from Methylococcus capsulatus str. Bath resides primarily within the membrane and consists of 7 transmembrane helices and a beta-hairpin which interacts with the soluble region of the B subunit. A conserved glutamate residue is thought to contribute to a metal centre PUBMED:15674245.

    \ ' '1194' 'IPR001905' '\

    All functionally characterised members of the ammonium transporter family are ammonia or ammonium uptake transporters. Some, but not others, also transport methylammonium. The mechanism of energy coupling, if any, to methyl-NH2 or NH3 uptake by the AmtB protein of Escherichia coli is not entirely clear. NH4+ uniport driven by the proton motive force (pmf), energy independent NH3 facilitation, and NH4+/K+ antiport have been proposed as possible transport mechanisms. In Corynebacterium glutamicum (Brevibacterium flavum) and Arabidopsis thaliana (Mouse-ear cress), uptake via the Amt1 homologues of AmtB has been reported to be driven by the pmf.

    \ ' '1195' 'IPR007820' '\

    This family contains sequences annotated as ammonia monooxygenase. The AmoA gene product from Pseudomonas putida has been characterised as ammonia monooxygenase PUBMED:9732537.\ Ammonia monooxygenase catalyses the oxidation of NH(3) to NH(2)OH.

    \ ' '1196' 'IPR006980' '\

    Ammonia monooxygenase and the particulate methane monooxygenase are both integral membrane proteins, occurring in ammonia oxidisers and methanotrophs respectively, which are thought to be evolutionarily related PUBMED:7590173. These enzymes have a relatively wide substrate specificity and can catalyse the oxidation of a range of substrates including ammonia, methane, halogenated hydrocarbons and aromatic molecules PUBMED:12209257. These enzymes are composed of 3 subunits - A (), B () and C () - and contain various metal centres, including copper. Particulate methane monooxygenase from Methylococcus capsulatus str. Bath is an ABC homotrimer, which contains mononuclear and dinuclear copper metal centres, and a third metal centre containing a metal ion whose identity in vivo is not certainPUBMED:15674245.

    \

    The C subunit from Methylococcus capsulatus str. Bath resides primarily in the membrane and consists of five transmembrane helices. Several conserved residues contribute to a metal binding centrePUBMED:15674245.

    \ ' '1197' 'IPR005533' '\

    This domain may have a role in cell adhesion PUBMED:11893501. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges.

    \ \ ' '1198' 'IPR006828' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    This region is found in the beta subunit of the 5-AMP-activated protein kinase complex, and its yeast homologues Sip1, Sip2 and Gal83, which are found in the SNF1 kinase complex PUBMED:8621499. This region is sufficient for interaction of this subunit with the kinase complex, but is not solely responsible for the interaction, and the interaction partner is not known PUBMED:7813428. The isoamylase domain () is sometimes found associated with proteins that contain this C-terminal domain.

    \ ' '1199' 'IPR001103' '\

    Steroid or nuclear hormone receptors (NRs) constitute an important super-family of transcription regulators that are involved in diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members include the steroid hormone receptors and receptors for thyroid hormone, retinoids and 1,25-dihydroxy-vitamin D3. The proteins function as dimeric molecules in the nucleus to regulate the transcription of target genes in a ligand-responsive manner PUBMED:7899080, PUBMED:8165128.

    \ \

    NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes and hormone resistance syndromes. Many do not yet have a defined ligand and are accordingly termed "orphan" receptors. More than 300 NRs have been described to date and a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.

    \ \

    The androgen receptor (AR) consists of 3 functional and structural domains: an N-terminal (modulatory) domain; a DNA binding domain () that mediates specific binding to target DNA sequences (ligand-responsive elements); and a hormone binding domain. The N-terminal domain (NTD) is unique to the androgen receptors and spans approximately the first 530 residues; the highly-conserved DNA-binding domain is smaller (around 65 residues) and occupies the central portion of the protein; and the hormone ligand binding domain (LBD) lies at the receptor C-terminus. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.

    \ \

    The LBDs of steroid hormone receptors fold into 12 helices that form a ligand-binding pocket. When an agonist is bound, helix 12 folds over the pocket to enclose the ligand PUBMED:12089231. When an antagonist is unbound, helix 12 is positioned away from the pocket in a way that interferes with the binding of coactivators to a groove in the hormone-binding domain formed after ligand binding. In AR, ligand binding that induces folding of helix 12 to overlie the pocket discloses a groove that binds a region of the NTD. Coactivator molecules can also bind to this groove, but the predominant site for coactivator binding to AR is in the NTD. AR ligand resides in a pocket and primarily contacts helices 4, 5, and 10. The DNA-binding region includes eight cysteine residues that form two coordination complexes, each composed of four cysteines and a Zn2+ ion. These two zinc fingers form the structure that binds to the major groove of DNA. The second zinc finger stabilises the binding complex by hydrophobic interactions with the first finger and contributes to specificity of receptor DNA binding.\ It is also necessary for receptor dimerisation that occurs during DNA binding

    \ \

    Defects in the androgen receptor cause testicular feminisation syndrome,\ androgen insensibility syndrome (AIS) PUBMED:1307250, PUBMED:1569163. AIS may be complete (CAIS), where external genitalia are phenotypically female; partial (PAIS), where genitalia are substantively ambiguous; or mild (MAIS), where external genitalia are normal male, or nearly so. Defects in the receptor also cause X-linked spinal and bulbar muscular atrophy (also known as Kennedy\'s disease).

    \ \ ' '1200' 'IPR004349' '\

    The nitrogenase complex catalyses the conversion of molecular nitrogen to ammonia (nitrogen fixation). The complex is hexameric, consisting of 2 alpha, 2 beta, and 2 delta subunits.

    \ \

    This family represents the delta\ subunit of a group of nitrogenases that do not utilise molybdenum (Mo) as a cofactor, but instead use either vanadium (V\ nitrogenases), or iron (alternative nitrogenases).

    \ ' '1201' 'IPR000663' '\ Atrial natriuretic peptides (ANPs) are vertebrate hormones that play an important role in the control of\ cardiovascular homeostatis, and sodium and water balance in general PUBMED:1652921, PUBMED:2536732, PUBMED:.\ There are different NPs that vary in length but share a common core. All are processed from a single precursor.\ A disulphide bond resident in the C-terminal section is required for full activity of atriopeptins. The family\ of NPs includes structurally-related peptides that elicit similar pharmacological spectra. Amongst these are\ brain natriuretic peptide (BNP); C-type natriuretic peptide (CNP); ventricular natriuretic peptide (VNP)\ PUBMED:1828035; and green mamba natriuretic peptide (DNP) PUBMED:1352773.\ ' '1202' 'IPR005109' '\ The members of this family (Anp1, Van1 and Mnn9) are membrane proteins required for proper Golgi function. These proteins colocalize within the cis Golgi, where they are physically associated in two distinct complexes PUBMED:9430634.\ ' '1203' 'IPR005039' '\ Prophages P1 and P7 exist as unit copy DNA plasmids in the bacterial cell. Maintenance of the prophage state requires the\ continuous expression of two repressors: (i) C1 is a protein which negatively regulates the expression of lytic genes including\ the C1 inactivator gene coi, and (ii) C4 is an antisense RNA which specifically inhibits the synthesis of an anti-repressor Ant.\ ' '1204' 'IPR006805' '\ Anthranilate synthase catalyses the first step in the biosynthesis of tryptophan. Component I catalyses the formation of anthranilate using ammonia and chorismate. The catalytic site lies in the adjacent region, described in the chorismate binding enzyme family (). This region is involved in feedback inhibition by tryptophan PUBMED:11371633. This family also contains a region of Para-aminobenzoate synthase component I.\ ' '1205' 'IPR005165' '\

    Anthrax toxin is a plasmid-encoded toxin complex produced by the Gram-positive, spore-forming bacteria, Bacillus anthracis. The toxin consists of three non-toxic proteins: the protective antigen (PA), the lethal factor (LF) and the edema factor (EF) PUBMED:14570563. These component proteins self-assemble at the surface of host cell receptors, yielding a series of toxic complexes that can produce shock-like symptoms and death. Anthrax toxin is one of a large group of Bacillus and Clostridium exotoxins referred to as binary toxins, forming independent enzymatic (A moiety) and binding (B moiety) components. The LF and EF proteins are the enzymes (A moiety) that act on cytosolic substrates, while PA is a multi-functional protein (B moiety) that binds to cell surface receptors, mediates the assembly and internalisation of the complexes, and delivers them to the host cell endosome PUBMED:17335404. Once PA is attached to the host receptor PUBMED:17381430, it must then be cleaved by a host cell surface (furin family) protease before it is able to bind EF and LF. The cleavage of the N-terminus of PA enables the C-terminal fragment to self-associate into a ring-shaped heptameric complex (prepore) that can bind LF or EF competitively. The PA-LF/EF complex is then internalised by endocytosis, and delivered to the endosome, where PA forms a pore in the endosomal membrane in order to translocate LF and EF to the cytosol. LF is a Zn-dependent metalloprotease that cleaves and inactivates mitogen-activated protein (MAP) kinases, kills macrophages, and causes death of the host by inhibiting cell proliferation PUBMED:14616089, PUBMED:11700563. EF is a calcium-and calmodulin-dependent adenylyl cyclase that can cause edema (fluid-filled swelling) when associated with PA. EF is not toxic by itself, and is required for the survival of germinated Bacillus spores within macrophages at the early stages of infection. EF dramatically elevates the level of host intracellular cAMP, a ubiquitous messenger that integrates many processes of the cell; increases in cAMP can interfere with host intracellular signalling PUBMED:15131111.

    \

    This entry represents a central domain in the edema factor adenylyl cyclase protein of anthrax toxin, as well as in adenylyl cylcases from other bacterial toxins.

    \ ' '1207' 'IPR003679' '\ This family consists of bacterial aminoglycoside 3-N-acetyltransferases () that catalyse the reaction PUBMED:1761222:\ The enzyme\ can use a range of antibiotics with 2-deoxystreptamine rings as acceptor for its acetyltransferase activity, this\ inactivates and confers resistance to gentamicin, kanamycin, tobramycin, neomycin and apramycin amongst others. For the kanamycin group antibiotics acetylation occurred at the 3"-amino group in arbekacin and amikacin, and at the 3-amino group in dibekacin as in the case of kanamycin reflecting the effect of the (S)-4-amino-2-hydroxybutyryl side chain which is present in arbekacin and amikacin, but absent in dibekacin and kanamycin PUBMED:9766465.\ ' '1208' 'IPR004914' '\ This family includes various protein that are involved in antirestriction. The ArdB protein efficiently inhibits restriction by\ members of the three known families of type I systems of Escherichia coli PUBMED:8393008.\ ' '1209' 'IPR003222' '\

    This entry consists of antitermination proteins found in bacteriophages, such as protein Q from phage lambda, and some bacterial homologues. Protein Q positively regulates expression of the phage late gene operon by binding to the bacterial host RNA polymerase (RNAP) and modifying it. The modified RNAP transcribes through termination sites that otherwise prevent expression of the regulated genes PUBMED:15150248.

    \ ' '1210' 'IPR002680' '\ The alternative oxidase is used as a second terminal oxidase in the mitochondria, electrons are transferred directly from reduced ubiquinol to oxygen forming water PUBMED:8770590. This is not coupled to ATP synthesis and is not inhibited by cyanide, this pathway is a single step process PUBMED:9426242. In Oryza sativa (Rice) the transcript levels of the alternative oxidase are increased by low temperature PUBMED:9426242. It has been predicted to contain a coupled diiron centre on the basis of a conserved sequence motif consisting of the proposed iron ligands, four Glu and two His residues PUBMED:11106766. The EPR study of Arabidopsis thaliana (Mouse-ear cress)\ alternative oxidase AOX1a shows that the enzyme contains a\ hydroxo-bridged mixed-valent Fe(II)/Fe(III) binuclear iron centre PUBMED:12215444. A catalytic cycle has been proposed that involves diiron centre and at least one transient protein-derived radical, most probably an invariant Tyr residue PUBMED:11801238.\ ' '1211' 'IPR012307' '\

    This TIM alpha/beta barrel structure is found in xylose isomerase () and in endonuclease IV (, ). This domain is also found in the N termini of bacterial myo-inositol catabolism proteins. These are involved in the myo-inositol catabolism pathway, and is required for growth on myo-inositol in Rhizobium leguminosarum bv. viciae PUBMED:11497462.

    \ ' '1212' 'IPR002575' '\ This entry consists of bacterial antibiotic resistance proteins,\ which confer resistance to various aminoglycosides they include:-\ aminoglycoside 3\'-phosphotransferase or kanamycin kinase / \ neomycin-kanamycin phosphotransferase and streptomycin 3\'\'-kinase\ or streptomycin 3\'\'-phosphotransferase. The aminoglycoside \ phosphotransferases inactivate aminoglycoside antibiotics via \ phosphorylation PUBMED:2167474.\ ' '1213' 'IPR013332' '\

    ApbA, the ketopantoate reductase enzyme of Salmonella typhimurium is required for the synthesis of thiamine via the alternative pyrimidine biosynthetic pathway PUBMED:. Precursors to the pyrimidine moiety of thiamine are synthesized de novo by the purine biosynthetic pathway or the alternative pyrimidine biosynthetic (APB) pathway. The ApbA protein catalyzes the NADPH-specific reduction of ketopantoic acid to pantoic acid. This activity had previously been associated with the pantothenate biosynthetic gene panE PUBMED:. ApbA and PanE are allelic PUBMED:.

    \ \ ' '1214' 'IPR004939' '\

    The anaphase-promoting complex (APC) is a multi-subunit E3 protein ubiquitin ligase that is responsible for the metaphase to anaphase transition and the exit from mitosis. Anaphase is initiated when the APC triggers the destruction of securin, thereby allowing the protease, separase, to disrupt sister-chromatid cohesion. Securin ubiquitination by the APC is inhibited by cyclin-dependent kinase 1 (Cdk1)-dependent phosphorylation PUBMED:18552837.

    \ \

    Forkhead Box M1 (FoxM1), which is a transcription factor that is over-expressed in many cancers, is degraded in late mitosis and early G1 phase by the APC/cyclosome (APC/C) E3 ubiquitin ligase PUBMED:18573889. The APC/C targets mitotic cyclins for destruction in mitosis and G1 phase and is then inactivated at S phase. It thereby generates alternating states of high and low cyclin-Cdk activity, which is required for the alternation of mitosis and DNA replication PUBMED:18559889.

    \ \

    APC from Schizosaccharomyces pombe and Saccharomyces cerevisiae was previously thought to have 11 subunits, but more sensitive techniques have identified 13 subunits in both yeasts PUBMED:12477395.

    \ \

    One of the subunits of the APC that is required for ubiquitination activity is APC10, a one-domain protein homologous to a sequence element, termed the DOC domain, found in several hypothetical proteins that may also mediate ubiquitination reactions, because they contain combinations of either RING finger (see ), cullin (see ) or HECT (see ) domains PUBMED:10318877, PUBMED:11524682, PUBMED:11884135.

    \ \

    The DOC domain consists of a beta-sandwich, in which a five-stranded antiparallel beta-sheet is packed on top of a three stranded antiparallel beta-sheet, exhibiting a \'jellyroll\' fold PUBMED:11524682, PUBMED:11884135.

    \ \

    Proteins known to contain a DOC domain include:\

    \

    \ ' '1215' 'IPR007242' '\ Macroautophagy is a bulk degradation process induced by starvation in eukaryotic cells. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. No molecule involved in autophagy has yet been identified in higher eukaryotes PUBMED:9852036. The pre-autophagosomal structure contains at least five Apg proteins: Apg1p, Apg2p, Apg5p, Aut7p/Apg8p and Apg16p. It is found in the vacuole PUBMED:11689437. The C-terminal glycine of Apg12p is conjugated to a lysine residue of Apg5p via an isopeptide bond. During autophagy, cytoplasmic components are enclosed in autophagosomes and delivered to lysosomes/vacuoles. Auotphagy protein 16 (Apg16) has been shown to be bind to Apg5 and is required for the function of the Apg12p-Apg5p conjugate PUBMED:10406794. Autophagy protein 5 (Apg5) is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway PUBMED:10712513.\ This entry represents Apg12, which is covalently bound to Apg5 PUBMED:9852036.\ ' '1216' 'IPR007240' '\ Macroautophagy is a bulk degradation process induced by starvation in eukaryotic cells. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. No molecule involved in autophagy has yet been identified in higher eukaryotes PUBMED:9852036. The pre-autophagosomal structure contains at least five Apg proteins: Apg1p, Apg2p, Apg5p, Aut7p/Apg8p and Apg16p. It is found in the vacuole PUBMED:11689437. The C-terminal glycine of Apg12p is conjugated to a lysine residue of Apg5p via an isopeptide bond. During autophagy, cytoplasmic components are enclosed in autophagosomes and delivered to lysosomes/vacuoles. Auotphagy protein 16 (Apg16) has been shown to be bind to Apg5 and is required for the function of the Apg12p-Apg5p conjugate PUBMED:10406794. Autophagy protein 5 (Apg5) is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway PUBMED:10712513.\

    Autophagy protein 17 (Apg17) is required for activating Apg1 protein kinases.

    \ ' '1217' 'IPR007243' '\ Macroautophagy is a bulk degradation process induced by starvation in eukaryotic cells. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. No molecule involved in autophagy has yet been identified in higher eukaryotes PUBMED:9852036. The pre-autophagosomal structure contains at least five Apg proteins: Apg1p, Apg2p, Apg5p, Aut7p/Apg8p and Apg16p. It is found in the vacuole PUBMED:11689437. The C-terminal glycine of Apg12p is conjugated to a lysine residue of Apg5p via an isopeptide bond. During autophagy, cytoplasmic components are enclosed in autophagosomes and delivered to lysosomes/vacuoles. Auotphagy protein 16 (Apg16) has been shown to be bind to Apg5 and is required for the function of the Apg12p-Apg5p conjugate PUBMED:10406794. Autophagy protein 5 (Apg5) is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway PUBMED:10712513.\

    Apg6/Vps30p has two distinct functions in the autophagic process, either associated with the membrane or in a retrieval step of the carboxypeptidase Y sorting pathway PUBMED:9712845.

    \ ' '1218' 'IPR007241' '\ Macroautophagy is a bulk degradation process induced by starvation in eukaryotic cells. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. No molecule involved in autophagy has yet been identified in higher eukaryotes PUBMED:9852036. The pre-autophagosomal structure contains at least five Apg proteins: Apg1p, Apg2p, Apg5p, Aut7p/Apg8p and Apg16p. It is found in the vacuole PUBMED:11689437. The C-terminal glycine of Apg12p is conjugated to a lysine residue of Apg5p via an isopeptide bond. During autophagy, cytoplasmic components are enclosed in autophagosomes and delivered to lysosomes/vacuoles. Auotphagy protein 16 (Apg16) has been shown to be bind to Apg5 and is required for the function of the Apg12p-Apg5p conjugate PUBMED:10406794. Autophagy protein 5 (Apg5) is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway PUBMED:10712513.\

    Apg9 plays a direct role in the formation of the cytoplasm to vacuole targeting and autophagic vesicles, possibly serving as a marker for a specialised compartment essential for these vesicle-mediated alternative targeting pathways PUBMED:10662773.

    \ ' '1219' 'IPR006748' '\

    The aminoglycosides are a large group of biologically active bacterial secondary metabolites, best known for their antibiotic properties PUBMED:9211644. Aminoglycoside phosphotransferases achieve inactivation of these enzymes by phosphorylation, utilising ATP. Likewise, hydroxyurea is inactivated by phosphorylation of the hydroxy group in the hydroxylamine moiety.

    \ ' '1220' 'IPR004828' '\ These antibacterial peptides are found in bees. These heat-stable, non-helical peptides are active against a wide range of plant-associated bacteria and some human pathogens PUBMED:2676519. This family contains a conserved region including the propeptide and apidaecin sequence.\ ' '1221' 'IPR006801' '\ Apolipoprotein A-II (ApoA-II) is the second major apolipoprotein of high density lipoprotein in human plasma. Mature ApoA-II is present as a dimer of two 77-amino acid chains joined by a disulphide bridge PUBMED:12119188. ApoA-II regulates many steps in HDL metabolism, and its role in coronary heart disease is unclear PUBMED:12119188. In bovine serum, the ApoA-II homologue is present in almost free form. Bovine ApoA-II shows antimicrobial activity against Escherichia coli and yeasts in phosphate buffered saline (PBS) PUBMED:9538260.\ ' '1222' 'IPR006781' '\

    Exchangeable apolipoproteins are water-soluble protein components of lipoproteins that solubilise lipids and regulate their metabolism by binding to cell receptors or activating specific enzymes. Apolipoprotein C-I (ApoC-1) is the smallest exchangeable apolipoprotein and transfers among HDL (high density lipoprotein), VLDL (very low-density lipoprotein) and chlylomicrons. ApoC-1 activates lecithin:choline acetyltransferase (LCAT), inhibits cholesteryl ester transfer protein, can inhibit hepatic lipase and phospholipase 2 and can stimulate cell growth. ApoC-1 delays the clearance of beta-VLDL by inhibiting its uptake via the LDL receptor-related pathway PUBMED:11580293. ApoC-1 has been implicated in hypertriglyceridemia PUBMED:11353333, and Alzheimer s disease PUBMED:11741391.

    ApoC-1 is believed to comprise of two dynamic helices that are stabilised by interhelical interactions and are connected by a short linker region. The minimal folding unit in the lipid-free state of this and other exchangeable apolipoproteins comprises the helix-turn-helix motif formed of four 11-mer sequence repeats.

    \ ' '1223' 'IPR002325' '\ The cytochrome b6f integral membrane protein complex transfers electrons \ between the two reaction centre complexes of oxygenic photosynthetic \ membranes, and participates in formation of the transmembrane \ electrochemical proton gradient by also transferring protons from the \ stromal to the internal lumen compartment PUBMED:. The cytochrome b6f complex \ contains four polypeptides: cytochrome f (285 aa); cytochrome b6 (215 aa); \ Rieske iron-sulphur protein (179 aa); and subunit IV (160 aa) PUBMED:8027021. In its \ structure and functions, the cytochrome b6f complex bears extensive analogy\ to the cytochrome bc1 complex of mitochondria and photosynthetic purple \ bacteria; cytochrome f (cyt f) plays a role analogous to that of cytochrome\ c1, in spite of their different structures PUBMED:7631417. \

    The 3D structure of Brassica rapa (Turnip) cyt f has been determined PUBMED:8762139. The lumen-side \ segment of cyt f includes two structural domains: a small one above a \ larger one that, in turn, is on top of the attachment to the membrane \ domain. The large domain consists of an anti-parallel beta-sandwich and a \ short haem-binding peptide, which form a three-layer structure. The small \ domain is inserted between beta-strands F and G of the large domain and is \ an all-beta domain. The haem nestles between two short helices at the \ N-terminus of cyt f. Within the second helix is the sequence motif for the \ c-type cytochromes, CxxCH (residues 21-25), which is covalently attached to\ the haem through thioether bonds to Cys-21 and Cys-24. His-25 is the fifth \ haem iron ligand. The sixth haem iron ligand is the alpha-amino group of \ Tyr-1 in the first helix PUBMED:8762139. Cyt f has an internal network of water \ molecules that may function as a proton wire PUBMED:8762139. The water chain appears\ to be a conserved feature of cyt f.

    \ ' '1224' 'IPR000074' '\

    Exchangeable apolipoproteins (apoA, apoC and apoE) have the same genomic structure and are members of a multi-gene family that probably evolved from a common ancestral gene. This entry includes the ApoA1, ApoA4 and ApoE proteins. ApoA1 and ApoA4 are part of the APOA1/C3/A4/A5 gene cluster on chromosome 11 PUBMED:15108119. Apolipoproteins function in lipid transport as structural components of lipoprotein particles, cofactors for enzymes and ligands for cell-surface receptors. In particular, apoA1 is the major protein component of high-density lipoproteins; apoA4 is thought to act primarily in intestinal lipid absorption; and apoE is a blood plasma protein that mediates the transport and uptake of cholesterol and lipid by way of its high affinity interaction with different cellular receptors, including the low-density lipoprotein (LDL) receptor. Recent findings with apoA1 and apoE suggest that the tertiary structures of these two members of the human exchangeable apolipoprotein gene family are related PUBMED:15234552. The three-dimensional structure of the LDL receptor-binding domain of apoE indicates that the protein forms an unusually elongated four-helix bundle that may be stabilised by a tightly packed hydrophobic core that includes leucine zipper-type interactions and by numerous salt bridges on the mostly charged surface. Basic amino acids important for LDL receptor binding are clustered into a surface patch on one long helix PUBMED:2063194.

    \ ' '1225' 'IPR002891' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \ This domain contains an ATP binding P-loop motif PUBMED:9786849.\ ' '1227' 'IPR005138' '\

    This is the N-terminal domain of aerolysin and pertussis toxin which contains a type-C lectin like fold.

    \ \

    Aerolysin causes the pathogenicity of Aeromonas hydrophila, a bacterium associated with diarrhoeal diseases and deep wound infections. Like many other microbial toxins, the protein changes in a multistep process from a completely water-soluble form to produce a transmembrane channel that breaks the permeability barrier of cells PUBMED:7510043.

    \ \

    Pertussis toxin is a major virulence factor of Bordetella pertussis, which causes whooping cough. The protein is a hexamer containing a catalytic subunit (S1) that is tightly associated with a pentameric cell-binding component (B-oligomer). ATP, detergents and phospholipids assist in activating the holotoxin by destabilising the interaction between S1 and the B-oligomer PUBMED:8637000. Pertussis toxin is an exotoxin and is an essential component of acellular vaccines PUBMED:8075982, PUBMED:7634099. The catalytic A-subunit (S1) shares structural homology with other ADP-ribosylating bacterial toxins, although differences in the carboxy-terminal portion explain its unique activation mechanism PUBMED:8075982. The diverse biological activities of the toxin depend on its ability to recognise carbohydrate-containing receptors on a wide variety of eukaryotic cells.

    \ ' '1228' 'IPR003762' '\ The Escherichia coli araBAD operon consists of three genes encoding three enzymes that convert L-arabinose to D-xylulose-5 phosphate.\ L-arabinose isomerase (araA) catalyses the coversion of L-arabinose to L-ribulose as the first step in the pathway of L-arabinose utilization as a carbon source PUBMED:9084180.\ ' '1229' 'IPR003313' '\ This entry defines the arabinose-binding and dimerisation domain of the bacterial gene regulatory protein AraC. \ The crystal structure of the arabinose-binding and dimerization domain of the Escherichia coli gene regulatory protein AraC was determined in the presence and\ absence of L-arabinose. The arabinose-bound molecule shows that the protein adopts an unusual fold, binding sugar within a beta barrel and completely burying the arabinose with the amino-terminal arm of the protein. Dimer contacts in the presence of arabinose are mediated by an antiparallel coiled-coil. In the uncomplexed protein, the amino-terminal arm is disordered, uncovering the sugar-binding pocket and allowing it to serve as an oligomerization interface PUBMED:9103202.\ ' '1230' 'IPR005569' '\

    Arc repressor act by the cooperative binding of two Arc repressor dimers to a 21-base-pair operator site. Each Arc dimer uses an antiparallel beta-sheet to recognise bases in the major groove PUBMED:8107872.

    \ ' '1231' 'IPR006752' '\

    Archaeal flagella are unique motility structures, and the absence of bacterial structural motility genes in the complete genome sequences\ of flagellated archaeal species has always suggested that archaeal flagellar biogenesis is likely mediated by novel components. FlaD and FlaE, are present in the cell as\ membrane-associated proteins but are not major components of isolated flagellar filaments. Interestingly, flaD was found to encode\ two proteins, each translated from a separate ribosome binding site.

    \ \

    This group of sequences contain the archaeal flaD and flaE proteins. The conserved region that defines these sequences is found in the N-teminal region of flaE but towards the C-terminal region of flaD PUBMED:11717274.

    \ ' '1232' 'IPR002774' '\

    Archaeal motility occurs by the rotation of flagella that are different to bacterial flagella, but show similarity to bacterial type IV pili. These similarities include the multiflagellin nature of the flagellar filament, N-terminal sequence similarities, as well as the presence of homologous proteins in the two systems PUBMED:16983194, PUBMED:11250034. Also unlike bacterial flagellins but similar to type IV pilins, archaeal flagellins are initially synthesised with a short leader peptide that is cleaved by a membrane-located peptidase PUBMED:11250034, PUBMED:15170402. The enzyme responsible for the removal of the this leader peptide is FlaK PUBMED:14622420.

    \ ' '1233' 'IPR011579' '\

    This domain has been found in a number of bacterial and archaeal proteins, all of which contain a conserved P-loop motif that is involved in binding ATP.

    \ ' '1234' 'IPR001535' '\

    Arenaviruses are single stranded RNA viruses. The arenavirus S RNAs that have been characterised include conserved terminal sequences, an ambisense arrangement of the coding regions for the precursor glycoprotein (GPC) and nucleocapsid (N) proteins and an intergenic region capable of forming a base-paired "hairpin" structure. The mature glycoproteins that result are G1 and G2 and the N protein PUBMED:2042397.

    \

    Tacaribe virus (TACV) is an arenavirus that is genetically and antigenically\ closely related to Junin arenavirus (JUNV), the aetiological agent of Argentine\ haemorrhagic fever (AHF). It is well established that TACV protects experimental animals fully against an otherwise lethal challenge with JUNV. It has been established that it is the heterologous glycoprotein that protects against JUNV challenge. A recombinant vaccinia virus that expresses JUNV glycoprotein precursor (VV-GJun) protected seventy-two percent of the animals inoculated with two doses of VV-GJun against the lethal JUNV challenge PUBMED:10769070.

    \ ' '1235' 'IPR000229' '\

    Arenaviruses are single stranded RNA viruses. The arenavirus S RNAs that have been characterised include conserved terminal sequences, an ambisense arrangement of the coding regions for the precursor glycoprotein (GPC) and nucleocapsid (N) proteins and an intergenic region capable of forming a base-paired "hairpin" structure. The mature glycoproteins that result are G1 and G2 and the N protein PUBMED:2042397.

    \

    This family represents the nucleocapsid protein that encapsulates the viral ssRNA PUBMED:8599223.

    \ ' '1236' 'IPR006689' '\

    The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic PUBMED:12429613. They are the founding members of a growing family that includes Arl (Arf-like), Arp\ (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.

    The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2-beta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound\ Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 up, restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.

    \ ' '1237' 'IPR001669' '\

    The arginine dihydrolase (AD) pathway is found in many prokaryotes and some primitive eukaryotes, an example of the latter being Giardia lamblia (Giardia intestinalis) PUBMED:9504342. The three-enzyme anaerobic pathway breaks down L-arginine to form 1 mol of ATP, carbon dioxide and ammonia. In simpler bacteria, the first enzyme, arginine deiminase, can account for up to 10% of total cell protein PUBMED:9504342.

    \ \

    Most prokaryotic arginine deiminase pathways are under the control of a repressor gene, termed ArgR PUBMED:1583685. This is a negative regulator, and will only release the arginine deiminase operon for expression in the presence of arginine PUBMED:9851988. The crystal structure of apo-ArgR from Bacillus stearothermophilus has been determined to 2.5A by means of X-ray crystallography PUBMED:10331868. The protein exists as a hexamer of identical subunits, and is shown to have six DNA-binding domains, clustered around a central oligomeric core when bound to arginine. It predominantly interacts with A.T residues in ARG boxes. This hexameric protein binds DNA at its N terminus to repress arginine biosyntheis or activate arginine catabolism. Some species have several ArgR paralogs. In a neighbour-joining tree, some of these paralogous sequences show long branches and differ significantly from the well-conserved C-terminal region.

    \ ' '1238' 'IPR001669' '\

    The arginine dihydrolase (AD) pathway is found in many prokaryotes and some primitive eukaryotes, an example of the latter being Giardia lamblia (Giardia intestinalis) PUBMED:9504342. The three-enzyme anaerobic pathway breaks down L-arginine to form 1 mol of ATP, carbon dioxide and ammonia. In simpler bacteria, the first enzyme, arginine deiminase, can account for up to 10% of total cell protein PUBMED:9504342.

    \ \

    Most prokaryotic arginine deiminase pathways are under the control of a repressor gene, termed ArgR PUBMED:1583685. This is a negative regulator, and will only release the arginine deiminase operon for expression in the presence of arginine PUBMED:9851988. The crystal structure of apo-ArgR from Bacillus stearothermophilus has been determined to 2.5A by means of X-ray crystallography PUBMED:10331868. The protein exists as a hexamer of identical subunits, and is shown to have six DNA-binding domains, clustered around a central oligomeric core when bound to arginine. It predominantly interacts with A.T residues in ARG boxes. This hexameric protein binds DNA at its N terminus to repress arginine biosyntheis or activate arginine catabolism. Some species have several ArgR paralogs. In a neighbour-joining tree, some of these paralogous sequences show long branches and differ significantly from the well-conserved C-terminal region.

    \ ' '1239' 'IPR006035' '\

    The ureohydrolase superfamily includes arginase (), agmatinase (), formiminoglutamase () and proclavaminate amidinohydrolase () PUBMED:15355972. These enzymes share a 3-layer alpha-beta-alpha structure PUBMED:15355972, PUBMED:16141327, PUBMED:12020346, and play important roles in arginine/agmatine metabolism, the urea cycle, histidine degradation, and other pathways.

    \

    Arginase, which catalyses the conversion of arginine to urea and ornithine,\ is one of the five members of the urea cycle enzymes that convert ammonia\ to urea as the principal product of nitrogen excretion PUBMED:7916684. There are several arginase isozymes that differ in catalytic, molecular and immunological properties. Deficiency in the liver isozyme leads to argininemia, which is usually associated with hyperammonemia.

    \

    Agmatinase hydrolyses agmatine to putrescine, the precursor for the biosynthesis of higher polyamines, spermidine and spermine. In addition, agmatine may play an important regulatory role in mammals.

    \

    Formiminoglutamase catalyses the fourth step in histidine degradation, acting to hydrolyse N-formimidoyl-L-glutamate to L-glutamate and formamide.

    \

    Proclavaminate amidinohydrolase is involved in clavulanic acid biosynthesis. Clavulanic acid acts as an inhibitor of a wide range of beta-lactamase enzymes that are used by various microorganisms to resist beta-lactam antibiotics. As a result, this enzyme improves the effectiveness of beta-lactamase antibiotics PUBMED:12020346.

    \ \ ' '1240' 'IPR001518' '\

    Argininosuccinate synthase () (AS) is a urea cycle enzyme that catalyzes the penultimate step in arginine biosynthesis: the ATP-dependent ligation of citrulline to aspartate to form argininosuccinate, AMP and pyrophosphate PUBMED:2123815, PUBMED:3133361.

    \

    In humans, a defect in the AS gene causes citrullinemia, a genetic disease\ characterised by severe vomiting spells and mental retardation.

    \

    AS is a homotetrameric enzyme of chains of about 400 amino-acid residues. An arginine seems to be important for the enzyme\'s catalytic mechanism. The sequences of AS from various prokaryotes, archaebacteria and eukaryotes show significant similarity.

    \ ' '1241' 'IPR002813' '\

    ArgJ is a bifunctional protein that catalyses the first and fifth steps in arginine biosynthesis PUBMED:8473852. The structure has been determined for glutamate N-acetyltransferase 2 (ornithine acetyltransferase; ), an ArgJ-like protein from Streptomyces clavuligerus PUBMED:15352873.

    \ ' '1242' 'IPR005129' '\

    Bacterial periplasmic transport systems require the function of a specific substrate-binding protein, located in the periplasm, and several\ cytoplasmic membrane transport components. In Escherichia coli, the arginine-ornithine transport system requires an\ arginine-ornithine-binding protein and the lysine-arginine-ornithine (LAO) transport system includes a LAO-binding protein. Both\ periplasmic proteins can be phosphorylated by a single kinase, ArgK PUBMED:2136858 resulting in reduced levels of transport activity of the periplasmic transport systems that\ include each of the binding proteins. The ArgK protein acts as an ATPase enzyme and as a kinase.

    \ ' '1244' 'IPR003348' '\

    This ATPase is involved in the removal of arsenate, antimonite, and arsenate from the cell.

    \

    In Escherichia coli an anion-translocating ATPase has been identified as the product of the arsenical resistance operon of resistance Plasmid R773. This ATP-driven oxyanion pump catalyses extrusion of the oxyanions arsenite, antimonite and arsenate. Maintenance of a low intracellular concentration of oxyanion produces resistance to the toxic agents. The pump is composed of two polypeptides, the products of the arsA and arsB genes. This two-subunit enzyme produces resistance to arsenite and antimonite. A third gene, arsC, expands the substrate specificity to allow for arsenate pumping and resistance PUBMED:1704144.

    \

    The ArsA and ArsB proteins form a membrane-bound pump that functions as an oxyanion-translocating ATPase. The ArsC protein is an arsenate reductase that reduces arsenate to arsenite, which is subsequently pumped out of the cell PUBMED:7629056.

    \ ' '1245' 'IPR000802' '\

    Arsenic is a toxic metalloid whose trivalent and pentavalent ions inhibit\ a variety of biochemical processes. Operons that encode arsenic resistance\ have been found in multicopy plasmids from both Gram-positive and\ Gram-negative bacteria PUBMED:7721697. The resistance mechanism is encoded from a single\ operon, which houses an anion pump. The pump has two polypeptide components:\ a catalytic subunit (the ArsA protein), which functions as an\ oxyanion-stimulated ATPase; and an arsenite export component (the ArsB protein),\ which is associated with the inner membrane PUBMED:1688427. The ArsA and ArsB proteins\ are thought to form a membrane complex that functions as an\ anion-translocating ATPase.

    \

    The ArsB protein is distinguished by its overall hydrophobic character,\ in keeping with its role as a membrane-associated channel. Sequence\ analysis reveals the presence of 13 putative transmembrane (TM) regions.

    \ ' '1246' 'IPR006660' '\

    Several bacterial taxon have a chromosomal resistance system, encoded by the\ ars operon, for the detoxification of arsenate, arsenite, and antimonite PUBMED:7860609. This system transports arsenite and antimonite out of the cell. The pump is composed of two polypeptides, the products of the arsA and arsB genes. This two-subunit enzyme produces resistance to arsenite and antimonite. Arsenate, however, must first be reduced to arsenite before it is extruded. A third gene, arsC, expands the substrate specificity to allow for arsenate pumping and resistance. ArsC is an approximately 150-residue arsenate reductase that uses reduced glutathione (GSH) to convert arsenate to arsenite with a redox active cysteine residue in the active site. ArsC forms an active quaternary complex with GSH, arsenate, and glutaredoxin 1 (Grx1). The three ligands must be present simultaneously for reduction to occur PUBMED:9261111.

    \ \

    The arsC family also comprises the Spx proteins which are GRAM-positive bacterial transcription factors that regulate the transcription of multiple genes in response to disulphide stress PUBMED:15028674.

    \ \

    The arsC protein structure has been solved PUBMED:11709171. It belongs to the thioredoxin superfamily fold which is defined by a beta-sheet core surrounded by alpha-helices. The active cysteine residue of ArsC is located in the loop between the first beta-strand and the first helix, which is also conserved in the Spx protein and its homologues.

    \ ' '1247' 'IPR000768' '\ Mono-ADP-ribosylation is a post-translational modification of proteins in which the \ ADP-ribose moiety of NAD is transferred to proteins. This process is responsible for the toxicity\ of some bacterial toxins (e.g., cholera and pertussis toxins). A family of \ mono(ADP-ribosyl)transferases exists in vertebrates that transfer ADP-ribose to arginine PUBMED:8703012.\ \ At least five forms of the enzyme have been characterised to date, some of which are\ attached to the membrane via glycosylphosphatidylinositol (GPI) anchors, while others\ appear to be secreted. The enzymes contain ~250-300 residues, which encode putative\ signal sequences and carbohydrate attachment sites. In addition, the N- and C-termini are\ predominantly hydrophobic, a characteristic of GPI-anchored proteins PUBMED:7947688.\ ' '1248' 'IPR003412' '\

    Arteriviruses are small, enveloped, animal viruses with an icosahedral core containing a positive-sense RNA genome. The arteriviruses are highly species specific, but share many biological and molecular properties, including virion morphology, a unique set of structural proteins, genome organization and replication strategy. The membrane glycoprotein GP5 is found in the porcine reproductive and respiratory syndrome virus (PRRSV) PUBMED:18336924.This is a family of structural glycoproteins from Arteriviridae that corresponds to open reading frame 4 (ORF4) of the virus.

    \ ' '1249' 'IPR002484' '\

    Arterivirus are ssRNA positive-strand viruses with no DNA stage in their replication cycle. This family contains the viral nucleocapsid protein, which encapsidates the viral ssRNA.

    \ \

    Porcine reproductive and respiratory syndrome virus (PRRSV) is the causative agent of both severe and persistent respiratory disease and reproductive failure in pigs worldwide. The PRRSV virion contains a core made of the 123 amino acid nucleocapsid (N or VP1) protein, a product of the ORF7 gene. The crystal structure of the capsid-forming domain of the nucleocapsid protein has been determined to 2.6 A resolution. The protein exists as a tight dimer forming a four-stranded beta sheet floor superposed by two long alpha helices and flanked by two N- and two C-terminal alpha helices. The structure represents a new class of viral capsid-forming domains, distinctly different from those of other known enveloped viruses, but reminiscent of the coat protein of bacteriophage MS2 PUBMED:14604534.

    \ ' '1250' 'IPR002556' '\ This family consists of viral envelope proteins from the\ Arteriviridae; this includes Porcine reproductive and respiratory syndrome virus (PRRSV) envelope protein GP3 and Lactate dehydrogenase-elevating virus (LDV) structural glycoprotein.\ Arteriviruses consists of positive ssRNA and do not have a DNA\ stage.\ ' '1251' 'IPR001332' '\ Arteriviruses encode four envelope proteins, GL, GS, M and N. GL envelope glycoprotein\ is heterogenously glycosylated with N-acetyllactosamine in a cell-type-specific manner. \ The GL glycoprotein expresses the neutralization determinants PUBMED:8553578.\ ' '1252' 'IPR001542' '\

    Arthropod defensins are a family of insect and scorpion cysteine-rich antibacterial peptides, primarily active against Gram-positive bacteria PUBMED:2911573, PUBMED:2358464, PUBMED:8471044, PUBMED:1761552, PUBMED:1425705. All these peptides range in length from 38 to 51 amino acids. There are six conserved cysteines all involved in intrachain disulphide bonds.

    \

    A schematic representation of peptides from the arthropod defensin family is shown below.\

    \
            +----------------------------+\
            |                   | \
           xxCxxxxxxxxxxxxxxCxxxCxxxxxxxxxCxxxxxCxCxx\
                      |  |          | |\
                      +---|---------------+ |\
                         +-----------------+\
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
     \
    

    \

    Although low level sequence similarities have been reported PUBMED:2911573 between the arthropod defensins and mammalian defensins, the topological arrangement of the disulphide bonds as well as the tertiary structure PUBMED:2401368 are completely different in the two families.

    \ ' '1253' 'IPR007290' '\

    Arv1 is a transmembrane protein, with potential zinc-binding motifs, that mediates sterol homeostasis. Its action is important in lipid homeostasis, which prevents free sterol toxicity PUBMED:11063737. Arv1 contains a homology domain (AHD), which consists of an N-terminal cysteine-rich subdomain with a putative zinc-binding motif, followed by a C-terminal subdomain of 33 amino acids. The C-terminal subdomain of the AHD is critical for the protein\'s function PUBMED:16725371. In yeast, Arv1p is important for the delivery of an early glycosylphosphatidylinositol GPI intermediate, GlcN-acylPI, to the first mannosyltransferase of GPI synthesis in the ER lumen PUBMED:18287539. It is important for the traffic of sterol in yeast and in humans. In eukaryotic cells, it may fuction in the sphingolipid metabolic pathway as a transporter of ceramides between the ER and Golgi PUBMED:12145310.

    \ ' '1254' 'IPR002640' '\

    \ The serum paraoxonases/arylesterases are enzymes that catalyse the hydrolysis\ of the toxic metabolites of a variety of organophosphorus insecticides. The\ enzymes hydrolyse a broad spectrum of organophosphate substrates, including \ paraoxon and a number of aromatic carboxylic acid esters (e.g., phenyl\ acetate), and hence confer resistance to organophosphate toxicity PUBMED:8661009. \

    \

    \ Mammals have 3 distinct paraoxonase types, termed PON1-3 PUBMED:8661009, PUBMED:11038162. In mice and\ humans, the PON genes are found on the same chromosome in close proximity. \ PON activity has been found in variety of tissues, with highest levels in \ liver and serum - the source of serum PON is thought to be the liver. Unlike mammals, fish and avian species lack paraoxonase activity. \

    \

    \ Human and rabbit PONs appear to have two distinct Ca2+ binding sites, one\ required for stability and one required for catalytic activity. The Ca2+\ dependency of PONs suggests a mechanism of hydrolysis where Ca2+ acts as the\ electrophilic catalyst, like that proposed for phospholipase A2. The\ paraoxonase enzymes, PON1 and PON3, are high density lipoprotein (HDL)-\ associated proteins capable of preventing oxidative modification of low\ density lipoproteins (LPL) PUBMED:11038162. Although PON2 has oxidative properties, the\ enzyme does not associate with HDL.\

    \

    \ Within a given species, PON1, PON2 and PON3 share ~60% amino acid sequence \ identity, whereas between mammalian species particular PONs (1,2 or 3) share\ 79-90% identity at the amino acid level. Human PON1 and PON3 share numerous \ conserved phosphorylation and N-glycosylation sites; however, it is not \ known whether the PON proteins are modified at these sites, or whether \ modification at these sites is required for activity in vivo PUBMED:11038162. \

    \ \ This family consists of arylesterases (Also known as serum paraoxonase) . These enzymes hydrolyse organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity PUBMED:9032442. Human arylesterase (PON1) is associated with HDL and may protect against LDL oxidation PUBMED:8661009.\ ' '1255' 'IPR001873' '\

    The apical membrane of many tight epithelia contains sodium channels that\ are primarily characterised by their high affinity to the diuretic blocker\ amiloride PUBMED:8181670, PUBMED:8905643, PUBMED:8905643, PUBMED:7499195. These channels mediate the first step of active sodium\ reabsorption essential for the maintenance of body salt and water\ homeostasis PUBMED:8181670. In vertebrates, the channels control reabsorption of\ sodium in kidney, colon, lung and sweat glands; they also play a role in\ taste perception.

    \

    Members of the epithelial Na+ channel (ENaC) family fall into four\ subfamilies, termed alpha, beta, gamma and delta PUBMED:8905643. The proteins exhibit\ the same apparent topology, each with two transmembrane (TM) spanning\ segments, separated by a large extracellular loop. In most ENaC proteins\ studied to date, the extracellular domains are highly conserved and contain\ numerous cysteine residues, with flanking C-terminal amphipathic TM regions,\ postulated to contribute to the formation of the hydrophilic pores of the\ oligomeric channel protein complexes. It is thought that the well-conserved\ extracellular domains serve as receptors to control the activities of the\ channels.

    \

    Vertebrate ENaC proteins are similar to degenerins of Caenorhabditis elegans\ PUBMED:7929098: deg-1, del-1, mec-4, mec-10 and unc-8. These proteins can be mutated to cause neuronal degradation, and are also thought to form sodium channels.

    \

    Structurally, the proteins that belong to this family consist of about 510\ to 920 amino acid residues. They are made of an intracellular N-terminus\ region followed by a transmembrane domain, a large extracellular loop, a\ second transmembrane segment and a C-terminal intracellular tail PUBMED:7929098.

    \ ' '1256' 'IPR002595' '\ The multigene family 360 protein are found within the \ African swine fever virus (ASFV) genome which consist of\ dsDNA and has similar structural features to the \ poxviruses PUBMED:2325203. The biological function of this family is \ not known PUBMED:2325203, although is a major structural \ protein PUBMED:7856088.\ ' '1257' 'IPR007844' '\ The AsmA protein is involved in the assembly of outer membrane proteins in Escherichia coli PUBMED:8866482. AsmA mutations were isolated as extragenic suppressors of an OmpF assembly mutant PUBMED:7476172. AsmA may have a role in LPS biogenesis PUBMED:7476172.\ ' '1258' 'IPR001962' '\ This domain is always found associated with (). Family members that contain this domain catalyse the conversion of aspartate to asparagine. Asparagine synthetase B () catalyzes the assembly of asparagine from aspartate, Mg(2+)ATP, and glutamine.\ The three-dimensional architecture of the N-terminal domain of asparagine synthetase B is similar to that observed for glutamine phosphoribosylpyrophosphate amidotransferase while the molecular motif of the C-domain is reminiscent to that observed for GMP synthetase PUBMED:10587437.\ ' '1259' 'IPR004618' '\

    Aspartate--ammonia ligase (asparagine synthetase) catalyses the conversion of L-aspartate to L-asparagine in the presence of ATP and ammonia. This family represents one of two non-homologous forms of aspartate--ammonia ligase found in Escherichia coli. This type is also found in Haemophilus influenzae, Treponema pallidum and Lactobacillus delbrueckii, but appears to have a very limited distribution. The fact that the protein from the H. influenzae is more than 70% identical to that from the spirochete T. pallidum, but less than 65% identical to that from the closely related E. coli, strongly suggests lateral transfer.

    \ ' '1260' 'IPR001461' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human)\ .

    \ \

    More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation.

    \ \

    Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1\' positions PUBMED:7674916. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site PUBMED:7674916. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event PUBMED:7674916.

    \

    This family does not include the retroviral nor retrotransposon \ aspartic proteases which are much smaller and appear to \ be homologous to the single domain aspartic proteases.

    \ ' '1261' 'IPR003190' '\ Decarboxylation of aspartate is the major route of alanine production in bacteria, and is catalysed by the enzyme aspartate decarboxylase. The enzyme is translated as an inactive proenzyme of two chains, A and B. This family contains both chains of aspartate decarboxylase.\ ' '1262' 'IPR006034' '\ Asparaginase, which is found in various plant, animal and bacterial cells, catalyses the deamination of asparagine to yield aspartic acid and an ammonium ion, resulting in a depletion of free circulatory asparagine in plasma PUBMED:3026924. The enzyme is effective in the treatment of human malignant lymphomas, which have a diminished capacity to produce asparagine synthetase: in order to survive, such cells absorb asparagine from blood plasma PUBMED:2407723, PUBMED:3379033 - if Asn levels have been depleted by injection of asparaginase, the lymphoma cells die. Glutaminase, a similar enzyme, catalyses the deaminination of glutamine to glutamic acid and an ammonium ion PUBMED:2407723. Both enzymes are homotetramers PUBMED:3026924: two threonine residues in the N-terminal half of the proteins are involved in the catalytic activity.\ ' '1263' 'IPR000246' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Threonine peptidases are characterised by a threonine nucleophile at the N terminus of the mature enzyme. The threonine peptidases belong to clan PB or are unassigned, clan T-. The type example for this clan is the archaean proteasome beta component of Thermoplasma acidophilum.

    \

    This group of sequences have a signature that places them in MEROPS peptidase family T2 (clan PB(T)). The glycosylasparaginases () are threonine peptidases. Also in this family is L-asparaginase (), which catalyses the following reaction:\

    \

    Glycosylasparaginase catalyses:\ \ cleaving the GlcNAc-Asn bond that links oligosaccharides to asparagine in N-linked glycoproteins. The enzyme is composed of two non-identical alpha/beta subunits joined by strong non-covalent forces and has one glycosylation site located in the alpha subunit PUBMED:8877373 and plays a major role in the degradation of glycoproteins.

    \ ' '1264' 'IPR007041' '\

    Arginine N-succinyltransferase catalyses the transfer of succinyl-CoA to arginine to produce succinylarginine. This is the first step in arginine catabolism via the arginine succinyltransferase pathway. Six major L-arginine-degrading pathways have been described for prokaryotes PUBMED:18045455. Many bacteria arginine succinyltransferase , which is the AstA protein of the succinyltransferase (ast) pathway operon consists of five genes. In a few species, such as Pseudomonas aeruginosa, a tandem gene pair encodes alpha and beta subunits of a heterodimer that is designated arginine and ornithine succinyltransferase (AOST).

    \ \ \

    This entry represents the family of proteins that make up the beta subunit of the heterodimer of Ast and AOST.

    \ ' '1265' 'IPR007079' '\

    Succinylarginine dihydrolase (AstB) transforms N(2)-succinylglutamate into succinate and glutamate. This enzyme is the second in the five-step ammonia-producing arginine succinyltransferase pathway, the major pathway in Escherichia coli and in other related bacteria for arginine catabolism as a sole nitrogen source. AstB assumes a five-stranded alpha/beta propeller structure, placing it in the amidinotransferase (AT) superfamily of proteins, which are characterised by their Cys-His-Glu active site triad PUBMED:15703173.

    \ ' '1266' 'IPR007036' '\

    This family describes both succinylglutamate desuccinylase that catalyses the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway and also includes aspartoacylase which cleaves acylaspartate into a fatty acid and aspartate. Mutations in lead to Canavan disease PUBMED:8252036.

    \ ' '1267' 'IPR004337' '\ The astrovirus genome is apparently organised with nonstructural proteins encoded at the 5\' end and structural proteins at the 3\' end PUBMED:8254779. \ Proteins in this family are encoded by astrovirus ORF2, one of the three astrovirus ORFs (1a, 1b, 2). The proteins contain a viral RNA-dependent RNA polymerase motif PUBMED:8254779. The 87kDa precursor polyprotein\ undergoes an intracellular cleavage to form a 79kDa protein. Subsequently, extracellular trypsin cleavage yields the three\ proteins forming the infectious virion PUBMED:10644354.\ ' '1268' 'IPR007472' '\

    Arginine-tRNA-protein transferase catalyses the post-translational conjugation of arginine to the N terminus of a protein. In eukaryotes, this functions as part of the N terminus rule pathway of protein degradation by conjugating a destabilising amino acid to the N-terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N-terminal cysteine is sometimes modified PUBMED:9858543. In Saccharomyces cerevisiae, Cys20, 23, 94 and/or 95 are thought to be important for activity PUBMED:7495814. Of these, only Cys 94 appears to be completely conserved in this family.

    \

    This entry represents the C-terminal region of the enzyme arginine-tRNA-protein transferase, found in both eukaryotic and prokaryotic enzymes.

    \ ' '1269' 'IPR007471' '\

    Arginine-tRNA-protein transferase catalyses the post-translational conjugation of arginine to the N terminus of a protein. In eukaryotes, this functions as part of the N terminus rule pathway of protein degradation by conjugating a destabilising amino acid to the N-terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N-terminal cysteine is sometimes modified PUBMED:9858543. In Saccharomyces cerevisiae, Cys20, 23, 94 and/or 95 are thought to be important for activity PUBMED:7495814. Of these, only Cys 94 appears to be completely conserved in this family.

    \

    This entry represents the N-terminal region of the enzyme arginine-tRNA-protein transferase, found in both eukaryotic and prokaryotic enzymes.

    \ ' '1270' 'IPR004312' '\ ATHILA is a group of Arabidopsis thaliana retrotransposons PUBMED:8534844 belonging to the Ty3/gypsy family of the long terminal\ repeat (LTR) class of eukaryotic retrotransposonsPUBMED:9611185, PUBMED:10889217. The central region of ATHILA retrotransposons contains two or\ three open reading frames (ORFs). This family represents the ORF1 product. The function of ORF1 is unknown.\ ' '1271' 'IPR005144' '\

    The ATP-cone is an evolutionarily mobile, ATP-binding regulatory domain which is found in a variety of proteins including ribonucleotide reductases, phosphoglycerate kinases and transcriptional regulators PUBMED:10939243.

    \ \

    In ribonucleotide reductase protein R1 () from Escherichia coli this domain is located at the N-terminus, and is composed mostly of helices PUBMED:8052308. It forms part of the allosteric effector region and contains the general allosteric activity site in a cleft located at the tip of the N-terminal region PUBMED:9309223. This site binds either ATP (activating) or dATP (inhibitory), with the base bound in a hydrophobic pocket and the phosphates bound to basic residues. Substrate binding to this site is thought to affect enzyme activity by altering the relative positions of the two subunits of ribonucleotide reductase.

    \ \ ' '1272' 'IPR003135' '\

    The ATP-grasp domain has an unusual nucleotide-binding fold, also referred to as palmate, and is found in a superfamily of enzymes including D-alanine-D-alanine ligase, glutathione synthetase, biotin carboxylase, and carbamoyl phosphate synthetase, the ribosomal protein S6 modification enzyme (RimK), urea amidolyase, tubulin-tyrosine ligase, and three enzymes of purine biosynthesis. This family does not contain all known ATP-grasp domain members. All the enzymes of this family possess ATP-dependent carboxylate-amine ligase activity, and their catalytic mechanisms are likely to include acylphosphate intermediates.

    \ ' '1273' 'IPR000749' '\

    ATP:guanido phosphotransferases are a family of structurally and functionally related enzymes PUBMED:2324092, PUBMED:7819288 that reversibly catalyse the transfer of phosphate between ATP and various phosphogens. The enzymes belonging to this family include:

    \ \

    \ \

    Creatine kinase plays an important role in energy metabolism of vertebrates. There are at least four different, but very closely related, forms of CK. Two isozymes, M (muscle) and B (brain), are cytosolic, while the other two are mitochondrial. In sea urchins there is a flagellar isozyme, which consists of the triplication of a CK-domain. A cysteine residue is implicated in the catalytic activity of these enzymes and the region around this active site residue is highly conserved.

    \ ' '1274' 'IPR000749' '\

    ATP:guanido phosphotransferases are a family of structurally and functionally related enzymes PUBMED:2324092, PUBMED:7819288 that reversibly catalyse the transfer of phosphate between ATP and various phosphogens. The enzymes belonging to this family include:

    \ \

    \ \

    Creatine kinase plays an important role in energy metabolism of vertebrates. There are at least four different, but very closely related, forms of CK. Two isozymes, M (muscle) and B (brain), are cytosolic, while the other two are mitochondrial. In sea urchins there is a flagellar isozyme, which consists of the triplication of a CK-domain. A cysteine residue is implicated in the catalytic activity of these enzymes and the region around this active site residue is highly conserved.

    \ ' '1275' 'IPR002650' '\ This entry consists of ATP-sulfurylase or sulphate adenylyltransferase (0 some of which are part of a bifunctional polypeptide chain associated with adenosyl phosphosulphate (APS) kinase, . Both enzymes are required for PAPS (phosphoadenosine-phosphosulphate) synthesis from inorganic sulphate PUBMED:8522184. ATP sulfurylase catalyses the synthesis of adenosine-phosphosulphate APS from ATP and inorganic sulphate PUBMED:9671738.\ ' '1276' 'IPR000131' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    The ATPase F1 complex gamma subunit forms the central shaft that connects the F0 rotary motor to the F1 catalytic core. The gamma subunit functions as a rotary motor inside the cylinder formed by the alpha(3)beta(3) subunits in the F1 complex PUBMED:16154570. The best-conserved region of the gamma subunit is its C-terminus, which seems to be essential for assembly and catalysis.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1277' 'IPR007849' '\

    This entry represents the ATPase assembly factor ATP10 found in mitochondria, which is essential for the assembly of the mitochondrial F1-F0 complex. A yeast nuclear gene (ATP10) encodes a product that is essential for the assembly of a functional mitochondrial ATPase complex. Mutations in ATP10 induce a loss of rutamycin sensitivity in the mitochondrial ATPase, but do not affect the respiratory enzymes. ATP10 has an Mr of 30,293 and its primary structure is not related to any known subunit of the yeast or mammalian mitochondrial ATPase complexes. ATP10 is associated with the mitochondrial membrane. It is suggested that the ATP10 product is not a subunit of the ATPase complex but rather a protein required for the assembly of the F0 sector of the complex PUBMED:2141026.

    \ ' '1278' 'IPR001421' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit 8 found in the F0 complex of mitochondrial F-ATPases from Metazoa. This subunit appears to be an integral component of the stator stalk in yeast mitochondrial F-ATPases PUBMED:12626501. The stator stalk is anchored in the membrane, and acts to prevent futile rotation of the ATPase subunits relative to the rotor during coupled ATP synthesis/hydrolysis. This subunit may have an analogous function in Metazoa. Subunit 8 differs in sequence between Metazoa, plants () and fungi ().

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1279' 'IPR000568' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit A (or subunit 6) found in the F0 complex of F-ATPases. This subunit is a key component of the proton channel, and may play a direct role in the translocation of protons across the membrane. Catalysis in the F1 complex depends upon the rotation of the central stalk and F0 c-ring, which in turn is driven by the flux of protons through the membrane via the interface between the F0 c-ring and subunit A. The peripheral stalk links subunit A to the external surface of the F1 domain, and is thought to act as a stator to counter the tendency of subunit A and the F1 alpha(3)beta(3) catalytic portion to rotate with the central rotary element\ PUBMED:16045926.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1280' 'IPR000194' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    This entry represents the alpha and beta subunits found in the F1, V1, and A1 complexes of F-, V- and A-ATPases, respectively (sometimes called the A and B subunits in V- and A-ATPases), as well as flagellar ATPase and the termination factor Rho. The F-ATPases (or F1F0-ATPases), V-ATPases (or V1V0-ATPases) and A-ATPases (or A1A0-ATPases) are composed of two linked complexes: the F1, V1 or A1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0, V0 or A0 complex that forms the membrane-spanning pore. The F-, V- and A-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis PUBMED:11309608, PUBMED:15629643.

    \

    In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit PUBMED:12745923.

    \

    In V- and A-ATPases, the alpha/A and beta/B subunits of the V1 or A1 complex are homologous to the alpha and beta subunits in the F1 complex of F-ATPases, except that the alpha subunit is catalytic and the beta subunit is regulatory.

    \

    The alpha/A and beta/B subunits can each be divided into three regions, or domains, centred around the ATP-binding pocket, and based on structure and function. The central domain contains the nucleotide-binding residues that make direct contact with the ADP/ATP molecule PUBMED:12885621.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1281' 'IPR002146' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunits B and B\' from the F0 complex in F-ATPases found in chloroplasts and in bacterial plasma membranes. The B subunits are part of the peripheral stalk that links the F1 and F0 complexes together, and which acts as a stator to prevent certain subunits from rotating with the central rotary element. The peripheral stalk differs in subunit composition between mitochondrial, chloroplast and bacterial F-ATPases. In bacterial and chloroplast F-ATPases, the peripheral stalk is composed of one copy of the delta subunit (homologous to OSCP in mitochondria), and two copies of subunit B in bacteria, or one copy each of subunits B and B\' in chloroplasts and photosynthetic bacteria PUBMED:16045926.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1282' 'IPR002379' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    The F-ATPases (or F1F0-ATPases) and V-ATPases (or V1V0-ATPases) are each composed of two linked complexes: the F1 or V1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0 or V0 complex that forms the membrane-spanning pore. The F- and V-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis PUBMED:11309608, PUBMED:15629643.

    \

    This entry represents subunit C (also called subunit 9, or proteolipid in F-ATPases, or the 16 kDa proteolipid in V-ATPases) found in the F0 or V0 complex of F- and V-ATPases, respectively. In F-ATPases, ten C subunits form an oligomeric ring that makes up the F0 rotor. The flux of protons through the ATPase channel drives the rotation of the C subunit ring, which in turn is coupled to the rotation of the F1 complex gamma subunit rotor due to the permanent binding between the gamma and epsilon subunits of F1 and the C subunit ring of F0. The sequential protonation and deprotonation of Asp61 of subunit C is coupled to the stepwise movement of the rotor PUBMED:14630314.

    \

    In V-ATPases, there are three proteolipid subunits (c, c\' and c\'\') that form part of the proton-conducting pore, each containing a buried glutamic acid residue that is essential for proton transport, and together they form a hexameric ring spanning the membrane PUBMED:15951435, PUBMED:14635779.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1283' 'IPR002699' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis PUBMED:11309608, PUBMED:15629643, PUBMED:15168615. The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.

    \

    This entry represents the D subunit found in V1 and A1 complexes of V- and A-ATPases, respectively. Subunit D appears to be located in the central stalk, whereas subunits E and G form part of the peripheral stalk connecting V1 and V0. This subunit is the most likely homologue to the gamma subunit of the F1 complex in F-ATPases, which undergoes rotation during ATP hydrolysis and serves an essential function in rotary catalysis PUBMED:12220197, PUBMED:7831318.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1284' 'IPR020547' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This family represents subunits called delta (in mitochondrial ATPase) or epsilon (in bacteria or chloroplast ATPase). The interaction site of subunit C of the F0 complex with the delta or epsilon subunit of the F1 complex may be important for connecting the rotor of F1 (gamma subunit) to the rotor of F0 (C subunit) PUBMED:12887009. In bacterial species, the delta subunit is the equivalent of the Oligomycin sensitive subunit (OSCP, ) in metazoans. The C-terminal domain of the epsilon subunit appears to act as an inhibitor of ATPase activity PUBMED:.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1285' 'IPR020546' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This family represents subunits called delta (in mitochondrial ATPase) or epsilon (in bacteria or chloroplast ATPase). The interaction site of subunit C of the F0 complex with the delta or epsilon subunit of the F1 complex may be important for connecting the rotor of F1 (gamma subunit) to the rotor of F0 (C subunit) PUBMED:12887009. In bacterial species, the delta subunit is the equivalent of the Oligomycin sensitive subunit (OSCP, ) in metazoans. The C-terminal domain of the epsilon subunit appears to act as an inhibitor of ATPase activity PUBMED:.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1286' 'IPR006721' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This family constitutes the mitochondrial ATP synthase epsilon subunit, which is distinct from the bacterial epsilon subunit (the latter being homologous to the mitochondrial delta subunit, ). The mitochondrial epsilon subunit is located in the stalk region of the F1 complex, and acts as an inhibitor of the ATPase catalytic core. The epsilon subunit can assume two conformations, contracted and extended, where the latter inhibits ATP hydrolysis. The conformation of the epsilon subunit is determined by the direction of rotation of the gamma subunit, and possibly by the presence of ADP. The extended epsilon subunit is thought to become extended in the presence of ADP, thereby acting as a safety lock to prevent wasteful ATP hydrolysis PUBMED:16154570.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1287' 'IPR008218' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis PUBMED:11309608, PUBMED:15629643, PUBMED:15168615. The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.

    \

    This entry represents subunit F found in the V1 complex of V-ATPases (both eukaryotic and bacterial), as well as in the A1 complex of A-ATPases. Subunit F is a 16 kDa protein that is required for the assembly and activity of V-ATPase, and has a potential role in the differential targeting and regulation of the enzyme for specific organelles. This subunit is not necessary for the rotation of the ATPase V1 rotor, but it does promote catalysis PUBMED:14963028.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1288' 'IPR006995' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit J found in the F0 complex of F-ATPases from fungal mitochondria. This subunit does not appear to display sequence similarity with subunits of F-ATPases found in other organisms PUBMED:9867807.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1289' 'IPR000272' '\

    The FXYD protein family contains at least seven members in mammals PUBMED:12538882. Two other family\ members that are not obvious orthologs of any identified mammalian FXYD protein\ exist in zebrafish. All these proteins share a signature sequence of six conserved\ amino acids comprising the FXYD motif in the NH2-terminus, and two glycines and\ one serine residue in the transmembrane domain. FXYD proteins are widely distributed in mammalian tissues with prominent expression\ in tissues that perform fluid and solute transport or that are electrically excitable.

    \

    Initial functional characterization suggested that FXYD proteins act as channels or as modulators of ion\ channels however studies have revealed that most FXYD proteins\ have another specific function and act as tissue-specific regulatory subunits of the\ Na,K-ATPase. Each of these auxiliary\ subunits produces a distinct functional effect on the transport characteristics of\ the Na,K-ATPase that is adjusted to the specific functional demands of the tissue in\ which the FXYD protein is expressed. FXYD proteins appear to preferentially\ associate with Na,K-ATPase alpha1-beta isozymes, and affect their function in a way that\ render them operationally complementary or supplementary to coexisting isozymes.

    \ ' '1290' 'IPR005598' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit I found in the F0 complex of F-ATPases in bacterial plasma membranes from proteobacteria. A possible function for this subunit is to guide the assembly of the membrane sector of the ATPase enzyme complex.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1291' 'IPR002951' '\

    Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA (OMIM:125370) is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins PUBMED:11264541, PUBMED:9647693. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington\'s disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteristic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity PUBMED:9647693.

    .\ \

    This entry includes Atrophin-1 and related proteins.

    \ ' '1292' 'IPR005521' '\

    This domain is found in attacin, sarcotoxin and diptericin. All members of these proteins are insect antibacterial proteins which are induced by the fat body and subsequently secreted into the hemolymph where they act synergistically to kill the invading microorganism PUBMED:7772280.

    \ ' '1293' 'IPR005143' '\

    This domain binds N-acyl homoserine lactones (AHLs), which are also known as autoinducers. These are small, diffusible molecules used as communication signals in a large variety of proteobacteria. It is almost always found in association with the DNA-binding LuxR domain (). The autoinducer binding domain forms the N-terminal region of the protein, while the DNA-binding domain forms the C-terminal region. In most cases, binding of AHL by this N-terminal domain leads to unmasking of the DNA-binding domain, allowing it to bind DNA and activate transcription PUBMED:11544353. In rare cases, some LuxR proteins such as EsaR, act as repressors PUBMED:12067349. In these proteins binding of AHL to this domain leads to inactivation of the protein as a transcriptional regulator. A large number of processes have been shown to be regulated by LuxR proteins, including bioluminescence, production of virulence factors in plant and animal pathogens, antibiotic production and plasmid transfer.

    \ \

    Structural studies of TraR from Agrobacterium tumefaciens PUBMED:12087407, PUBMED:12198141 show that the functional protein is a homodimer. Binding of the cognate AHL is required for protein folding, resistance to proteases and dimerisation. The autoinducer binding domain binds its cognate AHL in an alpha/beta/alpha sandwich and provides an extensive dimerisation surface, though residues from the C-terminal region also make some contribution to dimerisation. The autoinducer binding domain is also required for interaction with RpoA, allowing transcription to occur PUBMED:15237104.

    \ \

    There are some proteins which consist solely of the autoinducer binding domain. The function of these is not known, but TrlR from Agrobacterium has been shown to inhibit the activity of TraR by the formation of inactive heterodimers PUBMED:11309123.

    \ ' '1294' 'IPR001690' '\

    Bacterial species have many methods of controlling gene expression and cell\ growth. Regulation of gene expression in response to changes in cell density is termed quorum sensing PUBMED:10607620, PUBMED:9990077. Quorum-sensing bacteria produce, release and respond to hormone-like molecules (autoinducers) that accumulate in the external environment as the cell population grows. Once a threshold of these molecules is reached, a signal transduction cascade is triggered that ultimately leads to behavioural changes in the bacterium PUBMED:9990077. Autoinducers are thus clearly important mediators of molecular communication.

    \

    Conjugal transfer of Agrobacterium octopine-type Ti plasmids is activated \ by octopine, a metabolite released from plant tumours PUBMED:8188582. Octopine causes conjugal donors to secrete a pheromone, Agrobacterium autoinducer (AAI),\ and exogenous AAI further stimulates conjugation. The putative AAI synthase and an AAI-responsive transcriptional regulator have been found to be encoded by the Ti plasmid traI and traR genes, respectively. TraR and TraI are similar to the LuxR and LuxI regulatory proteins of Vibrio fischeri, and AAI is similar in structure to the diffusable V. fischeri autoinducer, the inducing ligand of LuxR. TraR activates target genes in the presence of AAI and also activates traR and traI themselves, creating two positive-feedback loops. TraR-AAI-mediated activation in wild-type Agrobacterium strains is enhanced by culturing on solid media, suggesting a possible role in cell density sensing PUBMED:8188582.

    \

    Production of light by the marine bacterium V. fischeri and by recombinant hosts containing cloned lux genes is controlled by the density\ of the culture PUBMED:3697093. Density-dependent regulation of lux gene expression has been shown to require a locus consisting of the luxR and luxI genes.

    \

    In these and other Gram-negative bacteria, N-(3-oxohexanoyl)-L-homoserine lactone (OHHL) acts as the autoinducer by binding to transcriptional regulatory proteins and activating them PUBMED:7968529. OHHL and related molecules, such as N-butanoyl- (BHL), N-hexanoyl- (HHL) and N-oxododecanoyl- (PAI) homoserine lactones, are produced by a family of proteins that share a high level of sequence similarity.

    \

    Proteins which currently members of this family include:\

    \ ' '1295' 'IPR007135' '\

    Proteins in this entry belong to the Atg3 group of proteins and the Atg3 conjugation enzymes.

    \ \

    Autophagy is a degradative transport pathway that delivers cytosolic proteins to the lysosome (vacuole) PUBMED:11058089 and is induced by starvation PUBMED:9190802. Cytosolic proteins appear inside the vacuole enclosed in autophagic vesicles. Autophagy significantly differs from other transport pathways by using double membrane layered transport intermediates, called autophagosomes PUBMED:11675007, PUBMED:18472412. The breakdown of vesicular transport intermediates is a unique feature of autophagy PUBMED:11058089. Autophagy can also function in the elimination of invading bacteria and antigens PUBMED:18472412.

    \ \

    Atg3 is the E2 enzyme for the LC3 lipidation process PUBMED:18321988. It is essential for autophagocytosis. The super protein complex, the Atg16L complex, consists of multiple Atg12-Atg5 conjugates. Atg16L has an E3-like role in the LC3 lipidation reaction. The activated intermediate, LC3-Atg3 (E2), is recruited to the site where the lipidation takes place PUBMED:18398292.

    \ \

    Atg3 catalyses the conjugation of Atg8 and phosphatidylethanolamine (PE). Atg3 has an alpha/beta-fold, and its core region is topologically similar to canonical E2 enzymes. Atg3 has two regions inserted in the core region and another with a long alpha-helical structure that protrudes from the core region as far as 30 A PUBMED:17227760. It interacts with atg8 through an intermediate thioester bond between Cys-288 and the C-terminal Gly of atg8. It also interacts with the C-terminal region of the E1-like atg7 enzyme.

    \ \

    Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The cysteine residue within the HPC motif is the putative active-site residue for recognition of the Apg5 subunit of the autophagosome complex PUBMED:12482611.

    \ ' '1296' 'IPR007134' '\

    Proteins in this entry belong to the Atg3 group of proteins and the Atg3 conjugation enzymes.

    \ \

    Autophagy is a degradative transport pathway that delivers cytosolic proteins to the lysosome (vacuole) PUBMED:11058089 and is induced by starvation PUBMED:9190802. Cytosolic proteins appear inside the vacuole enclosed in autophagic vesicles. Autophagy significantly differs from other transport pathways by using double membrane layered transport intermediates, called autophagosomes PUBMED:11675007, PUBMED:18472412. The breakdown of vesicular transport intermediates is a unique feature of autophagy PUBMED:11058089. Autophagy can also function in the elimination of invading bacteria and antigens PUBMED:18472412.

    \ \

    Atg3 is the E2 enzyme for the LC3 lipidation process PUBMED:18321988. It is essential for autophagocytosis. The super protein complex, the Atg16L complex, consists of multiple Atg12-Atg5 conjugates. Atg16L has an E3-like role in the LC3 lipidation reaction. The activated intermediate, LC3-Atg3 (E2), is recruited to the site where the lipidation takes place PUBMED:18398292.

    \ \

    Atg3 catalyses the conjugation of Atg8 and phosphatidylethanolamine (PE). Atg3 has an alpha/beta-fold, and its core region is topologically similar to canonical E2 enzymes. Atg3 has two regions inserted in the core region and another with a long alpha-helical structure that protrudes from the core region as far as 30 A PUBMED:17227760. It interacts with atg8 through an intermediate thioester bond between Cys-288 and the C-terminal Gly of atg8. It also interacts with the C-terminal region of the E1-like atg7 enzyme.

    \ \

    Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the lysosome/vacuole. Atg3 is a ubiquitin like modifier that is topologically similar to the canonical E2 enzyme PUBMED:11825910. It catalyses the conjugation of Atg8 and phosphatidylethanolamine PUBMED:9023185.

    This domain is the N-terminal of Atg3 while the C-terminal is represented by .

    \ ' '1297' 'IPR003311' '\

    The Aux/IAA family of genes are key regulators of auxin-modified gene expression PUBMED:12036262. The plant hormone auxin (indole-3-acetic acid, IAA) regulates diverse cellular and developmental responses in plants, including cell division, expansion, differentiation and patterning of embryo responses PUBMED:15061689. Auxin can regulate the gene expression of several families, including GH3 and SAUR, as well as Aux/IAA itself. The Aux/IAA proteins act as repressors of auxin-induced gene expression, possibly through modulating the activity of DNA-binding auxin response factors (ARFs) (). Aux/IAA and ARF are thought to interact through C-terminal protein-protein interaction domains found in both Aux/IAA and ARF.

    \

    Recent evidence suggests that Aux/IAA proteins can also mediate light responses PUBMED:11544131. Some members of the AUX/IAA family are longer and contain an N-terminal DNA binding domain PUBMED:9482737 and may have an early function in the establishment of vascular and body patterns in embryonic and post-embryonic development in some plants.

    \ \ \ \ ' '1298' 'IPR000526' '\ Auxin binding protein is located in the lumen of the endoplasmic reticulum (ER). The primary structure contains an N-terminal hydrophobic leader sequence of 30-40 amino acids, which could represent a signal for translocation of the protein to the ER PUBMED:2555179, PUBMED:1321684. The mature protein comprises around 165 residues, and contains a number of potential N-glycosylation sites. In vitro transport studies have demonstrated co-translational glycosylation PUBMED:1321684. Retention within the lumen of the ER correlates with\ an additional signal located at the C terminus, represented by the sequence Lys-Asp-Glu-Leu, known to be responsible for preventing secretion of proteins from the lumen of the ER in eukaryotic cells PUBMED:2555179, PUBMED:1321684.\ ' '1299' 'IPR003676' '\ This family consists of the protein products of a gene cluster that encodes a group of auxin-regulated RNAs (small auxin up RNAs, SAURs) PUBMED:2485235. Proteins from this ARG7 auxin responsive genes family have no identified functional role PUBMED:10524760.\ ' '1300' 'IPR005166' '\

    A family of a vain specific viral glycoproteins that forms a receptor-binding gp85 polypeptide that is linked through disulphide to a membrane-spanning gp37 spike. Gp85 confers a high degree of subgroup specificity for interaction with distinct cell receptors PUBMED:3009025.

    \ ' '1301' 'IPR005468' '\ Avidin PUBMED:2388586 is a minor constituent of egg white in several groups of oviparous\ vertebrates. Avidin, which was discovered in the 1920\'s, takes its name from\ the avidity with which it binds biotin. These two molecules bind so strongly\ that is extremely difficult to separate them. Streptavidin is a protein produced\ by Streptomyces avidinii which also binds biotin and whose sequence is\ evolutionary related to that of avidin.\

    Avidin and streptavidin both form homotetrameric complexes of noncovalently\ associated chains. Each chain forms a very strong and specific non-covalent\ complex with one molecule of biotin.

    \ \

    The three-dimensional structures of both streptavidin PUBMED:2928324, PUBMED:8515446 and avidin PUBMED:2784773\ have been determined and revealed them to share a common fold: an eight\ stranded anti-parallel beta-barrel with a repeated +1 topology enclosing an\ internal ligand binding site.

    \

    Fibropellins I and III PUBMED:8500658 are proteins that form the apical lamina of the sea\ urchin embryo, a component of the extracellular matrix. These two proteins\ have a modular structure composed of a CUB domain (see), followed\ by a variable number of EGF repeats and a C-terminal avidin-like domain.

    \ ' '1302' 'IPR005042' '\

    The pathogenicity gene, pthA, of Xanthomonas axonopodis pv. citri is required to elicit symptoms of Asiatic citrus canker disease. Introduction of pthA into Xanthomonas strains that are mildly pathogenic or opportunistic on citrus confers the ability to induce cankers on citrus. Structurally, pthA is highly similar to avrBs3 and avrBsP from Xanthomonas euvesicatoria and to avrB4, avrb6, avrb7, avrBIn, avrB101, and avrB102 from Xanthomonas campestris pv. malvacearum PUBMED:1421509.

    \ ' '1303' 'IPR011606' '\

    Some proteins in this entry are encoded by a gene, which is a part of the azl operon. This operon is involved in branched-chain amino acid transport PUBMED:9287000. Overexpression of this gene results in resistance to a leucine analogue, 4-azaleucine. The protein has 5 potential transmembrane motifs.

    \ ' '1304' 'IPR003132' '\

    This entry represents the immunoglobulin-binding domain found in the Staphylococcus aureus virulence factor protein A (SpA). Protein A contains five highly homologous Ig-binding domains in tandem (designated domains E, D, A, B and C), which share a common structure consisting of three helices in a closed left-handed twist. Protein A can exist in both secreted and membrane-bound forms, and has two distinct Ig-binding activities: each domain can bind Fc-gamma (the constant region of IgG involved in effector functions) and Fab (the Ig fragment responsible for antigen recognition) PUBMED:10805799.

    \ ' '1305' 'IPR007309' '\

    Yeast transcription factor IIIC (TFIIIC) is a multisubunit protein complex that interacts with two control elements of class III promoters called the A and B blocks. This family represents the subunit within TFIIIC involved in B-block binding PUBMED:1279682. Although defined as a yeast protein, it is also found in a number of other organisms.

    \ ' '1307' 'IPR003147' '\ Protein L is a bacterial protein with immunoglobulin (Ig) light chain-binding properties. It contains a number of homologous b1 repeats towards the N-terminus. These repeats have been found to be responsible for the interaction of protein L with Ig light chains PUBMED:1618782.\ ' '1308' 'IPR005146' '\

    This entry represents the B3/B4 domain found in tRNA synthetase beta subunits as well as in some non-tRNA synthetase proteins. This domain has a 3-layer structure, and contains a beta-sandwich fold of unusual topology, and contains a putative tRNA-binding structural motif PUBMED:7664121. In Thermus thermophilus, both the catalytic alpha- and the non-catalytic beta-subunits comprise the characteristic fold of the class II active-site domains. The presence of an RNA-binding domain, similar to that of the U1A spliceosomal protein, in the beta-subunit of tRNA synthetase indicates structural relationships among different families of RNA-binding proteins.

    \

    Aminoacyl-tRNA synthetases can catalyse editing reactions to correct errors produced during amino acid activation and tRNA esterification, in order to prevent the attachment of incorrect amino acids to tRNA. The B3/B4 domain of the beta subunit contains an editing site, which lies close to the active site on the alpha subunit PUBMED:15526031. Disruption of this site abolished tRNA editing, a process that is essential for faithful translation of the genetic code.

    \ ' '1309' 'IPR005147' '\

    Domain B5 is found in phenylalanine-tRNA synthetase beta subunits. This domain has been shown to bind DNA through a winged helix-turn-helix motif PUBMED:11152603. Phenylalanine-tRNA synthetase may influence common cellular processes via DNA binding, in addition to its aminoacylation function.

    \ \ \ ' '1310' 'IPR002554' '\ Protein phosphatase 2A (PP2A) is a major intracellular protein\ phosphatase that regulates multiple aspects of cell growth and metabolism.\ The ability of this widely distributed heterotrimeric enzyme to act on a\ diverse array of substrates is largely controlled by the nature of its\ regulatory B subunit. There are multiple families of B subunits, this family is called the B56 family PUBMED:7592815.\ ' '1311' 'IPR001470' '\ Chlorosomes, which are attached to the inner surface of the cytoplasmic\ membrane, consist of four polypeptides and associated pigments and lipids.\ The principal light-harvesting pigment of the green filamentous bacterium\ Chloroflexus aurantiacus is bacteriochlorophyll (Bchl) c. This pigment is\ either bound to, or constrained by, a small approximately 80-residue\ polypeptide designated Bchlc-binding protein. In C. aurantiacus, a C-terminal\ extension is believed to play a role in proper incorporation of the protein\ during chlorosome assembly PUBMED:2376566. The protein has a high degree of similarity\ to Bchlc-binding proteins of other photosynthetic bacteria.\ ' '1312' 'IPR000119' '\

    Bacteria synthesise a set of small, usually basic proteins of about 90\ residues that bind DNA and are known as histone-like proteins PUBMED:3118156, PUBMED:3047111. Examples include the HU protein in Escherichia coli is a dimer of closely related alpha and beta chains and in other bacteria can be a dimer of identical chains. HU-type proteins have been found in a variety of eubacteria, cyanobacteria and archaebacteria, and are also encoded in the chloroplast genome of some algae PUBMED:1961745. The integration host factor (IHF), a dimer of closely related chains which seem to function in genetic recombination as well as in translational and transcriptional control PUBMED:2972385 is found in enterobacteria and viral proteins include the African Swine fever virus protein A104R (or LMW5-AR) PUBMED:8464748.

    \ \

    The exact function of these proteins is not yet clear but they are capable of wrapping DNA and stabilising it from denaturation under extreme environmental conditions. The structure is known for one of these proteins PUBMED:6540370. The protein exists as a dimer and two "beta-arms" function as the non-specific binding site for bacterial DNA.

    \ ' '1313' 'IPR013317' '\

    This entry represents the central domain of bacterial DnaA proteins PUBMED:8110826, PUBMED:1779750, PUBMED:2558436 that play an important role in initiating and regulating chromosomal replication. DnaA is an ATP- and DNA-binding protein. It binds specifically to 9 bp nucleotide repeats known as dnaA boxes which are found in the chromosome origin of replication (oriC).

    \

    DnaA is a protein of about 50 kDa that contains two conserved regions: the first is located in the N-terminal half and corresponds to the ATP-binding domain, the second is located in the C-terminal half and could be involved in DNA-binding. The protein may also bind the RNA polymerase beta subunit, the dnaB and dnaZ proteins, and the groE gene products (chaperonins) PUBMED:2172087.

    \ ' '1314' 'IPR002010' '\

    Secretion of virulence factors in Gram-negative bacteria involves \ transportation of the protein across two membranes to reach the cell \ exterior PUBMED:8969244. There have been four secretion systems described in \ animal enteropathogens such as Salmonella and Yersinia, with further \ sequence similarities in plant pathogens like Ralstonia and Erwinia PUBMED:8969244.

    \ \

    The type III secretion system is of great interest, as it is used to \ transport virulence factors from the pathogen directly into the host cell \ PUBMED:10334981 and is only triggered when the bacterium comes into close contact with\ the host. The protein subunits of the system are very similar to those of \ bacterial flagellar biosynthesis PUBMED:10564516. However, while the latter forms a\ ring structure to allow secretion of flagellin and is an integral part of\ the flagellum itself PUBMED:10564516, type III subunits in the outer membrane\ translocate secreted proteins through a channel-like structure.

    \ \

    It is believed that the family of type III inner membrane proteins are \ used as structural moieties in a complex with several other subunits PUBMED:9618447. \ One such set of inner membrane proteins, labeled "R" here for nomenclature \ purposes, includes the Salmonella and Shigella SpaR, the Yersinia YscT, \ Rhizobium Y4YN, and the Erwinia HrcT genes PUBMED:9618447. The flagellar protein FliR \ also shares similarity, probably due to evolution of the type III secretion \ system from the flagellar biosynthetic pathway.

    \ \ ' '1315' 'IPR006135' '\

    Secretion of virulence factors in Gram-negative bacteria involves \ transportation of the protein across two membranes to reach the cell \ exterior. There have been four secretion systems described in \ animal enteropathogens such as Salmonella and Yersinia, with further \ sequence similarities in plant pathogens like Ralstonia and Erwinia PUBMED:8969244.

    \ \ The type III secretion system is of great interest, as it is used to \ transport virulence factors from the pathogen directly into the host cell \ PUBMED:10334981 and is only triggered when the bacterium comes into close contact with\ the host. The protein subunits of the system are very similar to those of \ bacterial flagellar biosynthesis PUBMED:10564516. However, while the latter forms a\ ring structure to allow secretion of flagellin and is an integral part of\ the flagellum itself, type III subunits in the outer membrane\ translocate secreted proteins through a channel-like structure.

    \ \ It is believed that the family of type III inner membrane proteins are \ used as structural moieties in a complex with several other subunits PUBMED:9618447. \ One such set of inner membrane proteins, labeled "S" here for nomenclature \ purposes, includes the Salmonella and Shigella SpaS, the Yersinia YscU, \ Rhizobium Y4YO, and the Erwinia HrcU genes. The flagellar protein FlhB \ also shares similarity, probably due to evolution of the type III secretion\ system from the flagellar biosynthetic pathway.

    \ ' '1316' 'IPR007780' '\ This family consists of several bacterial proteins which are closely related to NAD-glutamate dehydrogenase found in Streptomyces clavuligerus. Glutamate dehydrogenases (GDHs) are a broadly distributed group of enzymes that catalyse the reversible oxidative deamination of glutamate to ketoglutarate and ammonia PUBMED:10924516.\ ' '1317' 'IPR001486' '\

    Globins are haem-containing proteins involved in binding and/or transporting oxygen. They belong to a very large and well studied family that is widely distributed in many organisms PUBMED:17540514. Globins have evolved from a common ancestor and can be divided into three groups: single-domain globins, and two types of chimeric globins, flavohaemoglobins and globin-coupled sensors. Bacteria have all three types of globins, while archaea lack flavohaemoglobins, and eukaryotes lack globin-coupled sensors PUBMED:16600051. Several functionally different haemoglobins can coexist in the same species. The major types of globins include:

    \

    \ \

    This entry represents a group of haemoglobin-like proteins found in eubacteria, cyanobacteria, protozoa, algae and plants, but not in animals or yeast. These proteins have a truncated 2-over-2 rather than the canonical 3-over-3 alpha-helical sandwich fold PUBMED:15598493. This entry includes:

    \

    \ ' '1318' 'IPR016048' '\

    Bacterial luciferase is a flavin monooxygenase that catalyses the oxidation of long-chain aldehydes and releases energy in the form of visible light, and which uses flavin as a substrate rather than a cofactor PUBMED:8703001. Bacterial luciferase is an alpha/beta (LuxA/LuxB) heterodimer, where each individual subunit folds into a single TIM (beta/alpha)8-barrel domain. There are structural similarities between bacterial luciferase and nonfluorescent flavoproteins (LuxF, FP390), alkanesulphonate monooxygenase (SsuD), and coenzyme F420-dependent terahydromethanopterin reductase, which make up clearly related families with somewhat different folds PUBMED:7776372, PUBMED:12445781, PUBMED:10891279.

    \

    More information about these proteins can be found at Protein of the Month: Luciferase PUBMED:.

    \ ' '1319' 'IPR001425' '\ The bacterial opsins are retinal-binding proteins that provide light-\ dependent ion transport and sensory functions to a family of halophilic \ bacteria PUBMED:2468194, PUBMED:2591367. They are integral membrane proteins believed to contain\ seven transmembrane (TM) domains, the last of which contains the attachment\ point for retinal (a conserved lysine).

    There are several classes of these\ bacterial proteins: they include bacteriorhodopsin and archaerhodopsin,\ which are light-driven proton pumps; halorhodopsin, a light-driven \ chloride pump; and sensory rhodopsin, which mediates both photoattractant\ (in the red) and photophobic (in the UV) responses.

    \ ' '1320' 'IPR000184' '\ The protein sequences of d15 from various strains of Haemophilus influenzae are highly conserved, with only a small variable region identified near the carboxyl terminus of the protein PUBMED:7737523. D15 is a highly conserved antigen that is protective in animal models and it may be a useful component of a universal subunit vaccine against Haemophilus infection and disease PUBMED:9284140. Membrane proteins from other bacteria have been shown to elicit protective immunity. Oma87 is a protective outer membrane antigen of Pasteurella multocida PUBMED:8757848.\ ' '1321' 'IPR001615' '\ Bacillus thuringiensis produces toxins active against insects PUBMED:8632451. The toxin kills the larvae of dipteran insects by making pores in the epithelial cell membrane of the insect midgut. The crystal protein is produced during sporulation and is accumulated both as an inclusion and as part of the spore coat.\ ' '1322' 'IPR002585' '\ These proteins are cytochrome bd type terminal oxidases that catalyse quinol dependent, Na+ independent oxygen uptake PUBMED:8626304. Members of this family are integral membrane proteins and contain a protoheame IX centre B558. \

    Cytochrome bd may play an important role in microaerobic nitrogen fixation in the enteric bacterium Klebsiella pneumoniae, where it is expressed under all conditions that permit diazotrophy PUBMED:9274021. Subunit I binds a single b-haem, through ligands at His186 and Met393 (using SW:P11026 numbering). In addition His19 is a ligand for the haem b found in subunit II ().

    \ ' '1323' 'IPR000249' '\

    This domain is found in a variety of polyhedral organelle shell proteins, including CsoS1A, CsoS1B and CsoS1C of Thiobacillus neapolitanus (Halothiobacillus neapolitanus) and their orthologs from other bacteria.

    \ \

    Some autotrophic and non-autotrophic organisms form polyhedral organelles, carboxysomes/enterosomes PUBMED:11722879. \ The best studied is the carboxysome of Halothiobacillus neapolitanus, which is composed of at least 9 proteins: six shell proteins, CsoS1A, CsoS1B, CsoS1C, Cso2A, Cso2B and CsoS3 (carbonic anhydrase) PUBMED:14729686, one protein of unknown function and the large and small subunits of RuBisCo (CbbL and Cbbs). \ Carboxysomes appear to be approximately 120 nm in diameter, most often observed as regular hexagons, with a solid interior bounded by a unilamellar protein shell. The interior is filled with type I RuBisCo, which is composed of 8 large subunits and 8 small subunits; it accounts for 60% of the carboxysomal protein, which amounts to approximately 300 molecules of enzyme per carboxysome. Carboxysomes are required for autotrophic growth at low CO2 concentrations and are thought to function as part of a CO2-concentrating mechanism PUBMED:15012219, PUBMED:9891798.

    \ \ \

    Polyhedral organelles, enterosomes, from non-autotrophic organisms are involved in coenzyme B12-dependent 1,2-propanediol utilisation (e.g., in Salmonella enterica PUBMED:10498708) and ethanolamine utilisation (e.g., in\ Salmonella typhimurium PUBMED:7868611). Genes needed for enterosome formation are located in the 1,2-propanediol utilisation pdu PUBMED:11844753, PUBMED:10498708 or ethanolamine utilisation eut PUBMED:7868611, PUBMED:10464203 operons, respectively. Although enterosomes of non-autotrophic organisms are apparently related to\ carboxysomes structurally, a functional relationship is uncertain. A role in CO2 concentration, similar to that of the carboxysome, is unlikely since there is no known association between CO2 and coenzyme B12-dependent 1,2-propanediol or ethanolamine utilisation PUBMED:11844753. It seems probable that entrosomes help protect the cells from reactive aldehyde species in the degradation pathways of 1,2-propanediol and ethanolamine PUBMED:11722879.

    \ ' '1324' 'IPR003362' '\

    This entry represents a conserved region from a number of different bacterial sugar transferases, involved in diverse biosynthesis pathways. Examples include galactosyl-P-P-undecaprenol synthetase (), which transfers galatose-1-phosphate to the lipid precursor undecaprenol phosphate in the first steps of O-polysaccharide biosynthesis; UDP-galactose-lipid carrier transferase, which is involved in the biosynthesis of amylovoran; and galactosyl transferase CpsD, which is essential for assembly of the group B Streptococci (GBS) type III capsular polysaccharide.

    \ ' '1325' 'IPR002633' '\

    Many Gram-positive bacteria produce ribosomally synthesized antimicrobial peptides, often termed bacteriocins. One important and well studied class of bacteriocins is the class IIa or pediocin-like bacteriocins produced by lactic acid bacteria. All class IIa bacteriocins are produced by food-associated strains, isolated from a variety of food products of industrial and natural origins, including meat products, dairy products and vegetables. Class IIa bacteriocins are all cationic, display anti-Listeria activity, and kill target cells by permeabilizing the cell membrane PUBMED:16232543, PUBMED:15611086, PUBMED:16059970.

    \

    Class IIa bacteriocins contain between 37 and 48 residues. Based on their primary structures, the peptide chains of class IIa bacteriocins may be divided roughly into two regions: a hydrophilic, cationic and highly conserved N-terminal region, and a less conserved hydrophobic/amphiphilic C-terminal region. The N-terminal region contains the conserved Y-G-N-G-V/L \'pediocin box\' motif and two conserved cysteine residues joined by a disulphide bridge. It forms a three-stranded antiparallel beta-sheet supported by the conserved disulphide bridge. This cationic N-terminal beta-sheet domain mediates binding of the class IIa bacteriocin to the target cell membrane. The C-terminal region forms a hairpin-like domain that penetrates into the hydrophobic part of the target cell membrane, thereby mediating leakage through the membrane. The two domains are joined by a hinge, which enables movement of the domains relative to each other PUBMED:15611086, PUBMED:16059970.

    \

    Some proteins known to belong to the class IIa bacteriocin family are listed below:

    \ \ ' '1326' 'IPR006883' '\ This is a family of Baculovirus proteins of approximate mass 19 kDa.\ ' '1327' 'IPR006725' '\ This family includes several hypothetical baculoviral proteins, with predicted molecular weights of approximately 44 kDa.\ ' '1328' 'IPR006871' '\ This is a family of Baculovirus ssDNA-binding proteins.\ ' '1329' 'IPR006733' '\

    This family represents the E56 protein, which is localized to the occlusion derived virus (ODV) envelope, but not to the budded virus (BV) envelope PUBMED:8599240. Signals necessary for transport and/or retention into this structure are believed to be found within the C-terminal portion of ODV-E56.

    \ ' '1330' 'IPR006934' '\ Baculovirus occlusion-derived virus (ODV) derives its envelope from an intranuclear membrane source. N-terminal amino acid sequences of the Autographa californica MNPV nuclear polyhedrosis virus (AcMNPV) envelope protein ODV-E66 is highly hydrophobic. This defined hydrophobic domain was shown to direct the protein, E66, to induce membrane microvesicles within a baculovirus-infected cell nucleus and the viral envelope. In addition, it was suggested that movement of this protein into the nuclear envelope may initiate through cytoplasmic membranes, such as endoplasmic reticulum, and that transport into the nucleus may be mediated through the outer and inner nuclear membrane PUBMED:9108103.\ ' '1331' 'IPR004941' '\ The function of the FP protein is not known. The protein is missing in baculovirus (Few Polyhedra) mutants PUBMED:8760443. \ ' '1332' 'IPR006790' '\ This is a family of viral structural glycoproteins PUBMED:1629955.\ ' '1333' 'IPR004955' '\

    Gp64 is the major envelope fusion glycoprotein in some, though not all, baculoviruses PUBMED:16997010, PUBMED:16364739. It is found on the surface of both infected cells and budded virions as a homotrimer, and by determining the viral receptor preferences it defines the host range and infection efficiency. Baculovirus enters its host cells by endocytosis followed by a low-pH-induced fusion of the viral envelope with the endosomal membrane, allowing viral entrance into the cell cytoplasm. This membrane fusion, and also the efficient budding of virions from the infected cell, is dependent on gp64. Gp75 is a homologous envelope glycoprotein from Thogotoviruses that, like gp64, forms homotrimers and is involved in both binding and entering the host cell PUBMED:1279100.

    \ ' '1334' 'IPR006824' '\ This is a family of Baculovirus DNA helicases, which are essential for the initiation of viral DNA replication and may contribute to other functions, such as controlling the switch to the late phase and leading to the inhibition of host protein synthesis.\ ' '1335' 'IPR006923' '\

    This is a family of Baculoviridae late expression factor 5, required for late and very late gene expression.

    \ ' '1336' 'IPR007765' '\ The Culex nigripalpus NPV (Culex nigripalpus nucleopolyhedrovirus) protein p24 is associated with nucleocapsids of budded and polyhedra-derived virions PUBMED:11602755, PUBMED:8423444.\ ' '1337' 'IPR006853' '\ This is a family of Baculovirus p26 proteins.\ ' '1338' 'IPR007879' '\ This family consists of a series of unidentified baculoviral P33 protein homologues of unknown function.\ ' '1339' 'IPR004122' '\

    Barrier-to-autointegration factor (BAF) is an essential protein that is highly conserved in metazoan evolution, and which may act as a DNA-bridging protein PUBMED:12902403. BAF binds directly to double-stranded DNA, to transcription activators, and to inner nuclear membrane proteins, including lamin A filament proteins that anchor nuclear-pore complexes in place, and nuclear LEM-domain proteins that bind to laminins filaments and chromatin. New findings suggest that BAF has structural roles in nuclear assembly and chromatin organization, represses gene expression and might interlink chromatin structure, nuclear architecture and gene regulation in metazoans PUBMED:15130582.

    \

    BAF can be exploited by retroviruses to act as a host component of pre-integration complexes, which promote the integration of the retroviral DNA into the host chromosome by preventing autointegration of retroviral DNA PUBMED:14645565. BAF might contribute to the assembly or activity of retroviral pre-integration complexes through direct binding to the retroviral proteins p55 Gag and matrix, as well as to DNA.

    \ \ ' '1340' 'IPR007799' '\

    This family consists of unidentified baculoviral p47 proteins which is one of the primary components of Autographa californica nuclear polyhedrosis virus (AcMNPV) encoded RNA polymerase, which initiates transcription from late and very late promoters PUBMED:9733837.\

    \ ' '1341' 'IPR006962' '\

    This family comprises the Baculovirus P48 proteins. They contain two possible membrane-spanning\ domains and a cysteine-rich domain that are conserved in all of the proteins. The Bombyx mori (Silk moth) nuclear polyhedrosis\ virus protein, , has been described as a putative DNA helicase.

    \ ' '1342' 'IPR007663' '\ Baculoviruses are distinct from other virus families in that there are two viral phenotypes: budded virus (BV) and occlusion-derived virus (ODV). BVs disseminate viral infection throughout the tissues of the host and ODVs transmit baculovirus between insect hosts. GFP tagging experiments implicate p74 as an ODV envelope protein PUBMED:2688302, PUBMED:11514740.\ ' '1343' 'IPR007601' '\ Polyhedra are large crystalline occlusion bodies containing nucleopolyhedrovirus virions, and surrounded by an electron-dense structure called the polyhedron envelope or polyhedron calyx. The polyhedron envelope (associated) protein PEP is thought to be an integral part of the polyhedron envelope. PEP is concentrated at the surface of polyhedra, and is thought to be important for the proper formation of the periphery of polyhedra. It is thought that PEP may stabilise polyhedra and protect them from fusion or aggregation PUBMED:8176372.\ ' '1344' 'IPR007600' '\ Polyhedra are large crystalline occlusion bodies containing nucleopolyhedrovirus virions, and surrounded by an electron-dense structure called the polyhedron envelope or polyhedron calyx. The polyhedron envelope (associated) protein PEP is thought to be an integral part of the polyhedron envelope. PEP is concentrated at the surface of polyhedra, and is thought to be important for the proper formation of the periphery of polyhedra. It is thought that PEP may stabilise polyhedra and protect them from fusion or aggregation PUBMED:8176372.\ ' '1345' 'IPR007589' '\ This family constitutes the 39 kDa major capsid protein of the Baculoviridae PUBMED:2644736.\ ' '1346' 'IPR006997' '\

    This is a family of Baculovirus proteins of unknown function.

    \ ' '1347' 'IPR006774' '\ ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture PUBMED:11756546. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 Saccharomyces cerevisiae ()) PUBMED:11756546. The N-terminal two thirds of the protein are necessary for DNA binding, and the N terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure PUBMED:1594441.\ ' '1348' 'IPR003103' '\

    Apoptosis, or programmed cell death (PCD), is a common and evolutionarily conserved property of all metazoans PUBMED:11341280. In many biological processes, apoptosis is required to eliminate supernumerary or dangerous (such as pre-cancerous) cells and to promote normal development. Dysregulation of apoptosis can, therefore, contribute to the development of many major diseases including cancer, autoimmunity and neurodegenerative disorders. In most cases, proteins of the caspase family execute the genetic programme that leads to cell death.

    \

    Bcl-2 proteins are central regulators of caspase activation, and play a key role in cell death by regulating the integrity of the mitochondrial and endoplasmic reticulum (ER) membranes PUBMED:12631689. At least 20 Bcl-2 proteins have been reported in mammals, and several others have been identified in viruses. Bcl-2 family proteins fall roughly into three subtypes, which either promote cell survival (anti-apoptotic) or trigger cell death (pro-apoptotic). All members contain at least one of four conserved motifs, termed Bcl-2 Homology (BH) domains. Bcl-2 subfamily proteins, which contain at least BH1 and BH2, promote cell survival by inhibiting the adapters needed for the activation of caspases.

    \ \

    Pro-apoptotic members potentially exert their effects by displacing the adapters from the pro-survival proteins; these proteins belong either to the Bax subfamily, which contain BH1-BH3, or to the BH3 subfamily, which mostly only feature BH3 PUBMED:9735050. Thus, the balance between antagonistic family members is believed to play a role in determining cell fate. Members of the wider Bcl-2 family, which also includes Bcl-x, Bcl-w and Mcl-1, are described by their similarity to Bcl-2 protein, a member of the pro-survival Bcl-2 subfamily PUBMED:9735050. Full-length Bcl-2 proteins feature all four BH domains, seven alpha-helices, and a C-terminal hydrophobic motif that targets the protein to the outer mitochondrial membrane, ER and nuclear envelope.

    \

    BAG domains are present in Bcl-2-associated athanogene 1 and silencer of death domains. The BAG proteins are modulators of chaperone activity, they bind to HSP70/HSC70 proteins and promote substrate release. The proteins have anti-apoptotic activity and increase the anti-cell death function of BCL-2 induced by various stimuli. BAG-1 binds to the serine/threonine kinase Raf-1 or Hsc70/Hsp70 in a mutually exclusive interaction. BAG-1 promotes cell growth by binding to and stimulating Raf-1 activity. The binding of Hsp70 to BAG-1 diminishes Raf-1 signalling and inhibits subsequent events, such as DNA synthesis, as well as arrests the cell cycle. BAG-1 has been suggested to function as a molecular switch that encourages cells to proliferate in normal conditions but become quiescent under a stressful environment PUBMED:12406544.

    \ \

    BAG-family proteins contain a single BAG domain, except for human BAG-5 which has four BAG repeats. The BAG domain is a conserved region located at the C-terminus of the BAG-family proteins that binds the ATPase domain of Hsc70/Hsp70. The BAG domain is evolutionarily conserved, and BAG domain containing proteins have been described and/or proven in a variety of organisms including Mus musculus (Mouse), Xenopus spp., Drosophila spp., Bombyx mori (Silk moth), Caenorhabditis elegans, Saccharomyces cerevisiae (Baker\'s yeast), Schizosaccharomyces pombe (Fission yeast), and Arabidopsis thaliana (Mouse-ear cress).

    \

    The BAG domain has 110-124 amino acids and is comprised of three anti-parallel alpha-helices, each approximately 30-40 amino acids in length. The first and second helices interact with the serine/threonine kinase Raf-1 and the second and third helices are the sites of the BAG domain interaction with the ATPase domain of Hsc70/Hsp70. Binding of the BAG domain to the ATPase domain is mediated by both electrostatic and hydrophobic interactions in BAG-1 and is energy requiring.

    \ ' '1349' 'IPR004194' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents BamHI restriction endonucleases, which recognises the DNA sequence GGATCC and cleaves after G-1 PUBMED:8145855, PUBMED:10882125.

    \ ' '1350' 'IPR019748' '\

    The FERM domain (F for 4.1 protein, E for ezrin, R for radixin and M for moesin) is a widespread protein module involved in localising proteins to the plasma membrane PUBMED:9757824. FERM domains are found in a number of cytoskeletal-associated proteins that associate with various proteins at the interface between the plasma membrane and the cytoskeleton. The FERM domain is located at the N-terminus of the majority of FERM-containing proteins PUBMED:9757824, PUBMED:10847681, which includes:

    \

    \

    Ezrin, moesin, and radixin are highly related proteins (ERM protein family), but the other proteins in which the FERM domain is found do not share any region of similarity outside of this domain. ERM proteins are made of three domains, the FERM domain, a central helical domain and a C-terminal tail domain, which binds F-actin. The amino-acid sequence of the FERM domain is highly conserved among ERM proteins and is responsible for membrane association by direct binding to the cytoplasmic domain or tail of integral membrane proteins. ERM proteins are regulated by an intramolecular association of the FERM and C-terminal tail domains that masks their binding sites for other molecules. For cytoskeleton-membrane cross-linking, the dormant molecules becomes activated and the FERM domain attaches to the membrane by binding specific membrane proteins, while the last 34 residues of the tail bind actin filaments. Aside from binding to membranes, the activated FERM domain of ERM proteins can also bind the guanine nucleotide dissociation inhibitor of Rho GTPase (RhoDGI), which suggests that in addition to functioning as a cross-linker, ERM proteins may influence Rho signalling pathways. The crystal structure of the FERM domain reveals that it is composed of three structural modules (F1, F2, and F3) that together form a compact clover-shaped structure PUBMED:10970839.

    \

    The FERM domain has also been called the amino-terminal domain, the 30kDa domain, 4.1N30, the membrane-cytoskeletal-linking domain, the ERM-like domain, the ezrin-like domain of the band 4.1 superfamily, the conserved N-terminal region, and the membrane attachment domain PUBMED:9757824.

    \ ' '1351' 'IPR004148' '\

    Endocytosis and intracellular transport involve several mechanistic steps: \

    \ Members of the Amphiphysin protein family are key regulators in the early steps of endocytosis, involved in the formation of clathrin-coated vesicles by promoting the assembly of a protein complex at the plasma membrane and directly assist in the induction of the high curvature of the membrane at the neck of the vesicle. Amphiphysins contain a characteristic domain, known as the BAR (Bin-Amphiphysin-Rvs)-domain, which is required for their in vivo function and their ability to tubulate membranes PUBMED:14993925. \

    The crystal structure of these proteins suggest the domain forms a crescent-shaped dimer of a three-helix coiled coil with a characteristic set of conserved hydrophobic, aromatic and hydrophilic amino acids. Proteins containing this domain have been shown to homodimerise, heterodimerise or, in a few cases, interact with small GTPases.

    \ ' '1352' 'IPR000468' '\ Barnase is the extracellular ribonuclease of Bacillus amyloliquefaciens, and barstar its specific intracellular inhibitor PUBMED:2696173, PUBMED:3050134. Expression of barstar is necessary to counter the lethal effect of expressed active barnase. The structure of the barnase-barstar complex is known PUBMED:8043575.\ ' '1353' 'IPR001153' '\

    Barwin is a basic protein isolated from aqueous extracts of barley seeds. It is\ 125 amino acids in length, and contains six cysteine residues that combine to form\ three disulphide bridges PUBMED:1390663, PUBMED:1390664. Comparative analysis shows the sequence to be highly similar to a 122 amino acid stretch in the C-terminal of the products of two wound-induced genes (win1 and win2) from potato, the product of the hevein gene of rubber trees, and pathogenesis-related protein 4 from tobacco. The high levels of similarity to these proteins, and their ability to bind saccharides, suggest that the barwin domain may be involved in a common defence mechanism in plants.

    \ ' '1354' 'IPR006949' '\ The temperate bacteriophage P2 has four defined tail genes: V, J, W and I. Their order is the late gene promoter, VWJI, followed by the tail fibre genes H and G and then a transcription terminator. BAP V protein is the small spike at the tip of the tail and basal plate assembly protein J lies at the edge of the baseplate PUBMED:7483254. This family also includes a number of bacterial homologues, which are thought to have been horizontally transferred.\ ' '1355' 'IPR002546' '\ This basic domain is found in the MyoD family of muscle specific proteins \ that control muscle development. The bHLH region of the MyoD family\ includes the basic domain and the Helix-loop-helix (HLH) motif.\ The bHLH region mediates specific DNA binding PUBMED:9343420. With 12 residues\ of the basic domain involved in DNA binding PUBMED:8790335. The basic domain\ forms an extended alpha helix in the structure.\ ' '1356' 'IPR000060' '\

    These prokaryotic transport proteins belong to a family known as BCCT (for Betaine /\ Carnitine / Choline Transporters) and are specific for compounds containing\ a quaternary nitrogen atom. The BCCT proteins contain 12 transmembrane regions\ and are energized by proton symport. They contain a conserved region with four\ tryptophans in their central region PUBMED:8752321.

    \ ' '1357' 'IPR003426' '\

    Bacteriochlorophyll A (or FMO) protein is involved in the energy transfer system of photosynthetic bacteria, such as Green Sulphur Bacteria. Bacteriochlorophyll A acts as a light-harvesting complex that directs light energy from the chlorosomes attached to the cell membrane to the reaction centre PUBMED:16245093. The protein forms a homotrimer, with each monomer unit containing seven molecules of bacteriochlorophyll A.

    \ ' '1358' 'IPR000712' '\

    Apoptosis, or programmed cell death (PCD), is a common and evolutionarily conserved property of all metazoans PUBMED:11341280. In many biological processes, apoptosis is required to eliminate supernumerary or dangerous (such as pre-cancerous) cells and to promote normal development. Dysregulation of apoptosis can, therefore, contribute to the development of many major diseases including cancer, autoimmunity and neurodegenerative disorders. In most cases, proteins of the caspase family execute the genetic programme that leads to cell death.

    \

    Bcl-2 proteins are central regulators of caspase activation, and play a key role in cell death by regulating the integrity of the mitochondrial and endoplasmic reticulum (ER) membranes PUBMED:12631689. At least 20 Bcl-2 proteins have been reported in mammals, and several others have been identified in viruses. Bcl-2 family proteins fall roughly into three subtypes, which either promote cell survival (anti-apoptotic) or trigger cell death (pro-apoptotic). All members contain at least one of four conserved motifs, termed Bcl-2 Homology (BH) domains. Bcl-2 subfamily proteins, which contain at least BH1 and BH2, promote cell survival by inhibiting the adapters needed for the activation of caspases.

    \ \

    Pro-apoptotic members potentially exert their effects by displacing the adapters from the pro-survival proteins; these proteins belong either to the Bax subfamily, which contain BH1-BH3, or to the BH3 subfamily, which mostly only feature BH3 PUBMED:9735050. Thus, the balance between antagonistic family members is believed to play a role in determining cell fate. Members of the wider Bcl-2 family, which also includes Bcl-x, Bcl-w and Mcl-1, are described by their similarity to Bcl-2 protein, a member of the pro-survival Bcl-2 subfamily PUBMED:9735050. Full-length Bcl-2 proteins feature all four BH domains, seven alpha-helices, and a C-terminal hydrophobic motif that targets the protein to the outer mitochondrial membrane, ER and nuclear envelope.

    \

    Active cell suicide (apoptosis) is induced by events such as growth factor withdrawal and toxins. It is controlled by regulators, which have either an inhibitory effect on programmed cell death (anti-apoptotic) or block the protective effect of inhibitors (pro-apoptotic) PUBMED:15335822,\ PUBMED:8918887. Many viruses have found a way of countering defensive apoptosis by encoding their own anti-apoptosis genes preventing their target-cells from dying too soon.

    \ \

    PAll proteins belonging to the Bcl-2 family PUBMED:8910675 contain either a BH1, BH2, BH3, or BH4 domain. All anti-apoptotic\ proteins contain BH1 and BH2 domains, some of them contain an additional N-terminal BH4 domain (Bcl-2, Bcl-x(L), Bcl-w), which is never seen in pro-apoptotic proteins, except for Bcl-x(S). On the other hand, all pro-apoptotic proteins contain a BH3 domain (except for Bad) necessary for\ dimerisation with other proteins of Bcl-2 family and crucial for their killing activity, some of them also contain BH1 and BH2 domains (Bax, Bak). The BH3 domain is also present in some anti-apoptotic protein, such as Bcl-2 or Bcl-x(L). Proteins that are known to contain these domains include vertebrate\ Bcl-2 (alpha and beta isoforms) and Bcl-x (isoforms (Bcl-x(L) and Bcl-x(S)); mammalian proteins Bax and Bak; mouse protein Bid; Xenopus laevis proteins Xr1 and Xr11; human induced myeloid leukemia cell\ differentiation protein MCL1 and Caenorhabditis elegans protein ced-9.

    \ ' '1359' 'IPR007623' '\

    This family includes the human p75NTR-associated cell death executor (Nerve growth factor receptor associated protein 1), which may be a signalling adaptor molecule involved in p75NTR-apoptosis induced by nerve growth factor. It may be important in neurogenetic diseases.

    \ ' '1360' 'IPR006804' '\

    The members of this group of sequences contain a conserved N-terminal domain which is found in the BCL7 family. The function of BCL7 proteins is unknown, though they may be involved in early development. Notably, BCL7B is commonly hemizygously deleted in patients with Williams syndrome PUBMED:9931421.

    \ ' '1361' 'IPR002731' '\ This domain is found in the BadF () and BadG ()\ proteins that are two subunits of Benzoyl-CoA reductase, that may\ be involved in ATP hydrolysis.\ The family also includes an activase subunit from the enzyme\ 2-hydroxyglutaryl-CoA dehydratase (). The hypothetical protein AQ_278 from Aquifex aeolicus\ contains two copies of this region suggesting that the family may structurally dimerise.\ ' '1362' 'IPR018513' '\

    An operon encoding 4 proteins required for bacterial cellulose biosynthesis\ (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation\ with strains lacking cellulose synthase activity PUBMED:2146681. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, \ designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum.

    \

    The calculated molecular mass of the protein encoded by bcsB is 85.3kDa PUBMED:2146681. BcsB encodes the catalytic subunit of cellulose synthase. The protein polymerises uridine 5\'-diphosphate glucose to cellulose: UDP-glucose + (1,4-beta-D-glucosyl)(N) = UDP + (1,4-beta-D-glucosyl)(N+1). The enzyme is specifically activated by the nucleotide cyclic diguanylic acid. Sequence analysis suggests that BcsB contains several transmembrane (TM) domains, and shares a high degree of similarity with Escherichia coli YhjN.

    \ ' '1363' 'IPR000409' '\

    The "beige" mouse is established as an animal model of Chediak-Higashi\ Syndrome (CHS) PUBMED:8896560. The BEACH domain was described in the BEIGE protein\ (D1035670) and in the highly homologous CHS protein . It is also\ found in distantly related proteins like, for example, \ and which are factor associated with neutral\ sphingomyelinase activation PUBMED:9620659.

    \ \

    The BEACH domain is usually followed by a series of WD repeats (). The function of the BEACH domain is\ unknown.

    \ ' '1364' 'IPR000916' '\

    Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.

    \

    The allergens in this family include allergens with the following designations: Aln g 1, Api g 1, Bet v 1, Car b 1, Cor a 1, Dau c 1, Mal d 1 and Pru a 1.

    \

    Trees within the order Fagales possess particularly potent allergens, e.g. Bet v1, the major White Birch (Betula verrucosa) pollen antigen. Bet v1 is the main cause of type I allergies observed in early spring. Type I, or immunoglobulin E-mediated (IgE-mediated) allergies affect 1 in 5 people in Europe and North America. Commonly-observed symptoms are hay fever, dermatitis, asthma and, in severe cases, anaphylactic shock. First contact with these allergens results in sensitisation; subsequent contact produces a cross-linking reaction of IgE on mast cells and concomitant release of histamine. The inevitable symptoms of an allergic reaction ensue.

    \

    Recent NMR analysis PUBMED:8702605 has confirmed earlier predictions of the protein structure and site of the major T-cell epitope PUBMED:8660368. The Bet v1 protein comprises 6 anti-parallel beta-strands and 3 alpha-helices. Four of the strands dominate the global fold, and 2 of the helices form a C-terminal amphipathic helical motif. This motif is believed to be the T-cell epitope. Other proteins belonging to this family include the major pollen allergens:\

    \ The motif is also found in:\

    \ ' '1365' 'IPR013803' '\

    Amyloid-beta precursor protein (APP, or A4) is associated with Alzheimer\'s disease (AD), because one of its breakdown products, amyloid-beta (A-beta), aggregates to form amyloid or senile plaques PUBMED:16301322, PUBMED:16364896. Mutations in APP or in proteins that process APP have been linked with early-onset, familial AD. Individuals with Down\'s syndrome carry an extra copy of chromosome 21, which contains the APP gene, and almost invariably develop amyloid plaques and Alzheimer\'s symptoms.

    \

    APP is important for the neurogenesis and neuronal regeneration, either through the intact protein, or through its many breakdown products PUBMED:16406235. APP consists of a large N-terminal extracellular region containing heparin-binding and copper-binding sites, a short hydrophobic transmembrane domain, and a short C-terminal intracellular domain. The N-terminal region is similar in structure to cysteine-rich growth factors and appears to function as a cell surface receptor, contributing to neurite growth, neuronal adhesion, axonogenesis and cell mobility PUBMED:16406235. APP acts as a kinesin I membrane receptor to mediate the axonal transport of beta-secretase and presenilin 1. The N-terminal domain can regulate neurite outgrowth through its binding to heparin and collagen I and IV, which are components of the extracellular matrix. APP is also coupled to apoptosis-inducing pathways, and is involved in copper homeostasis/oxidative stress through copper ion reduction, where copper-metallated APP induces neuronal death PUBMED:12611883. The C-terminal intracellular domain appears to be involved in transcription regulation through protein-protein interactions. APP can promote transcription activation through binding to APBB1/Tip60, and may bind to the adaptor protein FE65 to transactivate a wide variety of different promoters.

    \

    APP can be processed by different sets of enzymes:

    \

    \ \

    This entry represents the amyloid-beta peptide (A-beta), which originates as a breakdown product from the cleavage of amyloid-beta precursor protein (APP, or A4), an integral, glycosylated membrane brain protein.

    \

    More information about these protein can be found at Protein of the Month: Amyloid-beta Precursor Protein PUBMED:.

    \ ' '1366' 'IPR001466' '\

    This entry represents the serine beta-lactamase-like superfamily. It is a group of diverse group of sequences that includes D-alanyl-D-alanine carboxypeptidase B, aminopeptidase (DmpB), alkaline D-peptidase, animal D-Ala-D-Ala carboxypeptidase homologues and the class A and C beta-lactamases and eukaryotic beta-lactamase homologues which are variously described as: transesterases, non-ribosomal peptide synthetases and hypothetical proteins. Many are serine peptidases belonging to MEROPS peptidase families S11 (D-Ala-D-Ala carboxypeptidase A family) and S12 (D-Ala-D-Ala carboxypeptidase B family, clan SE). The beta-lactamases are classified as both S11 and S12 non-peptidase homologues; these either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.

    \ \ \

    Beta-lactamase catalyses the opening and hydrolysis of the beta-lactam ring\ of beta-lactam antibiotics such as penicillins and cephalosporins PUBMED:1856867. There are four groups, classed A, B, C and D according to sequence, substrate specificity, and kinetic behaviour: class A (penicillinase-type) is the most common PUBMED:1856867. The genes for class A beta-lactamases are widely distributed in bacteria, frequently located on transmissible plasmids in Gram-negative organisms, although an equivalent chromosomal gene has been found in a few species PUBMED:2788410.

    \ \

    Class A, C and D beta-lactamases are serine-utilising hydrolases - class B enzymes utilise a catalytic zinc centre instead. The 3 classes of serine beta-lactamase are evolutionarily related and belong to a superfamily that also includes DD-peptidases and other penicillin-binding proteins PUBMED:3128280. All these proteins contain an S-x-x-K motif, the Ser being the active site residue. Although clearly related, however, the sequences of the 3 classes of serine beta-lactamases vary considerably outside the active site.

    \ ' '1367' 'IPR001597' '\

    This domain is found in many tryptophanases (tryptophan indole-lyase, TNase), tyrosine phenol-lyases (TPL) and threonine aldolases. It is involved in the degradation of amino acids. The glycine cleavage system is composed of four proteins: P, T, L and H. In Bacillus subtilis, the P \'protein\' is an heterodimer of two subunits. The glycine cleavage system catalyses the degradation of glycine. The P protein binds the alpha-amino group of glycine through its pyridoxal phosphate cofactor; CO(2) is released and the remaining methylamine moiety is then transferred to the lipoamide cofactor of the H protein

    \ ' '1369' 'IPR004199' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Beta-galactosidase enzymes () belong to the glycosyl hydrolase 42 family . Beta-galactosidase is the product of the lac operon Z gene of Escherichia coli. This enzyme catalyses the hydrolysis of the disaccharide lactose to galactose and glucose, and can also convert lactose to allolactose, the inducer of the lac operon. This domain is found in single chain beta-galactosidases, which are comprised of five domains. The active site is located in a deep pocket built around the central alpha-beta barrel, with the other domains conferring specificity for a disaccharide substrate. This entry represents domain 5, which contains an N-terminal loop that swings towards the active site upon the deep binding of a ligand to produce a closed conformation PUBMED:11732897. This domain is also found in the amino-terminal portion of the small chain of dimeric beta-galactosidases.

    \ ' '1370' 'IPR003093' '\

    Apoptosis, or programmed cell death (PCD), is a common and evolutionarily conserved property of all metazoans PUBMED:11341280. In many biological processes, apoptosis is required to eliminate supernumerary or dangerous (such as pre-cancerous) cells and to promote normal development. Dysregulation of apoptosis can, therefore, contribute to the development of many major diseases including cancer, autoimmunity and neurodegenerative disorders. In most cases, proteins of the caspase family execute the genetic programme that leads to cell death.

    \

    Bcl-2 proteins are central regulators of caspase activation, and play a key role in cell death by regulating the integrity of the mitochondrial and endoplasmic reticulum (ER) membranes PUBMED:12631689. At least 20 Bcl-2 proteins have been reported in mammals, and several others have been identified in viruses. Bcl-2 family proteins fall roughly into three subtypes, which either promote cell survival (anti-apoptotic) or trigger cell death (pro-apoptotic). All members contain at least one of four conserved motifs, termed Bcl-2 Homology (BH) domains. Bcl-2 subfamily proteins, which contain at least BH1 and BH2, promote cell survival by inhibiting the adapters needed for the activation of caspases.

    \ \

    Pro-apoptotic members potentially exert their effects by displacing the adapters from the pro-survival proteins; these proteins belong either to the Bax subfamily, which contain BH1-BH3, or to the BH3 subfamily, which mostly only feature BH3 PUBMED:9735050. Thus, the balance between antagonistic family members is believed to play a role in determining cell fate. Members of the wider Bcl-2 family, which also includes Bcl-x, Bcl-w and Mcl-1, are described by their similarity to Bcl-2 protein, a member of the pro-survival Bcl-2 subfamily PUBMED:9735050. Full-length Bcl-2 proteins feature all four BH domains, seven alpha-helices, and a C-terminal hydrophobic motif that targets the protein to the outer mitochondrial membrane, ER and nuclear envelope.

    \

    Active cell suicide (apoptosis) is induced by events such as growth factor withdrawal and toxins. It is controlled by regulators, which have either an inhibitory effect on programmed cell death (anti-apoptotic) or block the protective effect of inhibitors (pro-apoptotic) PUBMED:15335822,\ PUBMED:8918887. Many viruses have found a way of countering defensive apoptosis by encoding their own anti-apoptosis genes preventing their target-cells from dying too soon.

    \ \

    PAll proteins belonging to the Bcl-2 family PUBMED:8910675 contain either a BH1, BH2, BH3, or BH4 domain. All anti-apoptotic\ proteins contain BH1 and BH2 domains, some of them contain an additional N-terminal BH4 domain (Bcl-2, Bcl-x(L), Bcl-w), which is never seen in pro-apoptotic proteins, except for Bcl-x(S). On the other hand, all pro-apoptotic proteins contain a BH3 domain (except for Bad) necessary for\ dimerisation with other proteins of Bcl-2 family and crucial for their killing activity, some of them also contain BH1 and BH2 domains (Bax, Bak). The BH3 domain is also present in some anti-apoptotic protein, such as Bcl-2 or Bcl-x(L). Proteins that are known to contain these domains include vertebrate\ Bcl-2 (alpha and beta isoforms) and Bcl-x (isoforms (Bcl-x(L) and Bcl-x(S)); mammalian proteins Bax and Bak; mouse protein Bid; Xenopus laevis proteins Xr1 and Xr11; human induced myeloid leukemia cell\ differentiation protein MCL1 and Caenorhabditis elegans protein ced-9.

    \ ' '1371' 'IPR003343' '\ Proteins that contain this domain are found in a variety of bacterial and\ phage surface proteins such as intimins. \ Intimin is a bacterial cell-adhesion molecule that mediates the intimate bacterial host-cell interaction. It contains three domains; two immunoglobulin-like domains and a C-type lectin-like module implying that carbohydrate recognition may be important in intimin-mediated cell adhesion PUBMED:10201396.\ ' '1372' 'IPR006862' '\ This entry presents the N-termini of acyl-CoA thioester hydrolase and bile acid-CoA:amino acid N-acetyltransferase (BAAT) PUBMED:11673457. This region is not thought to contain the active site of either enzyme. Thioesterase isoforms have been identified in peroxisomes, cytoplasm and mitochondria, where they are thought to have distinct functions in lipid metabolism PUBMED:10567408. For example, in peroxisomes, the hydrolase acts on bile-CoA esters PUBMED:11673457.\ ' '1373' 'IPR003540' '\ A large group of bacterial exotoxins are referred to as "A/B toxins", \ essentially because they are formed from two subunits. The "A" subunit\ possesses enzyme activity, and is transferred to the host cell following\ a conformational change in the membrane-bound transport "B" subunit PUBMED:8225592.\ \

    Clostridial species are one of the major causes of food poisoning/gastro-\ intestinal illnesses. They are Gram-positive, spore-forming rods that occur\ naturally in the soil PUBMED:8225592. Included in the family are: Clostridium botulinum, which produces one of the most potent toxins in existence; Clostridium tetani, causative agent of tetanus; and Clostridium perfringens, commonly found in wound infections and diarrhoea cases.

    \ \

    Among the toxins produced by certain Clostridium spp. are the binary \ exotoxins. These proteins consist of two independent polypeptides, which\ correspond to the A/B subunit moieties. The enzyme component (A) enters \ the cell through endosomes produced by the oligomeric binding/translocation\ protein (B), and prevents actin polymerisation through ADP-ribosylation of \ monomeric G-actin PUBMED:8225592, PUBMED:8645309, PUBMED:10802189.

    \ \

    Members of the "A" binary toxin family include C. perfringens iota toxin Ia\ PUBMED:8225592, C. botulinum C2 toxin CI PUBMED:8645309, and Clostridium difficile ADP-ribosyltransferase \ PUBMED:10802189. Other homologous proteins have been found in Clostridium spiroforme PUBMED:8645309, PUBMED:10802189.

    \ ' '1374' 'IPR003896' '\

    A large group of bacterial exotoxins are referred to as "A/B toxins", \ essentially because they are formed from two subunits PUBMED:8225592. The "A" subunit possesses enzyme activity, and is transferred to the host cell following\ a conformational change in the membrane-bound transport "B" subunit. Clostridial species are one of the major causes of food \ poisoning/gastro-intestinal illnesses. They are Gram-positive, spore-forming rods that occur naturally in the soil PUBMED:8225592. Among the toxins produced by certain Clostridium spp. are the binary exotoxins. These proteins consist of two independent polypeptides, which correspond to the A/B subunit moieties. The enzyme component (A) enters the cell through endosomes produced by the oligomeric binding/translocation protein (B), and prevents actin polymerisation through ADP-ribosylation of monomeric G-actin PUBMED:8225592, PUBMED:9659689, PUBMED:10802189.

    \ \

    Members of the "B" binary toxin family also include the Bacillus anthracis protective antigen (PA) protein PUBMED:8225592, most likely due to a common evolutionary ancestor. B. anthracis, a large Gram-positive spore-forming rod, is the causative agent of anthrax. Its two virulence factors are the \ poly-D-glutamate polypeptide capsule, and the actual anthrax exotoxin PUBMED:1910002. The toxin comprises three factors: the protective antigen (PA); the oedema factor (EF); and the lethal factor (LF). Each is a thermolabile \ protein of ~80kDa. PA forms the "B" part of the exotoxin and allows passage\ of the "A" moiety (consisting of EF and LF) into target cells. PA protein forms the central part of the complete anthrax toxin, and translocates the B moiety into host cells after assembling as a heptamer in the membrane PUBMED:1910002, PUBMED:3148491.

    \ ' '1375' 'IPR000775' '\ Bindin, the major protein component of the acrosome granule of sea urchin sperm, mediates species-specific adhesion of sperm to the egg surface during fertilisation PUBMED:1991551, PUBMED:1775065. The \ protein coats the acrosomal process after externalisation by the acrosome reaction; it binds to \ sulphated, fucose-containing polysaccharides on the vitelline-layer receptor proteoglycans that \ cover the egg plasma membrane. Bindins from different genera show high levels of sequence similarity \ in both the mature bindin domain and in the probindin precursor region. The most highly conserved \ region is a 42-residue segment in the central portion of the mature bindin protein. This domain may \ be responsible for conserved functions of bindin, while the more highly divergent flanking regions \ may be responsible for its species-specific properties PUBMED:1991551.\ ' '1376' 'IPR019774' '\

    Phenylalanine, tyrosine and tryptophan hydroxylases constitute a family of \ tetrahydrobiopterin-dependent aromatic amino acid hydroxylases, all of which are \ rate-limiting catalysts for important metabolic pathways PUBMED:3475690. The proteins \ are structurally and functionally related, each containing iron, and catalysing ring \ hydroxylation of aromatic amino acids, using tetra-hydrobiopterin (BH4) as a substrate. \ All are regulated by phosphorylation at serines in their N-termini. It has been suggested \ that the proteins each contain a conserved C-terminal catalytic (C) domain and an unrelated N-terminal regulatory (R) domain. It is possible that the R domains arose from \ genes that were recruited from different sources to combine with the common gene for the \ catalytic core. Thus, by combining with the same C domain, the proteins acquired\ the unique regulatory properties of the separate R domains.

    \ \

    A variety of enzymes belong to this family that includes, phenylalanine-4-hydroxylase from Chromobacterium violaceum where it is copper-dependent; it is \ iron-dependent in Pseudomonas aeruginosa, phenylalanine-4-hydroxylase catalyzes the conversion of phenylalanine to tyrosine. \ In humans, deficiencies are the cause of phenylketonuria, the most common inborn error \ of amino acid metabolism PUBMED:9406548, tryptophan 5-hydroxylase catalyzes the rate-limiting step in serotonin biosynthesis: \ the conversion of tryptophan to 3-hydroxy-anthranilate and tyrosine 3-hydroxylase catalyzes the rate limiting step in catecholamine biosynthesis: \ the conversion of tyrosine to 3,4-dihydroxy-L-phenylalanine.

    \ ' '1377' 'IPR000089' '\ The biotin / lipoyl attachment domain has a conserved lysine residue that binds biotin or lipoic acid. Biotin plays a catalytic role in some carboxyl transfer reactions and is covalently attached, via an amide bond, to a lysine residue in enzymes requiring this coenzyme PUBMED:1526981. E2 acyltransferases have an essential cofactor, lipoic acid, which is covalently bound via an amide linkage to a lysine group PUBMED:1825611. The lipoic acid cofactor is found in a variety of proteins that include, H-protein of the glycine cleavage system (GCS), mammalian and yeast pyruvate dehydrogenases and fast migrating protein (FMP) (gene acoC) from Ralstonia eutropha (Alcaligenes eutrophus).\ ' '1378' 'IPR005499' '\ This family contains the enzyme 6-carboxyhexanoate--CoA ligase . This enzyme is involved in the first step of biotin synthesis, where it converts pimelate into pimeloyl-CoA PUBMED:1445232. The enzyme requires magnesium as a cofactor and forms a homodimer PUBMED:1445232.\ ' '1379' 'IPR003784' '\

    BioMNY proteins are considered to constitute tripartite biotin transporters in prokaryotes. One-third of the widespread bioY genes are linked to bioMN. Many bioY genes are located at loci encoding biotin biosynthesis, while others are unlinked to biotin metabolic or transport genes. BioY is a high-capacity transporter that is converted to a high-affinity system in the presence of BioMN. BioMNY-mediated biotin uptake is severely impaired by the replacement of the Walker A lysine residue in BioM, demonstrating the dependency of high-affinity transport on a functional ATPase PUBMED:17301237.

    \ ' '1380' 'IPR001370' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    The baculovirus inhibitor of apoptosis protein repeat (BIR) is a domain of tandem repeats separated by a variable length linker that seems to confer cell death-preventing activity PUBMED:8139034, PUBMED:8552191. The BIR domains characterise the Inhibitor of Apoptosis (IAP) family of proteins (MEROPS proteinase inhibitor family I32, clan IV) that suppress apoptosis by interacting with and inhibiting the enzymatic activity of both initiator and effector caspases (MEROPS peptidase family C14, ). Several distinct mammalian IAPs including XIAP, c-IAP1, c-IAP2, and ML-IAP, have been identified, and they all exhibit antiapoptotic activity in cell culture. The functional unit in each IAP protein is the baculoviral IAP repeat (BIR), which contains approximately 80 amino acids folded around a zinc atom. Most mammalian IAPs have more than one BIR domain, with the different BIR domains performing distinct functions. For example, in XIAP, the third BIR domain (BIR3) potently inhibits the catalytic activity of caspase-9, whereas the linker sequences immediately preceding the second BIR domain (BIR2) selectively targets caspase-3 or -7.

    \

    The first-recognised members of family MEROPS inhibitor family I32 were viral proteins that inhibited the apoptosis of infected cells: Cp-IAP from Cydia pomonella granulosis virus (CpGV) PUBMED:8445726 and Op-IAP from Orgyia pseudotsugata multicapsid polyhedrosis virus(OpMNPV) PUBMED:8139034. The discovery of homologous proteins in mammals followed soon after with the recognition that mutations in the gene for neuronal apoptosis inhibitory protein (NIAP) underlie spinal muscular atrophy PUBMED:7813013. The inhibitors in family I32 all possess one or more 80-residue domains known as BIR (baculovirus inhibitor repeat) domains and have accordingly been termed \'BIR-containing\' or \'BIRC\' proteins as well as IAP proteins.

    \

    The mechanism of inhibition of caspases by the IAP proteins is complex, and reactive site residues cannot yet be identified with any confidence. Despite the conservation of the BIR or IAP (inhibitor of apoptosis) domains throughout the family it seems clear that other parts of the molecules also make essential contributions to inhibitory activity.

    \ \ \ \

    Homologs of most components in the mammalian apoptotic pathway have been identified in fruit flies. The Drosophila Apaf-1, known as Dapaf-1, HAC-1 or Dark, shares significant sequence similarity with its mammalian counterpart, and is critically important for the activation of the Drosophila initiator caspase Dronc. Dronc, in turn, cleaves and activates the effector caspase DrICE. The Drosophila IAP, DIAP1, binds to and in-activates both DrICE and Dronc through its BIR1 and BIR2 domains. During apoptosis, the anti-death function of DIAP1 is countered by at least four pro-apoptotic proteins, Reaper, Hid, Grim, and sickle, through direct physical interactions. These four proteins represent the functional homologs of the mammalian protein Smac, and they all share a conserved IAP-binding motif at their N termini. The three proteins Reaper, Hid, and Grim are collectively referred to as the RHG proteins PUBMED:11511363, PUBMED:15273300.

    \

    Both XIAP and DIAP1 contain a RING domain at their C termini, and can act as an E3 ubiquitin ligase. Indeed, both XIAP and DIAP1 have been shown to promote self-ubiquitination and degradation as well as to negatively regulate the target caspases. Nonetheless, important differences exist between XIAP and DIAP1. The primary function of XIAP is thought to inhibit the catalytic activities of caspases; to what extent the ubiquitinating activity of XIAP contributes to its function remains unclear. For DIAP1, however, the ubiquitinating activity appears to be essential for its function.

    \

    Recently a Drosophila p53 protein has been identified that mediates apoptosis via a novel pathway involving the activation of the Reaper gene and subsequent inhibition of the inhibitors of apoptosis (IAPs). CIAP1, a major mammalian homologue of Drosophila IAPs, is irreversibly inhibited (cleaved) during p53-dependent apoptosis and this cleavage is mediated by a serine protease. Serine protease inhibitors that block CIAP1 cleavage inhibit p53-dependent apoptosis. Furthermore, activation of the p53 protein increases the transcription of the HTRA2 gene, which encodes a serine protease that interacts with CIAP1 and potentiates apoptosis. Therefore mammalian p53 protein activates apoptosis through a novel pathway functionally similar to that in Drosophila, which involves HTRA2 and subsequent inhibition of CIAP1 by cleavage PUBMED:12569127.

    \ ' '1381' 'IPR007100' '\

    RNA-directed RNA polymerase (RdRp) () is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage PUBMED:2759231, PUBMED:8709232. It catalyses synthesis of the RNA strand complementary to a given RNA template, but the precise molecular mechanism remains unclear.\ The postulated RNA replication process is a two-step mechanism. First, the initiation step of RNA synthesis begins at or near the 3\' end of the RNA template by means of a primer-independent (de novo) mechanism. The de novo initiation consists in the addition of a nucleotide tri-phosphate (NTP) to the 3\'-OH of the first initiating NTP. During the following so-called elongation phase, this nucleotidyl transfer reaction is repeated with subsequent NTPs to generate the complementary RNA product PUBMED:11531403.

    \

    All the RNA-directed RNA polymerases, and many DNA-directed polymerases, employ a fold whose organisation has been likened to the shape of a right hand with three subdomains termed fingers, palm and thumb PUBMED:9309225. Only the palm subdomain, composed of a four-stranded antiparallel beta-sheet with two alpha-helices, is well conserved among all of these enzymes. In RdRp, the palm subdomain comprises three well conserved motifs (A, B and C). Motif A (D-x(4,5)-D) and motif C (GDD) are spatially juxtaposed; the Asp residues of these motifs are implied in the binding of Mg2+ and/or Mn2+. The Asn residue of motif B is involved in selection of ribonucleoside triphosphates over dNTPs and thus determines whether RNA is synthesised rather than DNA PUBMED:10827187.\ The domain organisation PUBMED:9878607 and the 3D structure of the catalytic centre of a wide range of RdPp\'s, even those with a low overall sequence homology, are conserved. The catalytic centre is formed by several motifs containing a number of conserved amino acid residues.

    \

    There are 4 superfamilies of viruses that cover all RNA containing viruses with no DNA stage:

    \ The RNA-directed RNA polymerases in the first of the above superfamilies can be divided into the following three subgroups:\

    \ \

    This family consists of the Birnaviridae enzymes. These proteins lack the highly conserved Gly-Asp-Asp (GDD) sequence, a component of the proposed catalytic site of this enzyme family that exists in the conserved motif VI of the palm domain of other RNA-directed RNA polymerases PUBMED:12069523.

    \ ' '1382' 'IPR002662' '\

    Infectious pancreatic necrosis virus (IPNV), a birnavirus, is an important pathogen in fish farms. Analyses of viral proteins showed that VP2 is the major structural and immunogenic polypeptide of the virus PUBMED:8525637, PUBMED:17976679. All neutralizing monoclonal antibodies are specific to VP2 and bind to continuous or discontinuous epitopes. The variable domain of VP2 and the 20 adjacent amino acids of the conserved C-terminal are probably the most important in inducing an immune response for the protection of animals PUBMED:8525637.

    \ \

    The large RNA segment of the Birnaviridae codes for a polyprotein (N-VP2-VP4-VP3-C), most of which is then processed to generate the constituent polypeptides. VP4 protein is involved in generating VP2 and VP3 PUBMED:2828658. Recombinant VP3 is more immunogenic than recombinant VP2 PUBMED:15669113.

    \ ' '1383' 'IPR002663' '\ VP3 is a minor structural component of the virus. The large RNA segment of Birnaviridae codes for a polyprotein (N-VP2-VP4-VP3-C) PUBMED:2828658.\ ' '1384' 'IPR002664' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to MEROPS peptidase family S50 (clan SF).

    \ \

    The large RNA segment, segment A, of birnaviruses codes for a polyprotein (N-VP2-VP4-VP3-C) PUBMED:2828658 that is processed into the major structural proteins of the virion: VP2, VP3, and into the putative protease VP4 PUBMED:2828658.

    \ ' '1385' 'IPR004284' '\ Birnaviruses are ds RNA viruses. Non structural protein VP5 is found in RNA segment A. The function of this small viral\ protein is unknown.\ ' '1386' 'IPR003929' '\

    Potassium channels are the most diverse group of the ion channel family\ PUBMED:1772658, PUBMED:1879548. They are important in shaping the action potential, and in neuronal excitability and plasticity PUBMED:2451788. The potassium channel family is\ composed of several functionally distinct isoforms, which can be broadly\ separated into 2 groups PUBMED:2555158: the practically non-inactivating \'delayed\' group and the rapidly inactivating \'transient\' group.

    \

    These are all highly similar proteins, with only small amino acid\ changes causing the diversity of the voltage-dependent gating mechanism,\ channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or\ other second messengers PUBMED:2448635. In eukaryotic cells, K+ channels\ are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes PUBMED:1373731. In prokaryotic cells, they play a role in the\ maintenance of ionic homeostasis PUBMED:11178249.

    \

    All K+ channels discovered so far possess a core of \ alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has\ been termed the K+ selectivity sequence.\ In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane.\ However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains.\ The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK)\ PUBMED:11178249, PUBMED:. The 2TM domain family comprises inward-rectifying K+ \ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.

    \

    Ca2+-activated K+ channels are a diverse group of channels that are activated by an increase in intracellular Ca2+ concentration. They are found in the majority of nerve cells, where they modulate cell excitability and action potential. Three types of Ca2+-activated K+ channel have been characterised, termed small-conductance (SK), intermediate conductance (IK) and large conductance (BK) respectively PUBMED:9687354.

    \

    BK channels (also referred to as maxi-K channels) are widely expressed in the body, being found in glandular tissue, smooth and skeletal muscle, as well as in neural tissues. They have been demonstrated to regulate arteriolar and airway diameter, and also neurotransmitter release. Each channel complex is thought to be composed of 2 types of subunit; the pore-forming (alpha) subunits and smaller accessory (beta) subunits.

    \ \

    The alpha subunit of the BK channel was initially thought to share the characteristic 6TM organisation of the voltage-gated K+ channels. However, the molecule is now thought to possess an additional TM domain, with an extracellular N-terminus and intracellular C-terminus. This C-terminal region contains 4 predominantly hydrophobic domains, which are also thought to lie intracellularly. The extracellular N-terminus and the first TM region are required for modulation by the beta subunit. The precise location of the Ca2+-binding site that modulates channel activation remains unknown, but it is thought to lie within the C-terminal hydrophobic domains.

    \ ' '1387' 'IPR007024' '\ An FAD-binding domain, BLUF, exemplified by the N-terminus of the AppA protein, (), from Rhodobacter sphaeroides, is present in various proteins, primarily from Bacteria. The BLUF domain is involved in sensing blue-light (and possibly redox) using FAD and is similar to the flavin-binding PAS domains and cryptochromes. The predicted secondary structure reveals that the BLUF domain is a novel FAD-binding fold PUBMED:12368079.\ ' '1388' 'IPR003760' '\ This is a family of basic membrane lipoproteins from Borrelia and various putative lipoproteins from other bacteria.\ All of these proteins are outer membrane proteins and are thus antigenic in nature when possessed by the pathogenic\ members of the family PUBMED:9350727. \

    The Bacillus subtilis degR, a positive regulator of the production of degradative enzymes, is also a member of this group PUBMED:9335269.

    \ ' '1389' 'IPR002634' '\ This family consist of the morpho-protein BolA from Escherichia coli and its various homologs. In E. coli, over-expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division PUBMED:10361282. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase PUBMED:10361282. BolA is also induced by stress during early stages of growth PUBMED:10361282 and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5 PUBMED:2684651, PUBMED:10361282.\ ' '1390' 'IPR000874' '\ Bombesin-like peptides comprise a large family of peptides which were initially isolated from amphibian skin, where they stimulate smooth muscle contraction. They were later found to be widely distributed in \ mammalian neural and endocrine cells. The amphibian peptides which belong to this family are currently \ classified into three subfamilies PUBMED:6141890, PUBMED:3868775; the Bombesin group, which includes bombesin and alytesin; the \ Ranatensin group, which includes ranatensins, litorin, and Rohdei litorin; and the Phyllolitorin group, \ which includes Leu(8)- and Phe(8)-phyllolitorins. In mammals and birds two categories of bombesin-like \ peptides are known PUBMED:1726343, PUBMED:2458345, gastrin-releasing peptide (GRP), which stimulates the \ release of gastrin as well as other gastrointestinal hormones, and neuromedin B (NMB), a neuropeptide \ whose function is not yet clear. Bombesin-like peptides, like many other active peptides, are synthesized \ as larger protein precursors that are enzymatically converted to their mature forms. The final peptides \ are eight to fourteen residues long.\ ' '1391' 'IPR003459' '\ The proteins in this entry are encoded by an open reading frame in plasmid borne DNA repeats of Borrelia species. This protein is known as ORF-A PUBMED:8636030. The function of this putative protein is unknown.\ ' '1392' 'IPR004874' '\ This is a group of Borrelia proteins that have not yet been characterised, but contain repeated regions.\ ' '1393' 'IPR000877' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    This family of eukaryotic proteinase inhibitors, belongs to MEROPS inhibitor family I12, clan IF. They inhibit serine peptidases of the S1 family () PUBMED:14705960.

    \ \

    The Bowman-Birk inhibitor family PUBMED:6996568 is one of the numerous families of serine proteinase inhibitors. They have a duplicated structure and generally possess two distinct inhibitory sites.\ These inhibitors are primarily found in plants and in particular in the seeds of legumes as well as in cereal grains. In cereals they exist in two forms, one of which is a duplication of the basic structure PUBMED:3667571. \ Proteins of the Bowman-Birk inhibitor family of serine proteinase inhibitors interact with the enzymes they inhibit via an exposed surface loop that adopts the canonical proteinase inhibitory conformation. The resulting non-covalent complex renders the proteinase inactive. This inhibition mechanism is common for the majority of serine proteinase inhibitor proteins and many analogous examples are known. A particular feature of the Bowman-Birk inhibitor protein, however, is that the interacting loop is a particularly well-defined disulphide-linked short beta-sheet region PUBMED:11375759, PUBMED:12325158, PUBMED:12643767.

    \ ' '1394' 'IPR001851' '\

    Bacterial binding protein-dependent transport systems PUBMED:3527048, PUBMED:2229036 are multicomponent systems typically composed of a periplasmic substrate-binding protein, one or two reciprocally homologous integral inner-membrane proteins and one or two peripheral membrane ATP-binding proteins that couple energy to the active transport system.

    \

    The integral inner-membrane proteins translocate the substrate across the membrane. It has been shown PUBMED:3000770, PUBMED:7934906 that most of these proteins contain a conserved region located about 80 to 100 residues from their C-terminal extremity. This region seems PUBMED:1738314 to be located in a cytoplasmic loop between two transmembrane domains. Apart from the conserved region, the sequence of these proteins is quite divergent, however they can be classified into seven families which have been respectively termed: araH, cysTW, fecCD, hisMQ, livHM, malFG and oppBC.

    \ ' '1395' 'IPR002093' '\

    The breast cancer type 2 susceptibility protein has a number of 39 amino acid repeats PUBMED:8673099 that are critical for binding to RAD51 (a key protein in DNA recombinational repair) and resistance to methyl methanesulphonate treatment PUBMED:9405383, PUBMED:9560268, PUBMED:9811893. BRCA2 is a breast tumour suppressor with a potential function in the cellular response to DNA damage. At the cellular level, expression\ is regulated in a cell-cycle dependent manner and peak expression of BRCA2 mRNA is found in S phase, suggesting BRCA2 may participate in regulating cell proliferation. There are eight repeats in BRCA2 designated as BRC1 to BRC8. BRC1, BRC2, BRC3, BRC4, BRC7, and BRC8 are highly conserved and bind to Rad51, whereas BRC5 and BRC6 are less well conserved and do not bind to Rad51 PUBMED:10551859. It has been suggested that BRCA2 plays a role in positioning Rad51 at the site of DNA repair or in removing Rad51 from DNA once repair has been completed.

    \ ' '1396' 'IPR003497' '\ This entry represents the N-terminus of the Baculoviridae BRO and ALI motif proteins. The function of BRO proteins is unknown. It\ has been suggested that BRO-A and BRO-C are DNA binding proteins that influence host DNA replication and/or transcription\ PUBMED:10888617. This Pfam domain does not include the characteristic invariant alanine, leucine, isoleucine motif of the ALI proteins PUBMED:9847359.\ ' '1397' 'IPR004275' '\

    In addition to the highly specific cell-mediated immune system, vertebrates possess an efficient host-defence mechanism against invading microorganisms which involves the synthesis of highly potent antimicrobial peptides with a large spectrum of activity. This entry represents a number of these defence peptides secreted from the skin of amphibians, including the opiate-like dermorphins and deltorphins, and the antimicrobial dermoseptins and temporins.

    \ ' '1398' 'IPR002009' '\ This family consists of Bromovirus coat proteins. RNA-protein interactions stabilise many viruses and also the nucleoprotein cores of enveloped animal viruses (e.g. retroviruses). The nucleoprotein particles are frequently pleomorphic and generally unstable due to the lack of strong protein-protein interactions in their capsids.\

    The structure is known for Cowpea chlorotic mottle virus (CCMV) PUBMED:7743132. It shows novel quaternary structure interactions based on interwoven carboxyterminal polypeptides that extend from canonical capsid beta-barrel subunits. Additional particle stability is provided by intercapsomere contacts between metal ion mediated carboxyl cages and by protein interactions with regions of ordered RNA.

    \ ' '1399' 'IPR002538' '\ Members of this family are found in the bromoviridae, assembling into long tubular structures at the surface of the infected protoplast. These\ proteins aid the infection of the virus PUBMED:9267012, PUBMED:9514964.\ ' '1400' 'IPR007541' '\ These basic secretory proteins (BSPs) are believed to be part of the plants defence mechanism against pathogens PUBMED:10202814.\ ' '1402' 'IPR001562' '\

    The Btk-type zinc finger or Btk motif (BM) is a conserved zinc-binding motif containing conserved cysteines and a histidine that is present in certain eukaryotic signalling proteins. The motif is named after Bruton\'s tyrosine kinase (Btk), an enzyme which is essential for B cell maturation in humans and mice PUBMED:8070576, PUBMED:15661031. Btk is a member of the Tec family of protein tyrosine kinases (PTK). These kinases contain a conserved Tec homology (TH) domain between the N-terminal pleckstrin homology (PH) domain () and the Src homology 3 (SH3) domain (). The N-terminal of the TH domain is highly conserved and known as the Btf motif, while the C-terminal region of the TH domain contains a proline-rich region (PRR). The Btk motif contains a conserved His and three Cys residues that form a zinc finger (although these differ from known zinc finger topologies), while PRRs are commonly involved in protein-protein interactions, including interactions with G proteins PUBMED:9280283, PUBMED:9796816. The TH domain may be of functional importance in various signalling pathways in different species PUBMED:8070576. A complete TH domain, containing both the Btk and PRR regions, has not been found outside the Tec family; however, the Btk motif on its own does occur in other proteins, usually C-terminal to a PH domain (note that although a Btk motif always occurs C-terminal to a PH domain, not all PH domains are followed by a Btk motif).

    \

    The crystal structures of Btk show that the Btk-type zinc finger has a globular core, formed by a long loop which is held together by a zinc ion, and that the Btk motif is packed against the PH domain PUBMED:8070576. The zinc-binding residues are a histidine and three cysteines, which are fully conserved in the Btk motif PUBMED:9218782.

    \

    Proteins known to contain a Btk-type zinc finger include:

    \ \

    \ ' '1403' 'IPR007602' '\ This family includes NS2 proteins from other members of the Orbivirus genus. NS2 is a non-specific single-stranded RNA-binding protein that forms large homomultimers and accumulates in viral inclusion bodies of infected cells. Three RNA-binding regions have been identified in Bluetongue virus 17 () at residues 2-11, 153-166 and 274-286 PUBMED:11752140. NS2 multimers also possess nucleotidyl phosphatase activity PUBMED:11162836. The precise function of NS2 is not known, but it may be involved in the transport and condensation of viral mRNAs PUBMED:11752140.\ ' '1404' 'IPR007520' '\

    This domain is the C terminus of Saccharomyces cerevisiae (Baker\'s yeast) Bul1. Bul1 binds the ubiquitin ligase Rsp5, via an N-terminal PPSY motif (157-160 in ) PUBMED:9931424. The complex containing Bul1 and Rsp5 is involved in intracellular trafficking of the general amino acid permease Gap1 PUBMED:11500494, degradation of Rog1 in cooperation with Bul2 and GSK-3 PUBMED:10958669, and mitochondrial inheritance PUBMED:10366593. Bul1 may contain HEAT repeats. The N terminus is .

    \ ' '1405' 'IPR007519' '\

    This domain is the N terminus of Saccharomyces cerevisiae (Baker\'s yeast) Bul1. Bul1 binds the ubiquitin ligase Rsp5, via an N-terminal PPSY motif (157-160 in ) PUBMED:9931424. The complex containing Bul1 and Rsp5 is involved in intracellular trafficking of the general amino acid permease Gap1 PUBMED:11500494, degradation of Rog1 in cooperation with Bul2 and GSK-3 PUBMED:10958669, and mitochondrial inheritance PUBMED:10366593. Bul1 may contain HEAT repeats. The C terminus is .

    \ ' '1406' 'IPR005167' '\

    Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. This entry represents the polyprotein region forming the G1 glycoprotein, which is the viral attachment protein PUBMED:8553534. It interacts with the G2 polyprotein

    \ ' '1407' 'IPR005168' '\

    Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. This entry represents the polyprotein region forming the G2 glycoprotein, which interacts with the G1 glycoprotein PUBMED:7645217.

    \ ' '1408' 'IPR000797' '\

    The NSS proteins are encoded in the S RNA from ssRNA negative-strand viruses PUBMED:8760423. The S RNA also codes for the nucleoprotein N. The two main products are read from overlapping reading frames in the viral complementary sequence.

    \ ' '1409' 'IPR004915' '\

    Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein (NSs). The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase.

    \ \

    This entry represents the segment S non-structural protein, NSs. This protein is present in infected cells, but absent from purified virus particles PUBMED:1826573, PUBMED:8445364. It is not essential for virus replication, but contributes substantially to pathogenesis PUBMED:11209062. This may be due to the capacity of NSs to interfere with the host immune response; it has been shown to inhibit the production of alpha/beta interferons, and to inhibit interferon regulatory factor-mediated cell death PUBMED:12133999, PUBMED:12829839. Studies indicate that NSs suppresses host mRNA synthesis by sequestering subunits of the basal transcription factor TFIIH, explaining the dramatic drop in RNA synthesis observed upon infection PUBMED:14980221.

    \ \ ' '1411' 'IPR001784' '\ Orthobunyavirus are enveloped viruses with a genome consisting of 3 ssRNA segments (called L, M and S). The nucleocapsid protein is encode on the small (S) genomic RNA. The N protein is the major component of the nucleocapsids. This protein is thought to interact with the L protein, virus RNA and/or other N proteins PUBMED:7897347.\ ' '1412' 'IPR007322' '\ The bunyaviruses are enveloped viruses with a genome consisting of 3 ssRNA segments (called L, M and S). The nucleocapsid protein is encoded by the small (S) genomic RNA. The L segment codes for an RNA polymerase. This family contains the RNA dependent RNA polymerase on the L segment.\ ' '1413' 'IPR004873' '\

    The BURP domain is a ~230-residue module, which has been named for the four members of the group initially identified, BNM2, USP, RD22, and PG1beta. It is found in the C-terminal part of a number of plant cell wall proteins, which are defined not only by the BURP domain, but also by the overall similarity in their modular construction. The BURP domain proteins consists of either three or four modules: (i) an N-terminal hydrophobic domain - a presumptive transit peptide, joined to (ii) a short conserved segment or other short segment, (iii) an optional segment consisting of repeated units which is unique to each member, and (iv) the C-terminal BURP domain. Although the BURP domain proteins share primary structural features, their expression patterns and the conditions under which they are expressed differ. The presence of the conserved BURP domain in diverse plant proteins suggests an important and fundamental functional role for this domain PUBMED:9790599. It is possible that the BURP domain represents a general motif for localization of proteins within the cell wall matrix. The other structural domains associated with the BURP domain may specify other target sites for intermolecular interactions PUBMED:12172833.

    \ \

    Some proteins known to contain a BURP domain are listed below PUBMED:9790599, PUBMED:12172833, PUBMED:14612572:\

    \

    \ ' '1414' 'IPR004619' '\

    This is a family of proteins found in a single copy in at least ten different early completed bacterial genomes. The only characterised member of the family is Bvg accessory factor (Baf), a protein required, in addition to the regulatory operon bvgAS, for heterologous transcription of the Bordetella pertussis toxin operon (ptx) in Escherichia coli PUBMED:11094274. Pertussis toxin is an important virulence factor of B. pertussis, the causative agent of pertussis or whooping cough. The BvgAS two-component system controls the expression of pertussis toxin and a number of other B. pertussis virulence factors. Baf acts with BvgAS to further activate ptx transcription in E. coli grown in minimal medium without affecting the growth rate, and functional Baf appears to be required for viability of B. pertussis.

    \ ' '1415' 'IPR006771' '\

    The pathogenic dimorphic fungal organism Blastomyces dermatitidis exists as a budding yeast at 37 degrees C and as a mycelium at 25\ degrees C. Bys1 is expressed specifically in the high temperature, unicellular yeast morphology and codes for a protein of 18.6 kDa that contains multiple\ putative phosphorylation sites, a hydrophobic N terminus, and two 34-amino-acid domains with similarly spaced nine-amino-acid\ degenerative repeating motifs PUBMED:11811639. The molecular function of this protein is not known.

    \ ' '1416' 'IPR011616' '\

    The basic-leucine zipper (bZIP) transcription factors PUBMED:7780801, PUBMED: of eukaryotic are proteins that contain a basic region mediating sequence-specific DNA-binding followed by a leucine zipper region (see ) required for dimerization.

    \ ' '1417' 'IPR001073' '\ \ C1q is a subunit of the C1 enzyme complex that activates the serum complement\ system. C1q comprises 6 A, 6 B and 6 C chains. These share the same topology, each\ possessing a small, globular N-terminal domain, a collagen-like Gly/Pro-rich central\ region, and a conserved C-terminal region, the C1q domain PUBMED:1706597. The C1q\ protein is produced in collagen-producing cells and shows sequence and structural\ similarity to collagens VIII and X PUBMED:2591537, PUBMED:2019595.\ \ ' '1418' 'IPR001442' '\

    This duplicated domain is present at the C-terminal of type 4 collagen, the major structural component of glomerular basement membranes (GMB) forming a \'chicken-wire\' meshwork together with laminins, proteoglycans and entactin/nidogen. Mutations in alpha-5 collagen IV are associated with\ X-linked Alport syndrome.

    \ ' '1419' 'IPR004695' '\

    Two members of the Tellurite-Resistance/Dicarboxylate Transporter (TDT) family have been functionally characterised. One is the TehA\ protein of Escherichia coli which has been implicated in resistance to tellurite; the other is the Mae1\ protein of Schizosaccharomyces pombe which functions in the uptake of malate and other dicarboxylates by a\ proton symport\ mechanism. These proteins exhibit 10 putative transmembrane a-helical\ spanners (TMSs).

    \ \ ' '1420' 'IPR002601' '\

    This domain of unknown function is found at the C-terminus in a number of Caenorhabditis elegans proteins. It may be an extracellular domain. Most copies of the C6\ domain contain six conserved cysteine residues. However some copies of the domain are missing cysteine residues\ 1 and 3 suggesting that these form a disulphide bridge. In there are 18 copies of the domain.

    \ ' '1421' 'IPR004676' '\ These proteins are members of the Cadmium Resistance (CadD) Family. To date, this family of proteins has only been found in Gram-positive bacteria. The CadD family includes two close orthologues in two Staphylococcus species that have been reported to function in cadmium resistance, and another staphylococcal protein that has been reported to possibly function in quaternary ammonium ion export.\ ' '1422' 'IPR002126' '\

    Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration PUBMED:2197976, PUBMED:,PUBMED:14570569. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins.

    \

    Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain PUBMED:11736639. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.

    \ ' '1423' 'IPR000233' '\ Cadherins are transmembrane glycoproteins vital in calcium-dependent cell-cell adhesion during tissue differentiation PUBMED:3061804. Cadherins cluster to form foci of homophilic binding units. A key determinant to the strength of the \ binding that it is mediated by cadherins is the juxtamembrane region of the cadherin. This region induces clustering and also binds to the protein p120ctn PUBMED:9566976. The cytoplasmic region is highly conserved in sequence and has been shown experimentally to regulate the cell-cell binding function of the extracellular domain of E-cadherin, possibly through interaction with the cytoskeleton PUBMED:3061804. This domain is found upstream of the cadherin domain .\ ' '1424' 'IPR005169' '\

    Helicobacter pylori is the most common world-wide infection and plays an important role in pathogenesis of peptic ulcers. The CagA (cytotoxin-associated gene A) protein is a cell-surface antigen which may play a role in determining the relative virulence of the viral strains.

    \ ' '1425' 'IPR010258' '\

    Several bacterial pathogens utilise conjugation machines to export effector molecules during infection. Such systems are members of the type IV or \'adapted conjugation\' secretion family. The prototypical type IV system is the Agrobacterium tumefaciens T-DNA transfer machine, which delivers oncogenic nucleoprotein particles to plant cells. Other pathogens, including Bordetella pertussis, Legionella pneumophila, Brucella spp. and Helicobacter pylori (Campylobacter pylori), use type IV machines to export effector proteins to the extracellular milieu or the mammalian cell cytosol.

    \

    Conjugation machines of Gram-negative bacteria consist of two surface structures, the mating channel through which the DNA transfer intermediate and proteins are translocated and the conjugal pilus for contacting recipient cells. Various conjugative pili have been visualised, but to date there is no ultrastructural information about the mating channel. Recent work on the A. tumefaciens T-DNA transfer system has focused on identifying interactions among the VirB protein subunits and defining steps in the transporter assembly pathway. There are three functional groups of VirB proteins: proteins localised exocellularly forming the T-pilus or other adhesive structures; mating-channel components; and cytoplasmic membrane ATPases. Although all of these proteins probably assemble as a supramolecular complex, as yet there is no direct evidence for a physical association between the conjugative pilus and the mating channel.

    \

    Several lines of evidence suggest that VirB6-VirB10 are probable channel subunits. VirB6, a highly hydrophobic protein, is thought to span the cytoplasmic membrane several times and presently is the best candidate for a channel-forming protein. VirB7, an outer membrane lipoprotein, interacts with itself and with VirB9 via disulphide bonds between unique reactive cysteines present in each protein. The VirB7-VirB9 heterodimer localises at the outer membrane and plays a critical role in stabilising other VirB proteins during assembly of the transfer machine. VirB9 is also required for formation of chemically crosslinked VirB10 oligomers probably corresponding to homotrimers PUBMED:10920394.

    \ ' '1426' 'IPR003930' '\

    Potassium channels are the most diverse group of the ion channel family\ PUBMED:1772658, PUBMED:1879548. They are important in shaping the action potential, and in neuronal excitability and plasticity PUBMED:2451788. The potassium channel family is\ composed of several functionally distinct isoforms, which can be broadly\ separated into 2 groups PUBMED:2555158: the practically non-inactivating \'delayed\' group and the rapidly inactivating \'transient\' group.

    \

    These are all highly similar proteins, with only small amino acid\ changes causing the diversity of the voltage-dependent gating mechanism,\ channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or\ other second messengers PUBMED:2448635. In eukaryotic cells, K+ channels\ are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes PUBMED:1373731. In prokaryotic cells, they play a role in the\ maintenance of ionic homeostasis PUBMED:11178249.

    \

    All K+ channels discovered so far possess a core of \ alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has\ been termed the K+ selectivity sequence.\ In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane.\ However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains.\ The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK)\ PUBMED:11178249, PUBMED:. The 2TM domain family comprises inward-rectifying K+ \ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.

    \

    Ca2+-activated K+ channels are a diverse group of channels that are activated by an increase in intracellular Ca2+ concentration. They are found in the majority of nerve cells, where they modulate cell excitability and action potential. Three types of Ca2+-activated K+ channel have been characterised, termed small-conductance (SK), intermediate conductance (IK) and large conductance (BK) respectively PUBMED:9687354.

    \

    BK channels (also referred to as maxi-K channels) are widely expressed \ in the body, being found in glandular tissue, smooth and skeletal muscle, \ as well as in neural tissues. They have been demonstrated to regulate \ arteriolar and airway diameter, and also neurotransmitter release. Each\ channel complex is thought to be composed of 2 types of subunit: the pore-\ forming (alpha) subunits and smaller accessory (beta) subunits.

    \ \ \ \

    The beta subunit (which is thought to possess 2 TM domains) increases the\ Ca2+ sensitivity of the BK channel PUBMED:7695911. It does this by enhancing the time\ spent by the channel in burst-like open states. However, it has little \ effect on the durations of closed intervals between bursts, or on the\ numbers of open and closed states entered during gating PUBMED:10051518.\

    \ ' '1427' 'IPR001693' '\

    Calcitonin PUBMED:3060108 is a 32 amino acid polypeptide hormone that causes a rapid but short-lived drop in the level of calcium and phosphate in the blood, by promoting the incorporation of these ions in the bones, alpha type. Alternative splicing of the gene coding for calcitonin produces a distantly related peptide of 37 amino acids, called calcitonin gene-related peptide (CGRP), beta type. CGRP induces vasodilatation in a variety of vessels, including the coronary, cerebral and systemic vasculature. Its abundance in the CNS also points toward a neurotransmitter or neuromodulator role.

    \

    Islet amyloid polypeptide (IAPP) PUBMED:2407732 (also known as diabetes-associated peptide (DAP), or amylin) is a peptide of 37 amino acids that selectively inhibits insulin-stimulated glucose utilisation and glycogen deposition in muscle, while not affecting adipocyte glucose metabolism. Structurally, IAPP is closely related to CGRP.

    \

    Two conserved cysteines in the N-terminal of these peptides are known to be involved in a disulphide bond. The C-terminal residue of all three peptides is amidated.

    \
    \
                    xCxxxxxCxxxxxxxxxxxxxxxxxxxxxxxxxxxx-NH(2)\
                     |     |                             Amide group\
                     +-----+\
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
    
    \ ' '1428' 'IPR006931' '\

    Calcipressin 1 negatively regulates calcineurin () by direct binding and is essential for the survival of T helper type 1 cells. Calcipressin 1 is a phosphoprotein that increases its capacity to inhibit calcineurin when phosphorylated at the FLISPP motif, and this phosphorylation also controls the half-life of calcipressin 1 by accelerating its degradation PUBMED:12809556.

    \ \

    Calcineurin is a calcium-responsive enzyme that dephosphorylates the nuclear factor of activated T cells (NFAT). In so doing it promotes its nuclear translocation and uniquely links calcium signalling to transcriptional regulation PUBMED:12942079. Calcipressins are a family of proteins derived from three genes. Calcipressin 1 is also known as modulatory calcineurin-interacting protein 1 (MCIP1), Adapt78 and Down syndrome critical region 1 (DSCR1). Calcipressin 2 is variously known as MCIP2, ZAKI-4 and DSCR1-like 1. Calcipressin 3 is also called MCIP3 and DSCR1-like 2 PUBMED:12942079. DSCR1 (Adapt78) is associated with successful adaptation to oxidative stress and calcium stress as well as with diseases like Alzheimer\'s and Down syndrome.

    \ \

    The DSCR1 (Adapt78) isoform 1 protein, calcipressin 1, inhibits calcineurin and protects against acute calcium-mediated stress damage, including transient oxidative stress PUBMED:12039863. Calcipressin 1 is encoded by DSCR1, a gene on human chromosome 21. Calcipressin 1 isoform 1 has an N-terminal coding region, which generates a new polypeptide of 252 amino acids. Endogenous calcipressin 1 exists as a complex together with the calcineurin A and B heterodimer PUBMED:12809556.\

    \ ' '1429' 'IPR006018' '\

    This group of proteins includes two protein families: caldesmon and lymphocyte specific protein.

    \

    Caldesmon (CDM) is an actin- and myosin-binding protein implicated in the\ regulation of actomyosin interactions in smooth muscle and non-muscle cells,\ possibly acting as a bridge between myosin and actin filaments PUBMED:1555769. CDM is\ believed to be an elongated molecule, with an N-terminal myosin/calmodulin-\ binding domain and a C-terminal tropomyosin/actin/calmodulin-binding domain,\ separated by a 40nm-long central helix PUBMED:1555769.

    \

    A high-molecular-weight form of CDM is predominantly expressed in smooth\ muscles, while a low-molecular-weight form is widely distributed in non-\ muscle tissues and cells (the protein is not expressed in skeletal muscle\ or heart).

    \ ' '1430' 'IPR007736' '\ This family contains plant proteins related to caleosin. Caleosins contain calcium-binding domains and have an oleosin-like association with lipid bodies. Caleosins are present at relatively low levels and are mainly bound to microsomal membrane fractions at the early stages of seed development. As the seeds mature, overall levels of caleosins increased dramatically and they were associated almost exclusively with storage lipid bodies PUBMED:11171180. The calcium binding domain is probably related to the calcium-binding EF-hands motif .\ ' '1431' 'IPR004005' '\

    Bovine calicivirus is a positive-stranded ssRNA viruses that cause gastroenteritis PUBMED:1840711. The calicivirus genome contains two open reading frames, ORF1 and ORF2 PUBMED:8892921, PUBMED:8642693. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the poly-protein in which these activities lie are similar to proteins produced by the picornaviruses PUBMED:8892921, PUBMED:1551442. ORF2 encodes a structural protein PUBMED:8892921. This signature finds ORF2, the structural coat protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs.

    \ \

    Rabbit hemorrhagic disease virus (RHDV) which causes a highly contagious disease of wild and domestic rabbits belongs to the family Caliciviridae PUBMED:16733562. The capsid protein self assembles to form an icosahedral capsid with a T=3 symmetry. It is about 38nm in diameter and consists of 180 capsid proteins. The capsid encapsulates the genomic RNA and VP2 proteins and attaches the virion to target cells by binding histo-blood group antigens present on gastroduodenal epithelial cells. The Shell domain (S domain) contains elements essential for the formation of the icosahedron. The Protruding domain (P domain) is divided into sub-domains P1 and P2. An hypervariable region in P2 is thought to play an important role in receptor binding and immune reactivity.

    \ \

    This entry includes the Calicivirus coat protein as well as the non-structural polyprotein.

    \ ' '1432' 'IPR001300' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium PUBMED:2539381. The protein is a complex of 2\ polypeptide chains (light and heavy), with three known forms in mammals\ PUBMED:7845226, PUBMED:2555341: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only PUBMED:2555341.

    \ \

    All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28-kDa subunit and an 80-kDa subunit that shares 55-65% sequence homology between the two proteases PUBMED:7845226, PUBMED:2539381. The crystallographic structure of m-calpain reveals six "domains" in the 80-kDa subunit:

    \ \
      \
    1. A 19-amino acid NH2-terminal sequence;
    2. \
    3. Active site domain IIa;
    4. \
    5. Active site domain IIb.\ \

      Domain 2 shows\ low levels of sequence similarity to papain; although the catalytic His has\ not been located by biochemical means, it is likely that calpain and papain\ are related PUBMED:7845226.

      \ \
    6. \
    7. Domain III;
    8. \
    9. An 18-amino acid extended sequence linking domain III to domain IV;
    10. \
    11. Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity PUBMED:7845226. />. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad PUBMED:11914728.
    12. \
    \ \ \

    Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known\ about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin (). The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma PUBMED:12843408.

    \ ' '1433' 'IPR001259' '\

    Calpain inhibitor (calpastatin) is restricted to the metazoa and specifically inhibits calpain (calcium-dependent cysteine protease). Calpastatin belongs to MEROPS inhibitor family I27, clan II. It plays a key role in post-mortem tenderisation of meat and may be involved in muscle\ protein degradation in living tissue.

    \ \ \

    The calpain system originally comprised three molecules: two Ca2+-dependent proteases, mu-calpain and m-calpain, and a third polypeptide, calpastatin, whose only known function is to inhibit the two calpains. Both mu- and m-calpain are heterodimers containing an identical 28-kDa subunit and an 80-kDa subunit that shares 55-65% sequence homology between the two proteases. The single calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent.

    \ \

    How calpain activity is regulated in cells is still unclear, but the calpains\ ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma PUBMED:12843408.

    \ \ ' '1434' 'IPR000557' '\ Calponin PUBMED:8130072, PUBMED:8144658 is a thin filament-associated protein that is implicated in the regulation\ and modulation of smooth muscle contraction. It is capable of binding to actin, calmodulin, troponin C and\ tropomyosin. The interaction of calponin with actin inhibits the actomyosin MgATPase activity. Calponin is a\ basic protein of approximately 34 Kd. Multiple isoforms are found in smooth muscles. Calponin contains three\ repeats of a well conserved 26 amino acid domain. Such a domain is also found in vertebrate smooth muscle protein\ (SM22 or transgelin), and a number of other proteins whose physiological role is not yet established, including\ Drosophila synchronous flight muscle protein SM20, Caenorhabditis elegans unc-87 protein PUBMED:7929573, rat neuronal protein NP25\ PUBMED:8015377, and an Onchocerca volvulus antigen PUBMED:7935620.\ ' '1435' 'IPR001580' '\

    Synonym(s): Calregulin, CRP55, HACBP

    \

    Calreticulin PUBMED:1497605 is a high-capacity calcium-binding protein which is present in most tissues and located at the periphery of the endoplasmic (ER) and the sarcoplamic reticulum (SR) membranes. It probably plays a role in the storage of calcium in the lumen of the ER and SR and it may well have other important functions.

    \

    Structurally, calreticulin is a protein of about 400 amino acid residues consisting of three domains:\

    \

    Calreticulin is evolutionarily related to several other calcium-binding proteins, including Onchocerca volvulus antigen RAL-1, calnexin PUBMED:8203019 and calmegin PUBMED:8126001.

    \ ' '1436' 'IPR001393' '\ Calsequestrin is the principal calcium-binding protein present in the\ sarcoplasmic reticulum of cardiac and skeletal muscle PUBMED:3379055. It is a highly \ acidic protein that is able to bind over 40 calcium ions and acts as an internal\ calcium store in muscle. Sequence analysis has suggested that calcium is\ not bound in distinct pockets via EF-hand motifs, but rather via \ presentation of a charged protein surface.

    Two forms of calsequestrin\ have been identified. The cardiac form is present in cardiac and slow\ skeletal muscle and the fast skeletal form is found in fast skeletal muscle.\ The release of calsequestrin-bound calcium (through a a calcium\ release channel) triggers muscle contraction.\ The active protein is not highly structured, more than 50% of\ it adopting a random coil conformation PUBMED:3427023. When calcium binds there is a structural change whereby\ the alpha-helical content of the protein increases from 3 to 11% PUBMED:3427023.\ Both forms of calsequestrin are phosphorylated by casein kinase II, but\ the cardiac form is phosphorylated more rapidly and to a higher degree PUBMED:1985907.

    \ ' '1437' 'IPR003644' '\

    The calx-beta motif is present as a tandem repeat in the cytoplasmic domains of Calx Na-Ca exchangers, which are used to expel calcium from cells. This motif overlaps domains used for calcium binding and regulation. The calx-beta motif is also present in the cytoplasmic tail of mammalian integrin-beta4, which mediates the bi-directional transfer of signals across the plasma membrane, as well as in some cyanobacterial proteins. This motif contains a series of beta-strands and turns that form a self-contained beta-sheet PUBMED:9294196, PUBMED:10390612.

    \ ' '1438' 'IPR013992' '\

    Cyclase-associated proteins (CAPs) are highly conserved actin-binding proteins present in a wide range of organisms including yeast, fly, plants, and mammals. CAPs are multifunctional proteins that contain several structural domains. CAP is involved in species-specific signalling pathways PUBMED:11919151, PUBMED:17635992, PUBMED:10658207, PUBMED:12351838. In Drosophila, CAP functions in Hedgehog-mediated eye development and in establishing oocyte polarity. In Dictyostelium (slim mold), CAP is involved in microfilament reorganisation near the plasma membrane in a PIP2-regulated manner and is required to perpetuate the cAMP relay signal to organise fruitbody formation. In plants, CAP is involved in plant signalling pathways required for co-ordinated organ expansion. In yeast, CAP is involved in adenylate cyclase activation, as well as in vesicle trafficking and endocytosis. In both yeast and mammals, CAPs appear to be involved in recycling G-actin monomers from ADF/cofilins for subsequent rounds of filament assembly PUBMED:17376963, PUBMED:15004221. In mammals, there are two different CAPs (CAP1 and CAP2) that share 64% amino acid identity.

    \

    All CAPs appear to contain a C-terminal actin-binding domain that regulates actin remodelling in response to cellular signals and is required for normal cellular morphology, cell division, growth and locomotion in eukaryotes. CAP directly regulates actin filament dynamics and has been implicated in a number of complex developmental and morphological processes, including mRNA localisation and the establishment of cell polarity. Actin exists both as globular (G) (monomeric) actin subunits and assembled into filamentous (F) actin. In cells, actin cycles between these two forms. Proteins that bind F-actin often regulate F-actin assembly and its interaction with other proteins, while proteins that interact with G-actin often control the availability of unpolymerised actin. CAPs bind G-actin.

    \

    In addition to actin-binding, CAPs can have additional roles, and may act as bifunctional proteins. In Saccharomyces cerevisiae (Baker\'s yeast), CAP is a component of the adenylyl cyclase complex (Cyr1p) that serves as an effector of Ras during normal cell signalling. S. cerevisiae CAP functions to expose adenylate cyclase binding sites to Ras, thereby enabling adenylate cyclase to be activated by Ras regulatory signals. In Schizosaccharomyces pombe (Fission yeast), CAP is also required for adenylate cyclase activity, but not through the Ras pathway. In both organisms, the N-terminal domain is responsible for adenylate cyclase activation, but the S cerevisiae and S. pombe N-termini cannot complement one another. Yeast CAPs are unique among the CAP family of proteins, because they are the only ones to directly interact with and activate adenylate cyclase PUBMED:10594005. S. cerevisiae CAP has four major domains. In addition to the N-terminal adenylate cyclase-interacting domain, and the C-terminal actin-binding domain, it possesses two other domains: a proline-rich domain that interacts with Src homology 3 (SH3) domains of specific proteins, and a domain that is responsible for CAP oligomerisation to form multimeric complexes (although oligomerisation appears to involve the N- and C-terminal domains as well). The proline-rich domain interacts with profilin, a protein that catalyses nucleotide exchange on G-actin monomers and promotes addition to barbed ends of filamentous F-actin PUBMED:17376963. Since CAP can bind profilin via a proline-rich domain, and G-actin via a C-terminal domain, it has been suggested that a ternary G-actin/CAP/profilin complex could be formed.

    \ \

    This entry represents the N-terminal domain of CAP proteins. This domain has an all-alpha structure consisting of six helices in a bundle with a left-handed twist and an up-and-down topology PUBMED:12962635.

    \ ' '1439' 'IPR007542' '\

    The entry includes major capsid proteins (vp54 and vp72) found in Iridoviruses, Phycodnaviruses, Asfarviruses and Ascoviruses, which are all type II dsDNA viruses with no RNA stage. This is the most abundant structural protein and can account for up to 45% of virion protein PUBMED:10082389. The structure of vp54 has been determined from Paramecium bursaria Chlorella virus 1 (PBCV-1), a very large icosahedral virus containing an internal membrane enclosed within a glycoprotein coat. The vp54 protein is a duplication consisting of two domains with a similar fold packed together like the nucleoplasmin subunits. The vp54 protein forms a trimer, where the domains are arranged around a pseudo 6-fold axis. The domains have a beta-sandwich structure consisting of 8 strands in two sheets with a jelly-roll topology PUBMED:12411581.

    \ ' '1440' 'IPR007833' '\

    This family includes export proteins involved in capsule polysaccharide biosynthesis, such as KpsS and LipB . Capsule polysaccharide modification protein lipB/A is involved in the phospholipid modification of the capsular polysaccharide and is a strong requirement for its translocation to the cell surface. The capsule of Neisseria meningitidis serogroup B and of other meningococcal serogroups and other Gram-negative bacterial pathogens, are anchored in the outer membrane through a 1,2-diacylglycerol moiety. The lipA and lipB genes are located on the 3\' end of the ctr operon. lipA and lipB do not encode proteins responsible for diacylglycerophosphatidic acid substitution of the meningococcal capsule polymer, but they are required for proper translocation and surface expression of the lipidated polymer PUBMED:15731047.

    \ \

    KpsS is an unusual sulphate-modified form of the capsular polysaccharide in Rhizobium loti (Mesorhizobium loti). Many plants, including R. loti, enter into symbiotic relationships with bacteria that allow survival in nutrient-limiting environments. KpsS functions as a fucosyl sulphotransferase in vitro. The kpsS gene product shares no significant amino acid similarity with previously identified sulphotransferases PUBMED:18430142. Sulphated cell surface polysaccharides are required for optimum nodule formation but limit growth rate and nodule colonisation in M. loti PUBMED:17028279.

    \ ' '1441' 'IPR001148' '\

    Carbonic anhydrases (CA: ) are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate PUBMED:18336305, PUBMED:10978542. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site PUBMED:9336012. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion.

    \

    \

    \

    \

    \

    \ \

    This entry represents alpha class carbonic anhydrases.

    \

    More information about these proteins can be found at Protein of the Month: Carbonic Anhydrase PUBMED:.

    \ ' '1442' 'IPR004231' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    This family is represented by the well-characterised metallocarboxypeptidase A inhibitor (MCPI) from potatoes, which belongs to the MEROPS inhibitor family I37, clan IE. It inhibits metallopeptidases belonging to MEROPS peptidase family M14, carboxypeptidase A. In Russet Burbank potatoes, it is a mixture of approximately equal amounts of two polypeptide chains containing 38 or 39 amino acid residues. The chains differ in their amino terminal sequence only PUBMED:1122280 and are resistant to fragmentation by proteases PUBMED:444453. The structure of the complex between bovine carboxypeptidase A and the 39-amino-acid carboxypeptidase A inhibitor from potatoes has been determined at 2.5-A resolution PUBMED:6933511.

    \ \

    The potato inhibitor is synthesised as a precursor, having a 29 residue N-terminal signal peptide, a 27 residue pro-peptide, the 39 residue mature inhibitor region and a 7 residue C-terminal extension. The 7 residue C-terminal extension is involved in inhibitor inactivation and may be required for targeting to the vacuole where the mature active inhibitor accumulates PUBMED:9862450.

    \ \

    The N-terminal region and the mature inhibitor are weakly related to other solananaceous proteins found in this entry, from potato, tomato and henbane, which have been incorrectly described as metallocarboxipeptidase inhibitors PUBMED:11488477.

    \ \ ' '1443' 'IPR001315' '\

    The caspase recruitment domain domain (CARD) is a homotypic protein interaction module composed of a bundle of six alpha-helices. CARD is related in sequence and structure to the death domain (DD, see ) and the death effector domain (DED, see ), which work in similar pathways and show similar interaction properties PUBMED:11504623. The CARD domain typically associates with other CARD-containing proteins, forming either dimers or trimers. CARD domains can be found in isolation, or in combination with other domains. Domains associated with CARD include: NACHT () (in Nal1 and Bir1), NB-ARC () (in Apaf-1), pyrin/dapin domains () (in Nal1), leucine-rich repeats () (in Nal1), WD repeats () (in Apaf1), Src homology domains (), PDZ (), RING, kinase and DD domains PUBMED:15226512.

    \

    CARD-containing proteins are involved in apoptosis through their regulation of caspases that contain CARDs in their N-terminal pro-domains, including human caspases 1, 2, 9, 11 and 12 PUBMED:9175472. CARD-containing proteins are also involved in inflammation through their regulation of NF-kappaB PUBMED:12101092. The mechanisms by which CARDs activate caspases and NF-kappaB involve the assembly of multi-protein complexes, which can facilitate dimerisation or serve as scaffolds on which proteases and kinases are assembled and activated.

    \ \ \ \ ' '1444' 'IPR002568' '\

    This family of carlavirus nucleic acid binding proteins includes a motif for a potential C-4 type zinc finger this has four highly conserved cysteine residues and is a conserved feature of the carlaviruses 3\' terminal ORF PUBMED:2265707. These proteins may function as viral transcriptional regulators. The carlavirus family includes Garlic latent virus and Potato virus S and Potato virus M, these viruses are positive strand, ssRNA with no DNA stage.

    \ ' '1446' 'IPR000117' '\ Kappa-casein is a mammalian milk protein involved in a\ number of important physiological processes PUBMED:9409842. In the gut,\ the ingested protein is split into an insoluble peptide\ (para kappa-casein) and a soluble hydrophilic glycopeptide\ (caseinomacropeptide). Caseinomacropeptide is responsible\ for increased efficiency of digestion, prevention of neonate\ hypersensitivity to ingested proteins, and inhibition of\ gastric pathogens.\ ' '1447' 'IPR001588' '\

    Caseins PUBMED:3074304 are the major protein constituent of milk. Caseins can be classified into two families; the first consists of the kappa-caseins, and the second groups the alpha-s1, alpha-s2, and beta-caseins. The alpha/beta caseins are a rapidly diverging family of proteins. However two regions are conserved: a cluster of phosphorylated serine residues and the signal sequence.

    \

    Alpha-s2 casein is known as epsilon-casein in mouse, gamma-casein in rat and casein-A in guinea pig. Alpha-s1 casein is known as alpha-casein in rat and rabbit and as casein-B in guinea pig.

    \ ' '1448' 'IPR001707' '\

    Chloramphenicol acetyltransferase (CAT) () PUBMED:1867713 catalyzes the acetyl-CoA dependent acetylation of chloramphenicol (Cm), an antibiotic which inhibits prokaryotic peptidyltransferase activity. Acetylation of Cm by CAT inactivates the antibiotic. A histidine residue, located in the C-terminal section of the enzyme, plays a central role in its catalytic mechanism.

    \

    There is a second family of CAT PUBMED:1314803, evolutionary unrelated to the main family described above. These CAT belong to the bacterial hexapeptide-repeat containing-transferases family (see ).

    \

    The crystal structure of the type III enzyme from Escherichia coli with chloramphenicol bound has been determined. CAT is a trimer of identical subunits (monomer Mr 25,000) and the trimeric structure is stabilised by a number of hydrogen bonds, some of which result in the extension of a beta-sheet across the subunit interface. Chloramphenicol binds in a deep pocket located at the boundary between adjacent subunits of the trimer, such\ that the majority of residues forming the binding pocket belong to one subunit while the catalytically essential histidine belongs to the adjacent subunit. His195 is appropriately positioned to act as a general base catalyst in the reaction, and the required tautomeric stabilisation is provided by an unusual interaction with a main-chain carbonyl oxygen PUBMED:2187098.

    \ ' '1449' 'IPR004341' '\ The CAT RNA-binding domain is found at the amino terminus of a family of transcriptional antiterminator proteins, the Co-AntiTerminator (CAT) domain. This domain forms a dimer in the crystal structure PUBMED:9305644. Transcriptional antiterminators of the BglG/SacY family are\ regulatory proteins that mediate the induction of sugar metabolizing operons in Gram-positive and Gram-negative bacteria. Upon activation, these proteins bind to specific targets in nascent mRNAs, thereby preventing abortive dissociation of the RNA polymerase from the DNA template PUBMED:10610766.\ ' '1450' 'IPR018028' '\

    Catalases () are antioxidant enzymes that catalyse the conversion of hydrogen peroxide to water and molecular oxygen, serving to protect cells from its toxic effects PUBMED:11351128. Hydrogen peroxide is produced as a consequence of oxidative cellular metabolism and can be converted to the highly reactive hydroxyl radical via transition metals, this radical being able to damage a wide variety of molecules within a cell, leading to oxidative stress and cell death. Catalases act to neutralise hydrogen peroxide toxicity, and are produced by all aerobic organisms ranging from bacteria to man. Most catalases are mono-functional, haem-containing enzymes, although there are also bifunctional haem-containing peroxidase/catalases () that are closely related to plant peroxidases, and non-haem, manganese-containing catalases () that are found in bacteria PUBMED:14745498.

    \ \

    This entry represents a subgroup within catalase enzymes ().

    \ ' '1451' 'IPR001894' '\

    The precursor sequences of a number of antimicrobial peptides secreted by neutrophils (polymorphonuclear leukocytes) upon activation have been found to be evolutionarily related and are collectively known as cathelicidins PUBMED:7589491.

    \

    Structurally, these proteins consist of three domains: a signal sequence, a conserved region of about 100 residues that contains four cysteines involved in two disulphide bonds, and a highly divergent C-terminal section of variable size. It is in this C-terminal section that the antibacterial peptides are found; they are proteolytically processed from their precursor by enzymes such as elastase. This structure is shown in the following schematic representation:

    \
    \
       +---+--------------------------------+--------------------+\
       |Sig| Propeptide     C  C  C  C      | Antibacterial pep. |\
       +---+----------------|--|--|--|------+--------------------+\
                            |  |  |  |\
                            +--+  +--+\
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
    
    \ ' '1452' 'IPR006820' '\ This domain occurs at the N-terminal of proteins belonging to the caudal-related homeobox protein family. This region is thought to mediate transcription activation. The level of activation caused by mouse Cdx2 () is affected by phosphorylation at serine 60 via the mitogen-activated protein kinase pathway PUBMED:11729123. Caudal family proteins are involved in the transcriptional regulation of multiple genes expressed in the intestinal epithelium, and are important in differentiation and maintenance of the intestinal epithelial lining. Caudal proteins always have a homeobox DNA binding domain ().\ ' '1453' 'IPR006068' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    P-ATPases (sometime known as E1-E2 ATPases) () are found in bacteria and in a number of eukaryotic plasma membranes and organelles PUBMED:9419228. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.

    \

    This entry represents the conserved C-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H+ (), Na+ (), Ca2+ (), Na+/K+ (), and H+/K+ (). In the H+/K+- and Na+/K+-exchange P-ATPases, this domain is found in the catalytic alpha chain.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '1454' 'IPR004986' '\ The gene III product (P15) of cauliflower mosaic virus (CaMV) is a DNA binding protein in which the DNA binding activity\ is located on its C-terminal part. A family of related proteins is expressed by other members of the Caulimoviridae.\ ' '1455' 'IPR004917' '\ This protein is found in various caulimoviruses. It codes for an 18 kDa protein (PII), which is dispensable for infection but which is\ required for aphid transmission of the virus PUBMED:6311674. This protein interacts with the PIII protein PUBMED:10601029. \ \ ' '1457' 'IPR002609' '\ This family consists of various Caulimovirus viroplasmin\ proteins. The viroplasmin protein is encoded by gene VI \ and is the main component of viral inclusion bodies or viroplasms PUBMED:2402462.\ Inclusions are the site of viral assembly, DNA synthesis and \ accumulation PUBMED:2402462. Two domains exist within gene VI corr\ esponding \ approximately to the 5\' third and middle third of gene VI, these influence\ systemic infection in a light-dependent manner PUBMED:8372449.\ ' '1458' 'IPR006858' '\

    Chicken anaemia virus (CAV) is a circovirus which can cause severe depletion of some cell types, such as lymphocytes, by the induction of apoptosis PUBMED:14741120. Studies indicate that expression of the viral VP3 protein, also known as apoptin, is sufficient to induce apoptosis in susceptible cells and to produce the doughnut-shaped apoptosis body formations observed in cells during CAV infections. This protein induces apoptosis in tumor cells and transformed cells, but is incapable of inducing apoptosis in normal cells unless tranforming signals are supplied which induce its phosphorylation. Cell death occurs in a p53- and Bcl-2-independent manner PUBMED:15258463. Apoptin contains putative nuclear localisation and nuclear export domains, and has a positively charged C-terminus which is thought to allow direct interaction with nucleic acids. Localisation to the nucleus appears to be necessary for apoptosis, but is not sufficient on its own. The biologically active form of the protein appears to be large complexes of 30-40 molecules which form distinct superstructures upon DNA binding. A number of cellular factors including Nmi, and several death domain superfamily proteins, have been shown to interact apoptin. It has been suggested that in the nucleus, it this binding of phosphorylated apoptin to DNA and other factors which starts the apoptotic machinery, triggering cell death.

    \ ' '1459' 'IPR001612' '\

    Caveolins PUBMED:8567687, PUBMED:8552590, PUBMED:15003112 are a family of integral membrane proteins which are the principal components of caveolae membranes. Cavoleae are flask-shaped plasma membrane invaginations whose exact cellular function is not yet clear. Caveolins may act as scaffolding proteins within caveolar membranes by compartmentalizing and concentrating signalling molecules. Various classes of signalling molecules, including G-protein subunits, receptor and non-receptor tyrosine kinases, endothelial nitric oxide synthase (eNOS), and small GTPases, bind Cav-1 through its \'caveolin-scaffolding domain\'.

    \ \

    Currently, three different forms of caveolins are known: caveolin-1 (or VIP21), caveolin-2 and caveolin-3 (or M-caveolin).

    \ \

    Caveolins are proteins of about 20 Kd, they form high molecular mass homo-oligomers. Structurally they seem to have N-terminal and C-terminal hydrophilic segments and a long central transmembrane domain that probably forms a hairpin in the membrane. Both extremities are known to face the cytoplasm. Caveolae are enriched with cholesterol and Cav-1 is one of the few proteins that binds cholesterol tightly and specifically.

    \ ' '1460' 'IPR003199' '\ This family of choloylglycine hydrolases includes conjugated bile acid hydrolase (CBAH) and penicillin acylase which cleave carbon-nitrogen bonds, other than peptide bonds, in linear amides.\ ' '1461' 'IPR005612' '\

    This domain is present in the CAATT-binding protein which is essential for growth and necessary for 60S ribosomal subunit biogenesis. Other proteins containing this domain stimulate transcription from the HSP70 promoter.

    \ ' '1462' 'IPR003417' '\

    Core binding factor (CBF) is a heterodimeric transcription factor essential for genetic regulation of hematopoiesis and osteogenesis. The beta subunit binds to the core site, 5\'-PYGPYGGT-3\', of a number of enhancers and promoters, including Murine leukemia virus, Polyomavirus enhancer, T-cell receptor enhancers etc. The beta subunit enhances DNA-binding ability of the alpha subunit in vitro, and has been show to have a structure related to the OB fold PUBMED:10404215. Also included in this family are the Drosophila melanogaster brother and big brother proteins, which regulate the DNA-binding properties of Runt.

    \ ' '1463' 'IPR001289' '\ The CCAAT-binding factor (CBFB/NF-YA) is a mammalian transcription factor that binds to a \ CCAAT motif in the promoters of a wide variety of genes, including type I collagen and \ albumin PUBMED:2266139. The factor is a heteromeric complex of A and B subunits, both of \ which are required for DNA-binding PUBMED:1549471. The subunits can \ interact in the absence of DNA-binding, conserved regions in each being important in \ mediating this interaction.

    The B subunit contains a region of similarity with the yeast \ protein HAP2 PUBMED:2000400. For the B subunit it has been suggested that the N-terminal \ portion of the conserved region is involved in subunit interaction and the C-terminal\ region involved in DNA-binding PUBMED:1569083.

    \ ' '1464' 'IPR003958' '\

    The CCAAT-binding factor (CBF) is a mammalian transcription factor that binds to a CCAAT motif in the\ promoters of a wide variety of genes, including type I collagen and albumin. The factor is a heteromeric\ complex of A and B subunits, both of which are required for DNA-binding PUBMED:2266139, PUBMED:1549471. The \ subunits can interact in the absence of DNA-binding, conserved regions in each being important in mediating \ this interaction.

    The A subunit can be split into 3 domains on the basis of sequence similarity, a \ non-conserved N-terminal \'A domain\'; a highly-conserved central \'B domain\' involved in DNA-binding; and a \ C-terminal \'C domain\', which contains a number of glutamine and acidic residues involved in protein-protein \ interactions PUBMED:1549471. The A subunit shows striking similarity to the HAP3 subunit of the yeast \ CCAAT-binding heterotrimeric transcription factor PUBMED:1549471, PUBMED:7845362. The Kluyveromyces lactis HAP3 protein \ has been predicted to contain a 4-cysteine zinc finger, which is thought to be present in similar HAP3\ and CBF subunit A proteins, in which the third cysteine is replaced by a serine PUBMED:7845362. This domain is found in the CCAAT transcription factor and archaeal histones.

    \ ' '1465' 'IPR002586' '\ This entry consists of various cobyrinic acid a,c-diamide synthases. \ These include CbiA and CbiP from \ Salmonella typhimurium PUBMED:7635831., and CobQ from Rhodobacter capsulatus PUBMED:.\ These amidases catalyse amidations to various side chains of \ hydrogenobyrinic acid or cobyrinic acid a,c-diamide in the biosynthesis \ of cobalamin (vitamin B12) from uroporphyrinogen III.\ Vitamin B12 is an important cofactor and an essential nutrient for many plants and animals and is primarily produced by bacteria PUBMED:7635831.\ ' '1466' 'IPR003722' '\

    Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase PUBMED:12869542. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase PUBMED:17163662.

    \ \

    There are at least two distinct cobalamin biosynthetic pathways in bacteria PUBMED:11153269:

    \ \

    Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) PUBMED:11215515. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.

    \ \

    This entry represents CbiC and CobH precorrin-8X methylmutase (also known as precorrin isomerase, ), both as stand-alone enzymes and when CobJ forms part of a bifunctional enzyme. CobH and CbiC from the aerobic and anaerobic pathways, respectively, catalyse a methyl rearrangement in precorrin-8 that moves the methyl group from C-11 to C-12 to produce hydrogenobyrinic acid PUBMED:11470433. Hydrogenobyrinic acid now contains all the major framework alterations associated with corrin synthesis PUBMED:11215515.

    \

    CobH and CbiC can sometimes be fused to other enzymes in the cobalamin pathway to make bifunctional enzymes: e.g., with CobJ/CibH (precorrin-3B C17-methylase/precorrin isomerase, ) and with CbiX (precorrin isomerase, ).

    \ \ ' '1467' 'IPR002748' '\

    Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase PUBMED:12869542. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase PUBMED:17163662.

    \ \

    There are at least two distinct cobalamin biosynthetic pathways in bacteria PUBMED:11153269:

    \ \

    Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) PUBMED:11215515. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.

    \ \

    This entry represents CbiD, an essential protein for cobalamin biosynthesis in both Salmonella typhimurium and Bacillus megaterium. A deletion mutant of CbiD suggests that this enzyme is involved in C-1 methylation and deacylation reactions required during the ring contraction process in the anaerobic pathway to cobalamin (similar role as CobF) PUBMED:15741157. The CbiD protein has a putative S-AdoMet binding site PUBMED:9742225. CbiD has no counterpart in the aerobic pathway.

    \ ' '1468' 'IPR002750' '\

    Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase PUBMED:12869542. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase PUBMED:17163662.

    \ \

    There are at least two distinct cobalamin biosynthetic pathways in bacteria PUBMED:11153269:

    \ \

    Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) PUBMED:11215515. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.

    \ \

    CbiG proteins are specific for anaerobic cobalamin biosynthesis. CbiG, which shows homology with CobE of the aerobic pathway, participates in the conversion of cobalt-precorrin 5 into cobalt-precorrin 6 PUBMED:12196148. CbiG is responsible for the opening of the delta-lactone ring and extrusion of the C2-unit PUBMED:16866557. The aerobic pathway uses molecular oxygen to trigger the events at C-20 leading to contraction and expulsion of the C2-unit as acetic acid from a metal-free intermediate, whereas the anaerobic route involves the internal delivery of oxygen from a carboxylic acid terminus to C-20 followed by extrusion of the C2-unit as acetaldehyde, using cobalt complexes as substrates PUBMED:16866557.

    \

    This entry represents the core domain of CibG.

    \ ' '1469' 'IPR003723' '\

    Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase PUBMED:12869542. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase PUBMED:17163662.

    \ \

    There are at least two distinct cobalamin biosynthetic pathways in bacteria PUBMED:11153269:

    \ \

    Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) PUBMED:11215515. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.

    \ \

    This entry represents CobK and CbiJ precorrin-6x reductase (). In the aerobic pathway, CobK catalyses the reduction of the macrocycle of precorrin-6X to produce precorrin-6Y; while in the anaerobic pathway CbiJ catalyses the reduction of the macrocycle of cobalt-precorrin-6X into cobalt-precorrin-6Y PUBMED:16243778, PUBMED:10559155.

    \ ' '1470' 'IPR002751' '\

    Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase PUBMED:12869542. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase PUBMED:17163662.

    \ \

    There are at least two distinct cobalamin biosynthetic pathways in bacteria PUBMED:11153269:

    \ \

    Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) PUBMED:11215515. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.

    \ \

    This entry represents the integral membrane protein CbiM, which is involved in cobalamin synthesis, although its exact function in unknown.

    \ ' '1471' 'IPR003705' '\

    The cobalt transport protein CbiN is part of the active cobalt transport system involved in uptake of cobalt in to the cell involved with cobalamin biosynthesis (vitamin B12). It has been suggested that CbiN may function as\ the periplasmic binding protein component of the active cobalt transport system PUBMED:.

    \ ' '1472' 'IPR003339' '\ Cobalt transport proteins are most often found in cobalamin (vitamin B12)\ biosynthesis operons. Salmonella typhimurium synthesizes cobalamin (vitamin B12) de novo under anaerobic conditions. Not all Salmonella and Pseudomonas cobalamin synthetic genes have apparent homologs in the other species suggesting that the cobalamin biosynthetic pathways differ between the two organisms PUBMED:.\ ' '1473' 'IPR002762' '\

    Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase PUBMED:12869542. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase PUBMED:17163662.

    \ \

    There are at least two distinct cobalamin biosynthetic pathways in bacteria PUBMED:11153269:

    \ \

    Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) PUBMED:11215515. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.

    \ \

    This entry represents the CbiX protein, which functions as a cobalt-chelatase in the anaerobic biosynthesis of cobalamin. It catalyses the insertion of cobalt into sirohydrochlorin. The structure of CbiX from Archaeoglobus fulgidus consists of a central mixed beta-sheet flanked by four alpha-helices, although it is about half the size of other Class II tetrapyrrole chelatases PUBMED:16835730. The CbiX proteins found in archaea appear to be shorter than those found in eubacteria PUBMED:12686546.

    \ ' '1474' 'IPR003153' '\

    Cbl adaptor proteins are RING-type E3 ubiquitin ligases. Cbl may be involved in the negative regulation of thymocyte development, targeting its substrate for ubiquitination PUBMED:11864842. The ubiquitin ligase activity of Cbl, and of its homologue Cbl-b, plays a role in the negative regulation of upstream kinases, such as Lck, Syk and PI3K, in T and B cells PUBMED:12787751. Cbl can interact with the EGF receptor (EGFR), causing the ubiquitination of the receptor following EGF ligand binding and Grb2 association. Ubiquitination is required for ligand-induced endocytosis of the EGFR PUBMED:15194809.

    \

    The N-terminal region is composed of three evolutionarily conserved domains: an N-terminal four-helix bundle domain, an EF Hand-like domain and a SH2-like domain, which together are known to bind to phosphorylated tyrosine residues. This entry represents the N-terminal four-helical bundle domain.

    \ \ ' '1475' 'IPR014741' '\

    Cbl adaptor proteins are RING-type E3 ubiquitin ligases. Cbl may be involved in the negative regulation of thymocyte development, targeting its substrate for ubiquitination PUBMED:11864842. The ubiquitin ligase activity of Cbl, and of its homologue Cbl-b, plays a role in the negative regulation of upstream kinases, such as Lck, Syk and PI3K, in T and B cells PUBMED:12787751. Cbl can interact with the EGF receptor (EGFR), causing the ubiquitination of the receptor following EGF ligand binding and Grb2 association. Ubiquitination is required for ligand-induced endocytosis of the EGFR PUBMED:15194809.

    \

    The N-terminal region is composed of three evolutionarily conserved domains: an N-terminal four-helix bundle domain, an EF Hand-like domain and a SH2-like domain, which together are known to bind to phosphorylated tyrosine residues. This entry represents the EF hand-like domain.

    \ ' '1476' 'IPR014742' '\

    Cbl adaptor proteins are RING-type E3 ubiquitin ligases. Cbl may be involved in the negative regulation of thymocyte development, targeting its substrate for ubiquitination PUBMED:11864842. The ubiquitin ligase activity of Cbl, and of its homologue Cbl-b, plays a role in the negative regulation of upstream kinases, such as Lck, Syk and PI3K, in T and B cells PUBMED:12787751. Cbl can interact with the EGF receptor (EGFR), causing the ubiquitination of the receptor following EGF ligand binding and Grb2 association. Ubiquitination is required for ligand-induced endocytosis of the EGFR PUBMED:15194809.

    \

    The N-terminal region is composed of three evolutionarily conserved domains: an N-terminal four-helix bundle domain, an EF Hand-like domain and a SH2-like domain, which together are known to bind to phosphorylated tyrosine residues. This entry represents the SH2-like domain.

    \ \ ' '1477' 'IPR000254' '\ The microbial degradation of cellulose and xylans requires several types of enzymes such as endoglucanases (), cellobiohydrolases () (exoglucanases), or xylanases () PUBMED:1886523. Structurally, cellulases and xylanases generally consist of a catalytic domain joined to a cellulose-binding domain (CBD) by a short linker sequence rich in proline and/or hydroxy-amino acids. The CBD of a number of fungal cellulases has been shown to consist of 36 amino acid residues, and it is found either at the N-terminal or at the C-terminal extremity of the enzymes. As it is shown in the following schematic representation, there are four conserved cysteines in this type of CBD domain, all involved in disulphide bonds.\
    \
                             +----------------+\
                             |          +-----|---------+\
                             |          |     |         |\
                      xxxxxxxCxxxxxxxxxxCxxxxxCxxxxxxxxxCx\
    
    \ ' '1478' 'IPR002883' '\

    This domain is found in two distinct sets of proteins with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domains or docking domains. They are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria.

    \ \

    The recycling of photosynthetically fixed carbon in plant cell walls is a key microbial process. Enzyme systems that attack the plant cell wall contain noncatalytic carbohydrate-binding modules that mediate attachment to this composite structure and play a pivotal role in maximizing the hydrolytic process. In anaerobes, the degradation is carried out by a high molecular weight, multifunctional complex termed the cellulosome. This consists of a number of independent enzyme components, each of which contains a conserved 40-residue dockerin domain, which functions to bind the enzyme to a cohesin domain within the scaffoldin protein PUBMED:7492333, PUBMED:7493964.

    \ \

    In anaerobic bacteria that degrade plant cell walls, exemplified by Clostridium thermocellum, the dockerin domains of the catalytic polypeptides can bind equally well to any cohesin from the same organism. More recently, anaerobic fungi, typified by Piromyces equi, have been suggested to also synthesise a cellulosome complex, although the dockerin sequences of the bacterial and fungal enzymes are completely different PUBMED:11524680. For example, the fungal enzymes contain one, two or three copies of the dockerin sequence in tandem within the catalytic polypeptide. In contrast, all the C. thermocellum cellulosome catalytic components contain a single dockerin domain. The anaerobic bacterial dockerins are homologous to EF hands (calcium-binding motifs) and require calcium for activity whereas the fungal dockerin does not require calcium. Finally, the interaction between cohesin and dockerin appears to be species specific in bacteria, there is almost no species specificity of binding within fungal species and no identified sites that distinguish different species.

    \ \

    The structure of dockerin from P. equi contains two helical stretches and four short beta-strands which form an antiparallel sheet structure adjacent to an additional short twisted parallel strand. The N- and C-termini are adjacent to each other.

    \ \

    Aerobic bacteria contain related regions, however these appear to function as cellulose/carbohydrate binding domains.

    \ ' '1479' 'IPR005087' '\

    Carbohydrate-binding modules (CBMs) of microbial glycoside hydrolases play a central role in the recycling of photosynthetically fixed carbon through their binding to specific plant structural polysaccharides PUBMED:11598143. Carbohydrate-binding modules (CBMs) can recogise both crystalline and amorphous cellulose forms PUBMED:15136030. CBMs are the most common non-catalytic modules associated with enzymes active in plant cell-wall hydrolysis. Many putative CBMs have been identified by amino acid sequence alignments but only a few representatives have been show experimentally to have a carbohydrate-binding function PUBMED:15210353.

    \ \

    binds both beta-1,4-glucan and beta-1,3-1,4-mixed linked glucans. binds to xylan and xylooligosaccharides. CBM25 has a starch-binding function. binds to amorphous cellulose and soluble beta-1,4-glucans, with a minimal binding requirement of cellotriose and optimal affinity for cellohexaose. Family 17 CBMs appear to have a very shallow binding cleft that may be more accessible to cellulose chains in non-crystalline cellulose than the deeper binding clefts of family 4 CBMs PUBMED:11733998. CBM28 does not compete with CBM17 modules when binding to non-crystalline cellulose but does have a "beta-jelly roll" topology, which is similar in structure to the CBM17 domains. Sequence and structural conservation in families 17 and 28 suggests that they have evolved through gene duplication and subsequent divergence PUBMED:15136030.

    \ \

    This entry includes family 11 and the domain is found in a number of bacterial cellulases.

    \ ' '1480' 'IPR005088' '\

    Carbohydrate-binding modules (CBMs) of microbial glycoside hydrolases play a central role in the recycling of photosynthetically fixed carbon through their binding to specific plant structural polysaccharides PUBMED:11598143. Carbohydrate-binding modules (CBMs) can recogise both crystalline and amorphous cellulose forms PUBMED:15136030. CBMs are the most common non-catalytic modules associated with enzymes active in plant cell-wall hydrolysis. Many putative CBMs have been identified by amino acid sequence alignments but only a few representatives have been show experimentally to have a carbohydrate-binding function PUBMED:15210353.

    \ \

    binds both beta-1,4-glucan and beta-1,3-1,4-mixed linked glucans. binds to xylan and xylooligosaccharides. CBM25 has a starch-binding function. binds to amorphous cellulose and soluble beta-1,4-glucans, with a minimal binding requirement of cellotriose and optimal affinity for cellohexaose. Family 17 CBMs appear to have a very shallow binding cleft that may be more accessible to cellulose chains in non-crystalline cellulose than the deeper binding clefts of family 4 CBMs PUBMED:11733998. CBM28 does not compete with CBM17 modules when binding to non-crystalline cellulose but does have a "beta-jelly roll" topology, which is similar in structure to the CBM17 domains. Sequence and structural conservation in families 17 and 28 suggests that they have evolved through gene duplication and subsequent divergence PUBMED:15136030.

    \ \

    This entry includes family 15 and the domain is found in a number of bacterial cellulases.

    \ ' '1481' 'IPR005086' '\

    Carbohydrate-binding modules (CBMs) of microbial glycoside hydrolases play a central role in the recycling of photosynthetically fixed carbon through their binding to specific plant structural polysaccharides PUBMED:11598143. Carbohydrate-binding modules (CBMs) can recogise both crystalline and amorphous cellulose forms PUBMED:15136030. CBMs are the most common non-catalytic modules associated with enzymes active in plant cell-wall hydrolysis. Many putative CBMs have been identified by amino acid sequence alignments but only a few representatives have been show experimentally to have a carbohydrate-binding function PUBMED:15210353.

    \ \

    binds both beta-1,4-glucan and beta-1,3-1,4-mixed linked glucans. binds to xylan and xylooligosaccharides. CBM25 has a starch-binding function. binds to amorphous cellulose and soluble beta-1,4-glucans, with a minimal binding requirement of cellotriose and optimal affinity for cellohexaose. Family 17 CBMs appear to have a very shallow binding cleft that may be more accessible to cellulose chains in non-crystalline cellulose than the deeper binding clefts of family 4 CBMs PUBMED:11733998. CBM28 does not compete with CBM17 modules when binding to non-crystalline cellulose but does have a "beta-jelly roll" topology, which is similar in structure to the CBM17 domains. Sequence and structural conservation in families 17 and 28 suggests that they have evolved through gene duplication and subsequent divergence PUBMED:15136030.

    \ \

    This entry includes family 17 and 28 which show structural homology. The domain is found in a number of alkaline cellulases.

    \ ' '1482' 'IPR005089' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \ Carbohydrate-binding module, family 25 CAZy GH_25 has a starch-binding function as demonstrated in one case.\ ' '1483' 'IPR001919' '\

    The microbial degradation of cellulose and xylans requires several types of enzyme such as endoglucanases (), cellobiohydrolases () (exoglucanases), or xylanases () PUBMED:1886523.\ Structurally, cellulases and xylanases generally consist of a catalytic domain joined to a cellulose-binding domain (CBD) by a short linker sequence rich in proline and/or hydroxy-amino acids.

    \

    The CBD domain is found either at the N-terminal or at the C-terminal extremity of these enzymes. As it is shown in the following schematic representation, there are two conserved cysteines in this CBD domain - one at each extremity of the domain - which have been shown PUBMED:1761039 to be involved in a disulphide bond. There are also four conserved tryptophan, two are involved in cellulose binding.\ The CBD of a number of bacterial cellulases has been shown to consist of about 105 amino acid residues PUBMED:1812490, PUBMED:10973978.

    \
    \
               +-------------------------------------------------+\
               |                                                 |\
              xCxxxxWxxxxxNxxxWxxxxxxxWxxxxxxxxWNxxxxxGxxxxxxxxxxCx\
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
    
    \ ' '1484' 'IPR002044' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This domain binds to starch, and is found often at the C-terminus of a variety of glycosyl hydrolases acting on polysaccharides more rapidly than on oligosaccharides. Reations include: the hydrolysis of terminal 1,4-linked alpha-D-glucose residues successively from non-reducing ends of the chains with release of beta-D-glucose, the degradation of starch to cyclodextrins by formation of a 1,4-alpha-D-glucosidic bond, and hydrolysis of 1,4-alpha-glucosidic linkages in polysaccharides to remove successive maltose units from the non-reducing ends of the chains.

    \ \ ' '1485' 'IPR005036' '\

    This family consists of several eukaryotic proteins that are thought to be involved in the regulation of glycogen metabolism. For instance, the mouse PTG protein has been shown to interact with glycogen synthase, phosphorylase kinase, phosphorylase a: these three enzymes have key roles in the regulation of\ glycogen metabolism. PTG also binds the catalytic subunit of protein phosphatase 1 (PP1C) and localizes it to glycogen. Subsets of similar interactions have been\ observed with several other members of this family, such as the yeast PIG1, PIG2, GAC1 and GIP2 proteins. While the precise function of these proteins is not\ known, they may serve a scaffold function, bringing together the key enzymes in glycogen metabolism. This entry is a carbohydrate binding domain.

    \ ' '1486' 'IPR005085' '\

    Carbohydrate-binding modules (CBMs) of microbial glycoside hydrolases play a central role in the recycling of photosynthetically fixed carbon through their binding to specific plant structural polysaccharides PUBMED:11598143. Carbohydrate-binding modules (CBMs) can recogise both crystalline and amorphous cellulose forms PUBMED:15136030. CBMs are the most common non-catalytic modules associated with enzymes active in plant cell-wall hydrolysis. Many putative CBMs have been identified by amino acid sequence alignments but only a few representatives have been show experimentally to have a carbohydrate-binding function PUBMED:15210353.

    \ \

    binds both beta-1,4-glucan and beta-1,3-1,4-mixed linked glucans. binds to xylan and xylooligosaccharides. CBM25 has a starch-binding function. binds to amorphous cellulose and soluble beta-1,4-glucans, with a minimal binding requirement of cellotriose and optimal affinity for cellohexaose. Family 17 CBMs appear to have a very shallow binding cleft that may be more accessible to cellulose chains in non-crystalline cellulose than the deeper binding clefts of family 4 CBMs PUBMED:11733998. CBM28 does not compete with CBM17 modules when binding to non-crystalline cellulose but does have a "beta-jelly roll" topology, which is similar in structure to the CBM17 domains. Sequence and structural conservation in families 17 and 28 suggests that they have evolved through gene duplication and subsequent divergence PUBMED:15136030.

    \ \

    This entry includes carbohydrate-binding module, family 25 PUBMED: which has a starch-binding function as has been demonstrated in one case.

    \ ' '1487' 'IPR003610' '\

    The carbohydrate-binding domain (CBD) is a short domain found in many different glycosyl hydrolase enzymes, such as the C-terminal cellulose-binding domain of endoglucanase Z PUBMED:9405041. The domain has a core structure consisting of a 3-stranded meander beta-sheet, which contains six aromatic groups that may be important for binding.

    \

    The overall topology of the CBD is structurally similar to the C-terminal chitin-binding domains (ChBD) of chitinase A1 and chitinase B, however the binding mechanism for the ChBD may be different from that of the CBD PUBMED:10788483.

    \ \ \ ' '1488' 'IPR007026' '\ This short domain contains four conserved cysteines that are probably required for the formation of two disulphide bonds. The domain is only found in proteins from Caenorhabditis species. The domain is named after the characteristic CC motif.\ ' '1489' 'IPR002712' '\

    CcdB protein is a topoisomerase poison from Escherichia coli PUBMED:9917404.\ It is responsible for killing plasmid-free segregants, and interferes with the activity of DNA gyrase. It acts to inhibit partitioning of the chromosomal DNA.

    \ \ ' '1490' 'IPR007078' '\ The CcmD protein is part of a C-type cytochrome biogenesis operon PUBMED:7635817. The exact function of this protein is uncertain. It has been proposed that CcmC, CcmD and CcmE interact directly with each other, establishing a cytoplasm to periplasm haem delivery pathway for cytochrome c maturation PUBMED:10998170. This protein is found fused to CcmE in . These proteins contain a predicted transmembrane helix.\ ' '1491' 'IPR004329' '\ CcmE is the product of one of a cluster of Ccm genes that are necessary for cytochrome c biosynthesis in eubacteria.\ Expression of these proteins is induced when the organisms are grown under anaerobic conditions with nitrate or nitrite as\ the final electron acceptor.\ ' '1492' 'IPR005616' '\ Members of this family include NrfF, CcmH, CycL, Ccl2.\ ' '1493' 'IPR004714' '\

    Cytochrome cbb3 oxidases are found almost exclusively in Proteobacteria, and represent a distinctive class of proton-pumping respiratory haem-copper oxidases (HCO) that lack many of the key structural features that contribute to the reaction cycle of the intensely studied mitochondrial cytochrome c oxidase (CcO). Expression of cytochrome cbb3 oxidase allows human pathogens to colonise anoxic tissues and agronomically important diazotrophs to sustain nitrogen fixation PUBMED:15100055.

    Genes encoding a cytochrome cbb3 oxidase were initially designated fixNOQP (ccoNOQP), the ccoNOQP operon is always found close to a second gene cluster, known as fixGHIS (ccoGHIS) whose expression is necessary for the assembly of a functional cbb3 oxidase. On the basis of their derived amino acid sequences each of the four proteins encoded by the ccoGHIS operon are thought to be membrane-bound. It has been suggested that they may function in concert as a multi-subunit complex, possibly playing a role in the uptake and metabolism of copper required for the assembly of the binuclear centre of cytochrome cbb3 oxidase.

    \ ' '1494' 'IPR004852' '\

    This is a group of distinct cytochrome c peroxidases (CCPs) that contain two haem groups. Similar to other cytochrome c peroxidases, they reduce hydrogen peroxide to water using c-type haem as an oxidizable substrate. However, since they possess two, instead of one, haem prosthetic groups, bacterial CCPs reduce hydrogen peroxide without the need to generate semi-stable free radicals. The two haem groups have significantly different redox potentials. The high potential (+320 mV) haem feeds electrons from electron shuttle proteins to the low potential (-330 mV) haem, where peroxide is reduced (indeed, the low potential site is known as the peroxidatic site) PUBMED:8591033. The CCP protein itself is structured into two domains, each\ containing one c-type haem group, with a calcium-binding site at the domain interface. This family also includes MauG proteins, whose similarity to di-haem CCP was previously recognised PUBMED:9202457.

    \ ' '1495' 'IPR007593' '\ This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression PUBMED:7559564.\ ' '1496' 'IPR002159' '\

    CD36 is a transmembrane, highly glycosylated, 88kDa glycoprotein expressed by monocytes, macrophages, platelets, microvascular endothelial cells and adipose tissues. Platelet glycoprotein IV (GP IV)(GPIIIb) (CD36 antigen) is also called GPIV, OKM5-antigen or PASIV. CD36 recognises oxidized low density lipoprotein, long chain fatty acids, anionic phospholipids, collagen types I, IV and V, thrombospondin (TSP) and Plasmodium falciparum infected erythrocytes. The recognition of apoptotic neutrophils is in co-operation with TSP and avb3. Other ligands may still be unknown.

    \ \

    CD36 is a scavenger receptor for oxidized LDL and shed photoreceptor outer segments and in recognition and phagocytosis of apoptotic cells and is the cell adhesion molecule in platelet adhesion and aggregation, platelet-monocyte and platelet-tumor cell interaction PUBMED:9478926.

    \ \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).\

    \ ' '1497' 'IPR013147' '\

    This family represents the transmembrane region of CD47 leukocyte antigen PUBMED:8794870, PUBMED:12124426.

    \ ' '1498' 'IPR013855' '\ In Saccharomyces cerevisiae (Baker\'s yeast), cell division control protein Cdc37 is required for the productive formation of Cdc28-cyclin complexes. Cdc37 may be a kinase targeting subunit of Hsp90 PUBMED:9242486.\ ' '1499' 'IPR003874' '\ CDC45 is an essential gene required for initiation of DNA replication in Saccharomyces cerevisiae (cell division control protein 45), forming a complex with MCM5/CDC46. Homologs of CDC45 have been identified in human PUBMED:9660782, mouse and the smut fungus, Melampsora spp., (tsd2 protein) among others.\ ' '1500' 'IPR004201' '\ This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C-terminus. The VAT-N domain found in AAA ATPases () is a substrate 185-residue recognition domain PUBMED:10531028.\ ' '1501' 'IPR003338' '\

    AAA ATPases (ATPases Associated with diverse cellular Activities) form a large protein family and play a number of roles in the cell including cell-cycle regulation, protein proteolysis and disaggregation, organelle biogenesis and intracellular transport. Some of them function as molecular chaperones, subunits of proteolytic complexes or independent proteases (FtsH, Lon). They also act as DNA helicases and transcription factors PUBMED:17201069.

    \ \

    AAA ATPases belong to the AAA+ superfamily of ringshaped P-loop NTPases, which act via the energy-dependent unfolding of macromolecules PUBMED:15037233, PUBMED:16828312. There are six major clades of AAA domains (proteasome subunits, metalloproteases, domains D1 and D2 of ATPases with two AAA domains, the MSP1/katanin/spastin group and BCS1 and it homologues), as well as a number of deeply branching minor clades PUBMED:15037233.

    \ \

    They assemble into oligomeric assemblies (often hexamers) that form a ring-shaped structure with a central pore. These proteins produce a molecular motor that couples ATP binding and hydrolysis to changes in conformational states that act upon a target substrate, either translocating or remodelling it PUBMED:16919475.

    \ \ \

    They are found in all living organisms and share the common feature of the presence of a highly conserved AAA domain called the AAA module. This domain is responsible for ATP binding and hydrolysis. It contains 200-250 residues, among them there are two classical motifs, Walker A (GX4GKT) and Walker B (HyDE) PUBMED:17201069.

    \ \

    The VAT protein of the archaebacterium Thermoplasma acidophilum, like all other members of the Cdc48/p97 family of AAA ATPases, has two ATPase domains and a 185-residue amino-terminal substrate-recognition domain, VAT-N. VAT shows activity in protein folding and unfolding and thus shares the common function of these ATPases in disassembly and/or degradation of protein complexes.

    \

    VAT-N is composed of two equally sized subdomains. The amino-terminal subdomain VAT-Nn forms a double-psi beta-barrel whose pseudo-twofold symmetry is mirrored by an internal sequence repeat of 42 residues. The carboxy-terminal subdomain VAT-Nc forms a novel six-stranded beta-clam fold PUBMED:10531028. Together, VAT-Nn and VAT-Nc form a kidney-shaped structure, in close agreement with results from electron microscopy. VAT-Nn is related to numerous proteins including prokaryotic transcription factors, metabolic enzymes, the protease cofactors UFD1 and PrlF, and aspartic proteinases.

    \ ' '1502' 'IPR005045' '\ Members of this family have no known function. They have predicted transmembrane helices.\ ' '1503' 'IPR003763' '\ The CDP-diacylglycerol pyrophosphatases play a role in the regulation of phospholipid metabolism by inositol, as well as regulating the cellular levels\ of phosphatidylinositol PUBMED:11016943.\ ' '1504' 'IPR004461' '\ The carbon monoxide dehydrogenase alpha subunit () catalyses the interconversion of CO and CO2 and the synthesis of acteyl-coA from the methylated corrinoid/iron sulphur protein, CO and CoA. Nomenclature follows the description for Methanosarcina thermophila. The complex is also found in Archaeoglobus fulgidus, not considered a methanogen, but is otherwise generally associated with methanogenesis.\ ' '1505' 'IPR016041' '\

    This entry represents a conserved region predicted to form a TIM alpha/beta barrel, and is found in the delta subunit of a number of CO dehydrogenase/acetyl-CoA synthase enzymes.

    \ ' '1506' 'IPR003175' '\ Cell cycle progression is negatively controlled by cyclin-dependent kinases inhibitors (CDIs). CDIs are involved in cell cycle arrest at the G1 phase. \ ' '1507' 'IPR004944' '\ These proteins are activators of cyclin-dependent kinase 5. They are heterodimers of a catalytic subunit and a regulatory subunit. \ ' '1508' 'IPR007847' '\ This domain is found individually and at the N terminus of a number of multi-domain proteins, including several found in the bacterium Deinococcus radiodurans which is capable of surviving ionizing irradiation and other DNA-damaging assaults at doses that are lethal to all other organisms.\ ' '1509' 'IPR003558' '\ Escherichia coli, Haemophilus spp and Campylobacter spp. all produce \ a toxin that is seen to cause distension in certain cell lines PUBMED:8112838, PUBMED:10203548, \ which eventually disintegrate and die. This novel toxin, termed cytolethal \ distending toxin (cdt), has three subunits: A, B and C. Their sizes are \ approx. 27.7, 29.5 and 19.9kDa respectively PUBMED:8112838, and they appear to be \ entirely novel PUBMED:10203548. \ \

    Further research on the complete toxin has revealed that it blocks the cell\ cycle at stage G2, through inactivation of the cyclin-dependent kinase Cdk1, and without induction of DNA breaks. This leads to multipolar abortive \ mitosis and micronucleation, associated with centrosomal amplification PUBMED:10777111.\ The roles of each subunit are unclear, but it is believed that they have\ separate roles in pathogenicity.

    \ \

    This entry represents the A and C subunits.

    \ ' '1511' 'IPR000875' '\ Cecropins PUBMED:3318666, PUBMED:2015623, PUBMED:1915368 are potent antibacterial proteins that constitute a \ main part of the cell-free immunity of insects. Cecropins are small proteins of about 35 amino acid \ residues active against both Gram-positive and Gram-negative bacteria. They seem to exert a lytic \ action on bacterial membranes. Cecropins isolated from insects other than Hyalophora cecropia (Cecropia moth) have been given \ various names; bactericidin, lepidopteran, sarcotoxin, etc. All of these peptides are structurally \ related. Cecropin P1, an intestinal antibacterial peptide from Sus scrofa (Pig), also belongs to this family.\ ' '1512' 'IPR004197' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Cellulases (Endoglucanases) catalyse the endohydrolysis of 1,4-beta-D-glucosidic linkages in cellulose.\ This is the N-terminal ig-like domain of cellulase, enzymes containing this domain belong to family 9 of the glycoside hydrolases ().

    \ ' '1513' 'IPR003922' '\

    An operon encoding 4 proteins required for bacterial cellulose biosynthesis\ (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation\ with strains lacking cellulose synthase activity PUBMED:2146681. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, \ designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum.

    \

    The calculated molecular mass of the protein encoded by bcsD is 17.3kDa PUBMED:2146681. The function of BcsD is unknown.

    \ ' '1514' 'IPR005150' '\

    Cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues, is the major component of wood and thus paper, and is synthesized by plants, most algae, some bacteria and fungi, and even some animals. The genes that synthesize cellulose in higher plants differ greatly from the well-characterised genes found in Acetobacter and Agrobacterium spp. More correctly designated as "cellulose synthase catalytic subunits", plant cellulose synthase (CesA) proteins are integral membrane proteins, approximately 1,000 amino acids in length. There are a number of highly conserved residues, including several motifs shown to be necessary for processive glycosyltransferase activity PUBMED:8901635.

    \ ' '1515' 'IPR004282' '\ Members of this family are probable integral membrane proteins. Their molecular function is unknown. CemA proteins\ are found in the inner envelope membrane of chloroplasts but not in the thylakoid membrane PUBMED:8633006. A cyanobacterial\ member of this family has been implicated in CO2 transport, but is probably not a CO2 transporter itself PUBMED:8633006.\ ' '1516' 'IPR005579' '\

    Cgr1 is involved in nucleolar integrity and is required for processing pre-rRNA for the 60S ribosome subunit. In Saccharomyces cerevisiae, this protein is conserved and contributes to compartmentalisation of nucleolar constituents PUBMED:11932453. Cgr1 is a small hydrophilic protein and members of this family are coiled-coil proteins PUBMED:11932453. Its primary role appears to be in ribosome biogenesis PUBMED:11116400, PUBMED:11342110. Expression of CGR1 is also associated with a cessation of yeast cell growth, which is a prerequisite for germination in this organism PUBMED:11342110.

    \ ' '1517' 'IPR006840' '\ The ChaC protein is thought to be associated with the putative ChaA Ca2+/H+ cation transport protein in Escherichia coli. Its function is not known. This family also includes homologues regions from several other bacterial and eukaryotic proteins.\ ' '1518' 'IPR001099' '\ Synonym(s): Chalcone synthase, Flavonone synthase, 6\'-deoxychalcone synthase \

    Naringenin-chalcone synthases () and stilbene synthases (STS) (formerly known as resveratrol synthases) are related plant enzymes. CHS is an important enzyme in flavanoid biosynthesis and STS is a key enzyme in stilbene-type phyloalexin biosynthesis. Both enzymes catalyse the addition of three molecules of malonyl-CoA to a starter CoA ester (a typical example is 4-coumaroyl-CoA), producing either a chalcone (with CHS) or stilbene (with STS) PUBMED:.

    \ \

    These enzymes have a conserved cysteine residue, located in the central section\ of the protein sequence, which is essential for the catalytic activity of both\ enzymes and probably represents the binding site for the 4-coumaryl-CoA group\ PUBMED:2033084.

    \ ' '1519' 'IPR012328' '\ Synonym(s): Chalcone synthase, Flavonone synthase, 6\'-deoxychalcone synthase \

    Naringenin-chalcone synthases () and stilbene synthases (STS) \ (formerly known as resveratrol synthases) are related plant enzymes. CHS is an\ important enzyme in flavanoid biosynthesis and STS is a key enzyme in \ stilbene-type phyloalexin biosynthesis. Both enzymes catalyze the addition of three\ molecules of malonyl-CoA to a starter CoA ester (a typical example is\ 4-coumaroyl-CoA), producing either a chalcone (with CHS) or stilbene (with\ STS) PUBMED:.

    \ \

    These enzymes have a conserved cysteine residue, located in the central section\ of the protein sequence, which is essential for the catalytic activity of both\ enzymes and probably represents the binding site for the 4-coumaryl-CoA group\ PUBMED:2033084.

    \

    This domain of chalcone synthase is reported to be structurally similar to domains in thiolase and beta-ketoacyl synthase. The differences in activity are accounted for by differences in the N-terminal domain.

    \ ' '1520' 'IPR003466' '\

    Chalcone isomerase () also known as chalcone-flavanone isomerase, is a plant enzyme responsible for the isomerisation of chalcone to naringenin a key step in the biosynthesis of flavonoids. The Petunia hybrida (Petunia) genome contains two genes coding for very similar enzymes, ChiA and ChiB, but only the first seems to encode a functional chalcone isomerase. Chalcone isomerase has a core 2-layer alpha/beta structure consisting of beta(3)-alpha(2)-beta-alpha(2)-beta(3) PUBMED:10966651. This entry represents a subgroup of Chalcone isomerase.

    \ ' '1521' 'IPR018013' '\

    The tsx gene of Escherichia coli encodes an outer membrane protein, Tsx,\ which constitutes the receptor for colicin K and Bacteriophage T6, and \ functions as a substrate-specific channel for nucleosides and deoxy-\ nucleosides PUBMED:2265760. The protein contains 294 amino acids, the first 22 of which are characteristic of a bacterial signal sequence peptide. The putative mature form of Tsx contains 272 residues with a calculated Mr of\ 31418. The Tsx sequence shows an even distribution of charged residues\ and lacks extensive hydrophobic stretches PUBMED:2265760. Tsx shows no significant similarities to the channel-forming proteins OmpC, OmpF, PhoE and LamB from the E. coli outer membrane.

    \ \

    This entry also contains related proteins of unknown function.

    \ ' '1522' 'IPR006189' '\

    The CHASE domain is an extracellular domain of 200-230 amino acids, which is found in transmembrane receptors from bacteria, lower eukaryotes and plants. It has been named CHASE (Cyclases/Histidine kinases Associated Sensory Extracellular) because of its presence in diverse receptor-like proteins with histidine kinase and nucleotide cyclase domains. The CHASE domain always occurs N-terminally in extracellular or periplasmic locations, followed by an intracellular tail housing diverse enzymatic signalling domains such as histidine kinase (), adenyl cyclase, GGDEF-type nucleotide cyclase and EAL-type phosphodiesterase domains, as well as non-enzymatic domains such PAS (), GAF (), phosphohistidine and response regulatory domains. The CHASE domain is predicted to bind diverse low molecular weight ligands, such as the cytokinin-like adenine derivatives or peptides, and mediate signal transduction through the respective receptors PUBMED:11590001.

    \ \

    The CHASE domain has a predicted alpha+beta fold, with two extended alpha helices on both boundaries and two central alpha helices separated by beta sheets. The termini are less conserved compared with the central part of the domain, which shows strongly conserved motifs.

    \ ' '1523' 'IPR004866' '\

    This domain represents the N-terminal domain in chitobiases and beta-hexosaminidases . Chitobiases degrade chitin, which forms the exoskeleton in insects and crustaceans, and which is one of the most abundant polysaccharides on earth PUBMED:8673609. Beta-hexosaminidases are composed of either a HexA/HexB heterodimer or a HexB homodimer, and can hydrolyse diverse substrates, including GM(2)-gangliosides; mutations in this enzyme are associated with Tay-Sachs disease PUBMED:12662933. HexB is structurally similar to chitobiase, consisting of a beta sandwich structure; this structure is similar to that found in the cellulose-binding domain of cellulase from Cellulomonas fimi (), suggesting that it may function as a carbohydrate-binding domain.

    \ ' '1524' 'IPR004867' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \ This short domain is found in members of the glycoside hydrolase family 20 () and represents the C-terminal domain in chitobiases and beta-hexosaminidases . It is composed of a beta sandwich structure PUBMED:8673609. The function of this domain is unknown. \ \ ' '1525' 'IPR000673' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    This entry represents the signal transduction response regulator CheB involved in chemotaxis. CheB methylesterase is responsible for removing the methyl group from the gamma-glutamyl methyl ester residues in the methyl-accepting chemotaxis proteins (MCP). The enzyme catalyses the reaction: protein L-glutamate O-methyl ester and water is converted to protein L-glutamate and methanol. CheB\ is regulated through phosphorylation by CheA. The N-terminal region of the protein is similar to that of other regulatory components of sensory transduction systems. The Myxococcus xanthus FrzG protein also belongs to this family, and is required for the normal aggregation of cells during fruiting body formation.

    \ ' '1526' 'IPR007597' '\ The precise function of these proteins is unclear, but some of them are involved in flagella motor switch PUBMED:11722727. The region represented in this entry is found in the CheC, CheX, CheA and FliY proteins. In some cases, this region is repeated in multiple copies.\ ' '1527' 'IPR005659' '\

    CheD deamidates glutamine residues to glutamate on methyl-accepting chemotaxis receptors (MCPs). CheD-mediated MCP deamidation is required for productive communication of the conformational signals of the chemoreceptors to the cheA kinase PUBMED:17908686. CheC is a CheY-P phosphatase (CheY controls flagellar rotation and is activated by phosphorylation). The activity of CheC is enhanced by its interaction with CheD, forming a CheC-CheD heterodimer. It is suggested that CheC exerts its effect on MCP methylation in Bacillus subtilis by controlling the binding of CheD to the MCPs PUBMED:8866475.

    \ ' '1528' 'IPR000780' '\

    Methyl transfer from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalysed by methyltransferases (Mtases) and modifies DNA, RNA, proteins and small molecules, such as catechol for regulatory purposes. The various aspects of the role of DNA methylation in prokaryotic restriction-modification systems and in a number of cellular processes in eukaryotes including gene regulation and differentiation is well documented.

    \ \

    Three classes of DNA Mtases transfer the methyl group from AdoMet to the target base to form either N-6-methyladenine, or N-4-methylcytosine, or C-5- methylcytosine. In C-5-cytosine Mtases, ten conserved motifs are arranged in the same order PUBMED:8127644. Motif I (a glycine-rich or closely related consensus sequence; FAGxGG in M.HhaI PUBMED:8343957), shared by other AdoMet-Mtases PUBMED:2684970, is part of the cofactor binding site and motif IV (PCQ) is part of the catalytic site. In contrast, sequence comparison among N-6-adenine and N-4-cytosine Mtases indicated two of the conserved segments PUBMED:2690010, although more conserved segments may be present. One of them corresponds to motif I in C-5-cytosine Mtases, and the other is named (D/N/S)PP(Y/F). Crystal structures are known for a number of Mtases PUBMED:7607476, PUBMED:8343957, PUBMED:8127644, PUBMED:7971991. The cofactor binding sites are almost identical and the essential catalytic amino acids coincide. The comparable protein folding and the existence of equivalent amino acids in similar secondary and tertiary positions indicate that many (if not all) AdoMet-Mtases have a common catalytic domain structure. This permits tertiary structure prediction of other DNA, RNA, protein, and small-molecule AdoMet-Mtases from their amino acid sequences PUBMED:7897657.

    \ \

    Flagellated bacteria swim towards favourable chemicals and away from deleterious ones. Sensing of \ chemoeffector gradients involves chemotaxis receptors, transmembrane (TM) proteins that detect \ stimuli through their periplasmic domains and transduce the signals via their cytoplasmic domains \ PUBMED:, PUBMED:9115443. Signalling outputs from these \ receptors are influenced both by the binding of the chemoeffector ligand to their periplasmic \ domains and by methylation of specific glutamate residues on their cytoplasmic domains. Methylation \ is catalysed by CheR, an S-adenosylmethionine-dependent methyltransferase PUBMED:9115443, which \ reversibly methylates specific glutamate residues within a coiled coil region, to form gamma-glutamyl methyl ester residues PUBMED:9115443, PUBMED:9628482. The structure of the Salmonella typhimurium chemotaxis receptor methyltransferase CheR, bound to S-adenosylhomocysteine, has been determined \ to a resolution of 2.0 A PUBMED:9115443. The structure reveals CheR to be a two-domain protein, with \ a smaller N-terminal helical domain linked via a single polypeptide connection to a larger \ C-terminal alpha/beta domain. The C-terminal domain has the characteristics of a nucleotide-binding \ fold, with an insertion of a small anti-parallel beta-sheet subdomain. The S-adenosylhomocysteine-binding site is formed mainly by the large domain, with contributions from residues within the \ N-terminal domain and the linker region PUBMED:9115443.

    \ ' '1529' 'IPR000780' '\

    Methyl transfer from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalysed by methyltransferases (Mtases) and modifies DNA, RNA, proteins and small molecules, such as catechol for regulatory purposes. The various aspects of the role of DNA methylation in prokaryotic restriction-modification systems and in a number of cellular processes in eukaryotes including gene regulation and differentiation is well documented.

    \ \

    Three classes of DNA Mtases transfer the methyl group from AdoMet to the target base to form either N-6-methyladenine, or N-4-methylcytosine, or C-5- methylcytosine. In C-5-cytosine Mtases, ten conserved motifs are arranged in the same order PUBMED:8127644. Motif I (a glycine-rich or closely related consensus sequence; FAGxGG in M.HhaI PUBMED:8343957), shared by other AdoMet-Mtases PUBMED:2684970, is part of the cofactor binding site and motif IV (PCQ) is part of the catalytic site. In contrast, sequence comparison among N-6-adenine and N-4-cytosine Mtases indicated two of the conserved segments PUBMED:2690010, although more conserved segments may be present. One of them corresponds to motif I in C-5-cytosine Mtases, and the other is named (D/N/S)PP(Y/F). Crystal structures are known for a number of Mtases PUBMED:7607476, PUBMED:8343957, PUBMED:8127644, PUBMED:7971991. The cofactor binding sites are almost identical and the essential catalytic amino acids coincide. The comparable protein folding and the existence of equivalent amino acids in similar secondary and tertiary positions indicate that many (if not all) AdoMet-Mtases have a common catalytic domain structure. This permits tertiary structure prediction of other DNA, RNA, protein, and small-molecule AdoMet-Mtases from their amino acid sequences PUBMED:7897657.

    \ \

    Flagellated bacteria swim towards favourable chemicals and away from deleterious ones. Sensing of \ chemoeffector gradients involves chemotaxis receptors, transmembrane (TM) proteins that detect \ stimuli through their periplasmic domains and transduce the signals via their cytoplasmic domains \ PUBMED:, PUBMED:9115443. Signalling outputs from these \ receptors are influenced both by the binding of the chemoeffector ligand to their periplasmic \ domains and by methylation of specific glutamate residues on their cytoplasmic domains. Methylation \ is catalysed by CheR, an S-adenosylmethionine-dependent methyltransferase PUBMED:9115443, which \ reversibly methylates specific glutamate residues within a coiled coil region, to form gamma-glutamyl methyl ester residues PUBMED:9115443, PUBMED:9628482. The structure of the Salmonella typhimurium chemotaxis receptor methyltransferase CheR, bound to S-adenosylhomocysteine, has been determined \ to a resolution of 2.0 A PUBMED:9115443. The structure reveals CheR to be a two-domain protein, with \ a smaller N-terminal helical domain linked via a single polypeptide connection to a larger \ C-terminal alpha/beta domain. The C-terminal domain has the characteristics of a nucleotide-binding \ fold, with an insertion of a small anti-parallel beta-sheet subdomain. The S-adenosylhomocysteine-binding site is formed mainly by the large domain, with contributions from residues within the \ N-terminal domain and the linker region PUBMED:9115443.

    \ ' '1530' 'IPR002545' '\

    CheW proteins are part of the chemotaxis signalling mechanism in bacteria. CheW interacts with the methyl accepting chemotaxis proteins (MCPs) and relays signals to CheY, which affects flagellar rotation. This family includes CheW and other related proteins that are involved in chemotaxis. The CheW-like regulatory domain in CheA PUBMED:9989504 binds to CheW, suggesting that these domains can interact with each other.

    \ ' '1531' 'IPR000789' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    In eukaryotes, cyclin-dependent protein kinases interact with cyclins to regulate cell cycle\ progression, and are required for the G1 and G2 stages of cell division PUBMED:3322810. The\ proteins bind to a regulatory subunit, cyclin-dependent kinase regulatory subunit (CKS),\ which is essential for their function. This regulatory subunit is a small protein of 79 to 150\ residues. In yeast (gene CKS1) and in fission yeast (gene suc1) a single isoform is known,\ while mammals have two highly related isoforms. The regulatory subunits exist as hexamers,\ formed by the symmetrical assembly of 3 interlocked homodimers, creating an unusual \ 12-stranded beta-barrel structure PUBMED:8211159. Through the barrel centre runs a 12A diameter\ tunnel, lined by 6 exposed helix pairs PUBMED:8491379. Six kinase units can be modelled to bind the\ hexameric structure, which may thus act as a hub for cyclin-dependent protein kinase\ multimerisation PUBMED:8491379, PUBMED:8211159.

    \ ' '1532' 'IPR007439' '\

    This family represents the bacterial chemotaxis phosphatase, CheZ. This protein forms a dimer characterised by a long four-helix bundle, composed of two helices from each monomer. CheZ dephosphorylates CheY in a reaction that is essential to maintain a continuous chemotactic response to environmental changes. It is thought that CheZ\'s conserved residue Gln 147 orientates a water molecule for nucleophilic attack at the CheY active site.

    \ ' '1533' 'IPR001002' '\

    A number of plant and fungal proteins that bind N-acetylglucosamine (e.g. solanaceous lectins of tomato and potato, plant endochitinases, the wound-induced proteins: hevein, win1 and win2, and the Kluyveromyces lactis killer toxin alpha subunit) contain this domain PUBMED:1757999. The domain may occur in one or more copies and is thought to be involved in recognition or binding of chitin subunits PUBMED:2070799, PUBMED:1375935. In chitinases, as well as in the potato wound-induced proteins, the 43-residue domain directly follows the signal sequence and is therefore at the N-terminus of the mature protein; in the killer toxin alpha subunit it is located in the central section of the protein.

    \ ' '1534' 'IPR004834' '\ This region is found commonly in chitin synthases classes I, II and III . Chitin a linear homopolymer of GlcNAc residues, it is an important component of the cell wall of fungi and is synthesised on the cytoplasmic surface of the cell membrane by membrane bound chitin synthases PUBMED:7773595. \ ' '1535' 'IPR004835' '\ Chitin synthase (), also known as chitin-UDP acetyl-glucosaminyl transferase, is a plasma membrane-bound protein which catalyses the conversion of UDP-N-acettyl-D-glucosamine and {(1,4)-(N-acetyl- beta-D-glucosaminyl)}(N) to UDP and {(1,4)-(N-acetyl-beta-D-\ glucosaminyl)}(N+1). It plays a major role in cell wall biogenesis. \ ' '1536' 'IPR003517' '\

    Three cysteine-rich proteins (also believed to be lipoproteins) make up the\ extracellular matrix of the Chlamydial outer membrane PUBMED:2287277. They are involved \ in the essential structural integrity of both the elementary body (EB) and \ recticulate body (RB) phase. As these bacteria lack the peptidoglycan layer\ common to most Gram-negative microbes, such proteins are highly important \ in the pathogenicity of the organism.

    \

    The largest of these is the major outer membrane protein (MOMP), and \ constitutes around 60% of the total protein for the membrane PUBMED:8477811. OMP2\ is the second largest, with a molecular mass of 58kDa, while the OMP3\ protein is ~15kDa PUBMED:2287277. MOMP is believed to elicit the strongest immune \ response, and has recently been linked to heart disease through its sequence\ similarity to a murine heart-muscle specific alpha myosin PUBMED:10037605.

    \

    The OMP3 family plays a structural role in the outer membrane during \ the EB stage of the Chlamydial cell, and different biovars show a small, yet \ highly significant, change at peptide charge level PUBMED:2287277. Members of this \ family include Chlamydia trachomatis, Chlamydia pneumoniae, and Chlamydia psittaci.

    \ ' '1537' 'IPR003506' '\

    Three cysteine-rich proteins (also believed to be lipoproteins) make up the\ extracellular matrix of the Chlamydial outer membrane PUBMED:2287277. They are involved in the essential structural integrity of both the elementary body (EB) and recticulate body (RB) phase. As these bacteria lack the peptidoglycan layer common to most Gram-negative microbes, such proteins are highly important \ in the pathogenicity of the organism.

    \

    The largest of these is the major outer membrane protein (MOMP), and \ constitutes around 60% of the total protein for the membrane PUBMED:8477811. OMP6 is the second largest, with a molecular mass of 58kDa, while the OMP3 protein is ~15kDa PUBMED:2287277. MOMP is believed to elicit the strongest immune response, and has recently been linked to heart disease through its sequence similarity to a murine heart-muscle specific alpha myosin PUBMED:10037605.

    \

    The OMP6 family plays a structural role in the outer membrane during \ the EB stage of the Chlamydial cell, and different biovars show a small, yet \ highly significant, change at peptide charge level PUBMED:2287277. Members of this family include Chlamydia trachomatis, Chlamydia pneumoniae and Chlamydia psittaci.

    \ ' '1538' 'IPR000604' '\ The major outer membrane protein of Chlamydia contains four symmetrically spaced variable domains (VDs I\ to IV). This protein maintains the structural rigidity of the outer membrane and facilitates porin formation,\ permitting diffusion of solutes through the intracellular reticulate body membrane. It is believed to play a role\ in pathogenesis and possibly adhesion. Along with the lipopolysaccharide, the major out membrane protein\ (MOMP) makes up the surface of the elementary body cell. Disulphide bond interactions within and between\ MOMP molecules and other components form high molecular weight oligomers. The MOMP is the protein used\ to determine the different serotypes.\ ' '1539' 'IPR001344' '\

    The light-harvesting complex (LHC) consists of chlorophylls A and B and the chlorophyll A-B binding protein. LHC functions as a light receptor that captures and delivers excitation energy to photosystems I and II with which it is closely associated. Under changing light conditions, the reversible phosphorylation of light harvesting chlorophyll a/b binding proteins (LHCII) represents a system for balancing the excitation energy between the two photosystems PUBMED:15225658.

    \

    The N-terminus of the chlorophyll A-B binding protein extends into the stroma where it is involved with adhesion of granal membranes and photo-regulated by reversible phosphorylation of its threonine residues PUBMED:10682866. Both these processes are believed to mediate the distribution of excitation energy between photosystems I and II.

    \

    This family also includes the photosystem II protein PsbS, which plays a role in energy-dependent quenching that increases thermal dissipation of excess absorbed light energy in the photosystem PUBMED:15033974.

    \ \ ' '1540' 'IPR004220' '\ 5-carboxymethyl-2-hydroxymuconate isomerase transforms 5-carboxymethyl-2-hydroxy-muconic acid into 5-oxo-pent-3-ene-1,2,5-tricarboxylic acid during the third step of the homoprotocatechuate catabolic pathway PUBMED:2194841. Homoprotocatechuate (HPC; 3,4-dihydroxyphenylacetate) is catabolized to Krebs cycle intermediates via extradiol (meta-) cleavage and the necessary enzymes are chromosomally encoded in a variety of bacteria PUBMED:8384293. 5-carboxymethyl-2-hydroxymuconate isomerase is probably a dimer of two identical subunits PUBMED:2194841. A comparison of the N-terminal half of the isomerase/decarboxylase sequence from the pathway (both encoded by the gene hpcE), with the second half showed significant similarity. This suggests that a duplication may have occurred to produce a bifunctional gene PUBMED:8223600.\ ' '1541' 'IPR007521' '\

    This domain is found N-terminal to choline/ethanolamine kinase regions () in some plant and fungal choline kinase enzymes (). This region is only found in some members of the choline kinase family, and is therefore unlikely to contribute to catalysis.

    \ ' '1542' 'IPR002573' '\

    Choline kinase, (ATP:choline phosphotransferase, ) belongs to the choline/ethanolamine kinase family.

    \ \

    Ethanolamine and choline are major membrane phospholipids, in the form of glycerophosphoethanolamine and glycerophosphocholine. Ethanolamine is also a component of the glycosylphosphatidylinositol (GPI) anchor, which is necessary for cell-surface protein attachment PUBMED:18489261. The de novo synthesis of these phospholipids begins with the creation of phosphoethanolamine and phosphocholine by ethanolamine and choline kinases in the first step of the CDP-ethanolamine pathway PUBMED:16861741, PUBMED:9506987. There are two putative choline/ethanolamine kinases (C/EKs) in the Trypanosoma brucei genome.

    \ \

    Ethanolamine kinase has no choline kinase activity PUBMED:18489261 and its activity is inhibited by ADP PUBMED:9506987. Inositol supplementation represses ethanolamine kinase, decreasing the incorporation of ethanolamine into the CDP-ethanolamine pathway and into phosphatidylethanolamine and phosphatidylcholine PUBMED:15201274.

    \ ' '1543' 'IPR007440' '\

    Chorismate--pyruvate lyase catalyses the first step in ubiquinone synthesis, the removal of pyruvate from chorismate, to yield 4-hydroxybenzoate in Escherichia coli and other Gram-negative bacteria PUBMED:1644758. The yeast Saccharomyces cerevisiae can synthesize ubiquinone from either chorismate or tyrosine PUBMED:11583838, however this enzyme is found only in bacteria. Its activity does not require metal cofactors PUBMED:8012607.

    \ ' '1544' 'IPR002635' '\ This family consists of the chorion superfamily proteins classes A, B, CA, CB and high-cysteine HCB from silk, gypsy and polyphemus moths. The chorion proteins make up the moths egg shell a complex extracellular structure PUBMED:3462711.\ ' '1545' 'IPR005649' '\ The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary PUBMED:1908228.\ ' '1546' 'IPR015890' '\ This entry represents the catalytic regions of the chorismate binding enzymes anthranilate synthase, isochorismate synthase, aminodeoxychorismate synthase and para-aminobenzoate synthase.\ Anthranilate synthase catalyses the reaction:\ \ The enzyme is a tetramer comprising 2 I and 2 II components: this entry is restricted to component I that \ catalyses the formation of anthranilate using ammonia rather than glutamine, while component II\ provides glutamine amidotransferase activity .\ ' '1547' 'IPR002701' '\

    Chorismate mutase, , catalyses the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine PUBMED:9642265, PUBMED:9497350. Prephenate dehydratase (, , PDT) catalyses the decarboxylation of prephenate into phenylpyruvate. In microorganisms PDT is involved in the terminal pathway of the biosynthesis of phenylalanine. In some bacteria, such as Escherichia coli, PDT is part of a bifunctional enzyme (P-protein) that also catalyzes the transformation of chorismate into prephenate (chorismate mutase) while in other bacteria it is a monofunctional enzyme. The sequence of monofunctional chorismate mutase aligns well with the N-terminal part of P-proteins PUBMED:9642265.

    \ ' '1548' 'IPR000453' '\ Chorismate synthase () catalyzes the last of the \ seven steps in the\ shikimate pathway which is used in prokaryotes, fungi and plants for the\ biosynthesis of aromatic amino acids. It catalyzes the 1,4-trans elimination\ of the phosphate group from 5-enolpyruvylshikimate-3-phosphate (EPSP) to form\ chorismate which can then be used in phenylalanine, tyrosine or tryptophan\ biosynthesis. Chorismate synthase requires the presence of a reduced flavin\ mononucleotide (FMNH2 or FADH2) for its activity.\ Chorismate synthase from various sources shows PUBMED:1718979,\ PUBMED:1837329 a high degree of sequence\ conservation. It is a protein of about 360 to 400 amino-acid residues.\ ' '1549' 'IPR003370' '\ This region is found in known and predicted chromate transporters PUBMED:2152903, PUBMED:2180932 in both bacteria and archaebacteria. These proteins reduce chromate accumulation and are essential for chromate resistance. They are composed of one or two copies of this region. The alignment contains two conserved motifs, FGG and PGP.\ ' '1550' 'IPR000953' '\ The CHROMO (CHRromatin Organization MOdifier) domain PUBMED:1982376, PUBMED:1708124, PUBMED:7667093, PUBMED:7501439 \ is a conserved region of around 60 amino acids, originally identified in Drosophila modifiers of variegation.\ These are proteins that alter the structure of chromatin to the condensed morphology of heterochromatin, \ a cytologically visible condition where gene expression is repressed. In one of these proteins, Polycomb, \ the chromo domain has been shown to be important for chromatin targeting. Proteins that contain a chromo \ domain appear to fall into 3 classes. The first class includes proteins having an N-terminal chromo domain \ followed by a region termed the chromo shadow domain PUBMED:7667093, eg. Drosophila and human heterochromatin \ protein Su(var)205 (HP1). The second class includes proteins with \ a single chromo domain, eg. Drosophila protein Polycomb (Pc); mammalian modifier 3; human Mi-2 autoantigenand \ and several yeast and Caenorhabditis elegans hypothetical proteins. In the third class paired tandem chromo domains are \ found, eg. in mammalian DNA-binding/helicase proteins CHD-1 to CHD-4 and yeast protein CHD1.\ ' '1551' 'IPR008251' '\

    Chromo shadow domain is distantly related to chromo domain. It is always found in association with a chromo domain.

    \

    The CHROMO (CHRromatin Organization MOdifier) domain PUBMED:1982376, PUBMED:1708124, PUBMED:7667093, PUBMED:7501439 \ is a conserved region of around 60 amino acids, originally identified in Drosophila modifiers of variegation.\ These are proteins that alter the structure of chromatin to the condensed morphology of heterochromatin, \ a cytologically visible condition where gene expression is repressed. In one of these proteins, Polycomb, \ the chromo domain has been shown to be important for chromatin targeting. Proteins that contain a chromo \ domain appear to fall into 3 classes. The first class includes proteins having an N-terminal chromo domain \ followed by a region termed the chromo shadow domain PUBMED:7667093, eg. Drosophila and human heterochromatin \ protein Su(var)205 (HP1); and mammalian modifier 1 and modifier 2. The second class includes proteins with \ a single chromo domain, eg. Drosophila protein Polycomb (Pc); mammalian modifier 3; human Mi-2 autoantigenand \ and several yeast and Caenorhabditis elegans hypothetical proteins. In the third class paired tandem chromo domains are \ found, eg. in mammalian DNA-binding/helicase proteins CHD-1 to CHD-4 and yeast protein CHD1.

    \ ' '1552' 'IPR000479' '\ The cation-independent mannose-6-phosphate receptor is a type I membrane protein responsible for transport of phosphorylated lysosomal enzymes from the golgi complex and the cell surface to lysosomes. Lysosomal enzymes bearing phosphomannosyl residues bind specifically to mannose-6-phosphate receptors in the golgi apparatus and the resulting receptor-ligand complex is transported to an acidic prelysosomal compartment where the low pH mediates the dissociation of the complex. This receptor also binds insulin growth factor. It contains\ 15 copies of a repeat.\ ' '1553' 'IPR008136' '\ CinA is the first gene in the competence-inducible (cin) operon, and is thought to be specifically required at some stage in the process of transformation PUBMED:7538190. This is a C-terminal region of putative competence-damaged proteins from the cin operon.\ ' '1554' 'IPR007291' '\

    Gyroviruses are small circular single stranded viruses. This family includes the VP1 protein from the chicken anemia virus which is the viral capsid protein.

    \ ' '1555' 'IPR003383' '\ Circoviruses are small circular single stranded viruses. This family is the capsid protein from viruses such as Porcine circovirus PUBMED:9573301 and Beak and feather disease virus . These proteins are about 220 amino acids long and of unknown function.\ ' '1558' 'IPR007576' '\ CITED, CBP/p300-interacting transactivator with ED-rich tail, is characterised by a conserved 32-amino acid sequence at the C terminus. CITED protein does not bind DNA directly and is thought to function as a transcriptional co-activator PUBMED:11744733.\ ' '1559' 'IPR006472' '\

    These sequences, from both Gram-positive and Gram-negative bacteria, represent the alpha subunit of the holoenzyme citrate lyase composed of alpha (), beta, and acyl carrier protein subunits in a stoichiometric relationship of 6:6:6. Citrate lyase is an enzyme which converts citrate to oxaloacetate. In bacteria, this reaction is involved in citrate fermentation. The alpha subunit catalyzes the reaction Acetyl-CoA + citrate = acetate + (3S)-citryl-CoA. The protein from Lactococcus lactis subsp. lactis (Streptococcus lactis) has been experimentally characterised PUBMED:1115558.

    \ ' '1560' 'IPR002736' '\

    The citG gene is found in a gene cluster with citrate lyase\ subunits PUBMED:9457870. The CitG protein catalyzes the conversion of ATP and dephospho-CoA to adenine and\ 2\'-(5"-triphosphoribosyl)-3\'-dephospho-CoA, the predicted precursor of the citrate lyase prosthetic\ group PUBMED:11042274.

    \ ' '1561' 'IPR004680' '\

    Characterised proteins in this entry belong mostly to the divalent anion symporter family, which is found in bacteria, archaea and eukaryotes. Substrates shown to be transported by these proteins include citrate and phosphate PUBMED:8598055. This entry also contains the melanocyte-specific transporter protein P, mutation of which leads to albinism PUBMED:11310796. Another protein in this entry, SAC1, has been shown to regulate the sulphur deprivation response in Chlamydomonas by inducing cysteine biosynthesis, though its precise role in this induction is not known PUBMED:12481091.

    \ ' '1562' 'IPR002020' '\

    Citrate synthase is a member of a small family of enzymes that can directly form a carbon-carbon bond without the presence of metal ion cofactors. It catalyses the first reaction in the Krebs\' cycle, namely the conversion of oxaloacetate and acetyl-coenzyme A into citrate and coenzyme A. This reaction is important for energy generation and for carbon assimilation. The reaction proceeds via a non-covalently bound citryl-coenzyme A intermediate in a 2-step process (aldol-Claisen condensation followed by the hydrolysis of citryl-CoA).

    \

    Citrate synthase enzymes are found in two distinct structural types: type I enzymes (found in eukaryotes, Gram-positive bacteria and archaea) form homodimers and have shorter sequences than type II enzymes, which are found in Gram-negative bacteria and are hexameric in structure. In both types, the monomer is composed of two domains: a large alpha-helical domain consisting of two structural repeats, where the second repeat is interrupted by a small alpha-helical domain. The cleft between these domains forms the active site, where both citrate and acetyl-coenzyme A bind. The enzyme undergoes a conformational change upon binding of the oxaloacetate ligand, whereby the active site cleft closes over in order to form the acetyl-CoA binding site PUBMED:15147839. The energy required for domain closure comes from the interaction of the enzyme with the substrate. Type II enzymes possess an extra N-terminal beta-sheet domain, and some type II enzymes are allosterically inhibited by NADH PUBMED:17087502.

    \ \

    This entry represents types I and II citrate synthase enzymes, as well as the related enzymes 2-methylcitrate synthase and ATP citrate synthase. 2-methylcitrate () synthase catalyses the conversion of oxaloacetate and propanoyl-CoA into (2R,3S)-2-hydroxybutane-1,2,3-tricarboxylate and coenzyme A. This enzyme is induced during bacterial growth on propionate, while type II hexameric citrate synthase is constitutive PUBMED:9579066. ATP citrate synthase () (also known as ATP citrate lyase) catalyses the MgATP-dependent, CoA-dependent cleavage of citrate into oxaloacetate and acetyl-CoA, a key step in the reductive tricarboxylic acid pathway of CO2 assimilation used by a variety of autotrophic bacteria and archaea to fix carbon dioxide PUBMED:16952946. ATP citrate synthase is composed of two distinct subunits. In eukaryotes, ATP citrate synthase is a homotetramer of a single large polypeptide, and is used to produce cytosolic acetyl-CoA from mitochondrial produced citrate PUBMED:16007201.

    \ ' '1563' 'IPR005551' '\

    Members of this protein family are annotated as CitX, containing the CitX domain, the domain is also found in the CitXG bifunctional protein, of the citrate lyase system. CitX transfers the prosthetic group 2\'-(5\'\'-triphosphoribosyl)-3\'-dephospho-CoA to the citrate lyase gamma chain, an acyl carrier protein. This enzyme may be designated holo-ACP synthase, holo-citrate lyase synthase, or apo-citrate lyase phosphoribosyl-dephospho-CoA transferase. In a few genera, including Haemophilus, this protein occurs as a fusion protein with CitG (), an enzyme involved in prosthetic group biosynthesis. This CitX family is easily separated from the holo-ACP synthases of other enzyme systems.

    \ ' '1564' 'IPR000704' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    Casein kinase, a ubiquitous, well-conserved protein kinase involved in cell metabolism and differentiation, is characterised by its preference for Ser or Thr in acidic stretches of amino acids. The enzyme is a tetramer of 2 alpha- and 2 beta-subunits PUBMED:2666134, PUBMED:1856204. However, some species (e.g., mammals) possess 2 related forms of the alpha-subunit (alpha and alpha\'), while others (e.g., fungi) possess 2 related beta-subunits (beta and beta\') PUBMED:7737972. The alpha-subunit is the catalytic unit and contains regions characteristic of serine/threonine protein kinases. The beta-subunit is believed to be regulatory, possessing an N-terminal auto-phosphorylation site, an internal acidic domain, and a potential metal-binding motif PUBMED:7737972. The beta subunit is a highly conserved protein of about 25kDa that contains, in its central section, a cysteine-rich motif, CX(n)C, that could be involved in binding a metal such as zinc PUBMED:8027080. The mammalian beta-subunit gene promoter shares common features with those of other mammalian protein kinases and is closely related to the promoter of the regulatory subunit of cAMP-dependent protein kinase PUBMED:7737972.

    \ ' '1566' 'IPR005553' '\

    Clag (cytoadherence linked asexual gene) is a malaria surface protein which has been shown to be involved in the binding of Plasmodium falciparum infected erythrocytes to host endothelial cells, a process termed cytoadherence. The cytoadherence phenomenon is associated with the sequestration of infected erythrocytes in the blood vessels of the brain, cerebral malaria. Clag is a multi-gene family in P. falciparum with at least 9 members identified to date. Orthologous proteins in the rodent malaria species Plasmodium chabaudi suggest that the gene family is found in other malaria species and may play a more generic role in cytoadherence.

    \ ' '1567' 'IPR000804' '\

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer PUBMED:15261670.

    \

    Clathrin coats contain both clathrin and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors PUBMED:17449236. All AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). Each subunit has a specific function. Adaptin subunits recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal appendage domains. By contrast, GGAs are monomers composed of four domains, which have functions similar to AP subunits: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The GAE domain is similar to the AP gamma-adaptin ear domain, being responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis PUBMED:12858162.

    \

    While clathrin mediates endocytic protein transport from ER to Golgi, coatomers (COPI, COPII) primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins PUBMED:14690497. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes PUBMED:17041781. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta\', gamma, delta, epsilon and zeta subunits.

    \

    This entry represents the small sigma subunit of various adaptins from different AP clathrin adaptor complexes (including AP1, AP2, AP3 and AP4), and the zeta subunit of various coatomer (COP) adaptors. The small sigma subunit of AP proteins have been characterised in several species PUBMED:8157009, PUBMED:8373805, PUBMED:8157009, PUBMED:9002613. The sigma subunit plays a role in protein sorting in the late-Golgi/trans-Golgi network (TGN) and/or endosomes. The zeta subunit of coatomers (zeta-COP) is required for coatomer binding to Golgi membranes and for coat-vesicle assembly PUBMED:8276893, PUBMED:14729954.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '1568' 'IPR000547' '\

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport PUBMED:15261670. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors PUBMED:17449236, PUBMED:11598180.

    \

    Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion PUBMED:15752139, PUBMED:16806884. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase PUBMED:16734666. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process PUBMED:15261670. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins PUBMED:16699812.

    \

    This entry represents the 7-fold alpha-alpha-superhelical ARM-type repeat found at the C-terminal of clathrin heavy chains and in VPS (vacuolar protein sorting-associated) proteins. In clathrin heavy chains, the C-terminal 7-fold ARM-type repeats interact to form the central hub of the triskelion. VPS proteins are required for vacuolar assembly and vacuolar traffick, and contain one clathrin-type repeat PUBMED:14745133.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '1569' 'IPR000996' '\

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport PUBMED:15261670. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors PUBMED:17449236, PUBMED:11598180.

    \

    Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion PUBMED:15752139, PUBMED:16806884. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase PUBMED:16734666. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process PUBMED:15261670. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins PUBMED:16699812.

    \

    This entry represents clathrin light chains, which are more divergent in sequence than the heavy chains PUBMED:14617352. In higher eukaryotes, two genes encode distinct but related light chains, each of which can yield two separate forms via alternative splicing. In yeast there is a single light chain whose sequence is only distantly related to that of higher eukaryotes. Clathrin light chains have a conserved acidic N-terminal domain, a central coiled-coil domain and a conserved C-terminal domain.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '1570' 'IPR001473' '\

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport PUBMED:15261670. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors PUBMED:17449236, PUBMED:11598180.

    \

    Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion PUBMED:15752139, PUBMED:16806884. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase PUBMED:16734666. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process PUBMED:15261670. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins PUBMED:16699812.

    \

    This entry represents the N-terminal beta-propeller region of clathrin heavy chains that extends away from the hub of triskelia, and which are responsible for peptide binding PUBMED:9827808.

    \ \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '1571' 'IPR003897' '\

    Clostridial species are one of the major causes of food \ poisoning/gastro-intestinal illnesses. They are Gram-positive, spore-forming rods that occur naturally in the soil PUBMED:8335373. Among the family are: Clostridium botulinum, which produces one of the most potent toxins in existence; Clostridium tetani, causative agent of tetanus; and Clostridium perfringens, commonly found in wound infections and diarrhoea cases. The use of toxins to damage the host is a method deployed by many bacterial pathogens.

    \

    The major virulence factor of C. perfringens is the CPE enterotoxin,\ which is secreted upon invasion of the host gut, and contributes to food \ poisoning and other gastrointestinal illnesses PUBMED:8335373. It has a molecular weight of 35.3kDa, and is responsible for the disintegration of tight \ junctions between endothelial cells in the gut PUBMED:9087440. This mechanism is mediated by host claudins-3 and -4, situated at the tight junctions.

    \

    Recently, two more host receptors have been characterised and expressed in \ vivo PUBMED:9334247. Named CPE-R and RVP1, these may be utilised in the passage of Clostridial species through the gut wall, although the regulatory mechanisms\ have not been elucidated.

    \ ' '1572' 'IPR003058' '\

    Bacteriocins are protein antibiotics that kill bacteria closely related to the producing species. Colicins are a subgroup of bacteriocins that are produced by and target Escherichia coli. The lethal action of most colicins is exerted either by formation of a pore in the cytoplasmic membrane of the target cell, or by an enzymatic nuclease digestion mechanism. Most colicins are able to translocate the outer membrane by a two-receptor system, where one receptor is used for the initial binding and the second for translocation. The initial binding is to cell surface receptors such as the porins OmpF, FepA, BtuB, Cir and FhuA; colicins have been classified according which receptors they bind to. The presence of specific periplasmic proteins, such as TolA, TolB, TolC, or TonB, are required for translocation across the membrane PUBMED:12423783. Cloacin DF13 is a bacteriocin that inactivates ribosomes by hydrolysing 16S RNA in 30S ribosomes at a specific site PUBMED:6344017.

    \

    Colicins are composed of domains with distinct functional roles. In general they contain a central R (receptor) domain that mediates receptor binding, an N-terminal T (translocation) domain that mediates translocation of the protein from the outer membrane receptor to the colicin\'s target within the cell, and a C-terminal C (catalytic) domain that performs the catalytic cleavage PUBMED:12409205.

    \ ' '1573' 'IPR003063' '\ The cloacin immunity protein complexes with cloacin in equimolar quantities\ and inhibits it by binding with high affinity to the cloacin C-terminal\ catalytic domain. The immunity protein is relatively small, containing 85\ amino acids.

    An extra ribosome binding site has been found to precede the immunity gene on the polycistronic Clo DF13 mRNA PUBMED:6253914, which perhaps accounts for the fact that, in cloacinogenic cells, more immunity protein than cloacin is synthesised PUBMED:6253914. Comparison of the complete amino acid sequence of the Clo DF13 immunity protein with that of the Col E3 and Col E6 immunity proteins reveals extensive similarities in primary structure, although Col E3 and Clo DF13 immunity proteins are exchangeable only to a low extent in vivo and in vitro PUBMED:6253914.

    \ ' '1574' 'IPR002679' '\ This family consist of coat proteins from closterovirus, which belong to the Closteroviridae, which have a positive strand ssRNA genome with no DNA stage during replication. The viral coat protein encapsulates and protects the viral genome. Both the large cp1 and smaller cp2 coat protein originate from the same primary transcript PUBMED:2033386. Members of the Closterovirus include Beet yellows virus (BYV) (Sugar beet yellows virus) and Grapevine leafroll-associated virus 7.\ ' '1575' 'IPR001907' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to the MEROPS peptidase family S14 (ClpP endopeptidase family, clan SK). ClpP is an ATP-dependent protease that cleaves a number of proteins, such as casein and albumin PUBMED:2197275. It exists as a heterodimer of ATP-binding regulatory A and catalytic P subunits, both of which are required for effective levels of protease activity in the presence of\ ATP PUBMED:2197275, although the P subunit alone does possess some catalytic activity. This family of sequences represent the P subunit.

    \ \

    Proteases highly similar to ClpP have been found to be encoded in the genome\ of bacteria, metazoa, some viruses and in the chloroplast of plants. A number of the proteins in this family are classified as non-peptidase homologues as they have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.\

    \ ' '1576' 'IPR000753' '\

    Clusterin is a vertebrate glycoprotein PUBMED:1585460, the exact function of which is not \ yet clear. Clusterin expression is complex, appearing as different forms in\ different cell compartments. One set of proteins is directed for secretion, and other clusterin species are expressed in the\ cytoplasm and nucleus. The secretory form of the clusterin protein (sCLU) is targeted to the ER by an initial\ leader peptide. This ~60-kDa pre-sCLU protein is further glycosylated and proteolytically cleaved into alpha- and beta-subunits, held together by disulphide bonds.\ External sCLU is an 80-kDa protein and may act as a molecular chaperone, scavenging denatured proteins outside cells following specific stress-induced injury such as heat shock. sCLU possesses nonspecific binding activity to hydrophobic domains of various proteins in vitro PUBMED:12551933.

    \

    A specific nuclear form of CLU (nCLU) acts as a pro-death signal, inhibiting cell growth and\ survival. The\ nCLU protein has two coiled-coil domains, one at its N terminus that is unable to bind Ku70, and a C-terminal coiled-coil domain that is uniquely able to associate\ with Ku70 and is minimally required for cell death.

    \

    Clusterin is synthesized as a precursor \ polypeptide of about 400 amino acids which is post-translationally cleaved to form two subunits \ of about 200 amino acids each. The two subunits are linked by five disulphide bonds to form an\ antiparallel ladder-like structure PUBMED:1491011. In each of the mature subunits the five \ cysteines that are involved in disulphide bonds are clustered in domains of about 30 amino acids \ located in the central part of the subunits.

    \ \

    This entry represents the clusterin precursor and related proteins.

    \ ' '1577' 'IPR004271' '\

    This family represents the matrix protein, M1, of Influenza C virus. The M1 protein is the product of a spliced mRNA. Small quantities of the unspliced mRNA are found in the cell additionally encoding the M2 protein (see ).

    \ ' '1578' 'IPR004267' '\

    This family represents the matrix protein, M2, of Influenza C virus. The M1 protein is the product of a spliced mRNA (see ). Small\ quantities of the unspliced mRNA are found in the cell additionally encoding the M2 protein.

    \ ' '1579' 'IPR003696' '\

    The putative O-carbamoyltransferases (O-Cases) encoded by the nodU genes of Rhizobium fredii and Bradyrhizobium japonicum are involved in the synthesis of nodulation factors PUBMED:7559434. The cmcH genes of Nocardia lactamdurans and Streptomyces clavuligerus encode a functional 3\'-hydroxymethylcephem O-carbamoyltransferase for cephamycin biosynthesis that shows significant similarity to the O-carbamoyltransferases PUBMED:7557411.

    \ ' '1580' 'IPR007072' '\ Members of this family are about 220 amino acids long. The CmcI protein is presumed to represent the cephalosporin-7--hydroxylase PUBMED:9696752. However this has not been experimentally verified.\ ' '1581' 'IPR003779' '\ The catechol and protocatechuate branches of the 3-oxoadipate pathway, which are\ important for the bacterial degradation of aromatic compounds, converge at the common intermediate 3-oxoadipate enol-lactone. \ Carboxymuconolactone decarboxylase (CMD) is involved in protocatechuate catabolism. In some\ bacteria a gene fusion event leads to expression of CMD with a hydrolase involved in the same pathway PUBMED:9495744.\ ' '1582' 'IPR003010' '\

    This family contains nitrilases that break carbon-nitrogen bonds and appear to be involved in the reduction of organic nitrogen compounds and ammonia production PUBMED:7987228. They all have distinct substrate specificity and include cyanide hydratases, aliphatic amidases, beta-alanine synthase, and a few other proteins with unknown molecular function. Sequence conservation over the entire length, as well as the similarity in the reactions catalyzed by the known enzymes in this family, points to a common catalytic mechanism. They have an invariant cysteine that is part of the catalytic site in nitrilases. Another highly conserved motif includes an invariant glutamic acid that might also be involved in catalysis PUBMED:7987228.

    \ ' '1584' 'IPR000151' '\

    Ciliary neurotrophic factor (CNTF) is a member of the gp130 family of cytokines. CNTF is a survival factor for various neuronal cell types and seems to prevent the degeneration of motor axons after axotomy suggesting it may be a potential therapeutic for treating neurodegeneration and nerve injury. CNTF acts on oligodendrocytes by favoring their final maturation, and this effect is mediated through the 130 kDa glycoprotein receptor common to the CNTF family and transduced through the Janus kinase pathway. The functional receptor complex of CNTF is composed of the CNTF receptor alpha (CNTFR), gp130 and the leukemia inhibitory factor receptor (LIFR).

    \ \

    The structure of CNTF is a four helical bundle PUBMED:7796798. CNTF acts as a homodimer. Three regions on CNTF have been identified as binding sites for its receptors. The ligand-receptor interactions are mediated through the cytokine binding domains (CBDs) and/or the immunoglobulin-like domains of the receptors. However, in the case of CNTF, the precise nature of the protein-protein contacts in the signalling complex has not yet been resolved, but there is evidence that the membrane distal CBD (CBD1) of LIFR associates in vitro with soluble CNTFR in the absence of CNTF PUBMED:11943154, PUBMED:12417647.

    \ ' '1585' 'IPR005107' '\

    Proteins containing this domain form structural complexes with other known families, such as and . The carbon monoxide (CO) dehydrogenase of Oligotropha carboxidovorans is a heterotrimeric complex composed of a apoflavoprotein, a molybdoprotein, and an iron-sulphur protein. It can be dissociated with sodium dodecylsulphate PUBMED:10636886. CO dehydrogenase catalyzes the oxidation of CO according to the following equation PUBMED:11076018:

    Subunit S represents the iron-sulphur protein of CO dehydrogenase and is clearly divided into a C- and an N-terminal domain, each binding a [2Fe-2S] cluster PUBMED:10430865.

    \ ' '1586' 'IPR000187' '\

    Corticotropin-releasing factor (CRF), urotensin-I, urocortin and sauvagine\ form a family of related neuropeptides in vertebrates. The family can be\ grouped into 2 separate paralogous lineages, with urotensin-I, urocortin and\ sauvagine in one group and CRF forming the other group. Urocortin and\ sauvagine appear to represent orthologues of fish urotensin-I in mammals and\ amphibians, respectively. The peptides have a variety of physiological\ effects on stress and anxiety, vasoregulation, thermoregulation, growth and\ metabolism, metamorphosis and reproduction in various species, and are all\ released as preprohormones PUBMED:10375459.

    \ CRF PUBMED:2200028 is a hormone found mainly in the paraventricular nucleus of the mammalian hypothalamus that regulates the release of corticotropin (ACTH) from the pituitary gland. From here, CRF\ is transported to the anterior pituitary, stimulating adrenocorticotropic\ hormone (ACTH) release via CRF type 1 receptors, thereby activating the\ hypothalamo-pituitary-adrenocortical axis (HPA) and thus glucocorticoid\ release.

    \

    \ CRF is evolutionary related to a number of other active peptides. Urocortin acts in vitro to stimulate the secretion of adrenocorticotropic hormone. Urotensin is found in the teleost caudal neurosecretory system and may play a role in osmoregulation and as a corticotropin-releasing factor. Urotensin-I is released\ from the urophysis of fish, and produces ACTH and subsequent cortisol \ release in vivo. The nonhormonal portion of the prohormone is thought to be\ the urotensin binding protein (urophysin). Sauvagine (), isolated from frog \ skin, has a potent hypotensive and diuretic effect.

    \ ' '1587' 'IPR003704' '\ Carbon monoxide dehydrogenase (Cdh) from Methanosarcina mazei (Methanosarcina frisia) Go1 is a Ni2+-, Fe2+-, and S2-containing alpha2beta2 heterotetramer PUBMED:8662887. The CO dehydrogenase enzyme complex from Methanosarcina thermophila contains a corrinoid/iron-sulphur enzyme composed of two subunits (delta and gamma) PUBMED:8550451. \ This family consists of carbon monoxide dehydrogenase I/II beta subunit and CO dehydrogenase (acetyl-CoA synthase\ epsilon subunit).\ ' '1589' 'IPR000275' '\

    Coagulogen is a gel-forming protein of hemolymph that hinders the spread of invaders by immobilising them PUBMED:3905780, PUBMED:6469947. The protein contains a single 175- residue polypeptide chain; this is cleaved after Arg-18 and Arg-46 by a clotting enzyme contained in the hemocyte and activated by a bacterial endotoxin (lipopolysaccharide). Cleavage releases two chains of coagulin, A and B, linked by two disulphide bonds, together with the peptide C PUBMED:3905780, PUBMED:6469947. Gel formation results from interlinking of coagulin molecules. Secondary structure prediction suggests the C peptide forms an alpha- helix, which is released during the proteolytic conversion of coagulogen to coagulin gel PUBMED:3905780. The beta-sheet structure and 16 half-cystines found in the molecule appear to yield a compact protein stable to acid and heat.

    \

    Mammalian blood coagulation is based on the proteolytically induced polymerisation of fibrinogens. Initially, fibrin monomers noncovalently interact with each other. The resulting homopolymers are further stabilised when the plasma transglutaminase (TGase) intermolecularly cross-links epsilon-(gamma-glutamyl)lysine bonds. In crustaceans, hemolymph coagulation depends on the TGase-mediated cross-linking of specific plasma-clotting proteins, but without the proteolytic cascade. In horseshoe crabs, the proteolytic coagulation cascade triggered by lipopolysaccharides and beta-1,3-glucans leads to the conversion of coagulogen into coagulin, resulting in noncovalent coagulin homopolymers through head-to-tail interaction. Horseshoe crab TGase, however, does not cross-link coagulins intermolecularly. Recently, we found that coagulins are cross-linked on hemocyte cell surface proteins called proxins. This indicates that a cross-linking reaction at the final stage of hemolymph coagulation is an important innate immune system of horseshoe crabs PUBMED:15170505.

    \ ' '1590' 'IPR006822' '\

    Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer PUBMED:15261670. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins PUBMED:14690497. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi PUBMED:11208122. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes PUBMED:17041781. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta\', gamma, delta, epsilon and zeta subunits.

    \

    This entry represents the epsilon subunit of the coatomer complex, which is involved in the regulation of intracellular protein trafficking between the endoplasmic reticulum and the Golgi complex PUBMED:10469566.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '1591' 'IPR006692' '\

    Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer PUBMED:15261670. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins PUBMED:14690497. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi PUBMED:11208122. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes PUBMED:17041781. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta\', gamma, delta, epsilon and zeta subunits.

    \

    This entry represents the WD-associated region found in coatomer subunits alpha, beta and beta\' subunits. The alpha-subunit (RET1P) of the coatomer complex in Saccharomyces cerevisiae (Baker\'s yeast), participates in membrane transport between the endoplasmic reticulum and Golgi apparatus. The protein contains six WD-40 repeat motifs in its N-terminal region PUBMED:8647451.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '1592' 'IPR003724' '\

    ATP:cob(I)alamin (or ATP:corrinoid) adenosyltransferases (), catalyse the conversion of cobalamin (vitamin B12) into its coenzyme form, adenosylcobalamin (coenzyme B12) PUBMED:15516577. Adenosylcobalamin (AdoCbl) is required for the ativity of certain enzymes. AdoCbl contains an adenosyl moiety liganded to the cobalt ion of cobalamin via a covalent Co-C bond, and its synthesis is unique to certain prokaryotes. ATP:cob(I)alamin adenosyltransferases are classed into three groups: CobA-type PUBMED:16672609, EutT-type PUBMED:15317775 and PduO-type PUBMED:11160088. Each of the three enzyme types appears to be specialised for particular AdoCbl-dependent enzymes or for the de novo synthesis AdoCbl. PduO and EutT are distantly related, sharing short conserved motifs, while CobA is evolutionarily unrelated and is an example of convergent evolution.

    \

    This entry represents the ATP:cob(I)alamin adenosyltransferases CobA (Salmonella typhimurium), CobO (Pseudomonas denitrificans), and ButR (Escherichia coli). There is a high degree of sequence identity between these proteins PUBMED:7916712. CobA is responsible for attaching the adenosyl moiety from ATP to the cobalt ion of the corrin ring, necessary for the convertion of cobalamin to adenosylcobalamin PUBMED:16672609, PUBMED:15516577.

    \ ' '1593' 'IPR002157' '\

    Cobalamin (Cbl or vitamin B12) is only accessible through diet in mammals. Absorption, plasma transport and cellular uptake of Cbl in mammals involves three Cbl-transporting proteins, which are listed below in order of increasing Cbl-specificity:

    \ \

    The structure of TC reveals a two-domain structure, an N-terminal alpha(6)-alpha(6) barrel, and a smaller C-terminal domain PUBMED:16537422. Many interactions between Cbl and its binding site in the interface of the two domains are conserved among the other Cbl transporters. Specificity for Cbl between the different transporters may reside in a beta-hairpin motif found in the smaller C-terminal domain PUBMED:17274763.

    \ \ ' '1594' 'IPR004485' '\

    Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase PUBMED:12869542. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase PUBMED:17163662.

    \ \

    There are at least two distinct cobalamin biosynthetic pathways in bacteria PUBMED:11153269:

    \ \

    Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) PUBMED:11215515. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.

    \ \

    This entry represents the CbiB protein, which is involved in cobalamin biosynthesis and porphyrin biosynthesis. It converts cobyric acid to cobinamide by the addition of aminopropanol on the F carboxylic group. It is part of the cob operon PUBMED:14645280.

    \ ' '1595' 'IPR003805' '\

    Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase PUBMED:12869542. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase PUBMED:17163662.

    \ \

    There are at least two distinct cobalamin biosynthetic pathways in bacteria PUBMED:11153269:

    \ \

    Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) PUBMED:11215515. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.

    \ \

    This entry represents the CobS protein, a cobalamin-5-phosphate synthase that catyalzes the reactions:

    \ \

    The protein product from these catalyses is associated with a large complex of proteins and is induced by cobinamide. CobS is involved in part III of cobalamin biosynthesis, one of the late steps in adenosylcobalamin synthesis that, together with CobU, CobT, and CobC proteins, defines the nucleotide loop assembly pathway PUBMED:10518530, PUBMED:15133100.

    \ \ \ ' '1596' 'IPR003203' '\ This family is composed of a group of bifunctional cobalbumin biosynthesis enzymes which display cobinamide kinase and cobinamide phosphate guanyltransferase activity. The crystal structure of the enzyme reveals the molecule to be a trimer with a propeller-like shape PUBMED:9601028.\ ' '1597' 'IPR002108' '\

    The actin-depolymerising factor homology (ADF-H) domain is an ~150-amino acid motif that is present in three phylogenetically distinct classes of eukaryotic actin-binding proteins PUBMED:9693358, PUBMED:12207032, PUBMED:9047337:\

    \ \

    Although these proteins are biochemically distinct and play different roles in actin dynamics, they all appear to use the ADF-H domain for their interactions with actin.

    \ \

    The ADF-H domain consists of a six-stranded mixed beta-sheet in which the four central strands (beta2-beta5) are anti-parallel and the two edge strands (beta1 and beta6) run parallel with the neighbouring strands. The sheet is surrounded by two alpha-helices on each side PUBMED:9693358, PUBMED:12207032, PUBMED:15522287.

    \ ' '1598' 'IPR002102' '\

    Cohesin domains interact with a complementary domain, termed the dockerin domain (see ). The cohesin-dockerin interaction is the crucial interaction for complex formation in the cellulosome PUBMED:9083107.

    \ \ \

    The scaffoldin component of the cellulolytic bacterium Clostridium thermocellum is a non-hydrolytic protein which\ organises the hydrolytic enzymes in a large complex, called the cellulosome. Scaffoldin comprises a series of functional domains,\ amongst which is a single cellulose-binding domain and nine cohesin domains which are responsible for integrating the individual\ enzymatic subunits into the complex.

    \ ' '1599' 'IPR000885' '\ Collagens contain a large number of globular domains in between the\ regions of triple helical repeats .\ These domains are involved in binding diverse substrates.\ One of these domains is found at the C terminus of fibrillar collagens.\ The exact function of this domain is unknown.\ ' '1600' 'IPR000293' '\

    Colicins are plasmid-encoded polypeptide toxins produced by and active against Escherichia coli and closely related bacteria. Colicins are released into the environment to reduce competition from other bacterial strains. Colicins bind to outer membrane receptors, using them to translocate to the cytoplasm or cytoplasmic membrane, where they exert their cytotoxic effect, including depolarisation of the cytoplasmic membrane, DNase activity, RNase activity, or inhibition of murein synthesis.

    \

    Channel-forming colicins (colicins A, B, E1, Ia, Ib, and N) are transmembrane proteins that depolarize the cytoplasmic membrane, leading to dissipation of cellular energy PUBMED:14731273. These colicins contain at least three domains: an N-terminal translocation domain responsible for movement across the outer membrane and periplasmic space; a central domain responsible for receptor recognition; and a C-terminal cytotoxic domain responsible for channel formation in the cytoplasmic membrane PUBMED:15519318.

    \

    This entry represents the C-terminal cytotoxic domain, which has a globin-like fold with additional helices at either end.

    \ \ ' '1601' 'IPR005557' '\

    Colicin immunity proteins are plasmid encoded proteins necessary for protecting the cell against colicins. Colicins are toxins released by bacteria during times of stress PUBMED:11590016.

    \ ' '1602' 'IPR000290' '\ This family includes bacterial colicin and pyocin immunity proteins PUBMED:8692833, PUBMED:8755730. These immunity proteins can bind specifically to the DNase-type colicins and pyocins and inhibit their bactericidal activity. The\ 1.8-angstrom crystal structure of the ImmE7 protein consists of four antiparallel alpha-helices PUBMED:8692833. Sequence similarities between colicins E2, A and E1 PUBMED:3936034 are less striking. The colicin\ E2 (pyocin) immunity protein does not share similarity with either the colicin E3 or\ cloacin DF13 PUBMED:6253914 immunity proteins. Pyocin protects a cell that harbours the plasmid\ ColE2 encoding colicin E2 against colicin E2; it is thus essential both for autonomous\ replication and colicin E2 immunity PUBMED:3892228.\ ' '1603' 'IPR003825' '\ Colicin V production protein is required in Escherichia coli for colicin V production from plasmid pColV-K30 PUBMED:2542219. This protein is coded for in the purF operon.\ ' '1604' 'IPR017913' '\

    This entry represents the N-terminal domain of colipase proteins. Colipase PUBMED:1567900, PUBMED:3147715 is a small protein cofactor needed by pancreatic lipase for efficient dietary lipid hydrolyisis. It also binds to the bile-salt covered triacylglycerol interface, thus allowing the enzyme to anchor itself to the water-lipid interface. Efficient absorption of dietary fats is dependent on the action of pancreatic triglyceride lipase. Colipase binds to the C-terminal, non-catalytic domain of lipase, thereby stabilising as active conformation and considerably increasing the overall hydrophobic binding site. Structural studies of the complex and of colipase alone have revealed the functionality of its architecture PUBMED:9240923, PUBMED:10570245.

    \ \

    Colipase is a small protein with five conserved disulphide bonds. Structural analogies have been recognised between a developmental protein (Dickkopf), the pancreatic lipase C-terminal domain, the N-terminal domains of lipoxygenases and the C-terminal domain of alpha-toxin. These non-catalytic domains in the latter enzymes are important for interaction with membrane. It has not been established if these domains are also involved in eventual protein cofactor binding as is the case for pancreatic lipase PUBMED:10570245.

    \ ' '1605' 'IPR001808' '\ Numerous bacterial transcription regulatory proteins bind DNA via a helix-turn-helix (HTH) motif. These proteins are very diverse, but for convenience may be grouped into subfamilies on the basis of sequence similarity. This family groups together a range of proteins, including anr, crp, clp, cysR, fixK, flp, fnr, fnrN, hlyX and ntcA PUBMED:14638413, PUBMED:10550204. Within this family, the HTH motif is situated towards the C-terminus.\ \ ' '1606' 'IPR017914' '\

    This entry represents the C-terminal domain of colipase proteins. Colipase PUBMED:1567900, PUBMED:3147715 is a small protein cofactor needed by pancreatic lipase for efficient dietary lipid hydrolyisis. It also binds to the bile-salt covered triacylglycerol interface, thus allowing the enzyme to anchor itself to the water-lipid interface. Efficient absorption of dietary fats is dependent on the action of pancreatic triglyceride lipase. Colipase binds to the C-terminal, non-catalytic domain of lipase, thereby stabilising as active conformation and considerably increasing the overall hydrophobic binding site. Structural studies of the complex and of colipase alone have revealed the functionality of its architecture PUBMED:9240923, PUBMED:10570245.

    \ \

    Colipase is a small protein with five conserved disulphide bonds. Structural analogies have been recognised between a developmental protein (Dickkopf), the pancreatic lipase C-terminal domain, the N-terminal domains of lipoxygenases and the C-terminal domain of alpha-toxin. These non-catalytic domains in the latter enzymes are important for interaction with membrane. It has not been established if these domains are also involved in eventual protein cofactor binding as is the case for pancreatic lipase PUBMED:10570245.

    \ ' '1607' 'IPR004288' '\

    This family consists exclusively of streptococcal competence stimulating peptide precursors, which are generally up to 50 amino acid residues long. In all the members of this family, the leader sequence is cleaved after two conserved glycine residues; thus the leader sequence is of the double- glycine type PUBMED:9352904. Competence stimulating peptides (CSP) are small (less than 25 amino acid residues) cationic peptides. The N-terminal amino acid residue is negatively charged, either glutamate or aspartate. The C-terminal end is positively charged. The third residue is also positively charged: a highly conserved arginine PUBMED:9352904. Some COMC proteins and their precursors (not included in this family) do not fully follow the above description.

    \ \

    Functionally, CSP act as pheromones, stimulating competence for genetic transformation in streptococci. In streptococci, the (CSP mediated) competence response requires exponential cell growth at a critical density, a relatively simple requirement when compared to the stationary-phase requirement of Haemophilus, or the late-logarithmic- phase of Bacillus PUBMED:7479953. All bacteria induced to competence by a particular CSP are said to belong to the same pherotype, because each CSP is recognised by a specific receptor (the signalling domain of a histidine kinase ComD). Pherotypes are not necessarily species-specific. In addition, an organism may change pherotype. There are two possible mechanisms for pherotype switching: horizontal gene transfer, and accumulation of point mutations. The biological significance of pherotypes and pherotype switching is not definitively determined. Pherotype switching occurs frequently enough in naturally competent streptococci to suggest that it may be an important contributor to genetic exchange between different bacterial species PUBMED:9352904.

    \ ' '1608' 'IPR003181' '\ The virus capsid is composed 60 icosahedral units, each of which is composed of one copy of each of the two coat proteins. This family contains the large coat protein (LCP) PUBMED:1546463 of the comoviridae viral family.\ ' '1609' 'IPR003182' '\ The virus capsid is composed 60 icosahedral units, each of which is composed of one copy of each of the two coat proteins. This family contains the small coat protein (SCP) PUBMED:1546463 of the comoviridae viral family.\ ' '1610' 'IPR002023' '\

    Among the many polypeptide subunits that make up complex I, there is one with a molecular weight of 24 kDa (in mammals), which is a component of the iron-sulphur (IP) fragment of the enzyme. It seems to bind a 2Fe-2S iron-sulphur cluster. The 24 kDa subunit is nuclear encoded, as a precursor form with a transit peptide in mammals and in Neurospora crassa.\ There is a highly conserved region located in the central section of this subunit that contains two conserved cysteines, that are probably involved in the binding of the 2Fe-2S centre. The 24 kDa subunit is highly similar to PUBMED:7690854, PUBMED:1445936:\

    \

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '1611' 'IPR001135' '\

    This entry represents subunit D (NuoD) of NADH-quinone oxidoreductase (), and subunit H (NdhH) of NAD(P)H-quinone oxidoreductase (). NADH-quinone (Q) oxidoreductase is a large and complex redox proton pump, which utilises the free energy derived from oxidation of NADH with lipophilic electron/proton carrier Q to translocate protons across the membrane to generate an electrochemical proton gradient PUBMED:11695831. Subunit D (NuoD) is a 49kDa polypeptide that appears to be evolutionarily important in determining the physiological function of complex I/NDH-1 PUBMED:11695831.

    \ ' '1612' 'IPR011538' '\

    Thie entry represents the 51 kDa subunit from NADH:ubiquinone oxidoreductase PUBMED:2029890. Among the many polypeptide subunits that make up complex I, there is one with a molecular weight of 51 kDa (in mammals), which is the second largest subunit of complex I and is a component of the iron-sulphur (IP) fragment of the enzyme. It seems to bind to NAD, FMN, and a 2Fe-2S cluster. The 51 kDa subunit and the bacterial hydrogenase alpha subunit contain three regions of sequence similarities. The first one most probably corresponds to the NAD-binding site, the second to the FMN-binding site, and the third one, which contains three cysteines, to the iron-sulphur binding region.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '1613' 'IPR001242' '\ This domain is found in many multi-domain enzymes which synthesize peptide antibiotics. This domain catalyses a\ condensation reaction to form peptide bonds in non-ribosomal peptide biosynthesis. It is usually found to the carboxy\ side of a phosphopantetheine binding domain (pp-binding). It has been shown that mutations in the HHXXXDG motif\ abolish activity suggesting this is part of the active site PUBMED:9712910. \ ' '1614' 'IPR013092' '\

    The connexins are a family of integral membrane proteins that oligomerise to form intercellular channels that are clustered at gap junctions. These channels are specialised sites of cell-cell contact that allow the passage of ions, intracellular metabolites and messenger molecules (with molecular weight less than 1-2kDa) from the cytoplasm of one cell to its opposing neighbours. They are found in almost all vertebrate cell types, and somewhat similar proteins have been cloned from plant species. Invertebrates utilise a different family of molecules, innexins, that share a similar predicted secondary structure to the vertebrate connexins, but have no sequence identity to them PUBMED:9769729.

    \ \

    Vertebrate gap junction channels are thought to participate in diverse biological functions. For instance, in the heart they permit the rapid cell-cell transfer of action potentials, ensuring coordinated contraction of the cardiomyocytes. They are also responsible for neurotransmission at specialised \'electrical\' synapses. In non-excitable tissues, such as the liver, they may allow metabolic cooperation between cells. In the brain, glial cells are extensively-coupled by gap junctions; this allows waves of intracellular Ca2+ to propagate through nervous tissue, and may contribute to their ability to spatially-buffer local changes in extracellular K+ concentration PUBMED:7685944.

    \ \

    The connexin protein family is encoded by at least 13 genes in rodents, with many homologues cloned from other species. They show overlapping tissue expression patterns, most tissues expressing more than one connexin type. Their conductances, permeability to different molecules, phosphorylation and voltage-dependence of their gating, have been found to vary. Possible communication diversity is increased further by the fact that gap junctions may be formed by the association of different connexin isoforms from apposing cells. However, in vitro studies have shown that not all possible combinations of connexins produce active channels PUBMED:8811187, PUBMED:8608591.

    \ \

    Hydropathy analysis predicts that all cloned connexins share a common transmembrane (TM) topology. Each connexin is thought to contain 4 TM\ domains, with two extracellular and three cytoplasmic regions. This model\ has been validated for several of the family members by in vitro biochemical\ analysis. Both N- and C-termini are thought to face the cytoplasm, and the\ third TM domain has an amphipathic character, suggesting that it contributes\ to the lining of the formed-channel. Amino acid sequence identity between\ the isoforms is ~50-80%, with the TM domains being well conserved. Both\ extracellular loops contain characteristically conserved cysteine residues,\ which likely form intramolecular disulphide bonds. By contrast, the single\ putative intracellular loop (between TM domains 2 and 3) and the cytoplasmic\ C-terminus are highly variable among the family members.\ Six connexins are\ thought to associate to form a hemi-channel, or connexon. Two connexons then\ interact (likely via the extracellular loops of their connexins) to form the\ complete gap junction channel.

    \ \
     \
           NH2-***        ***        *************-COOH\
                 **     **   **      **\
                 **    **     **    **   Cytoplasmic\
              ---**----**-----**----**----------------\
                 **    **     **    **   Membrane\
                 **    **     **    **\
              ---**----**-----**----**----------------\
                 **    **     **    **   Extracellular\
                  **  **       **  **\
                    **           **\
    
    \ \

    Two sets of nomenclature have been used to identify the connexins. The\ first, and most commonly used, classifies the connexin molecules according\ to molecular weight, such as connexin43 (abbreviated to Cx43), indicating\ a connexin of molecular weight close to 43kDa. However, studies have\ revealed cases where clear functional homologues exist across species\ that have quite different molecular masses; therefore, an alternative\ nomenclature was proposed based on evolutionary considerations, which\ divides the family into two major subclasses, alpha and beta, each with a\ number of members PUBMED:1320430. Due to their ubiquity and overlapping tissue distributions, it has proved difficult to elucidate the functions of individual connexin isoforms. To circumvent this problem, particular connexin-encoding genes have been subjected to targeted-disruption in mice, and the phenotype of the resulting animals investigated. Around half the connexin isoforms have been investigated in this manner PUBMED:9861669. Further insight into the functional roles of connexins has come from the discovery that a number of human diseases are caused by mutations in connexin genes. For instance, mutations in Cx32 give rise to a form of inherited peripheral neuropathy called X-linked dominant Charcot-Marie-Tooth disease PUBMED:7570999. Similarly, mutations in Cx26 are responsible for both autosomal recessive and dominant forms of nonsyndromic deafness, a disorder characterised by hearing loss, with no apparent effects on other organ systems.

    \ \

    This domain is found in the N-terminal region of these proteins.

    \ ' '1615' 'IPR013124' '\

    The connexins are a family of integral membrane proteins that oligomerise to form intercellular channels that are clustered at gap junctions. These channels are specialised sites of cell-cell contact that allow the passage of ions, intracellular metabolites and messenger molecules (with molecular weight less than 1-2kDa) from the cytoplasm of one cell to its opposing neighbours. They are found in almost all vertebrate cell types, and somewhat similar proteins have been cloned from plant species. Invertebrates utilise a different family of molecules, innexins, that share a similar predicted secondary structure to the vertebrate connexins, but have no sequence identity to them PUBMED:9769729.

    \ \

    Vertebrate gap junction channels are thought to participate in diverse biological functions. For instance, in the heart they permit the rapid cell-cell transfer of action potentials, ensuring coordinated contraction of the cardiomyocytes. They are also responsible for neurotransmission at specialised \'electrical\' synapses. In non-excitable tissues, such as the liver, they may allow metabolic cooperation between cells. In the brain, glial cells are extensively-coupled by gap junctions; this allows waves of intracellular Ca2+ to propagate through nervous tissue, and may contribute to their ability to spatially-buffer local changes in extracellular K+ concentration PUBMED:7685944.

    \ \

    The connexin protein family is encoded by at least 13 genes in rodents, with many homologues cloned from other species. They show overlapping tissue expression patterns, most tissues expressing more than one connexin type. Their conductances, permeability to different molecules, phosphorylation and voltage-dependence of their gating, have been found to vary. Possible communication diversity is increased further by the fact that gap junctions may be formed by the association of different connexin isoforms from apposing cells. However, in vitro studies have shown that not all possible combinations of connexins produce active channels PUBMED:8811187, PUBMED:8608591.

    \ \

    Hydropathy analysis predicts that all cloned connexins share a common transmembrane (TM) topology. Each connexin is thought to contain 4 TM\ domains, with two extracellular and three cytoplasmic regions. This model\ has been validated for several of the family members by in vitro biochemical\ analysis. Both N- and C-termini are thought to face the cytoplasm, and the\ third TM domain has an amphipathic character, suggesting that it contributes\ to the lining of the formed-channel. Amino acid sequence identity between\ the isoforms is ~50-80%, with the TM domains being well conserved. Both\ extracellular loops contain characteristically conserved cysteine residues,\ which likely form intramolecular disulphide bonds. By contrast, the single\ putative intracellular loop (between TM domains 2 and 3) and the cytoplasmic\ C-terminus are highly variable among the family members.\ Six connexins are\ thought to associate to form a hemi-channel, or connexon. Two connexons then\ interact (likely via the extracellular loops of their connexins) to form the\ complete gap junction channel.

    \ \
     \
           NH2-***        ***        *************-COOH\
                 **     **   **      **\
                 **    **     **    **   Cytoplasmic\
              ---**----**-----**----**----------------\
                 **    **     **    **   Membrane\
                 **    **     **    **\
              ---**----**-----**----**----------------\
                 **    **     **    **   Extracellular\
                  **  **       **  **\
                    **           **\
    
    \ \

    Two sets of nomenclature have been used to identify the connexins. The\ first, and most commonly used, classifies the connexin molecules according\ to molecular weight, such as connexin43 (abbreviated to Cx43), indicating\ a connexin of molecular weight close to 43kDa. However, studies have\ revealed cases where clear functional homologues exist across species\ that have quite different molecular masses; therefore, an alternative\ nomenclature was proposed based on evolutionary considerations, which\ divides the family into two major subclasses, alpha and beta, each with a\ number of members PUBMED:1320430. Due to their ubiquity and overlapping tissue distributions, it has proved difficult to elucidate the functions of individual connexin isoforms. To circumvent this problem, particular connexin-encoding genes have been subjected to targeted-disruption in mice, and the phenotype of the resulting animals investigated. Around half the connexin isoforms have been investigated in this manner PUBMED:9861669. Further insight into the functional roles of connexins has come from the discovery that a number of human diseases are caused by mutations in connexin genes. For instance, mutations in Cx32 give rise to a form of inherited peripheral neuropathy called X-linked dominant Charcot-Marie-Tooth disease PUBMED:7570999. Similarly, mutations in Cx26 are responsible for both autosomal recessive and dominant forms of nonsyndromic deafness, a disorder characterised by hearing loss, with no apparent effects on other organ systems.

    \ \

    Gap junction alpha-1 protein (also called connexin43, or Cx43) is a connexin\ of 381 amino acid residues (human isoform) that is widely expressed in\ several organs and cell types, and is the principal gap junction protein of\ the heart. Characterisation of genetically-engineered mice that lack Cx43,\ and also of human patients that have spontaneously-occurring mutations in\ the gene encoding it (GJA1), suggest Cx43 is essential for the development\ of normal cardiac architecture and ventricular conduction. Mice lacking Cx43\ survive to term but die shortly after birth. They have cardiac malformations\ that lead to the obstruction of the pulmonary artery, leading to neonatal\ cyanosis, and subsequent death. This phenotype is reminiscent of some forms\ of stenosis of the pulmonary artery. Human subjects with visceroatrial\ heterotaxia (a heart disorder characterised by arterial defects), have been\ found to have points mutations in the Cx43-encoding gene, as a result of \ which a potential phosphorylation site within the C-terminus is disrupted. \ Consequently, although these mutant Cx43 molecules still form functional gap\ junction channels, their response to protein kinase activation is impaired.

    \ \

    This domain is found in the C-terminal region of these proteins.

    \ ' '1616' 'IPR002266' '\

    The connexins are a family of integral membrane proteins that oligomerise to form intercellular channels that are clustered at gap junctions. These channels are specialised sites of cell-cell contact that allow the passage of ions, intracellular metabolites and messenger molecules (with molecular weight less than 1-2kDa) from the cytoplasm of one cell to its opposing neighbours. They are found in almost all vertebrate cell types, and somewhat similar proteins have been cloned from plant species. Invertebrates utilise a different family of molecules, innexins, that share a similar predicted secondary structure to the vertebrate connexins, but have no sequence identity to them PUBMED:9769729.

    \ \

    Vertebrate gap junction channels are thought to participate in diverse biological functions. For instance, in the heart they permit the rapid cell-cell transfer of action potentials, ensuring coordinated contraction of the cardiomyocytes. They are also responsible for neurotransmission at specialised \'electrical\' synapses. In non-excitable tissues, such as the liver, they may allow metabolic cooperation between cells. In the brain, glial cells are extensively-coupled by gap junctions; this allows waves of intracellular Ca2+ to propagate through nervous tissue, and may contribute to their ability to spatially-buffer local changes in extracellular K+ concentration PUBMED:7685944.

    \ \

    The connexin protein family is encoded by at least 13 genes in rodents, with many homologues cloned from other species. They show overlapping tissue expression patterns, most tissues expressing more than one connexin type. Their conductances, permeability to different molecules, phosphorylation and voltage-dependence of their gating, have been found to vary. Possible communication diversity is increased further by the fact that gap junctions may be formed by the association of different connexin isoforms from apposing cells. However, in vitro studies have shown that not all possible combinations of connexins produce active channels PUBMED:8811187, PUBMED:8608591.

    \ \

    Hydropathy analysis predicts that all cloned connexins share a common transmembrane (TM) topology. Each connexin is thought to contain 4 TM\ domains, with two extracellular and three cytoplasmic regions. This model\ has been validated for several of the family members by in vitro biochemical\ analysis. Both N- and C-termini are thought to face the cytoplasm, and the\ third TM domain has an amphipathic character, suggesting that it contributes\ to the lining of the formed-channel. Amino acid sequence identity between\ the isoforms is ~50-80%, with the TM domains being well conserved. Both\ extracellular loops contain characteristically conserved cysteine residues,\ which likely form intramolecular disulphide bonds. By contrast, the single\ putative intracellular loop (between TM domains 2 and 3) and the cytoplasmic\ C-terminus are highly variable among the family members.\ Six connexins are\ thought to associate to form a hemi-channel, or connexon. Two connexons then\ interact (likely via the extracellular loops of their connexins) to form the\ complete gap junction channel.

    \ \
     \
           NH2-***        ***        *************-COOH\
                 **     **   **      **\
                 **    **     **    **   Cytoplasmic\
              ---**----**-----**----**----------------\
                 **    **     **    **   Membrane\
                 **    **     **    **\
              ---**----**-----**----**----------------\
                 **    **     **    **   Extracellular\
                  **  **       **  **\
                    **           **\
    
    \ \

    Two sets of nomenclature have been used to identify the connexins. The\ first, and most commonly used, classifies the connexin molecules according\ to molecular weight, such as connexin43 (abbreviated to Cx43), indicating\ a connexin of molecular weight close to 43kDa. However, studies have\ revealed cases where clear functional homologues exist across species\ that have quite different molecular masses; therefore, an alternative\ nomenclature was proposed based on evolutionary considerations, which\ divides the family into two major subclasses, alpha and beta, each with a\ number of members PUBMED:1320430. Due to their ubiquity and overlapping tissue distributions, it has proved difficult to elucidate the functions of individual connexin isoforms. To circumvent this problem, particular connexin-encoding genes have been subjected to targeted-disruption in mice, and the phenotype of the resulting animals investigated. Around half the connexin isoforms have been investigated in this manner PUBMED:9861669. Further insight into the functional roles of connexins has come from the discovery that a number of human diseases are caused by mutations in connexin genes. For instance, mutations in Cx32 give rise to a form of inherited peripheral neuropathy called X-linked dominant Charcot-Marie-Tooth disease PUBMED:7570999. Similarly, mutations in Cx26 are responsible for both autosomal recessive and dominant forms of nonsyndromic deafness, a disorder characterised by hearing loss, with no apparent effects on other organ systems.

    \ \

    Gap junction alpha-8 protein (also called connexin50, Cx50, or lens fibre\ protein MP70) is a connexin of ~431 amino acid residues. The chicken isoform\ is shorter (399 residues) and is hence known as Cx45.6. Cx50 and Cx46 are\ the two gap junction proteins normally found in lens fibre cells of the eye.\ Evidence from both genetically-engineered mice, and from the identification\ of mutations in the human Cx50-encoding gene, highlight the importance of\ this connexin in maintaining lens transparency. Deletion of mice Cx50\ produces a viable phenotype, but these animals start to develop cataracts\ (of the zonular pulverant type) at about one week old. They also have\ abnormally small eyes and lenses. Similarly, mutations in the human gene\ encoding Cx50 have been associated with the occurrence of congenital\ cataracts. Affected individuals develop cataracts (with zonular pulverent\ opacities), and analysis shows they have a single point mutation in the Cx50\ coding region, resulting in a non-conservative substitution in the second\ putative TM domain of a serine residue for a proline.

    \ ' '1617' 'IPR018383' '\

    This entry represents the UPF0324 family of uncharacterised, multi-pass membrane proteins.

    \ ' '1618' 'IPR016065' '\ This is a family of conserved hypothetical proteins, which includes a putative methylase.\ ' '1619' 'IPR007348' '\ CopC is a bacterial blue copper protein that binds 1 atom of copper per protein molecule. Along with CopA, CopC mediates copper resistance by sequestration of copper in the periplasm PUBMED:1924351.\ ' '1620' 'IPR000923' '\

    Blue (type 1) copper proteins are small proteins which bind a single copper atom and which are characterised by an intense electronic absorption band near 600 nm PUBMED:6698995, PUBMED:8433378. The most well known members of this class of proteins are the plant chloroplastic plastocyanins, which exchange electrons with cytochrome c6, and the distantly related bacterial azurins, which exchange electrons with cytochrome c551. This family of proteins also includes amicyanin from bacteria such as Methylobacterium extorquens or Paracoccus versutus (Thiobacillus versutus) that can grow on methylamine; auracyanins A and B from Chloroflexus aurantiacus PUBMED:1313011; blue copper protein from Alcaligenes faecalis; cupredoxin (CPC) from Cucumis sativus (Cucumber) peelings PUBMED:1468551; cusacyanin (basic blue protein; plantacyanin, CBP) from cucumber; halocyanin from Natronomonas pharaonis (Natronobacterium pharaonis) PUBMED:8195126, a membrane associated copper-binding protein; pseudoazurin from Pseudomonas; rusticyanin from Thiobacillus ferrooxidans PUBMED:1879547; stellacyanin from Rhus vernicifera (Japanese lacquer tree); umecyanin from the roots of Armoracia rusticana (Horseradish); and allergen Ra3 from ragweed. This pollen protein is evolutionary related to the above proteins, but seems to have lost the ability to bind copper. Although there is an appreciable amount of divergence in the sequences of all these proteins, the copper ligand sites are conserved.

    \ ' '1621' 'IPR001083' '\

    Some fungal transcription factors contain an N-terminal domain, the copper fist,\ which seems to be involved in copper-dependent DNA-binding PUBMED:8262047, PUBMED:8509391.\ These proteins activate the transcription of the metallothionein gene in response to\ copper. Metallothionein maintains copper levels in yeast \ PUBMED:3052856, PUBMED:8262047. \ The copper fist domain, which is similar in structure to metallothionein itself, undergoes\ a large conformational change on copper-binding that allows DNA-binding. The domain contains a conserved array of zinc-binding residues (Cys-X2-Cys-X8-Cys-X-His) and forms a three-stranded antiparallel beta-sheet with two short helical segments that project from one end of the beta-sheet PUBMED:9665167. Conserved residues form a basic patch that may be important for DNA binding.\

    \ ' '1622' 'IPR001260' '\ Coprogen oxidase (i.e. coproporphyrin III oxidase or coproporphyrinogenase) catalyses \ the oxidative decarboxylation of coproporphyrinogen III to proto-porhyrinogen IX in the \ haem and chlorophyll biosynthetic pathways PUBMED:8407975, PUBMED:8219054. The protein is a \ homodimer containing two internally bound iron atoms per molecule of native protein \ PUBMED:3516695. The enzyme is active in the presence of molecular oxygen that acts\ as an electron acceptor). The enzyme is widely distributed having been found in a variety of eukaryotic and \ prokaryotic sources.\ ' '1623' 'IPR004916' '\ Ubiquinone biosyntheis proteins, COQ7, are central metabolic regulatory proteins. They are members of a protein family, that contain two repeats of about 90 amino acids, that contains two conserved motifs. One of these DXEXXH may be part of an enzyme active site.\ ' '1624' 'IPR006888' '\ Cor1 is a component of the chromosome core in the meiotic prophase chromosomes PUBMED:7876343. Xlr is a lymphoid cell specific protein PUBMED:7821804. Xmr is abundantly transcribed in testis in a tissue-specific and developmentally regulated manner. The protein is located in the nuclei of spermatocytes, early in the prophase of the first meiotic division, and later becomes concentrated in the XY nuclear subregion where it is in particular associated with the axes of sex chromosomes PUBMED:8306953.\ ' '1625' 'IPR002523' '\ The CorA transport system is the primary Mg2+ influx system of Salmonella typhimurium and Escherichia coli PUBMED:9775386, PUBMED:9786860. CorA is virtually ubiquitous in the\ Bacteria and Archaea. There are also eukaryotic relatives of this protein.\ ' '1626' 'IPR003377' '\

    The drosophila cornichon protein (gene: cni) PUBMED:7540118 is required in the germline for dorsal-ventral signalling. The dorsal-ventral pattern formation involves a reorganisation of the microtubule network correlated with the movement of the oocyte nucleus, and depending on the initial correct establishment of the anterior-posterior axis via a signal from the oocyte produced by cornichon and gurken and received by torpedo protein in the follicle cells. The biochemical function of the cornichon protein is currently not known. It is a protein of 144 residues that seems to contain three transmembrane regions.

    \ ' '1627' 'IPR006784' '\ This family represents the Coronavirus ORF3 protein, also known as the X2A protein.\ ' '1628' 'IPR004945' '\ The function of the Coronavirus 6B and 7B proteins is not known.\ ' '1629' 'IPR003449' '\ This is a family of proteins from Coronavirus, which may function in the formation of membrane-bound replication complexes or in viral assembly.\ ' '1630' 'IPR004876' '\ Members of this family are Coronavirus proteins that are located in the nucleocapsid. They have no known function.\ ' '1631' 'IPR002574' '\ This family consists of various Coronavirus matrix proteins which are\ transmembrane glycoproteins. The M protein or E1 glycoprotein is implicated in virus assembly PUBMED:6325918.\ The E1 viral membrane protein is required for formation of the viral \ envelope and is transported via the Golgi complex PUBMED:2305554.\ ' '1632' 'IPR006841' '\

    This is a family of Coronavirus nonstructural protein NS2. Phosphoamino acid analysis confirmed the phosphorylated nature of NS2 and identified serine and threonine as its phosphorylated amino acid residues PUBMED:1833877. It was also demonstrated that the ns2 gene product is not essential for Murine hepatitis virus replication in transformed murine cells PUBMED:2168966.

    \ ' '1633' 'IPR007878' '\

    Members have a phosphoesterase module (2H) PUBMED:12466548 and are predicted to be involved in RNA modification. The viral group of 2H phosphoesterases contains proteins from two unrelated virus types: the type C rotaviruses (VP3 protein, ) that are double stranded multipartite RNA viruses and the coronaviruses (NS2 protein, this group) that are positive strand RNA viruses. Given that these viruses have vertebrate hosts, it is likely that the 2H phosphoesterase domain was derived from the host by one of virus groups followed by rapid sequence divergence PUBMED:12466548. Subsequently, it may have been exchanged between the viral families. Although the direction of the exchange is not clear, it is possible that a double stranded replicative form of a subgenomic RNA transcript of the coronavirus NS2 was stabilised by a rotavirus and incorporated into its multiple double stranded RNA genome PUBMED:12466548. These proteins can be utilised as novel drug targets because of their predicted RNA modification role.

    \ ' '1634' 'IPR004293' '\ Members of this family are non-structural proteins that are found in\ Transmissible gastroenteritis virus (TGEV) and Porcine respiratory coronavirus (PRCV) isolates. These proteins are found on the same mRNA as another product, designated ORF3a. While ORF3a/b has been implicated in TGEV and PRCV pathogenesis, its precise role remains unclear PUBMED:10948987, PUBMED:10365166.\ ' '1635' 'IPR005603' '\

    This non-structural protein does not appear to be essential for viral growth in tissue culture and its physiological role is unknown.

    \ ' '1636' 'IPR001218' '\

    The entry represents the Coronavirus nucleocapsid protein. Sequence comparison of the N genes of five strains of the coronavirus Murine hepatitis virus suggests a three domain structure for the nucleocapsid protein PUBMED:2171216. There seems to be a specific interaction between the coronavirus Mouse hepatitis virus A59 nucleocapsid protein and packaging signal PUBMED:9426448.

    \ ' '1637' 'IPR002551' '\ The type I glycoprotein S of Coronavirus, trimers of which constitute the typical viral spikes, is assembled into virions through noncovalent interactions with the M protein. The spike glycoprotein is translated\ as a large polypeptide that is subsequently cleaved to S1 and S2 PUBMED:2984314. Both chimeric S proteins appeared to cause cell fusion when expressed individually, suggesting that they were biologically fully active PUBMED:10627571. The spike is a type I membrane glycoprotein that possesses a conserved transmembrane anchor and an unusual cysteine-rich (cys) domain that bridges the putative junction of the anchor and the cytoplasmic tail PUBMED:10725213.\ ' '1638' 'IPR002552' '\

    The type I glycoprotein S of Coronavirus, trimers of which constitute the typical viral spikes, is assembled into virions through noncovalent interactions with the M protein. The spike glycoprotein is translated\ as a large polypeptide that is subsequently cleaved to S1 and S2 PUBMED:2984314. Both chimeric S proteins appeared to cause cell fusion when expressed individually, suggesting that they were biologically fully active PUBMED:10627571. The spike is a type I membrane glycoprotein that possesses a conserved transmembrane anchor and an unusual cysteine-rich (cys) domain that bridges the putative junction of the anchor and the cytoplasmic tail PUBMED:10725213.

    \ ' '1639' 'IPR000883' '\ Cytochrome c oxidase () is a key enzyme in aerobic metabolism. Proton pumping haem-copper oxidases represent the terminal, energy-transfer enzymes of respiratory chains in prokaryotes and eukaryotes. The CuB-haem a3 (or haem o) binuclear centre, associated with the largest subunit I of cytochrome c and ubiquinol oxidases (), is directly involved in the coupling between dioxygen reduction and proton pumping PUBMED:8083153, PUBMED:8049679.\ Some terminal oxidases generate a transmembrane proton gradient across the plasma membrane (prokaryotes) or the mitochondrial inner membrane (eukaryotes).

    The enzyme complex consists of 3-4 subunits (prokaryotes) up to 13 polypeptides (mammals) of which only the catalytic subunit (equivalent to mammalian subunit I (CO I)) is found in all haem-copper respiratory oxidases. The presence of a bimetallic centre (formed by a high-spin haem and copper B) as well as a low-spin haem, both ligated to six conserved histidine residues near the outer side of four transmembrane spans within CO I is common to all family members PUBMED:8013452, PUBMED:6307356, PUBMED:2824194. In contrast to eukaryotes the respiratory chain of prokaryotes is branched to multiple terminal oxidases. The enzyme complexes \ vary in haem and copper composition, substrate type and substrate affinity. The different respiratory oxidases allow the cells to customize their respiratory systems according to a variety of environmental growth conditions PUBMED:8083153.

    \ \

    It has been shown that eubacterial quinol oxidase was derived from cytochrome c oxidase in Gram-positive bacteria and that archaebacterial quinol oxidase has an independent origin. A considerable amount of evidence suggests that proteobacteria (Purple bacteria) acquired quinol oxidase through a lateral gene transfer from Gram-positive bacteria PUBMED:8083153.

    \ \

    Nitric oxide reductase (NOR) () exists in denitrifying species of archae and eubacteria and is a heterodimer of cytochromes b and c. Phenazine methosulphate can act as acceptor. The prosite signature in this entry recognises the haem-copper site of the nitric oxidases.

    \ ' '1640' 'IPR007745' '\ Cox17p is essential for the assembly of functional cytochrome c oxidase (CCO) and for delivery of copper ions to the mitochondrion for insertion into the enzyme in Saccharomyces cerevisiae PUBMED:12370308.\ ' '1641' 'IPR000859' '\

    The CUB domain (for complement C1r/C1s, Uegf, Bmp1) is a structural motif of approximately 110 residues found almost exclusively in extracellular and plasma membrane-associated proteins, many of which are developmentally regulated PUBMED:8510165, PUBMED:2026272. These proteins are involved in a diverse range of functions, including complement activation, developmental patterning, tissue repair, axon guidance and angiogenesis, cell signalling, fertilisation, haemostasis, inflammation, neurotransmission, receptor-mediated endocytosis, and tumour suppression PUBMED:17335815, PUBMED:17051152. Many CUB-containing proteins are peptidases belonging to MEROPS peptidase families M12A (astacin) and S1A (chymotrypsin). Proteins containing a CUB domain include:

    \ \

    Several of the above proteins consist of a catalytic domain together with several CUB domains interspersed by calcium-binding EGF domains. Some CUB domains appear to be involved in oligomerisation and/or recognition of substrates and binding partners. For example, in the complement proteases, the CUB domains mediate dimerisation and binding to collagen-like regions of target proteins (e.g. C1q for C1r/C1s). The structure of CUB domains consists of a beta-sandwich with a jelly-roll fold. Almost all CUB domains contain four conserved cysteines that probably form two disulphide bridges (C1-C2, C3-C4). The CUB1 domains of C1s and Map19 have calcium-binding sites PUBMED:17446170.

    \ ' '1642' 'IPR002429' '\

    Cytochrome c oxidase () PUBMED:6307356, PUBMED:8083153 is an oligomeric enzymatic complex which is a component of the respiratory chain and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. The number of polypeptides in the complex ranges from 3-4 (prokaryotes), up to 13(mammals).

    \

    Subunit 2 (CO II) transfers the electrons from cytochrome c to the catalytic subunit 1. It contains two adjacent transmembrane regions in its N-terminus and the major part of the protein is exposed to the periplasmic or to the mitochondrial intermembrane space, respectively. CO II provides the substrate-binding site and contains a copper centre called Cu(A), probably the primary acceptor in cytochrome c oxidase. An exception is the corresponding subunit of the cbb3-type oxidase which lacks the copper A redox-centre. Several bacterial CO II have a C-terminal extension that contains a covalently bound haem c.

    \ ' '1643' 'IPR011759' '\

    Cytochrome c oxidase () PUBMED:6307356, PUBMED:8083153 is an oligomeric enzymatic complex which is a component of the respiratory chain and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. The enzyme complex consists of 3-4 subunits (prokaryotes) to up to 13 polypeptides (mammals).

    \

    Subunit 2 (CO II) transfers the electrons from cytochrome c to the catalytic subunit 1. It contains two adjacent transmembrane regions in its N-terminus and the major part of the protein is exposed to the periplasmic or to the mitochondrial intermembrane space, respectively. CO II provides the substrate-binding site and contains a copper centre called Cu(A) (see ), probably the primary acceptor in cytochrome c oxidase. An exception is the corresponding subunit of the cbb3-type oxidase which lacks the copper A redox-centre. Several bacterial CO II have a C-terminal extension that contains a covalently bound haem c.

    \

    The N-terminal domain of cytochrome C oxidase contains two transmembrane alpha-helices.

    \ ' '1644' 'IPR000298' '\

    Cytochrome c oxidase () is the terminal enzyme of the respiratory chain of mitochondria and many aerobic bacteria. It catalyses the transfer of electrons from reduced cytochrome c to molecular oxygen:

    \ \ \

    This reaction is coupled to the pumping of four additional protons across the mitochondrial or bacterial membrane PUBMED:10563795, PUBMED:16598262.

    \ \

    Cytochrome c oxidase is an oligomeric enzymatic complex that is located in the mitochondrial inner membrane of eukaryotes and in the plasma membrane of aerobic prokaryotes. The core structure of prokaryotic and eukaryotic cytochrome c oxidase contains three common subunits, I, II and III. In prokaryotes, subunits I and III can be fused and a fourth subunit is sometimes found, whereas in eukaryotes there are a variable number of additional small polypeptidic subunits PUBMED:8383670. The functional role of subunit III is not yet understood.

    \ \

    As the bacterial respiratory systems are branched, they have a number of distinct terminal oxidases, rather than the single cytochrome c oxidase present in the eukaryotic mitochondrial systems. Although the cytochrome o oxidases do not catalyse the cytochrome c but the quinol (ubiquinol) oxidation they belong to the same haem-copper oxidase superfamily as cytochrome c oxidases. Members of this family share sequence similarities in all three core subunits: subunit I is the most conserved subunit, whereas subunit II is the least conserved PUBMED:1316894, PUBMED:2162835, PUBMED:8083153.

    \ ' '1645' 'IPR003204' '\

    Cytochrome c oxidase () is an oligomeric enzymatic complex which is a component \ of the respiratory chain complex and is involved in the transfer of electrons from \ cytochrome c to oxygen PUBMED:6307356. \ In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in \ aerobic prokaryotes it is found in the plasma membrane.

    \

    In eukaryotes, in addition to the \ three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are \ a variable number of small polypeptidic subunits. One of these subunits is known as Va.

    \ ' '1646' 'IPR002124' '\

    Cytochrome c oxidase () is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen PUBMED:6307356. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.

    \

    In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. One of these subunits, which is known as Vb in mammals, V in Dictyostelium discoideum (Slime mold) and IV in yeast, binds a zinc atom. The sequence of subunit Vb is well conserved and includes three conserved cysteines that coordinate the zinc ion PUBMED:1661610, PUBMED:8638158. Two of these cysteines are clustered in the C-terminal section of the subunit.

    \ ' '1647' 'IPR001349' '\

    Cytochrome c oxidase () is an oligomeric enzymatic complex which is a component \ of the respiratory chain complex and is involved in the transfer of electrons from \ cytochrome c to oxygen PUBMED:6307356. \ In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in \ aerobic prokaryotes it is found in the plasma membrane.

    \

    In eukaryotes, in addition to the \ three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are \ a variable number of small polypeptidic subunits. One of these subunits is known as VIa \ in vertebrates and fungi. Mammals have two tissue-specific isoforms of VIa, a liver and a \ heart form. Only one form is found in fish PUBMED:9107314.

    \ ' '1648' 'IPR003213' '\

    Cytochrome c oxidase () is an oligomeric enzymatic complex that is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen PUBMED:6307356. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.

    \

    In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptide subunits. One of these subunits is the potentially haem-binding subunit, VIb, which is encoded in the nucleus PUBMED:11136449.

    \ ' '1649' 'IPR003177' '\

    Cytochrome c oxidase () is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen PUBMED:6307356. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.

    \

    In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. This family is composed of the heart and liver isoforms of cytochrome c oxidase subunit VIIa.

    \ ' '1650' 'IPR003205' '\

    Cytochrome c oxidase () is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen PUBMED:6307356. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.

    \

    In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits.This family is composed of cytochrome c oxidase subunit VIII.

    \ ' '1651' 'IPR003823' '\ This entry represents an uncharacterised domain in proteins of unknown function. This domain is found associated with CBS domains in\ some proteins .\ ' '1652' 'IPR008213' '\ The phycobilisome linker polypeptide determines the state of aggregation and the location of the disc-shaped phycobiliprotein units within the phycobilisome and modulates their spectroscopic properties in order to mediate a directed and optimal energy transfer. The phycobilisome is a hemidiscoidal structure that is composed of two distinct substructures, a core complex (that contains the phycobiliproteins) and a number of rods radiating from the core. The N-terminal domain of the petH gene product from Anabaena sp. (strain PCC 7119) shows homology to the CpcD phycobilisome linker polypeptide PUBMED:8343609.\ ' '1653' 'IPR001476' '\

    The chaperonins are \'helper\' molecules required for correct folding and subsequent assembly of some proteins PUBMED:1349837. These are required for normal cell growth PUBMED:2897629, \ and are stress-induced, acting to stabilise or protect disassembled \ polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10) PUBMED:12354603.

    \

    The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa \ chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 \ stacked rings, each ring containing 7 identical subunits PUBMED:2897629. These ring \ structures assemble by self-stimulation in the presence of Mg2+-ATP. The \ central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner PUBMED:1350777. The binding of cpn10 to \ cpn60 inhibits the weak ATPase activity of cpn60.

    \

    Escherichia coli GroES has also been shown to bind ATP cooperatively, and \ with an affinity comparable to that of GroEL PUBMED:7901771. Each GroEL subunit contains three structurally distinct domains: an apical, an intermediate and an equatorial domain. The apical\ domain contains the binding sites for both GroES and the unfolded protein substrate. The equatorial domain contains the ATP-binding site and most of the oligomeric\ contacts. The intermediate domain links the apical and equatorial domains and transfers allosteric information between them. The GroEL oligomer is a tetradecamer,\ cylindrically shaped, that is organised in two heptameric rings stacked back to back. Each GroEL ring contains a central cavity, known as the \'Anfinsen cage\',\ that provides an isolated environment for protein folding. The identical 10 kDa subunits of GroES form a dome-like heptameric oligomer in solution. ATP binding to GroES may\ be important in charging the seven subunits of the interacting GroEL ring\ with ATP, to facilitate cooperative ATP binding and hydrolysis for \ substrate protein release.

    \ ' '1654' 'IPR007870' '\

    COMPASS (Set1C, Complex Proteins Associated with Set1) is a nuclear complex that acts as a histone H3 (Lysine 4) methyltransferase that is required for telomeric silencing of gene expression in Saccharomyces cerevisiae (Baker\'s yeast) PUBMED:11805083. Component of the COMPASS (Set1C) complex include Set1(2), Bre2(2), Spp1(2), Sdc1(1), Shg1(1), Swd1(1), Swd2(1), and Swd3(1).

    \ \

    This entry included S. cerevisiae Shg1, which is a SET domain-containing protein related to the human Trx protein ASH2 PUBMED:11687631.

    \ ' '1655' 'IPR005481' '\

    Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine () or ammonia (), and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates PUBMED:10387030, PUBMED:11212301. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate PUBMED:8916922. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase () (ACC), propionyl-CoA carboxylase () (PCCase), pyruvate carboxylase () (PC) and urea carboxylase ().

    \

    Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain PUBMED:10089390. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites PUBMED:12379099. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.

    \

    Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein PUBMED:7907330. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP PUBMED:17397987. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia PUBMED:17451989. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains PUBMED:.

    \ \ \

    This entry represents the N-terminal domain of the large subunit of carbamoyl phosphate synthase. This domain can also be found in certain other related proteins.

    \ ' '1656' 'IPR005479' '\

    Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine () or ammonia (), and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates PUBMED:10387030, PUBMED:11212301. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate PUBMED:8916922. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase () (ACC), propionyl-CoA carboxylase () (PCCase), pyruvate carboxylase () (PC) and urea carboxylase ().

    \

    Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain PUBMED:10089390. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites PUBMED:12379099. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.

    \

    Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein PUBMED:7907330. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP PUBMED:17397987. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia PUBMED:17451989. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains PUBMED:.

    \ \ \

    This entry represents the ATP-binding domain found in the large subunit of carbamoyl phosphate synthase, as well as in related proteins.

    \ ' '1657' 'IPR001251' '\ This entry defines the C-terminal of various retinaldehyde/retinal-binding proteins that may be\ functional components of the visual cycle. Cellular retinaldehyde-binding protein (CRALBP) carries 11-cis-retinol or 11-cis-retinaldehyde as endogenous ligands and may function as a substrate carrier protein that modulates interaction of these retinoids with visual cycle enzymes PUBMED:1715867. \ The multidomain protein Trio binds the LAR transmembrane tyrosine phosphatase, contains a protein kinase domain, and has separate rac-specific and rho-specific guanine nucleotide exchange factor domains PUBMED:8643598. Trio is a multifunctional protein that integrates and amplifies signals involved in coordinating actin remodeling, which is necessary for cell migration and growth.\

    Other members of the family are \ transfer proteins that include, guanine nucleotide exchange factor that may \ function as an effector of RAC1, phosphatidylinositol/phosphatidylcholine transfer \ protein that is required for the transport of secretory proteins from the golgi\ complex and alpha-tocopherol transfer protein that enhances the transfer of the \ ligand between separate membranes.

    \ ' '1658' 'IPR008273' '\ This entry defines the N-terminal of various retinaldehyde/retinal-binding proteins that may be\ functional components of the visual cycle. Cellular retinaldehyde-binding protein (CRALBP) carries 11-cis-retinol or 11-cis-retinaldehyde as endogenous ligands and may function as a substrate carrier protein that modulates interaction of these retinoids with visual cycle enzymes PUBMED:1715867. \ The multidomain protein Trio binds the LAR transmembrane tyrosine phosphatase, contains a protein kinase domain, and has separate rac-specific and rho-specific guanine nucleotide exchange factor domains PUBMED:8643598. Trio is a multifunctional protein that integrates and amplifies signals involved in coordinating actin remodeling, which is necessary for cell migration and growth.\

    Other members of the family are \ transfer proteins that include, guanine nucleotide exchange factor that may \ function as an effector of RAC1, phosphatidylinositol/phosphatidylcholine transfer \ protein that is required for the transport of secretory proteins from the golgi\ complex and alpha-tocopherol transfer protein that enhances the transfer of the \ ligand between separate membranes.

    \ ' '1659' 'IPR003691' '\

    Three genes, crcA, cspE and crcB when present in high copy confer camphor resistance on a cell and suppress mutations in the chromosomal partition gene mukB in Escherichia coli. The cspE gene has been previously identified as a cold shock-like protein with homologues in all organisms tested PUBMED:8844142.

    \ \

    Camphor and mukB mutations may interfere with chromosome condensation and high copy crcA, cspE and crcB have been implicated as promoting or protecting chromosome folding PUBMED:8844142.

    \ ' '1660' 'IPR000587' '\

    Creatinase or creatine amidinohydrolase () catalyses the conversion of creatine and water to sarcosine and urea. The enzyme works as a homodimer, and is induced by choline chloride. Each monomer of creatinase has two clearly defined domains, a small N-terminal domain, and a large C-terminal domain.

    \ \

    The structure of the C-terminal region represents the "pita-bread" fold. The fold contains both alpha helices and an anti-parallel beta sheet within two structurally similar domains that are thought to be derived from an ancient gene duplication. The active site, where conserved, is located between the two domains. The fold is common to methionine aminopeptidase (), aminopeptidase P (), prolidase (), agropine synthase and creatinase (). Though many of these peptidases require a divalent cation, creatinase is not a metal-dependent enzyme PUBMED:8146141, PUBMED:12136144, PUBMED:8471602.

    \ ' '1661' 'IPR001064' '\

    The crystallins are water-soluble structural proteins that occur in high concentration in the cytoplasm of eye lens fiber cells. Four major groups of crystallin have been distinguished on the basis of size, charge and immunological properties: alpha-, beta- and gamma-crystallins occur in all vertebrate classes (though gamma-crystallins are low or absent in avian lenses); and delta-crystallin is found exclusively in reptiles and birds PUBMED:2688200, PUBMED:7634077.

    \ \

    This entry represents beta and gamma- crystallin which form a family of related proteins PUBMED:2107329, PUBMED:3064189. Structurally, beta and gamma crystallins are composed of two similar domains which, in turn, are each composed of two similar motifs with the two domains connected by a short connecting peptide. Each motif, which is about forty amino acid residues long, is folded in a distinctive \'Greek key\' pattern.

    \ ' '1662' 'IPR003785' '\

    Creatininase () catalyses the hydrolysis of creatinine to creatine, which can then be metabolised to urea and sarcosine by creatinase (). Creatininase is a member of the urease-related amidohydrolase superfamily PUBMED:7670196. Creatininase from Pseudomonas putida has a core structure consisting of 3-layers, alpha/beta/alpha PUBMED:15003455.

    \ ' '1663' 'IPR005558' '\

    Arthropod express a family of neuropeptides PUBMED:8590372 which so far consist of the following types of neurohormones:\

    \ \ These neurohormones are peptides of 70 to 80 residues which are processed from larger size precursors. They contain six conserved cysteines that are involved in disulphide bonds, as shown in the following schematic representation.

    \

    Crustacean neurohormone H proteins are referred to as precursor-related peptides as they are typically co-transcribed and translated with the CHH neurohormone (). However, in some species this neuropeptide is synthesized as a separate protein. Furthermore, neurohormone H can undergo proteolysis to give rise to 5 different neuropeptides PUBMED:3298549.

    \ ' '1664' 'IPR001166' '\

    Arthropod express a family of neuropeptides PUBMED:8590372 which so far consist of the following types of neurohormones:\

    \ \ These neurohormones are peptides of 70 to 80 residues which are processed from larger size precursors. They contain six conserved cysteines that are involved in disulphide bonds, as shown in the following schematic representation.

    \ ' '1665' 'IPR003090' '\

    The crystallins are water-soluble structural proteins that occur in high concentration in the cytoplasm of eye lens fiber cells. Four major groups of crystallin have been distinguished on the basis of size, charge and immunological properties: alpha-, beta- and gamma-crystallins occur in all vertebrate classes (though gamma-crystallins are low or absent in avian lenses); and delta-crystallin is found exclusively in reptiles and birds PUBMED:2688200, PUBMED:7634077.

    \

    Alpha-crystallin occurs as large aggregates, comprising two types of related subunits (A and B) that are highly similar to the small (15-30kDa) heat shock proteins (HSPs), particularly in their C-terminal halves. The relationship between these families is one of classic gene duplication and divergence, from the small HSP family, allowing adaptation to novel functions. Divergence probably occurred prior to evolution of the eye lens, alpha-crystallin being found in small amounts in tissues outside the lens PUBMED:2688200.

    \ \

    Alpha-crystallin has chaperone-like properties including the ability to prevent the precipitation of denatured proteins and to increase cellular tolerance to stress PUBMED:15575808. It has been suggested that these functions are important for the maintenance of lens transparency and the prevention of cataracts. This is supported by the observation that alpha-crystallin mutations show an association with cataract formation.

    \

    This entry represents the N-terminal domain of alpha-crystallin. It is not necessary for dimerisation or chaperone activity, but appears to be required for the formation of higher order aggregates PUBMED:9650080, PUBMED:11278766.

    \ ' '1666' 'IPR005534' '\

    CsgG is an outer membrane-located lipoprotein that is highly resistant to protease digestion. During curli assembly, an adhesive surface fibre, CsgG is required to maintain the stability of CsgA and CsgB PUBMED:9383186.

    \ ' '1667' 'IPR003751' '\

    The RNA-binding protein CsrA (carbon storage regulator) is a new kind of global regulator, which facilitates specific mRNA decay PUBMED:9211896. CsrA is entirely contained within a globular complex of approximately 18 CsrA-H6 subunits and a single RNA, CsrB. CsrA binds to the CsrB RNA molecule to form the Csr regulatory system which has a strong negative regulatory effect on glycogen biosynthesis, glyconeogenesis and glycogen catabolism and a positive regulatory effect on glycolysis PUBMED:9211896.

    \ ' '1668' 'IPR003706' '\ Escherichia coli induces the synthesis of at least 30 proteins at the onset of carbon starvation, two-thirds of which are positively regulated by the cyclic AMP (cAMP) and cAMP receptor protein (CRP) complex. \ This family consists of carbon starvation protein CstA a predicted membrane protein. It has been suggested that\ CstA is involved in peptide utilization PUBMED:1848300.\ ' '1669' 'IPR000647' '\

    Nuclear factor I (NF-I) or CCAAT box-binding transcription factor (CTF) PUBMED:2504497, PUBMED:2339052 (also known as TGGCA-binding proteins) are a family of vertebrate nuclear proteins which recognise and bind, as dimers, the palindromic DNA sequence 5\'-TGGCANNNTGCCA-3\'. CTF/NF-I binding sites are present in viral and cellular promoters and in the origin of DNA replication of Human adenovirus 2 (HAdV-2). The CTF/NF-I proteins were first identified as nuclear factor I, a collection of proteins that activate the replication of several Adenovirus serotypes (together with NF-II and NF-III) PUBMED:6216480. The family of proteins was also identified as the CTF transcription factors, before the NFI and CTF families were found to be identical PUBMED:3398920. The CTF/NF-I proteins are individually capable of activating transcription and DNA replication. In a given species, there are a large number of different CTF/NF-I proteins, generated both by alternative\ splicing and by the occurrence of four different genes. CTF/NF-1 proteins contain 400 to 600 amino acids. The N-terminal 200 amino-acid sequence, almost perfectly conserved in all species and genes sequenced, mediates site-specific DNA recognition, protein dimerisation and Adenovirus DNA replication. The C-terminal\ 100 amino acids contain the transcriptional activation domain. This activation domain is the target of gene expression regulatory pathways elicited by growth factors and it interacts with basal transcription factors\ and with histone H3 PUBMED:8543151.

    \ \ ' '1670' 'IPR004820' '\

    This family includes PUBMED:10208837:

    \ \

    CTP:cholinephosphate cytidylyltransferase (CCT) is a key regulatory enzyme in phosphatidylcholine biosynthesis that catalyzes the formation of CDP-choline.\ A comparison of the catalytic domains of CCTs from a wide variety of organisms reveals a large number of completely conserved residues. There may be a role for the conserved HXGH sequence in catalysis. The membrane-binding domain in rat CCT has been defined, and it has been suggested that lipids may play a role in inactivating the enzyme. A phosphorylation domain has been described PUBMED:9370319.

    \ \ ' '1671' 'IPR003329' '\

    Synonym(s): CMP-N-acetylneuraminic acid synthetase

    \

    Acylneuraminate cytidylyltransferase () (CMP-NeuAc synthetase) catalyzes the reaction of CTP and NeuAc to form CMP-NeuAc, which is the nucleotide sugar donor used by sialyltransferases PUBMED:8663048. The outer membrane lipooligosaccharides of some microorganisms contain terminal sialic acid attached to N-acetyllactosamine and so this modification may be important in pathogenesis.

    \ ' '1672' 'IPR006893' '\

    This family of proteins contains p23 from the citrus tristeza virus, which is a member of the Closteroviridae. CTV produces more positive than negative RNA strands, and p23 controls this asymmetrical RNA accumulation. Amino acids 42-180 are essential for function and are thought to contain RNA-binding and zinc finger domains PUBMED:11752137.

    \ ' '1673' 'IPR015798' '\

    Amine oxidases (AO) are enzymes that catalyse the oxidation of a wide range of biogenic amines including many neurotransmitters, histamine and xenobiotic amines. There are two classes of amine oxidases: flavin-containing () and copper-containing (). Copper-containing AO act as a disulphide-linked homodimer. They catalyse the oxidation of primary amines to aldehydes, with the subsequent release of ammonia and hydrogen peroxide, which requires one copper ion per subunit and topaquinone as cofactor PUBMED:8591028: \

    \

    Copper-containing amine oxidases are found in bacteria, fungi, plants and animals. In prokaryotes, the enzyme enables various amine substrates to be used as sources of carbon and nitrogen PUBMED:9048544, PUBMED:9405045. In eukaryotes they have a broader range of functions, including cell differentiation and growth, wound healing, detoxification and cell signalling PUBMED:8805580.

    \

    The copper amine oxidases occur as mushroom-shaped homodimers of 70-95 kDa, each monomer containing a copper ion and a covalently bound redox cofactor, topaquinone (TPQ). TPQ is formed by post-translational modification of a conserved tyrosine residue. The copper ion is coordinated with three histidine residues and two water molecules in a distorted square pyramidal geometry, and has a dual function in catalysis and TPQ biogenesis. The catalytic domain is the largest of the 3-4 domains found in copper amine oxidases, and consists of a beta sandwich of 18 strands in two sheets. The active site is buried and requires a conformational change to allow the substrate access. The two N-terminal domains share a common structural fold, its core consisting of a five-stranded antiparallel beta sheet twisted around an alpha helix. The D1 domains from the two subunits comprise the stalk, of the mushroom-shaped dimer, and interact with each other but do not pack tightly against each other PUBMED:8591028, PUBMED:10576737.

    \ \

    This entry represents the C-terminal catalytic domain of copper amine oxidases, and has a super-sandwich fold consisting of 18 beta-strands in 2 sheets PUBMED:8591028. A domain with a similar structural fold can be found as the third domain in lysyl oxidase PplO PUBMED:14690425.

    \ ' '1674' 'IPR015800' '\

    Amine oxidases (AO) are enzymes that catalyse the oxidation of a wide range of biogenic amines including many neurotransmitters, histamine and xenobiotic amines. There are two classes of amine oxidases: flavin-containing () and copper-containing (). Copper-containing AO act as a disulphide-linked homodimer. They catalyse the oxidation of primary amines to aldehydes, with the subsequent release of ammonia and hydrogen peroxide, which requires one copper ion per subunit and topaquinone as cofactor PUBMED:8591028: \

    \

    Copper-containing amine oxidases are found in bacteria, fungi, plants and animals. In prokaryotes, the enzyme enables various amine substrates to be used as sources of carbon and nitrogen PUBMED:9048544, PUBMED:9405045. In eukaryotes they have a broader range of functions, including cell differentiation and growth, wound healing, detoxification and cell signalling PUBMED:8805580.

    \

    The copper amine oxidases occur as mushroom-shaped homodimers of 70-95 kDa, each monomer containing a copper ion and a covalently bound redox cofactor, topaquinone (TPQ). TPQ is formed by post-translational modification of a conserved tyrosine residue. The copper ion is coordinated with three histidine residues and two water molecules in a distorted square pyramidal geometry, and has a dual function in catalysis and TPQ biogenesis. The catalytic domain is the largest of the 3-4 domains found in copper amine oxidases, and consists of a beta sandwich of 18 strands in two sheets. The active site is buried and requires a conformational change to allow the substrate access. The two N-terminal domains share a common structural fold, its core consisting of a five-stranded antiparallel beta sheet twisted around an alpha helix. The D1 domains from the two subunits comprise the stalk, of the mushroom-shaped dimer, and interact with each other but do not pack tightly against each other PUBMED:8591028, PUBMED:10576737.

    \ \

    This entry represents one (N2) of the two N-terminal domains (N2/N3) that share a similar structure.

    \ ' '1675' 'IPR015802' '\

    Amine oxidases (AO) are enzymes that catalyse the oxidation of a wide range of biogenic amines including many neurotransmitters, histamine and xenobiotic amines. There are two classes of amine oxidases: flavin-containing () and copper-containing (). Copper-containing AO act as a disulphide-linked homodimer. They catalyse the oxidation of primary amines to aldehydes, with the subsequent release of ammonia and hydrogen peroxide, which requires one copper ion per subunit and topaquinone as cofactor PUBMED:8591028: \

    \

    Copper-containing amine oxidases are found in bacteria, fungi, plants and animals. In prokaryotes, the enzyme enables various amine substrates to be used as sources of carbon and nitrogen PUBMED:9048544, PUBMED:9405045. In eukaryotes they have a broader range of functions, including cell differentiation and growth, wound healing, detoxification and cell signalling PUBMED:8805580.

    \

    The copper amine oxidases occur as mushroom-shaped homodimers of 70-95 kDa, each monomer containing a copper ion and a covalently bound redox cofactor, topaquinone (TPQ). TPQ is formed by post-translational modification of a conserved tyrosine residue. The copper ion is coordinated with three histidine residues and two water molecules in a distorted square pyramidal geometry, and has a dual function in catalysis and TPQ biogenesis. The catalytic domain is the largest of the 3-4 domains found in copper amine oxidases, and consists of a beta sandwich of 18 strands in two sheets. The active site is buried and requires a conformational change to allow the substrate access. The two N-terminal domains share a common structural fold, its core consisting of a five-stranded antiparallel beta sheet twisted around an alpha helix. The D1 domains from the two subunits comprise the stalk, of the mushroom-shaped dimer, and interact with each other but do not pack tightly against each other PUBMED:8591028, PUBMED:10576737.

    \ \

    This entry represents one (N3) of the two N-terminal domains (N2/N3) that share a similar structure.

    \ ' '1676' 'IPR003245' '\

    Blue (type 1) copper proteins are small proteins which bind a single copper atom and which are characterised by an intense electronic absorption band near 600 nm PUBMED:6698995, PUBMED:8433378. The most well known members of this class of proteins are the plant chloroplastic plastocyanins, which exchange electrons with cytochrome c6, and the distantly related bacterial azurins, which exchange electrons with cytochrome c551. This family of proteins also includes amicyanin from bacteria such as Methylobacterium extorquens or Paracoccus versutus (Thiobacillus versutus) that can grow on methylamine; auracyanins A and B from Chloroflexus aurantiacus PUBMED:1313011; blue copper protein from Alcaligenes faecalis; cupredoxin (CPC) from Cucumis sativus (Cucumber) peelings PUBMED:1468551; cusacyanin (basic blue protein; plantacyanin, CBP) from cucumber; halocyanin from Natronomonas pharaonis (Natronobacterium pharaonis) PUBMED:8195126, a membrane associated copper-binding protein; pseudoazurin from Pseudomonas; rusticyanin from Thiobacillus ferrooxidans PUBMED:1879547; stellacyanin from Rhus vernicifera (Japanese lacquer tree); umecyanin from the roots of Armoracia rusticana (Horseradish); and allergen Ra3 from ragweed. Although there is an appreciable amount of divergence in the sequences of all these proteins, the copper ligand sites are conserved. This domain is found in a variety of plant cyanins and pollern allergen.

    \ \

    Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.

    \

    The allergens in this family include allergens with the following designations: Amb a 3.

    \ ' '1677' 'IPR004946' '\

    This family of cucumovirus proteins may be long-distance movement proteins.

    \ ' '1678' 'IPR000247' '\ Cucumoviruses are tripartite RNA plant viruses believed to share a close\ evolutionary relationship with Brome mosaic virus (BMV). The cucumoviruses include: Cucumber mosaic virus (cucumber mosaic cucumovirus) PUBMED:2230731, \ Peanut stunt virus PUBMED:1926787 and Tomato aspermy virus (TAV) PUBMED:1990057. The viral coat proteins show a high degree of sequence\ similarity PUBMED:2230731.\ ' '1679' 'IPR003350' '\ A class, also called ONECUT, of homeodomain proteins. \ The CUT domain is a DNA-binding motif which can bind independently or in cooperation with the homeodomain (), often found downstream of the CUT domain. Proteins display two modes of DNA binding, which hinge on the homeodomain and on the linker that separates it from the cut domain, and two modes of transcriptional stimulation, which hinge on the homeodomain PUBMED:9593691.\ ' '1680' 'IPR001373' '\

    Cullins are a family of hydrophobic proteins that act as scaffolds for ubiquitin ligases (E3). Cullins are found throughout eukaryotes. Humans express seven cullins (Cul1, 2, 3, 4A, 4B, 5 and 7), each forming part of a multi-subunit ubiquitin complex. Cullin-RING ubiquitin ligases (CRLs), such as Cul1 (SCF) PUBMED:8681378, play an essential role in targeting proteins for ubiquitin-mediated destruction; as such, they are diverse in terms of composition and function, regulating many different processes from glucose sensing and DNA replication to limb patterning and circadian rhythms. The catalytic core of CRLs consists of a RING protein and a cullin family member. For Cul1, the C-terminal cullin-homology domain binds the RING protein. The RING protein appears to function as a docking site for ubiquitin-conjugating enzymes (E2s). Other proteins contain a cullin-homology domain, such as the APC2 subunit of the anaphase-promoting complex/cyclosome and the p53 cytoplasmic anchor PARC; both APC2 and PARC have ubiquitin ligase activity. The N-terminal region of cullins is more variable, and is used to interact with specific adaptor proteins PUBMED:15688063, PUBMED:11961546, PUBMED:15537541.

    \

    This entry represents the N-terminal region of cullin proteins, which consists of several domains, including cullin repeat domain, a 4-helical bundle domain, an alpha+beta domain, and a winged helix-like domain.

    \ ' '1681' 'IPR004323' '\

    CutA1 is a widespread protein of about 12 kDa found in bacteria, plants, and animals, including humans PUBMED:12949080. The protein was originally identified in a gene locus of\ Escherichia coli called cutA involved in divalent metal tolerancePUBMED:7623666. The cutA locus consists of two operons, one containing a single gene encoding a cytoplasmic\ protein, CutA1, and the other composed of two genes encoding a 50-kDa (CutA2) and a 24-kDa (CutA3) inner membrane proteins. Molecular genetics studies on the\ E. coli cutA locus showed that some mutations lead to copper sensitivity due to its increased uptake PUBMED:9260936. However, the specific function of CutA1 in E. coli is still\ unknown.

    \

    However, a possible role of mammalian CutA1 in the anchoring of the enzyme\ acetylcholinesterase (AChE)1 in neuronal cell membranes. CutA1 does not directly interact with AChE, but the CutA1 gene is widely expressed in different regions of the brain with an expression\ pattern that parallels that of AChE. In addition CutA1 Co-purified with AChE from human caudate nucleus. CutA1, thus, might provide an intriguing link between copper tolerance in bacteria and a\ complex process in the brain of the most evolved organisms.

    \

    Both rat and E. coli CutA1 have been crystallised PUBMED:12949080. Both\ proteins are trimeric in the crystals and in solution through an inter-subunit beta-sheet formation. Each monomer exhibits the same overall structure, adopting a ferredoxin-like fold made of an alpha-beta sandwich with antiparallel beta-sheet and containing an additional short strand and a C-terminal helix. In the beta-sheet, alternate strands are connected by helices with positive crossovers, resulting in a double beta-alpha-beta motif\ where the antiparallel beta-sheet packs against antiparallel alpha-helices. The C-terminal helix packs orthogonal to the N terminus.

    \ \

    \ The strong structure similarity of CutA1 with PII proteins might point to an role for CutA1 in signalling through allosteric communication between monomers. CutA1 may be involved in the tuning of a disulphide bond cascade in bacteria and mammals, acting as the PII proteins do in the nitrogen signal cascade in bacteria and plants.

    \ ' '1682' 'IPR005627' '\ Copper transport in Escherichia coli is mediated by the products of at least six genes, cutA, cutB, cutC, cutD, cutE, and cutF. A mutation in one or more of these genes results in an increased copper sensitivity. Members of this family are between 200 and 300 amino acids in length and are found in both eukaryotes and bacteria.\ ' '1683' 'IPR000675' '\

    Aerial plant organs are protected by a cuticle composed of an insoluble polymeric structural compound,\ cutin, which is a polyester composed of hydroxy and hydroxyepoxy fatty acids PUBMED:. Plant pathogenic\ fungi produce extracellular degradative enzymes PUBMED:1557023 that play an important role in pathogenesis.\ They include cutinase, which hydrolyses cutin, facilitating fungus penetration through the cuticle. Inhibition\ of the enzyme can prevent fungal infection through intact cuticles. Cutin monomers released from the cuticle\ by small amounts of cutinase on fungal spore surfaces can greatly increase the amount of cutinase secreted by\ the spore, the mechanism for which process is as yet unknown PUBMED:, PUBMED:1557023.

    \

    Cutinase is a serine esterase containing the classical Ser, His, Asp triad of serine hydrolases PUBMED:.\ The protein belongs to the alpha-beta class, with a central beta-sheet of 5 parallel strands covered by 5\ helices on either side of the sheet. The active site cleft is partly covered by 2 thin bridges formed by amino\ acid side chains, by contrast with the hydrophobic lid possessed by other lipases PUBMED:1560844. The protein \ also contains 2 disulphide bridges, which are essential for activity, their cleavage resulting in complete \ loss of enzymatic activity PUBMED:. Two cutinase-like proteins (MtCY39.35 and MtCY339.08c) have been \ found in the genome of the bacteria Mycobacterium tuberculosis.

    \ ' '1684' 'IPR002479' '\ This repeat is found in multiple tandem copies in several\ proteins. The repeat is 20 amino acid residues long. \ It has been suggested that these repeats in \ might be responsible for the specific recognition of\ choline-containing cell walls PUBMED:3422470. Similar repeats are found in the glucosyltransferases and glucan-binding protein of\ oral streptococci, dextransucrases of Leuconostoc mesenteroides as well as toxins of Clostridium difficile PUBMED:15576779.\ ' '1685' 'IPR007253' '\ This repeat is found in multiple tandem copies in proteins including amidase enhancers PUBMED:1356138 and adhesins PUBMED:11254569.\ ' '1686' 'IPR005172' '\

    This entry includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin PUBMED:10191092 and TSO1 PUBMED:10769245. This group of proteins is called a CXC domain in PUBMED:10769245.

    \ ' '1687' 'IPR004153' '\ This repeat contains the conserved pattern CXCXC where X can be any amino acid. The repeat is found in up to five copies in Vascular endothelial growth factor C PUBMED:8612600. In the salivary glands of the dipteran Chironomus tentans, a specific messenger ribonucleoprotein (mRNP) particle, the Balbiani ring (BR) granule, can be visualized during its assembly on the gene and during its nucleocytoplasmic transport. This repeat is found over 70 copies in the balbiani ring protein 3 (). It is also found in some silk proteins PUBMED:9089085.\ ' '1688' 'IPR003712' '\ Some bacteria can overcome the toxicity of environmental cyanate by hydrolysis of cyanate. This reaction is catalyzed by cyanate lyase (also known as cyanase) PUBMED:3049588. Cyanate lyase is found in bacteria and plants and catalyzes the reaction of cyanate with bicarbonate to produce ammonia and carbon dioxide. \

    The cyanate lyase monomer is composed of two domains. The N-terminal domain shows structural similarity to the DNA-binding alpha-helix bundle motif. The C-terminal domain has an \'open fold\' with no structural homology to other proteins. The dimer structure reveals the C-terminal domains to be intertwined, and the decamer is formed by a pentamer of these dimers. The active site of the enzyme is located between dimers and is comprised of residues from four adjacent subunits of the homodecamer PUBMED:10801492.

    \ ' '1689' 'IPR007325' '\ Proteins in this family are thought to be cyclase enzymes. They are found in proteins involved in antibiotic synthesis. However they are also found in organisms that do not make antibiotics pointing to a wider role for these proteins. The proteins contain a conserved motif HXGTHXDXPXH that is likely to form a part of the active site.\ ' '1690' 'IPR004367' '\

    Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles PUBMED:12910258, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

    \

    Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi\'s sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus PUBMED:11056549.

    \ \

    This is the C-terminal domain of cyclins.

    \ ' '1691' 'IPR005535' '\

    Cyclotides (cyclo peptides) are plant peptides of ~30 amino acids with a head to-tail cyclic backbone and six cysteine residues involved in three disulphide\ bonds. The cyclotides are extremely resistant to proteolysis and are remarkably stable. Cyclotides display a diverse range of biological activities, including uterotonic activity, inhibition of neurotensin binding, hemolytic, anti-HIV and anti-microbial activity. This range of biological activities makes cyclotides amenable to potential pharmaceutical and agricultural applications. Although their precise role in plants has not yet been reported, it appears that they are most likely present as defence molecules PUBMED:10600388, PUBMED:12482862, PUBMED:12482868, PUBMED:12946412.

    \ \

    The three-dimensional structure of cyclotides is compact and contains a number of beta-turns, three beta strands arranged in a distorted triple-stranded beta-sheet, a short helical segment, and a network of disulphide bonds which form a cystine knot. The cystine knot consists of an embedded ring in the structure, formed by two disulphide bonds and their connecting backbone segments is threaded by a third disulphide bond. Although the cystine knot motif is now well known in a wide variety of proteins, the cyclotides remain as the only example in which a cystine knot is embedded within a circular protein backbone, a motif that is referred to as the cyclic cystine knot (CCK) PUBMED:10600388, PUBMED:12482862, PUBMED:12482868, PUBMED:12946412.

    \ \

    Cyclotides can be separated into two sub-families, one of which tends to contain a larger number of positively charged residues and has a bracelet-like circularisation of the backbone. The second subfamily contains a backbone twist due to a cis-Pro peptide bond and may conceptually be regarded as a molecular Moebius strip PUBMED:10600388, PUBMED:12482868. Bracelet and Moebius families of cyclotides possess a Knottin scaffold. The cyclotide family of proteins is abundant in plants from the Rubiaceae and Violaceae families and includes:

    \ \

    \ ' '1692' 'IPR000199' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin PUBMED:8164744, the type example for clan PA.

    \ \

    Picornaviral proteins are expressed as a single polyprotein\ which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly.\

    \ ' '1693' 'IPR006208' '\

    This domain is found at the C-terminal of glycoprotein hormones and various extracellular proteins. It is believed to be involved in disulphide-linked dimerisation.

    \ ' '1694' 'IPR000277' '\

    Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination PUBMED:8690703, PUBMED:7748903, PUBMED:15189147. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors PUBMED:17109392. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy PUBMED:16763894.

    \

    PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic PUBMED:15581583.

    \ \ \

    A number of pyridoxal-dependent enzymes involved in the metabolism of cysteine, homocysteine and methionine have been shown PUBMED:1577698, PUBMED:8511966 to be evolutionary related. These enzymes are proteins of about 400 amino-acid residues. The pyridoxal-P group is attached to a lysine residue located in the central section of these enzymes.

    \ ' '1695' 'IPR001893' '\

    This cysteine rich repeat contains four cysteines. It is found in multiple copies in metazoan proteins, single copies occur in some bacterial species and is absent from the fungi.

    \ \

    The Golgi apparatus protein 1 (GLG1,), which is located in Golgi cisterns of various cell types, can bind fibroblast growth factor and E-selectin. Sixteen cysteine-rich GLG1 repeats form the core of the protein and are located in the lumen. The C-terminal part of GLG1 is composed of a transmembrane region and a short cytoplasmic tail. The Cys-rich GLG1 repeat is a ~60 amino acid module that contains 4 Cys residues, which can form intrachain disulphide bridges PUBMED:1448090. Homologues of the vertebrate GLG1/Golgi sialoglycoprotein MG-160 (Mg160)/E-selectin ligand 1 (ESL1)/cysteine-rich fibroblast growth factor receptor 1 (CFR1)/latent transforming growth factor-beta complex protein 1 (LTCP-1) have been found in insects and the nematode Caenorhabditis elegans PUBMED:12029485.

    \ ' '1696' 'IPR000010' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    The cystatins are cysteine proteinase inhibitors belonging to MEROPS inhibitor family I25, clan IH PUBMED:2107324, PUBMED:14587292, PUBMED:1855589. They mainly inhibit peptidases belonging to peptidase families C1 (papain family) and C13 (legumain family). The cystatin family includes:

    \ \ \

    All true cystatins inhibit cysteine peptidases of the papain family (MEROPS peptidase family C1), and some also inhibit legumain family enzymes (MEROPS peptidase family C13). These peptidases play key roles in physiological processes, such as intracellular protein degradation (cathepsins B, H and L), are pivotal in the remodelling of bone (cathepsin K), and may be important in the control of antigen presentation (cathepsin S, mammalian legumain). Moreover, the activities of such peptidases are increased in pathophysiological conditions, such as cancer metastasis and inflammation. Additionally, such peptidases are essential for several pathogenic parasites and bacteria. Thus in animals cystatins not only have capacity to regulate normal body processes and perhaps cause disease when down-regulated, but in other organisms may also participate in defence against biotic and abiotic stress.

    \ ' '1697' 'IPR002541' '\ This entry consists of various proteins involved in cytochrome c\ assembly from mitochondria and bacteria; CycK from Rhizobium leguminosarum PUBMED:7665469, \ CcmC from Escherichia coli and Paracoccus denitrificans PUBMED:7635817, PUBMED:9043133\ and orf240 from Triticum aestivum (Wheat) mitochondria PUBMED:7529870. \ The members of this family are probably integral membrane proteins\ with six predicted transmembrane helices that may comprise the membrane component of an \ ABC (ATP binding cassette) transporter complex. This transporter may be necessary for transport of some component \ needed for cytochrome c assembly. \

    One member, R. leguminosarum CycK, contains a putative haem-binding motif PUBMED:7665469. Wheat \ orf240 also contains a putative haem-binding motif and is a proposed \ ABC transporter with c-type haem as its proposed substrate PUBMED:7529870.\ However it seems unlikely that all members of this family transport\ haem or c-type apocytochromes because P. denitrificans CcmC transports neither PUBMED:9043133.

    \ ' '1698' 'IPR011994' '\

    Cytidylate kinase () catalyses the phosphorylation of cytidine 5\'-monophosphate (dCMP) to cytidine 5\'-diphosphate (dCDP) in the presence of ATP or GTP.

    \ ' '1699' 'IPR000511' '\ Cytochrome c haem-lyase (CCHL) () and cytochrome Cc1 haem-lyase (CC1HL) PUBMED:1499554 are mitochondrial enzymes that catalyse the covalent attachment of a haem group on two cysteine residues of cytochrome c and c1. These two enzymes are functionally and evolutionary related. There are two conserved regions, the first is located in the central section and the second in the C-terminal section. Both patterns contain conserved histidine, tryptophan and acidic residues which could be important for the interaction of the enzymes with the apoproteins and/or the haem group.\ ' '1700' 'IPR003317' '\ These proteins are cytochrome bd type terminal oxidases that catalyse quinol dependent, Na+ independent oxygen uptake PUBMED:8626304. Members of this family are integral membrane proteins and contain a protoheame IX centre B558. \

    Cytochrome bd may play an important role in microaerobic nitrogen fixation in the enteric bacterium Klebsiella pneumoniae, where it is expressed under all conditions that permit diazotrophy PUBMED:9274021.

    \ ' '1701' 'IPR013081' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    Cytochrome b559, which forms part of the reaction centre core of PSII is a heterodimer composed of one alpha subunit (PsbE), one beta (PsbF) subunit, and a haem cofactor. Two histidine residues from each subunit coordinate the haem. Although cytochrome b559 is a redox-active protein, it is unlikely to be involved in the primary electron transport in PSII due to its very slow photo-oxidation and photo-reduction kinetics. Instead, cytochrome b559 could participate in a secondary electron transport pathway that helps protect PSII from photo-damage. Cytochrome b559 is essential for PSII assembly PUBMED:12560096.

    \ \

    This domain occurs in both the alpha and beta subunits of cytochrome B559. In the alpha sbunit it occurs together with a lumenal domain (), while in the beta subunit it occurs on its own.

    \ ' '1702' 'IPR013082' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    The alpha subunit (PsbE) of cytochrome b559, forms a haem-binding heterodimer with the beta subunit (PsbF) () within the reaction centre core of PSII. Both PsbE and PsbF are essential components for PSII assembly, and are probably involved in secondary electron transport mechanisms that help to protect PSII from photo-damage PUBMED:12560096.

    \

    This domain occurs in the lumenal region of the alpha subunit. It is usually found in conjuction with an N-terminal domain ().

    \ ' '1703' 'IPR003321' '\ The enzyme cytochrome c nitrite reductase (c552) catalyses the six-electron reduction of nitrite to ammonia as one of the key steps in the biological nitrogen cycle, where it participates in the anaerobic energy metabolism of dissimilatory nitrate ammonification. Cytochrome c nitrite reductase from Sulfurospirillum deleyianum is a functional dimer, with 10 close-packed haem groups of type c and an unusual lysine-coordinated high-spin haem at the active site PUBMED:10440380.\ ' '1704' 'IPR004877' '\

    Cytochrome b561 is a secretory vesicle-specific electron transport protein. It is an integral membrane protein, that binds two haem groups non-covalently. This entry represents the eukaryotic family. Members of the \'bacterial cytochrome b561\' family can be found in .

    \ ' '1705' 'IPR005798' '\

    In the mitochondrion of eukaryotes and in aerobic prokaryotes, cytochrome b is a component of respiratory chain complex III () - also known as the bc1 complex or ubiquinol-cytochrome c reductase. In plant chloroplasts and cyanobacteria, there is a analogous protein, cytochrome b6, a component of the plastoquinone-plastocyanin reductase (), also known as the b6f complex.

    \

    Cytochrome b/b6 PUBMED:2509716, PUBMED:8329437 is an integral membrane protein of approximately 400 amino acid residues that probably has 8 transmembrane segments. In plants and cyanobacteria, cytochrome b6 consists of two subunits encoded by the petB and petD genes. The sequence of petB is colinear with the N-terminal part of mitochondrial cytochrome b, while petD corresponds to the C-terminal part.\ Cytochrome b/b6 non-covalently binds two haem groups, known as b562 and b566. Four conserved histidine residues are postulated to be the ligands of the iron atoms of these two haem groups.

    \

    Apart from regions around some of the histidine haem ligands, there are a few conserved regions in the sequence of b/b6. The best conserved of these regions includes an invariant P-E-W triplet which lies in the loop that separates the fifth and sixth transmembrane segments. It seems to be important for electron transfer at the ubiquinone redox site - called Qz or Qo (where o stands for outside) - located on the outer side of the membrane. This entry is the C-terminus of these proteins.

    \ ' '1706' 'IPR005797' '\

    In the mitochondrion of eukaryotes and in aerobic prokaryotes, cytochrome b is a component of respiratory chain complex III () - also known as the bc1 complex or ubiquinol-cytochrome c reductase. In plant chloroplasts and cyanobacteria, there is a analogous protein, cytochrome b6, a component of the plastoquinone-plastocyanin reductase (), also known as the b6f complex.

    \

    Cytochrome b/b6 PUBMED:2509716, PUBMED:8329437 is an integral membrane protein of approximately 400 amino acid residues that probably has 8 transmembrane segments. In plants and cyanobacteria, cytochrome b6 consists of two subunits encoded by the petB and petD genes. The sequence of petB is colinear with the N-terminal part of mitochondrial cytochrome b, while petD corresponds to the C-terminal part.\ Cytochrome b/b6 non-covalently binds two haem groups, known as b562 and b566. Four conserved histidine residues are postulated to be the ligands of the iron atoms of these two haem groups.

    \

    Apart from regions around some of the histidine haem ligands, there are a few conserved regions in the sequence of b/b6. The best conserved of these regions includes an invariant P-E-W triplet which lies in the loop that separates the fifth and sixth transmembrane segments. It seems to be important for electron transfer at the ubiquinone redox site - called Qz or Qo (where o stands for outside) - located on the outer side of the membrane. This entry is the N-terminus of these proteins.

    \ ' '1707' 'IPR003088' '\

    Cytochromes c (cytC) can be defined as electron-transfer proteins having \ one or several haem c groups, bound to the protein by one or, more \ generally, two thioether bonds involving sulphydryl groups of cysteine \ residues. The fifth haem iron ligand is always provided by a histidine \ residue. CytC possess a wide range of properties and function in a large \ number of different redox processes PUBMED:.

    \

    Ambler PUBMED:1646017 recognised four classes of cytC.

    \

    Class I includes the low-spin soluble cytC of mitochondria and bacteria, with the haem-attachment site towards the N-terminus, and the sixth ligand provided by a methionine residue about 40 residues further on towards the C-terminus. On the basis of sequence similarity, class I cytC were further subdivided into five classes, IA to IE. Class IB includes the eukaryotic mitochondrial cytC and prokaryotic \'short\' cyt c2 exemplified by Rhodopila globiformis cyt c2; class IA includes \'long\' cyt c2, such as Rhodospirillum rubrum cyt c2 and Aquaspirillum itersonii cyt c-550, which have several extra loops by comparison with class IB cytC.

    \ ' '1708' 'IPR002326' '\ Cytochrome bc1 complex (ubiquinol:ferricytochrome c oxidoreductase) is \ found in mitochondria, photosynthetic bacteria and other prokaryotes PUBMED:.\ It is minimally composed of three subunits: cytochrome b, carrying a low-\ and a high-potential haem group; cytochrome c1 (cyt c1); and a high-potential Rieske iron-sulphur protein. The general function of the complex \ is electron transfer between two mobile redox carriers, ubiquinol and \ cytochrome c; the electron transfer is coupled with proton translocation \ across the membrane, thus generating proton-motive force in the form of an\ electrochemical potential that can drive ATP synthesis. In its structure and\ functions, the cytochrome bc1 complex bears extensive analogy to the\ cytochrome b6f complex of chloroplasts and cyanobacteria; cyt c1 plays an\ analogous role to cytochrome f, in spite of their different structures PUBMED:7631417.\ ' '1709' 'IPR012127' '\

    Cytochromes c (cytC) can be defined as electron-transfer proteins having \ one or several haem c groups, bound to the protein by one or, more \ generally, two thioether bonds involving sulphydryl groups of cysteine \ residues. The fifth haem iron ligand is always provided by a histidine \ residue. CytC possess a wide range of properties and function in a large \ number of different redox processes PUBMED:. Ambler PUBMED:1646017 recognised four classes of cytC.

    \ \

    Class II includes the high-spin cytC\' and a number of low-spin cytochromes, e.g. cyt c-556. The haem-attachment site is close to the C-terminus. The cytC\' are capable of binding such ligands as CO, NO or CN(-), albeit with rate and equilibrium constants 100 to 1,000,000-fold smaller than other high-spin haemoproteins PUBMED:1646027. This, coupled with its relatively low redox potential, makes it unlikely that cytC\' is a terminal oxidase. Thus cytC\' probably functions as an electron transfer protein PUBMED:1646016.

    \

    The 3D structures of a number of cytC\' have been determined. The molecule \ usually exists as a dimer, each monomer folding as a four-alpha-helix bundle\ incorporating a covalently-bound haem group at the core PUBMED:1646016. The Chromatium vinosum cytC\' exhibits dimer dissociation upon ligand binding PUBMED:8230224.

    \ ' '1710' 'IPR002322' '\ Cytochromes c (cytC) can be defined as electron-transfer proteins having \ one or several haem c groups, bound to the protein by one or, more \ generally, two thioether bonds involving sulphydryl groups of cysteine \ residues. The fifth haem iron ligand is always provided by a histidine \ residue. CytC possess a wide range of properties and function in a large \ number of different redox processes PUBMED:. \

    Ambler PUBMED:1646017 recognised four classes of cytC.

    \

    Class III comprises the low \ redox potential multiple haem cytochromes: cyt C7 (trihaem), C3 (tetrahaem),\ and high-molecular-weight cytC, HMC (hexadecahaem), with only 30-40 \ residues per haem group. The haem c groups, all bis-histidinyl coordinated,\ are structurally and functionally nonequivalent and present different redox\ potentials in the range 0 to -400 mV PUBMED:7830606. \ The 3D structures of a number of cyt C3 proteins have been determined. The proteins\ consist of 4-5 alpha-helices and 2 beta-strands wrapped around a compact\ core of four non-parallel haems, which present a relatively high degree of \ exposure to the solvent. The overall protein architecture, haem plane \ orientations and iron-iron distances are highly conserved PUBMED:7830606.

    \ ' '1711' 'IPR005126' '\ Within the NapC/NirT family of cytochrome c proteins, some members, such as NapC and NirT , bind four haem groups, while others, such as TorC , bind five haems. This family aligns the common N-terminal region that contains four haem-binding C-X(2)-CH motifs.\ ' '1712' 'IPR002689' '\ Glycoprotein L from Cytomegalovirus serves a chaperone for the correct folding and surface expression of glycoprotein H (gH) PUBMED:7964634. Glycoprotein L is a member of the heterotrimeric gCIII complex of glycoprotein which also includes gH and gO and has an essential role in viral fusion PUBMED:10196283.\ ' '1713' 'IPR003143' '\ Cytochrome cd1 (nitrite reductase) catalyses the conversion of nitrite to nitric oxide in the nitrogen cycle. This family represents the d1 haem-binding domain of cytochrome cd1, in which His/Tyr side chains ligate the d1 haem iron of the active site in the oxidized state PUBMED:9311786.\ ' '1714' 'IPR003038' '\ Members of this family are thought to be integral membrane\ proteins. Some members of this family have been shown to\ cause apoptosis if mutated PUBMED:8413235, these proteins are known as\ DAD for defender against death. The family also includes\ the epsilon subunit of the oligosaccharyltransferase that\ is involved in N-linked glycosylation PUBMED:7593165.\ ' '1715' 'IPR002219' '\

    Diacylglycerol (DAG) is an important second messenger. Phorbol esters (PE) are analogues of DAG and potent tumour promoters that cause a variety of physiological changes when administered to both cells and tissues. DAG activates a family of serine/threonine protein kinases, collectively known as protein kinase C (PKC) PUBMED:1396661. Phorbol esters can directly stimulate PKC. The N-terminal region of PKC, known as C1, has been shown PUBMED:2500657 to bind PE and DAG in a phospholipid and zinc-dependent fashion. The C1 region contains one or two copies (depending on the isozyme of PKC) of a cysteine-rich domain, which is about 50 amino-acid residues long, and which is essential for DAG/PE-binding. The DAG/PE-binding domain binds two zinc ions; the ligands of these metal ions are probably the six cysteines and two histidines that are conserved in this domain.

    \ ' '1716' 'IPR007130' '\ The terminal step of triacylglycerol (TAG) formation is catalysed by the enzyme diacylglycerol acyltransferase (DAGAT) PUBMED:11751830, PUBMED:11751875.\ ' '1717' 'IPR000829' '\

    Diacylglycerol kinase () (DAGK) is an enzyme that catalyses the formation of phosphatidic acid from diacylglycerol and ATP, an important step in phospholipid biosynthesis. In bacteria DAGK is very small (13 to 15 kD) membrane protein which seems to contain three transmembrane domains PUBMED:8071224. The best conserved region, is a stretch of 12 residues which are located in a cytoplasmic loop between the second and third transmembrane domains.

    \ ' '1718' 'IPR000756' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The DAG kinase domain is assumed to be an accessory domain. Upon cell stimulation, DAG kinase converts DAG into phosphatidate, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity. It catalyses the reaction: ATP + 1,2-diacylglycerol = ADP +\ 1,2-diacylglycerol 3-phosphate. The enzyme is stimulated by calcium and phosphatidylserine and phosphorylated by protein kinase C. This domain is always associated with .

    \ ' '1719' 'IPR001206' '\

    Diacylglycerol kinase (DGK, ) phosphorylates diacylglycerol (DAG) to yield phosphatidic acid. This enzyme initiates resynthesis of phosphoinositides consumed by phospholipase C during cellular signal transduction. Mammalian DGK consists of nine isozymes encoded by separate genes PUBMED:11983067. In addition to PKC-like zinc fingers and catalytic regions commonly conserved in all DGKs, these isozymes contain a variety of regulatory domains of known and/or predicted functions. The mammalian isozymes are named according to the order of their cDNA cloning and are subdivided into five groups based on their characteristic structural features. Each DGK isozyme is a critical downstream component of a DAG-dependent signalling system.

    \ \

    Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family PUBMED:2156169.

    \ \ \

    This domain is usually associated with an accessory domain (see ).

    \ ' '1720' 'IPR006218' '\ Members of the 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthetase family catalyse the first step in aromatic amino acid biosynthesis from chorismate. Class I includes bacterial and yeast enzymes; class II includes higher plants and various microorganisms (see ) PUBMED:8760910. \

    The first step in the common pathway leading to the biosynthesis of aromatic compounds is the stereospecific condensation of phosphoenolpyruvate (PEP) and D-erythrose-4-phosphate (E4P) giving rise to 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP). This reaction is catalyzed by DAHP synthase, a metal-activated enzyme, which in microorganisms is the target for negative-feedback regulation by pathway intermediates or by end products. In Escherichia coli there are three DAHP synthetase isoforms, each specifically inhibited by one of the three aromatic amino acids. The crystal structure of the phenylalanine-regulated form of DAHP synthetase shows the fold as is a (beta/alpha)8 barrel with several additional beta strands and alpha helices PUBMED:10425687.

    \ ' '1721' 'IPR002480' '\

    Members of the 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthetase family () catalyse the first step in aromatic amino acid biosynthesis from chorismate. Class I (see ) includes bacterial and yeast enzymes; class II includes higher plants and various microorganisms PUBMED:8760910.

    \

    The first step in the common pathway leading to the biosynthesis of aromatic compounds is the stereospecific condensation of phosphoenolpyruvate (PEP) and D-erythrose-4-phosphate (E4P) giving rise to 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP). This reaction is catalyzed by DAHP synthase, a metal-activated enzyme, which in microorganisms is the target for negative-feedback regulation by pathway intermediates or by end products.

    \ ' '1722' 'IPR004006' '\ Dihydroxyacetone kinase (glycerone kinase) catalyses the phosphorylation of glycerone in the presence of ATP to glycerone phosphate in the glycerol utilization pathway. This is the kinase domain of the dihydroxyacetone kinase family.\ ' '1723' 'IPR004007' '\ Dihydroxyacetone kinase (glycerone kinase) catalyses the phosphorylation of glycerone in the presence of ATP to glycerone phosphate in the glycerol utilization pathway. This is the predicted phosphatase domain of the dihydroxyacetone kinase family.\ ' '1724' 'IPR004133' '\ This domain contains 9 conserved cysteines and is extracellular. Therefore the cysteines may form disulphide bridges. This family of proteins has been termed the DAN family PUBMED:9660951 after the first member to be reported. This family includes DAN, Cerberus and Gremlin. The gremlin protein is an antagonist of bone morphogenetic protein signalling. It is postulated that all members of this family antagonize different TGF beta TGF-beta ligands PUBMED:9660951.\ ' '1725' 'IPR006076' '\ This entry includes various FAD dependent oxidoreductases: Glycerol-3-phosphate dehydrogenase (), Sarcosine oxidase beta subunit (), D-alanine oxidase (), D-aspartate oxidase (). \

    D-amino acid oxidase () (DAMOX or DAO) is an FAD flavoenzyme that catalyzes the oxidation \ of neutral and basic D-amino acids into their corresponding keto acids. DAOs have been characterised \ and sequenced in fungi and vertebrates where they are known to be located in the peroxisomes. D-aspartate \ oxidase () (DASOX) PUBMED:1601857 is an enzyme, structurally related to DAO, which catalyzes \ the same reaction but is active only toward dicarboxylic D-amino acids. In DAO, a conserved histidine \ has been shown PUBMED:1673125 to be important for the enzyme\'s catalytic activity.

    \ ' '1726' 'IPR001653' '\

    Bacteria, plants and fungi metabolise aspartic acid to produce four amino acids - lysine, threonine, methionine and isoleucine - in a series of reactions known as the aspartate pathway. Additionally, several important metabolic intermediates are produced by these reactions, such as diaminopimelic acid, an essential component of bacterial cell wall biosynthesis, and dipicolinic acid, which is involved in sporulation in Gram-positive bacteria. Members of the animal kingdom do not posses this pathway and must therefore acquire these essential amino acids through their diet. Research into improving the metabolic flux through this pathway has the potential to increase the yield of the essential amino acids in important crops, thus improving their nutritional value. Additionally, since the enzymes are not present in animals, inhibitors of them are promising targets for the development of novel antibiotics and herbicides. For more information see PUBMED:11352712.

    \

    The lysine/diaminopimelic acid branch of the aspartate pathway produces the essential amino acid lysine via the intermediate meso-diaminopimelic acid (meso-DAP), which is also a vital cell wall component in Gram-negative bacteria PUBMED:10508663. The production of dihydropicolinate from aspartate-semialdehyde controls flux into the lysine/diaminopimelic acid pathway. Three variants of this pathway exist, differing in how tetrahydropicolinate (formed by reduction of dihydropicolinate) is metabolised to meso-DAP. One variant, the most commonly found one in archaea and bacteria, uses primarily succinyl intermediates, while a second variant, found only in Bacillus, utilises primarily acetyl intermediates. In the third variant, found in some Gram-positive bacteria, a dehydrogenase converts tetrahydropicolinate directly to meso-DAP. In all variants meso-DAP is subsequently converted to lysine by a decarboxylase, or, in Gram-negative bacteria, assimilated into the cell wall. Evidence exists that a fourth, currently unknown, variant of this pathway may function in plants PUBMED:15652176.

    \

    This entry represents diaminopimelate epimerase (), which catalyses the isomerisation of L,L-dimaminopimelate to meso-DAP in the biosynthetic pathway leading from aspartate to lysine. It is a member of the broader family of PLP-independent amino acid racemases. This enzyme is a monomeric protein of about 30 kDa consisting of two domains which are homologus in structure though they share little sequence similarity PUBMED:9843410. Each domain consists of mixed beta-sheets which fold into a barrel around the central helix. The active site cleft is formed from both domains and contains two conserved cysteines thought to function as the acid and base in the catalytic reaction PUBMED:14747737. Other PLP-independent racemases such as glutamate racemase have been shown to share a similar structure and mechanism of catalysis.

    \ ' '1727' 'IPR000846' '\

    Dihydrodipicolinate reductase catalyzes the second step in the biosynthesis of diaminopimelic acid and lysine, the NAD or NADP-dependent reduction of 2,3-dihydrodipicolinate into 2,3,4,5-tetrahydrodipicolinate.

    \ \

    In Escherichia coli and Mycobacterium tuberculosis, dihydrodipicolinate reductase has equal specificity for NADH and NADPH, however in Thermotoga maritima there it has a greater affinity for NADPH PUBMED:18250105. In addition, the enzyme is inhibited by high concentrations of its substrate, which consequently acts as a feedback control on the lysine biosynthesis pathway. In T. maritima, the enzyme also lacks N-terminal and C-terminal loops which are present in enzyme of the former two organisms.

    \ ' '1728' 'IPR005012' '\

    Daxx is a ubiquitously expressed protein that functions, in part, as a transcriptional co-repressor through its interaction with a growing number\ of nuclear, DNA-associated proteins. Human Daxx contains four\ structural domains commonly found in transcriptional regulatory proteins: two predicted paired amphipathic helices, an acid-rich domain and a\ Ser/Pro/Thr (SPT)-rich domain. The post-translational modification status of the SPT-domain of hDaxx regulates its association with\ transcription factors such as Pax3 and ETS-1, effectively bringing hDaxx to sites of active transcription.\ Through its presence at the site of active transcription, hDaxx could then be able to associate with acetylated histones present in the nucleosomes and\ Dek that is associated with chromatin. Through its association with the SPT-domain of hDaxx, histone deacetylases may also\ be brought to the site of active transcription. As a consequence, nucleosomes in the vicinity of the site of active transcription will have the histone tails\ deacetylated, allowing the deactylated tail to bind to DNA, thereby leading to an inactive chromatin structure and transcriptional repression PUBMED:12140263.

    \

    The Daxx protein (also known as the Fas-binding protein) is thought to play a role in apoptosis as a component of nuclear promyelocytic leukemia\ protein (PML) oncogenic domains (PODS). Daxx associates with PODs through a direct interaction with\ PML, a critical component of PODs. The interaction is a dynamic, cell cycle regulated\ event and is dependent on the post-translational modification of PML by the small ubiquitin-related modifier SUMO-1.

    \ ' '1729' 'IPR003200' '\

    Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) plays a central role in the synthesis of alpha-ribazole-5\'-phosphate, an intermediate for the lower ligand of cobalamin PUBMED:8206834. It is one of the enzymes of the anaerobic pathway of cobalamin biosynthesis, and one of the four proteins (CobU, CobT, CobC, and CobS) involved in the synthesis of the lower ligand and the assembly of the nucleotide loop PUBMED:12101181, PUBMED:7592411.

    \

    Vitamin B12 (cobalamin) is used as a cofactor in a number of enzyme-catalysed reactions in bacteria, archaea and eukaryotes PUBMED:8550510. The biosynthetic pathway to adenosylcobalamin from its five-carbon precursor, 5-aminolaevulinic acid, can be divided into three sections: (1) the biosynthesis of uroporphyrinogen III from 5-aminolaevulinic acid; (2) the conversion of uroporphyrinogen III into the ring-contracted, deacylated intermediate precorrin 6 or cobalt-precorrin 6; and (3) the transformation of this intermediate to form adenosylcobalamin PUBMED:12196148. Cobalamin is synthesised by bacteria and archaea via two alternative routes that differ primarily in the steps of section 2 that lead to the contraction of the macrocycle and excision of the extruded carbon molecule (and its attached methyl group) PUBMED:11153269. One pathway (exemplified by Pseudomonas denitrificans) incorporates molecular oxygen into the macrocycle as a prerequisite to ring contraction, and has consequently been termed the aerobic pathway. The alternative, anaerobic, route (exemplified by Salmonella typhimurium) takes advantage of a chelated cobalt ion, in the absence of oxygen, to set the stage for ring contraction PUBMED:12196148.

    \ \

    This entry represents bacterial- and archaeal-type nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase enzymes involved in dimethylbenzimidazole synthesis, as well as a group of proteins of unknown function. This function is essential to de novo cobalamin (vitamin B12) production in bacteria.

    \ ' '1730' 'IPR005580' '\

    This RNA binding domain is found at the C-terminus of a number of DEAD helicase proteins PUBMED:10481020.

    \ ' '1731' 'IPR007387' '\ The function of the members of this family is unknown, but DctQ homologues are invariably found in the tripartite ATP-independent periplasmic transporters PUBMED:10627041.\ ' '1732' 'IPR004668' '\

    These proteins are members of the C4-Dicarboxylate Uptake (Dcu) family. Most proteins in this family are predicted to have 12 GES predicted transmembrane regions; however the one member whose membrane topology has been experimentally determined has 10 transmembrane regions, with both the N- and C-termini localized to the periplasm PUBMED:9733683. The DcuA and DcuB proteins are involved in the transport of aspartate, malate, fumarate and succinate in many species PUBMED:8131924, PUBMED:14654290, PUBMED:11004174, and are thought to function as antiporters with any two of these substrates. Since DcuA is encoded in an operon with the gene for aspartase, and DcuB is encoded in an operon with the gene for fumarase, their physiological functions may be to catalyze aspartate:fumarate and fumarate:malate exchange during the anaerobic utilization of aspartate and fumarate, respectively PUBMED:7961398. The Escherichia coli DcuA and DcuB proteins have very different expression patterns PUBMED:9852003. DcuA is constitutively expressed; DcuB is strongly induced anaerobically by FNR and C4-dicarboxylates, while it is repressed by nitrate and subject to CRP-mediated catabolite repression.

    \ ' '1733' 'IPR018385' '\ Escherichia coli contains four different secondary carriers (DcuA, DcuB, DcuC, and DctA) for C4-dicarboxylates PUBMED:10482502, PUBMED:1512189, PUBMED:7961398, PUBMED:8955408 DcuA is used for aerobic growth on C4-dicarboxylates PUBMED:10482502, PUBMED:5541510, whereas the Dcu carriers (encoded by the dcuA, dcuB, and dcuC genes) are used under anaerobic conditions and form a distinct family of carriers PUBMED:1512189, PUBMED:8020497, PUBMED:9889977, PUBMED:7961398, PUBMED:9230919, PUBMED:8955408. Each of the Dcu carriers is able to catalyze the uptake, antiport, and possibly also efflux of C4-dicarboxylates. DcuB is the major C4-dicarboxylate carrier for fumarate respiration with high fumarate-succinate exchange activity. It is synthesized only in the absence of oxygen and nitrate and in the presence of C4-dicarboxylates PUBMED:1512189, PUBMED:9973351, PUBMED:9852003, PUBMED:9765574. DcuA is expressed constitutively in aerobic and anaerobic growth and can substitute for DcuB PUBMED:9852003, PUBMED:7961398. \ \ These proteins are members of the C4-dicarboxylate Uptake C (DcuC) family. DcuC has 12 GES predicted transmembrane regions, is induced only under anaerobic conditions, and is not repressed by glucose. DcuC may therefore function as a succinate efflux system during anaerobic glucose fermentation. However, when overexpressed, it can replace either DcuA or DcuB in catalyzing fumarate-succinate exchange and fumarate uptake PUBMED:8020497, PUBMED:10368146. DcuC shows the same transport modes as DcuA and DcuB (exchange, uptake, and presumably efflux of C4-dicarboxylates) PUBMED:8955408.\ ' '1734' 'IPR004875' '\

    These proteins are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. Interestingly this family also includes the CENP-B protein. This\ domain in that protein appears to have lost the metal binding residues and is unlikely to have endonuclease activity. Centromere Protein B (CENP-B) is a DNA-binding protein\ localised to the centromere PUBMED:9451007.

    \ ' '1735' 'IPR004177' '\ The DDHD domain is 180 residues long and contains four conserved residues that may form a metal binding site. The domain is named after these four residues. This pattern of conservation of metal binding residues is often seen in phosphoesterase domains. This domain is found in retinal degeneration B proteins, as well as a family of probable phospholipases.\ ' '1736' 'IPR005013' '\

    Members of this family are involved in asparagine-linked protein glycosylation. In particular, dolichyl-diphosphooligosaccharide-protein glycosyltransferase (DDOST), also\ known as oligosaccharyltransferase (), transfers the high-mannose sugar GlcNAc(2)-Man(9)-Glc(3) from a dolichol-linked donor to an asparagine acceptor in\ a consensus Asn-X-Ser/Thr motif. In most eukaryotes, the DDOST complex is composed of three subunits, which in humans are described as a 48kDa subunit,\ ribophorin I, and ribophorin II. However, the yeast DDOST appears to consist of six subunits (alpha, beta, gamma, delta, epsilon, zeta). The yeast beta subunit is a\ 45kDa polypeptide, previously discovered as the Wbp1 protein, with known sequence similarity to the human 48kDa subunit and the other orthologues. This family\ includes the 48kDa-like subunits from several eukaryotes; it also includes the yeast DDOST beta subunit Wbp1.

    \ ' '1737' 'IPR006719' '\

    The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing PUBMED:1699826. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa).

    \

    This domain is present at the N-terminal of these proteins.

    \ ' '1738' 'IPR006718' '\

    The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing PUBMED:1699826. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa).

    This repeat is usually found in 12 copies in the central region of the protein. Its function is unknown. Length polymorphisms of Dec-1 have been observed in wild-type strains, and are caused by changes in the numbers of the first five repeats PUBMED:8350348.

    \ ' '1739' 'IPR003332' '\ Decorin is a proteoglycan that decorates collagen fibres. Borrelia burgdorferi causes lyme disease, a tick-borne infection that can develop into a chronic, multisystemic disorder. Decorin may mediate the adherence of B. burgdorferi to collagen fibers in skin and other tissues PUBMED:7642279. B. burgdorferi decorin binding protein A (DbpA) facilitates this binding PUBMED:9784533.\ ' '1740' 'IPR001875' '\

    The death effector domain (DED) is a homotypic protein interaction module composed of a bundle of six alpha-helices. DED is related in sequence and structure to the death domain (DD, see ) and the caspase recruitment domain (CARD, see ), which work in similar pathways and show similar interaction properties PUBMED:11504623. The dimerisation of DED domains is mediated primarily by electrostatic interactions. DED domains can be found in isolation, or in combination with other domains. Domains associated with DED include: caspase catalytic domains (in caspase-8, -10), death domains (in FADD), nuclear localisation sequences (in DEDD), transmembrane domains (in Bap31 and Bar), nucleotide-binding domains (in Dap3), coiled-coil domains (in Hip and Hippi), SAM domains (in Bar), and E2-binding RING domains (in Bar) PUBMED:15226512.

    \

    Several DED-containing proteins are involved in the regulation of apoptosis through their interactions with DED-containing caspases (), such as caspases 8 and 10 in humans, both of which contain tandem pairs of DEDs. There are many DED-containing modulators of apoptosis, which can either enhance or inhibit caspase activation PUBMED:15173180.

    \ ' '1742' 'IPR001855' '\

    Defensins are 2-6 kDa, cationic, microbicidal peptides active against many Gram-negative and Gram-positive bacteria, fungi, and enveloped viruses PUBMED:8528769, containing three pairs of intramolecular disulphide bonds. On the basis of their size and pattern of disulphide bonding, mammalian defensins are classified into alpha, beta and theta categories. Every mammalian species explored thus far has beta-defensins. In cows, as many as 13 beta-defensins exist in neutrophils. However, in other species, beta-defensins are more often produced by epithelial cells lining various organs (e.g. the epidermis, bronchial tree and genitourinary tract).

    \ \

    Defensins are produced constitutively and/or in response to microbial products or proinflammatory cytokines. Some defensins are also called corticostatins (CS) because they inhibit corticotropin-stimulated corticosteroid production. The mechanism(s) by which microorganisms are killed and/or inactivated by defensins is not understood completely. However, it is generally believed that killing is a consequence of disruption of the microbial membrane. The polar topology of defensins, with spatially separated charged and hydrophobic regions, allows them to insert themselves into the phospholipid membranes so that their hydrophobic regions are buried within the lipid membrane interior and their charged (mostly cationic) regions interact with anionic phospholipid head groups and water. Subsequently, some defensins can aggregate to form \'channel-like\' pores; others might bind to and cover the microbial membrane in a \'carpet-like\' manner. The net outcome is the disruption of membrane integrity and function, which ultimately leads to the lysis of microorganisms. Some defensins are synthesised as propeptides which may be relevant to this process.

    \ \

    Human, rabbit and guinea-pig beta-defensins, as well as human beta-defensin-2 (hBD2), induce the activation and degranulation of mast cells, resulting in the release of histamine and prostaglandin D2

    \ ' '1743' 'IPR002366' '\

    Defensins are 2-6 kDa, cationic, microbicidal peptides active against many Gram-negative and Gram-positive bacteria, \ fungi, and enveloped viruses PUBMED:8528769, containing three pairs of intramolecular disulphide bonds PUBMED:12072367. On the basis of their size and pattern of\ disulphide bonding, mammalian defensins are classified into alpha, beta and theta categories. Alpha-defensins, which have been identified in humans, monkeys and several\ rodent species, are particularly abundant in neutrophils, certain macrophage populations and Paneth cells of the small intestine. Every mammalian species\ explored thus far has beta-defensins. In cows, as many as 13 beta-defensins exist in neutrophils. However, in other species, beta-defensins are more often produced by\ epithelial cells lining various organs (e.g. the epidermis, bronchial tree and genitourinary tract). Theta-defensins are cyclic and have so far only been identified in primate\ phagocytes.

    Defensins are produced constitutively and/or in response to microbial products or proinflammatory cytokines. Some defensins are also called corticostatins (CS) because \ they inhibit corticotropin-stimulated corticosteroid production. The mechanism(s) by which microorganisms are killed and/or inactivated by defensins is not understood completely. However, it is generally believed that killing is a\ consequence of disruption of the microbial membrane. The polar topology of defensins, with spatially separated charged and hydrophobic regions, allows them to\ insert themselves into the phospholipid membranes so that their hydrophobic regions are buried within the lipid membrane interior and their charged (mostly cationic)\ regions interact with anionic phospholipid head groups and water. Subsequently, some defensins can aggregate to form \'channel-like\' pores; others might bind to and cover the microbial membrane in a \'carpet-like\' manner. The net outcome is the disruption of membrane integrity and function,\ which ultimately leads to the lysis of microorganisms. Some defensins are synthesized as propeptides which may be relevant to this process - in neutrophils only the mature peptides have been identified but in Paneth cells, the propeptide is stored in vesicles PUBMED:12021776 and appears to be cleaved by trypsin on activation.

    \ ' '1744' 'IPR006081' '\

    Defensins are 2-6 kDa, cationic, microbicidal peptides active against many Gram-negative and Gram-positive bacteria, fungi, and enveloped viruses PUBMED:8528769, containing three pairs of intramolecular disulphide bonds. On the basis of their size and pattern of disulphide bonding, mammalian defensins are classified into alpha, beta and theta categories. Alpha-defensins, which have been identified in humans, monkeys and several rodent species, are particularly abundant in neutrophils, certain macrophage populations and Paneth cells of the small intestine.

    Defensins are produced constitutively and/or in response to microbial products or proinflammatory cytokines. Some defensins are also called corticostatins (CS) because they inhibit corticotropin-stimulated corticosteroid production. The mechanism(s) by which microorganisms are killed and/or inactivated by defensins is not understood completely. However, it is generally believed that killing is a consequence of disruption of the microbial membrane. The polar topology of defensins, with spatially separated charged and hydrophobic regions, allows them to insert themselves into the phospholipid membranes so that their hydrophobic regions are buried within the lipid membrane interior and their charged (mostly cationic) regions interact with anionic phospholipid head groups and water. Subsequently, some defensins can aggregate to form \'channel-like\' pores; others might bind to and cover the microbial membrane in a \'carpet-like\' manner. The net outcome is the disruption of membrane integrity and function,\ which ultimately leads to the lysis of microorganisms. Some defensins are synthesized as propeptides which may be relevant to this process.

    \ \

    Human neutrophil-derived alpha-defensins (HNPs) are capable of enhancing phagocytosis by mouse macrophages. HNP1-3 have been reported to increase the production of tumor necrosis factor (TNF) and IL-1, while decreasing the production of IL-10 by monocytes. Increased levels of proinflammatory factors (e.g. IL-1, TNF, histamine and prostaglandin D2) and suppressed levels of IL-10 at the site of microbial infection are likely to amplify local inflammatory responses. This might be further reinforced by the capacity of some human and rabbit alpha-defensins to inhibit the production of immunosuppressive glucocorticoids by competing for the binding of adrenocorticotropic hormone to its receptor. Moreover, human alpha-defensins can enhance or suppress the activation of the classical pathway of complement in vitro by binding to solid-phase or fluid-phase complement C1q, respectively. The capacity of defensins to enhance phagocytosis, promote neutrophil recruitment, enhance the production of proinflammatory cytokines, suppress anti-inflammatory mediators and regulate complement activation argues that defensins upregulate innate host inflammatory defences against microbial invasion.

    \ ' '1745' 'IPR000653' '\

    This entry represents a family that are probably all pyridoxal-phosphate-dependent aminotransferase enzymes with a variety of molecular functions. The family includes StsA , StsC and StsS PUBMED:9238101. The aminotransferase activity was demonstrated for purified StsC protein as the L-glutamine:scyllo-inosose aminotransferase , which catalyses the first amino transfer in the biosynthesis of the streptidine subunit of streptomycin PUBMED:9238101.

    \ ' '1746' 'IPR003206' '\

    This entry represents the large subunit of adenosylcobalamin-dependent diol dehydratases () and glycerol dehydratases (). These enzymes are produced by some enterobacteria in response to growth substances. The enzyme have an TIM beta/alpha barrel fold PUBMED:10903944. Inactivated holoenzymes are reactivated by their own reactivating factors that mediate the ATP-dependent exchange of an enzyme-bound, damaged cofactor for free adenosylcobalamin through intermediary formation of apoenzyme. The reactivation takes place in two steps: (a) ADP-dependent cobalamin release and (b) ATP-dependent dissociation of the resulting apoenzyme-reactivating factor complexes PUBMED:17916188.

    \ ' '1747' 'IPR003208' '\ This family contains the medium subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances.\ ' '1748' 'IPR003207' '\ This family contains the small subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances PUBMED:9805380, PUBMED:10949584.\ ' '1749' 'IPR000167' '\

    A number of proteins are produced by plants that experience water-stress.\ Water-stress takes place when the water available to a plant falls below a\ critical level. The plant hormone abscisic acid (ABA) appears to modulate the\ response of plant to water-stress. Proteins that are expressed during water-\ stress are called dehydrins PUBMED:2562763, PUBMED:1387328. Dehydrins contribute to freezing stress tolerance in plants and it was suggested that this could be partly due to their protective effect on membranes PUBMED:15356392.

    \ \ \

    Dehydrins share a number of structural features. One of the most notable\ features is the presence, in their central region, of a continuous run of\ five to nine serines followed by a cluster of charged residues. Such a region\ has been found in all known dehydrins so far with the exception of pea\ dehydrins. A second conserved feature is the presence of two copies of a\ lysine-rich octapeptide; the first copy is located just after the cluster\ of charged residues that follows the poly-serine region and the second copy\ is found at the C-terminal extremity.

    \ ' '1750' 'IPR003433' '\ The virus capsid is composed of 60 icosahedral units of a combination of VP4, VP3, VP2 and VP1. Four different translation initiation sites of the Densovirus capsid protein mRNA give rise to these four viral proteins, VP1 to VP4. This family represents VP4.\ ' '1751' 'IPR002915' '\ This family includes the enzyme deoxyribose-phosphate aldolase, which is involved in nucleotide metabolism. \ \ The family also includes a group of related bacterial proteins of unknown function, see examples and .\ ' '1752' 'IPR014036' '\

    The deoR-type HTH domain is a DNA-binding, helix-turn-helix (HTH) domain of\ about 50-60 amino acids present in transcription regulators of the deoR\ family, involved in sugar catabolism. This family of prokaryotic regulators is\ named after Escherichia coli deoR, a repressor of the deo operon, which\ encodes nucleotide and deoxyribonucleotide catabolic enzymes. DeoR also\ negatively regulates the expression of nupG and tsx, a nucleoside-specific\ transport protein and a channel-forming protein, respectively.

    \ \

    DeoR-like transcription repressors occur in diverse bacteria as regulators of\ sugar and nucleoside metabolic systems. The effector molecules for deoR-like\ regulators are generally phosphorylated intermediates of the relevant\ metabolic pathway. The DNA-binding deoR-type HTH domain occurs usually in the\ N-terminal part. The C-terminal part can contain an effector-binding domain\ and/or an oligomerization domain. DeoR occurs as an octamer, whilst glpR and\ agaR are tetramers. Several operators may be bound simultaneously, which could\ facilitate DNA looping PUBMED:1731335, PUBMED:14731281.

    \ ' '1753' 'IPR007599' '\

    The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae (Baker\'s yeast) contains a proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process, and the classes were called der for degradation in the ER. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein that is localized to the ER. Deletion of DER1 abolished degradation of the substrate proteins, suggesting that the function of the Der1 protein may be specifically required for the degradation process associated with the ER PUBMED:8631297. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. This family may also mediate degradation of misfolded proteins.

    \ ' '1754' 'IPR002742' '\

    Desulfoferrodoxins contains two types of iron: an Fe-S4 site very similar to that found in desulfoferrodoxin from Desulfovibrio gigas, and an octahedral coordinated high-spin ferrous site most probably with nitrogen/oxygen-containing ligands. Due to this rather unusual combination of active centres, this novel protein is named desulfoferrodoxin PUBMED:2174880.

    \

    This domain comprises essentially the full length of neelaredoxin (, PUBMED:8001576), a monomeric, blue, non-haeme iron protein of D. gigas said to bind two iron atoms per monomer with identical spectral properties. Neelaredoxin was shown recently to have significant superoxide dismutase activity PUBMED:9914498. This domain is also found (in a form in which the distance between the motifs H[HWYF]IXW and CN[IL]HGXW is somewhat shorter) as the C-terminal domain of desulfoferrodoxin, which is said to bind a single ferrous iron atom.\ The N-terminal domain of desulfoferrodoxin is described by .

    \ ' '1755' 'IPR007677' '\ The precise function of this protein is unknown. A deletion/insertion mutation is associated with an autosomal dominant non-syndromic hearing impairment form PUBMED:9771715. In addition, this protein has also been found to contribute to acquired etoposide resistance in melanoma cells PUBMED:11297734.\ ' '1756' 'IPR007085' '\

    This entry represents the C-terminal domain found in DNA/pantothenate metabolism flavoproteins, which affects synthesis of DNA and pantothenate metabolism. These proteins contain ATP, phosphopantothenate, and cysteine binding sites. The structure of this domain has been determined in human phosphopantothenoylcysteine (PPC) synthetase PUBMED:12906824 and as the PPC synthase domain (CoaB) from the Escherichia coli coenzyme A bifunctional protein CoaBC PUBMED:15530362. This domain adopts a 3-layer alpha/beta/alpha fold with mixed beta-sheets, which topologically resembles a combination of Rossmann-like and ribokinase-like folds. The structure of these proteins predicts a ping pong mechanism with initial formation of an acyladenylate intermediate, followed by release of pyrophosphate and attack by cysteine to form the final products PPC and AMP.

    \ ' '1757' 'IPR007729' '\

    2-keto-3-deoxy-galactonokinase is a bacterial transferase that catalyses the second step in D-galactonate degradation. D-Galactonate is catabolized in saprophytic mycobacteria to give pyruvate and glyceraldehyde-3-phosphate by a pathway that involves galactonate dehydratase, 2-keto-3-deoxy-galactonate kinase, and 6-phospho-2-keto-3-deoxy-galactonate aldolase PUBMED:7287628.

    \ ' '1759' 'IPR000422' '\

    3,4-dihydroxy-2-butanone 4-phosphate synthase () (DHBP synthase) (RibB) catalyses the conversion of D-ribulose 5-phosphate to formate and 3,4-dihydroxy-2-butanone 4-phosphate, the latter serving as the biosynthetic precursor for the xylene ring of riboflavin PUBMED:9211332. In Photobacterium leiognathi, the riboflavin synthesis genes ribB (DHBP synthase), ribE (riboflavin synthase), ribH (lumazone synthase) and ribA (GTP cyclohydrolase II) all reside in the lux operon PUBMED:11396941. RibB is sometimes found as a bifunctional enzyme with GTP cyclohydrolase II that catalyses the first committed step in the biosynthesis of riboflavin (). No sequences with significant homology to DHBP synthase are found in the metazoa.

    \ ' '1760' 'IPR002220' '\ Dihydropicolinate synthase (DHDPS) is the key enzyme in lysine biosynthesis\ via the diaminopimelate pathway of prokaryotes, some phycomycetes and\ higher plants. The enzyme catalyses the condensation of L-aspartate-beta-\ semialdehyde and pyruvate to dihydropicolinic acid via a ping-pong\ mechanism in which pyruvate binds to the enzyme by forming a Schiff-base\ with a lysine residue PUBMED:7853400. Three other proteins are structurally related to DHDPS and probably also act\ via a similar catalytic mechanism. These are Escherichia coli N-acetylneuraminate lyase () (gene nanA), which\ catalyzes the condensation of N-acetyl-D-mannosamine and pyruvate to form\ N-acetylneuraminate; Rhizobium meliloti (Sinorhizobium meliloti) protein mosA PUBMED:8349559, which is involved in the biosynthesis\ of the rhizopine 3-o-methyl-scyllo-inosamine; and E. coli hypothetical protein yjhH.\ The sequences of DHDPS from different sources are well-conserved. The\ structure takes the form of a homotetramer, in which 2 monomers are\ related by an approximate 2-fold symmetry PUBMED:7853400. Each monomer comprises\ 2 domains: an 8-fold alpha-/beta-barrel, and a C-terminal alpha-helical\ domain. The fold resembles that of N-acetylneuraminate lyase. The active\ site lysine is located in the barrel domain, and has access via 2 channels\ on the C-terminal side of the barrel.\ ' '1761' 'IPR001667' '\ This is a domain of predicted phosphoesterases that includes Drosophila prune protein and bacterial RecJ exonuclease PUBMED:9478130. The RecJ protein of Escherichia coli plays an important role in a number of DNA repair and\ recombination pathways. RecJ catalyzes processive degradation of single-stranded DNA in a 5\'-to-3\' direction. Sequences highly related to those encoding RecJ can be found in many\ of the eubacterial genomes sequenced to date PUBMED:10633092.\ ' '1762' 'IPR004097' '\ This domain is called DHHA2 since it is often associated with the DHH domain () and is diagnostic of DHH subfamily 2 members PUBMED:9478130. The domain is about 120 residues long and contains a conserved DXK motif at its amino terminus. It is present in inorganic pyrophosphatases and in exopolyphosphatase of Saccharomyces cerevisiae.\ ' '1763' 'IPR012135' '\

    Dihydroorotate dehydrogenase (DHOD), also known as dihydroorotate oxidase, catalyses the fourth step in de novo pyrimidine biosynthesis, the stereospecific oxidation of (S)-dihydroorotate to orotate, which is the only redox reaction in this pathway. DHODs can be divided into two mains classes: class 1 cytosolic enzymes found primarily in Gram-positive bacteria, and class 2 membrane-associated enzymes found primarily in eukaryotic mitochondria and Gram-negative bacteria PUBMED:9405053.

    \ \

    The class 1 DHODs can be further divided into subclasses 1A and 1B, which differ in their structural organisation and use of electron acceptors. The 1A enzyme is a homodimer of two PyrD subunits where each subunit forms a TIM barrel fold with a bound FMN cofactor located near the top of the barrel PUBMED:9655329. Fumarate is the natural electron acceptor for this enzyme. The 1B enzyme, in contrast is a heterotetramer composed of a central, FMN-containing, PyrD homodimer resembling the 1A homodimer, and two additional PyrK subunits which contain FAD and a 2Fe-2S cluster PUBMED:11188687. These additional groups allow the enzyme to use NAD(+) as its natural electron acceptor.

    \ \

    The class 2 membrane-associated enzymes are monomers which have the FMN-containing TIM barrel domain found in the class 1 PyrD subunit, and an additional N-terminal alpha helical domain PUBMED:10673429, PUBMED:12220493. These enzymes use respiratory quinones as the physiological electron acceptor.

    \ \

    This entry represents the FMN-binding subunit common to all classes of dihydroorotate dehydrogenase.

    \ ' '1764' 'IPR002658' '\

    The 3-dehydroquinate synthase () domain is present in isolation in various bacterial 3-dehydroquinate synthases and also present as a domain in the pentafunctional AROM polypeptide () PUBMED:7556173. 3-dehydroquinate (DHQ) synthase catalyses the formation of dehydroquinate (DHQ) and orthophosphate from 3-deoxy-D-arabino heptulosonic 7 phosphate PUBMED:9613570. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids.

    \ ' '1765' 'IPR001381' '\

    3-dehydroquinate dehydratase (), or dehydroquinase, catalyzes the\ conversion of 3-dehydroquinate into 3-dehydroshikimate. It is the third step\ in the shikimate pathway for the biosynthesis of aromatic amino acids from\ chorismate. Two classes of dehydroquinases exist, known as types I and II.

    \

    The\ best studied type I enzyme is from Escherichia coli (gene aroD) and related\ bacteria where it is a homodimeric protein.\ In fungi, dehydroquinase is part of a multifunctional enzyme which catalyzes\ five consecutive steps in the shikimate pathway. A histidine PUBMED:1429576 is involved in the catalytic mechanism.

    \ ' '1766' 'IPR001874' '\ 3-dehydroquinate dehydratase (), or dehydroquinase, catalyzes the conversion of 3-dehydroquinate into 3-dehydroshikimate. It is the third step in the shikimate pathway for the biosynthesis of aromatic amino acids from chorismate. Two classes of dehydroquinases exist, known as types I and II. Class-II enzymes are homododecameric enzymes of about 17 kDa. They are found in some bacteria such as actinomycetales PUBMED:1910148, PUBMED:8170389 and some fungi where they act in a catabolic pathway that allows the use of quinic acid as a carbon source.\ ' '1767' 'IPR006796' '\ Dickkopf proteins are a class of Wnt antagonists. They possess two conserved cysteine-rich regions. This family represents the N-terminal conserved region PUBMED:12167704. The C-terminal region has been found to share significant sequence similarity to the colipase fold () PUBMED:9663378.\ ' '1768' 'IPR007778' '\ This family consists of REP proteins from a number of Dictyostelium species (Slime molds). REP protein is probably involved in transcription regulation and control of DNA replication, specifically the amplification of plasmid at low copy numbers. The formation of homomultimers may be required for their regulatory activity PUBMED:10366530.\ ' '1769' 'IPR007643' '\ The Dictyostelium discoideum (Slime mold) spore coat is a polarised extracellular matrix composed of glycoproteins and cellulose. Four of the major coat glycoproteins exist as a multi-protein complex within the prespore vesicles before secretion. Of these, SP96 and SP70 are members of this family. The presence of SP96 and SP70 in the complex is necessary for the cellulose binding activity of the complex, which is in turn necessary for normal spore coat assembly PUBMED:10931888. The function of this region is not known.\ ' '1770' 'IPR001796' '\

    Dihydrofolate reductase (DHFR) () catalyses the NADPH-dependent reduction of dihydrofolate to tetrahydrofolate, an essential step in de novo synthesis both of glycine and of purines and deoxythymidine phosphate (the precursors of DNA synthesis) PUBMED:2830673, and important also in the conversion of deoxyuridine monophosphate to deoxythymidine monophosphate. Although DHFR is found ubiquitously in prokaryotes and eukaryotes, and is found in all dividing cells, maintaining levels of fully reduced folate coenzymes, the catabolic steps are still not well understood PUBMED:3383852.

    \

    Bacterial species possesses distinct DHFR enzymes (based on their pattern of binding diaminoheterocyclic molecules), but mammalian DHFRs are highly similar PUBMED:500653. The active site is situated in the N-terminal half of the sequence, which includes a conserved Pro-Trp dipeptide; the tryptophan has been shown PUBMED:6815178 to be involved in the binding of substrate by the enzyme. Its central role in DNA precursor synthesis, coupled with its inhibition by antagonists such as trimethoprim and methotrexate, which are used as anti-bacterial or anti-cancer agents, has made DHFR a target of anticancer chemotherapy. However, resistance has developed against some drugs, as a result of changes in DHFR itself PUBMED:2601715.

    \ \ ' '1771' 'IPR004123' '\

    Thioredoxins PUBMED:3896121, PUBMED:2668278, PUBMED:7788289, PUBMED:7788290 are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of 2 cysteine thiol groups to a disulphide, accompanied by the transfer of 2 electrons and 2 protons. The net result is the covalent interconversion of a disulphide and a dithiol.

    \

    Compared to human thioredoxin, human U5 snRNP-specific protein U5-15kDa contains 37 additional residues that may cause structural changes which most likely form putative binding sites for other spliceosomal proteins or RNA. Although U5-15kDa apparently lacks protein disulphide isomerase activity, it is\ strictly required for pre-mRNA splicing PUBMED:10610776.

    \ ' '1772' 'IPR007837' '\ DNA damage-inducible (din) genes in Bacillus subtilis are coordinately regulated and together compose a global regulatory network that has been termed the SOS-like or SOB regulon. This family includes DinB from B. subtilis PUBMED:1847907.\ ' '1773' 'IPR000627' '\

    This entry represents the C-terminal domain common to several intradiol ring-cleavage dioxygenases. Dioxygenases catalyse the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms. Cleavage of aromatic rings is one of the most important functions of dioxygenases, which play key roles in the degradation of aromatic compounds. The substrates of ring-cleavage dioxygenases can be classified into two groups according to the mode of scission of the aromatic ring. Intradiol enzymes use a non-haem Fe(III) to cleave the aromatic ring between two hydroxyl groups (ortho-cleavage), whereas extradiol enzymes () use a non-haem Fe(II) to cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon (meta-cleavage) PUBMED:10730195. These two subfamilies differ in sequence, structural fold, iron ligands, and the orientation of second sphere active site amino acid residues.

    \

    Enzymes that belong to the intradiol family include catechol 1,2-dioxygenase (1,2-CTD) (); protocatechuate 3,4-dioxygenase (3,4-PCD) (); and chlorocatechol 1,2-dioxygenase () PUBMED:15060064.

    \ ' '1774' 'IPR007535' '\

    This domain is the N-terminal region of catechol, chlorocatechol or hydroxyquinol 1,2-dioxygenase proteins. This region is always found adjacent to the dioxygenase domain ().

    \

    Dioxygenases catalyse the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms. Cleavage of aromatic rings is one of the most important functions of dioxygenases, which play key roles in the degradation of aromatic compounds. The substrates of ring-cleavage dioxygenases can be classified into two groups according to the mode of scission of the aromatic ring. Intradiol enzymes use a non-haem Fe(III) to cleave the aromatic ring between two hydroxyl groups (ortho-cleavage), whereas extradiol enzymes () use a non-haem Fe(II) to cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon (meta-cleavage) PUBMED:10730195. These two subfamilies differ in sequence, structural fold, iron ligands, and the orientation of second sphere active site amino acid residues.

    \

    Enzymes that belong to the intradiol family include catechol 1,2-dioxygenase (1,2-CTD) (); protocatechuate 3,4-dioxygenase (3,4-PCD) (); and chlorocatechol 1,2-dioxygenase () PUBMED:15060064.

    \ ' '1775' 'IPR002728' '\ Members of this family include , a candidate tumour suppressor gene PUBMED:8603384, and DPH2 from\ yeast PUBMED:8406038, which confers resistance to diphtheria toxin and\ has been found to be involved in diphthamide synthesis. Diphtheria\ toxin inhibits eukaryotic protein synthesis by ADP-ribosylating\ diphthamide, a posttranslationally modified histidine residue present\ in EF2. The exact function of the members of this family is\ unknown.\ ' '1776' 'IPR000512' '\ Diphtheria toxin () is a 58 kDa protein secreted by lysogenic strains of Corynebacterium diphtheriae. The toxin causes the disease diphtheria in humans by gaining entry into the cell cytoplasm and inhibiting protein synthesis PUBMED:8573568. The mechanism of inhibition involves transfer of the ADP-ribose group of NAD to elongation factor-2 (EF-2), rendering EF-2 inactive. The catalysed reaction is as follows: \ \ The crystal structure of the diphtheria toxin homodimer has been determined to 2.5A resolution PUBMED:1589020. The structure reveals a Y-shaped molecule of 3 domains, a catalytic domain (fragment A), whose fold is of the alpha + beta type; a transmembrane (TM) domain, which consists of 9 alpha-helices, 2 pairs of which may participate in pH-triggered membrane insertion and translocation; and a receptor-binding domain, which forms a flattened beta-barrel with a jelly-roll-like topology PUBMED:1589020. The TM- and receptor binding-domains together constitute fragment B.\ ' '1777' 'IPR000512' '\ Diphtheria toxin () is a 58 kDa protein secreted by lysogenic strains of Corynebacterium diphtheriae. The toxin causes the disease diphtheria in humans by gaining entry into the cell cytoplasm and inhibiting protein synthesis PUBMED:8573568. The mechanism of inhibition involves transfer of the ADP-ribose group of NAD to elongation factor-2 (EF-2), rendering EF-2 inactive. The catalysed reaction is as follows: \ \ The crystal structure of the diphtheria toxin homodimer has been determined to 2.5A resolution PUBMED:1589020. The structure reveals a Y-shaped molecule of 3 domains, a catalytic domain (fragment A), whose fold is of the alpha + beta type; a transmembrane (TM) domain, which consists of 9 alpha-helices, 2 pairs of which may participate in pH-triggered membrane insertion and translocation; and a receptor-binding domain, which forms a flattened beta-barrel with a jelly-roll-like topology PUBMED:1589020. The TM- and receptor binding-domains together constitute fragment B.\ ' '1778' 'IPR000512' '\ Diphtheria toxin () is a 58 kDa protein secreted by lysogenic strains of Corynebacterium diphtheriae. The toxin causes the disease diphtheria in humans by gaining entry into the cell cytoplasm and inhibiting protein synthesis PUBMED:8573568. The mechanism of inhibition involves transfer of the ADP-ribose group of NAD to elongation factor-2 (EF-2), rendering EF-2 inactive. The catalysed reaction is as follows: \ \ The crystal structure of the diphtheria toxin homodimer has been determined to 2.5A resolution PUBMED:1589020. The structure reveals a Y-shaped molecule of 3 domains, a catalytic domain (fragment A), whose fold is of the alpha + beta type; a transmembrane (TM) domain, which consists of 9 alpha-helices, 2 pairs of which may participate in pH-triggered membrane insertion and translocation; and a receptor-binding domain, which forms a flattened beta-barrel with a jelly-roll-like topology PUBMED:1589020. The TM- and receptor binding-domains together constitute fragment B.\ ' '1779' 'IPR001762' '\

    Disintegrins are a family of small proteins from viper venoms that function as potent inhibitors of both platelet aggregation and integrin-dependent cell adhesion PUBMED:15578957, PUBMED:15974889. Integrin receptors are involved in cell-cell and cell-extracellular matrix interactions, serving as the final common pathway leading to aggregation via formation of platelet-platelet bridges, which are essential in thrombosis and haemostasis. Disintegrins contain an RGD (Arg-Gly-Asp) or KGD (Lys-Gly-Asp) sequence motif that binds specifically to integrin IIb-IIIa receptors on the platelet surface, thereby blocking the binding of fibrinogen to the receptor-glycoprotein complex of activated platelets. Disintegrins act as receptor antagonists, inhibiting aggregation induced by ADP, thrombin, platelet-activating factor and collagen PUBMED:12050803. The role of disintegrin in preventing blood coagulation renders it of medical interest, particularly with regard to its use as an anti-coagulant PUBMED:16918409.

    \

    Disintegrins from different snake species have been characterised: albolabrin, applagin, barbourin, batroxostatin, bitistatin, obtustatin PUBMED:12742023, schistatin PUBMED:16101289, echistatin PUBMED:15535803, elegantin, eristicophin, flavoridin PUBMED:14499613, halysin, kistrin, tergeminin, salmosin PUBMED:14661951 and triflavin.

    \

    Disintegrin-like proteins are found in various species ranging from slime mold to humans. Some other proteins known to contain a disintegrin domain are:

    \ \ ' '1780' 'IPR007817' '\ This entry is found in DIT1, a protein is involved in the synthesis of dityrosine PUBMED:8183942. Dityrosine is a sporulation-specific component of the Saccharomyces cerevisiae ascospore wall that is essential for the resistance of the spores to adverse environmental conditions. is involved in the biosynthesis of pyoverdine PUBMED:8704959.\ ' '1781' 'IPR007060' '\

    DivIC, from the spore-forming, Gram-positive bacterium Bacillus subtilis, is necessary for both vegetative and sporulation septum formation PUBMED:8113187. These proteins are mainly composed of an N-terminal coiled-coil. DivIB, DivIC and FtsL inter-depend on each other for stabilisation and localisation. The latter two form a heterodimer. DivIC is always centre cell but the other two associate with it during septation PUBMED:15659160.

    \ ' '1782' 'IPR007793' '\ The Bacillus subtilis divIVA1 mutation causes misplacement of the septum during cell division, resulting in the formation of small, circular, anucleate minicells PUBMED:9045828. Inactivation of divIVA produces a minicell phenotype, whereas overproduction of DivIVA results in a filamentation phenotype PUBMED:9045828. These proteins appear to contain coiled-coils.\ ' '1783' 'IPR001158' '\ Dishevelled (Dsh) protein is an important component of the Wnt signal-transduction pathway. It has three relatively conserved domains: DIX, PDZ and DEP. The DIX domain of Dvl-1 (a mammalian Dishevelled homologue) shares 37% identity with the C-terminal region of Axin. Dsh can interact with the Axin/APC/GSK3/beta-catenin complex, and may thus modulate its activity PUBMED:10330181.\

    The Wnt signalling pathway is conserved in various species from Caenorhabditis elegans to mammals, and plays important roles in development, cellular proliferation, and differentiation. The molecular mechanisms by which the Wnt signal regulates cellular functions are becoming increasingly well understood. Wnt stabilises cytoplasmic beta-catenin, which stimulates the expression of genes including c-myc, c-jun, fra-1, and cyclin D1. Axin and its homologue Axil are components of the Wnt signalling pathway that negatively regulate this pathway. Other components of the Wnt signalling pathway, including Dvl, glycogen synthase kinase-3beta (GSK-3beta), beta-catenin, and adenomatous polyposis coli (APC), interact with Axin, and the phosphorylation and stability of beta-catenin are regulated in the Axin complex. Axil has similar functions to Axin. Thus, Axin and Axil act as scaffold proteins in the Wnt signalling pathway, thereby modulating the Wnt-dependent cellular functions PUBMED:10647780.

    \ ' '1784' 'IPR002925' '\ Dienelactone hydrolases play a crucial role in chlorocatechol degradation via the modified ortho cleavage pathway. Enzymes induced in 4-fluorobenzoate-utilizing bacteria have been classified into three groups on the basis of their specificity towards cis- and trans-dienelactone PUBMED:7684040.\ Some proteins contain repeated small fragments of this domain (for example rat kan-1 protein).\ ' '1785' 'IPR006998' '\

    The dlt operon (dltA to dltD) of Lactobacillus rhamnosus 7469 encodes four proteins responsible for the esterification of lipoteichoic acid (LTA) by D-alanine. These esters play an important role in controlling the net anionic charge of the poly (GroP) moiety of LTA. DltA and DltC encode the D-alanine-D-alanyl carrier protein ligase (Dcl) and D-alanyl carrier protein (Dcp), respectively. Whereas the functions of DltA and DltC are defined, the functions of DltB and DltD are unknown. In vitro assays showed that DltD bound Dcp for ligation with D-alanine by Dcl in the presence of ATP. In contrast, the homologue of Dcp, the Escherichia coli acyl carrier protein (ACP), involved in fatty acid biosynthesis, was not bound to DltD and thus was not ligated with D-alanine. DltD also catalyzed the hydrolysis of the mischarged D-alanyl-ACP. The hydrophobic N-terminal sequence of DltD was required for anchoring the protein in the membrane. It is hypothesized that this membrane-associated DltD facilitates the binding of Dcp and Dcl for ligation of Dcp with D-alanine and that the resulting D-alanyl-Dcp is translocated to the primary site of D-alanylation PUBMED:10781555.

    \ \ \

    These sequences contain the C-terminal region of DltD.

    \ ' '1786' 'IPR007002' '\

    The dlt operon (dltA to dltD) of Lactobacillus rhamnosus 7469 encodes four proteins responsible for the esterification of lipoteichoic acid (LTA) by D-alanine. These esters play an important role in controlling the net anionic charge of the poly (GroP) moiety of LTA. DltA and DltC encode the D-alanine-D-alanyl carrier protein ligase (Dcl) and D-alanyl carrier protein (Dcp), respectively. Whereas the functions of DltA and DltC are defined, the functions of DltB and DltD are unknown. In vitro assays showed that DltD bound Dcp for ligation with D-alanine by Dcl in the presence of ATP. In contrast, the homologue of Dcp, the Escherichia coli acyl carrier protein (ACP), involved in fatty acid biosynthesis, was not bound to DltD and thus was not ligated with D-alanine. DltD also catalyzed the hydrolysis of the mischarged D-alanyl-ACP. The hydrophobic N-terminal sequence of DltD was required for anchoring the protein in the membrane. It is hypothesized that this membrane-associated DltD facilitates the binding of Dcp and Dcl for ligation of Dcp with D-alanine and that the resulting D-alanyl-Dcp is translocated to the primary site of D-alanylation PUBMED:10781555.

    \ \ \

    These sequences contain the central region of DltD.

    \ ' '1787' 'IPR006999' '\

    The dlt operon (dltA to dltD) of Lactobacillus rhamnosus 7469 encodes four proteins responsible for the esterification of lipoteichoic acid (LTA) by D-alanine. These esters play an important role in controlling the net anionic charge of the poly (GroP) moiety of LTA. DltA and DltC encode the D-alanine-D-alanyl carrier protein ligase (Dcl) and D-alanyl carrier protein (Dcp), respectively. Whereas the functions of DltA and DltC are defined, the functions of DltB and DltD are unknown. In vitro assays showed that DltD bound Dcp for ligation with D-alanine by Dcl in the presence of ATP. In contrast, the homologue of Dcp, the Escherichia coli acyl carrier protein (ACP), involved in fatty acid biosynthesis, was not bound to DltD and thus was not ligated with D-alanine. DltD also catalyzed the hydrolysis of the mischarged D-alanyl-ACP. The hydrophobic N-terminal sequence of DltD was required for anchoring the protein in the membrane. It is hypothesized that this membrane-associated DltD facilitates the binding of Dcp and Dcl for ligation of Dcp with D-alanine and that the resulting D-alanyl-Dcp is translocated to the primary site of D-alanylation PUBMED:10781555.

    \ \ \

    These sequences contain the N-terminal region of DltD.

    \ ' '1788' 'IPR005173' '\

    This region is found to the C terminus of the DM DNA-binding domain \ PUBMED:10729224. DM-domain proteins with this motif are known as DMRTA proteins. The function of this region is unknown.

    \ ' '1789' 'IPR002180' '\

    6,7-dimethyl-8-ribityllumazine synthase (riboflavin synthase) catalyses the biosynthesis of riboflavin according to the reaction: .

    \ \

    The biosynthesis of one riboflavin molecule requires one molecule of GTP and two molecules of ribulose 5-phosphate as substrates. The final step in the biosynthesis of the vitamin involves the dismutation of 6,7-dimethyl-8-ribityllumazine catalyzed by riboflavin synthase. The second product, 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione, is recycled in the biosynthetic pathway by 6,7-dimethyl-8-ribityllumazine synthase PUBMED:18298940. N-[2,4-dioxo-6-d-ribitylamino-1,2,3,4-tetrahydropyrimidin-5-yl]oxalamic acid derivatives inhibit riboflavin synthase PUBMED:18331058.

    \ \

    This family includes the beta chain of 6,7-dimethyl-8-ribityllumazine synthase . The family also includes a subfamily of distant archaebacterial proteins that may also have the same function for example .

    \ ' '1790' 'IPR007059' '\ The terminal electron transfer enzyme dimethyl sulphoxide reductase of Escherichia coli is a heterotrimeric enzyme composed of a membrane extrinsic catalytic dimer (DmsAB) and a membrane intrinsic polytopic anchor subunit (DmsC) PUBMED:8429002. This family represents DmsC.\ ' '1791' 'IPR006691' '\

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis PUBMED:12042765, PUBMED:11395412. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227.

    \

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils PUBMED:7980433.

    \

    This entry represents the beta-pinwheel repeat found at the C-terminal end of subunit A of topoisomerase IV (ParC) and subunit A of DNA gyrase (GyrA). DNA gyrase is the topoisomerase II found primarily in bacteria and archaea that consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. This is distinct from the topoisomerase II found in most eukaryotes, which consists of a single polypeptide, with the N- and C-terminal regions corresponding to gyrB and gyrA, respectively, and which is not represented in this entry.

    \

    The ability of DNA gyrase to introduce negative supercoils into DNA is mediated in part by the C-terminal domain of subunit A, which forms a beta-pinwheel fold that is similar to a beta-propeller but with a different blade topology, and which forms a superhelical spiral domain PUBMED:15123801, PUBMED:15897198. This beta-pinwheel is capable of bending DNA by over 180 degrees over a 40 bp region, possibly by wrapping the DNA around the GyrA C-terminal beta-pinwheel domain.

    \

    In topoisomerase IV, although the C-terminal domain forms a similar superhelical spiral to that of DNA gyrase A, it assembles as a broken form of a beta-pinwheel as distinct from that of gyrA, due to the absence of a DNA gyrase-specific GyrA box motif PUBMED:15466871. This difference may account for parC being less efficient than gyrA in mediating DNA-bending, leading to their divergence in terms of activity, where topoisomerase IV acts to relax positive supercoils, and DNA gyrase acts to introduce negative supercoils PUBMED:16023670.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '1792' 'IPR013506' '\

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis PUBMED:12042765, PUBMED:11395412. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227.

    \

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils PUBMED:7980433.

    \

    Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions, domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA PUBMED:8982450.

    \

    Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication PUBMED:16023670. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.

    \

    This entry represents the second domain found in subunit B (gyrB and parE) of bacterial gyrase and topoisomerase IV, and the equivalent N-terminal region in eukaryotic topoisomerase II composed of a single polypeptide.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '1793' 'IPR002288' '\

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis PUBMED:12042765, PUBMED:11395412. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227.

    \

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils PUBMED:7980433.

    \

    Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions, domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA PUBMED:8982450.

    \

    Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication PUBMED:16023670. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.

    \

    This entry represents the C-terminal region (C-terminal part of domain 2) of subunit B found in topoisomerase II (gyrB) and topoisomerase IV (parE), which are primarily of bacterial origin. It does not include the topoisomerase II enzymes composed of a single polypeptide, as are found in most eukaryotes. This region is involved in subunit interaction, which accounts for the difference between subunit B and single polypeptide topoisomerase II.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '1794' 'IPR004615' '\

    DNA-directed DNA polymerase () catalyzes DNA-template-directed extension of the 3\'-end of an RNA strand by one nucleotide at a time. DNA polymerase III is a complex, multichain enzyme responsible for most of the replicative synthesis in bacteria. The enzyme also has 3\' to 5\' exonuclease activity. It has a core composed of alpha, epsilon and theta chains, that associate with a tau subunit which allows the core dimerization to form the PolIII\' complex. PolIII\' associates with the gamma complex (gamma, delta, delta\', psi and chi chains) and with the beta chain. This family is the psi subunit, the small subunit of the DNA polymerase III holoenzyme in Escherichia coli and related species, whose exact function is not known. It appears to have a narrow taxonomic distribution, being restricted to the gammaproteobacteria.

    \ ' '1795' 'IPR013839' '\

    DNA ligase (polydeoxyribonucleotide synthase) is the enzyme that joins two DNA fragments by catalyzing the formation of an internucleotide ester bond between phosphate and deoxyribose. It is active during DNA replication, DNA repair and DNA recombination. There are two forms of DNA ligase: one requires ATP (), the other NAD ().

    \ \

    This entry represents the N-terminal adenylation domain of NAD-dependent DNA ligases. These are proteins of about 75 to 85 Kd whose sequence is well conserved PUBMED:1526462, PUBMED:8390989. They also show similarity to yicF, an Escherichia coli hypothetical protein of 63 Kd. Despite a complete lack of detectable sequence similarity, the fold of the central core of this adenyaltion domain shares homology with the equivalent region of ATP-dependent DNA ligases PUBMED:10368271, PUBMED:10698952.

    \ ' '1796' 'IPR004150' '\

    DNA ligases catalyse the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilizing either ATP or NAD(+) as a cofactor PUBMED:10698952. This family is a small domain found after the adenylation domain DNA_ligase_N in NAD+-dependent ligases (). OB-fold domains generally are involved in nucleic acid binding.

    \ ' '1797' 'IPR004149' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the zinc finger domain found in NAD-dependent DNA ligases. DNA ligases catalyse the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilizing either ATP or NAD(+) as a cofactor PUBMED:10698952. This domain is a small zinc binding motif that is presumably DNA binding. It is found only in NAD-dependent DNA ligases.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '1798' 'IPR001525' '\ C-5 cytosine-specific DNA methylases () (C5 Mtase) are enzymes that specifically methylate the C-5 carbon of cytosines in DNA to produce C5-methylcytosine PUBMED:3248729, PUBMED:8127644, PUBMED:2716049. In mammalian cells, cytosine-specific methyltransferases methylate certain CpG sequences, which are believed to modulate gene expression and cell differentiation. In bacteria, these enzymes are a component of restriction-modification systems and serve as valuable tools for the manipulation of DNA PUBMED:7773746, PUBMED:8127644. The structure of HhaI methyltransferase (M.HhaI) has been resolved to 2.5 A PUBMED:8343957: the molecule folds into 2 domains - a larger catalytic domain containing catalytic and cofactor binding sites, and a smaller DNA recognition domain.\ ' '1799' 'IPR013507' '\

    This entry represents the C-terminal domain of DNA mismatch repair proteins, such as MutL. This domain functions in promoting dimerisation PUBMED:16024043. The dimeric MutL protein has a key function in communicating mismatch recognition by MutS to downstream repair processes. Mismatch repair contributes to the overall fidelity of DNA replication by targeting mispaired bases that arise through replication errors during homologous recombination and as a result of DNA damage. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex PUBMED:14527292.

    \ ' '1800' 'IPR003498' '\ This family includes proteins that are probably involved in DNA packing in Herpesviridae. This domain is found at the C-terminus\ of the protein.\ ' '1801' 'IPR003499' '\ This family includes proteins that are probably involved in DNA packing in Herpesviridae. This domain is normally found at the\ N-terminus of the protein.\ ' '1802' 'IPR001001' '\ Describes the beta chain of DNA polymerase III. This is a complex, multichain enzyme responsible for most of the replicative synthesis in bacteria. The beta chain is required for initiation of replication from an RNA primer, nucleotide triphosphate (dNTP)\ residues being added to the 5\'-end of the growing DNA chain.\ ' '1803' 'IPR001001' '\ Describes the beta chain of DNA polymerase III. This is a complex, multichain enzyme responsible for most of the replicative synthesis in bacteria. The beta chain is required for initiation of replication from an RNA primer, nucleotide triphosphate (dNTP)\ residues being added to the 5\'-end of the growing DNA chain.\ ' '1804' 'IPR001001' '\ Describes the beta chain of DNA polymerase III. This is a complex, multichain enzyme responsible for most of the replicative synthesis in bacteria. The beta chain is required for initiation of replication from an RNA primer, nucleotide triphosphate (dNTP)\ residues being added to the 5\'-end of the growing DNA chain.\ ' '1805' 'IPR007459' '\ The DNA polymerase III holoenzyme () is the polymerase responsible for the replication of the Escherichia coli chromosome. The holoenzyme is composed of the DNA polymerase III core, the sliding clamp, and the DnaX clamp loading complex. The DnaX complex contains either the tau or gamma product of gene dnax, complexed to delta.delta and to chi psi. Chi forms a 1:1 heterodimer with psi. The chi psi complex functions by increasing the affinity of tau and gamma for delta.delta allowing a functional clamp-loading complex to form at physiological subunit concentrations. Psi is responsible for the interaction with DnaX (gamma/tau), but psi is insoluble unless it is in a complex with chi PUBMED:7494000.\ ' '1807' 'IPR001098' '\ Synonym(s): DNA nucleotidyltransferase (DNA-directed) \

    DNA-directed DNA polymerases() are the key enzymes catalysing the accurate replication of DNA. They require either a small RNA molecule or a protein as a primer for the de novo synthesis of a DNA chain. A number of polymerases belong to this family PUBMED:2196557, PUBMED:1870963, PUBMED:8451181.

    \ \ \ ' '1809' 'IPR004868' '\ This entry is found in DNA polymerase type B proteins. Proteins in this entry are found in plant and fungal mitochondria, and in viruses.\ ' '1810' 'IPR007185' '\ DNA polymerase epsilon is essential for cell viability and chromosomal DNA replication in budding yeast. In addition, DNA polymerase epsilon may be involved in DNA repair and cell-cycle checkpoint control. The enzyme consists of at least four subunits in mammalian cells as well as in yeast. The largest subunit of DNA polymerase epsilon is responsible for polymerase activity. In mouse, the DNA polymerase epsilon subunit B is the second largest subunit of the DNA polymerase. A part of the N-terminal was found to be responsible for the interaction with SAP18. Experimental evidence suggests that this subunit may recruit histone deacetylase to the replication fork to modify the chromatin structure PUBMED:11872158.\ ' '1811' 'IPR007015' '\ Proteins of this family are predominantly nucleolar. The majority are described as transcription factor transactivators. The family also includes the fifth essential DNA polymerase (Pol5p) of Schizosaccharomyces pombe (Fission yeast) and Saccharomyces cerevisiae (Baker\'s yeast) (). Pol5p is localized exclusively to the nucleolus and binds near or at the enhancer region of rRNA-encoding DNA repeating units.\ ' '1812' 'IPR001462' '\ This domain is at the C-terminus of hepatitis B-type viruses P proteins and represents a functional domain that controls the RNase H activities of the protein. The domain is always associated with and .\ ' '1813' 'IPR000201' '\ This domain is at the N-terminus of hepadnavirus P proteins and covers the so-called terminal protein and the spacer region of the protein. This domain is always associated with and .\ ' '1814' 'IPR007238' '\ DNA primase is the polymerase that synthesises small RNA primers for the Okazaki fragments made during discontinuous DNA replication. DNA primase is a heterodimer of two subunits, the small subunit Pri1 (48 kDa in yeast), and the large subunit Pri2 (58 kDa in the yeast Saccharomyces cerevisiae) PUBMED:2528682. Both subunits participate in the formation of the active site, but the ATP binding site is located on the small subunit PUBMED:2023935. Primase function has also been demonstrated for human and mouse primase subunits PUBMED:8026492.\ ' '1815' 'IPR002755' '\

    DNA primase PUBMED:2023935 synthesizes the RNA primers for the Okazaki\ fragments in lagging strand DNA synthesis. DNA primase is a heterodimer of large (p60) and small (p50) subunits in eukaryotes. This family represents sequences of the small subunit and the DNA primase sequences of the Archaea PUBMED:10536154. No sequence similarity can be detected between the eukaryotic p50 and p60 subunits and the primases purified from bacteriophage and bacteria, .

    \ ' '1816' 'IPR006591' '\

    DNA-dependent RNA polymerase catalyzes the transcription of DNA into RNA using the four ribonucleoside triphosphates as substrates. Each class of RNA polymerase is assembled from 9 to 15 different polypeptides. Rbp10 (RNA polymerase CX) is a domain found in RNA polymerase subunit 10; present in RNA polymerase I, II and III.

    \ \

    Regarding the function, it shows xylanase activity as well as alpha-l- arabinofuranosidase activity and is involved in Xylan degradation, endohydrolysis of 1,4-beta-d-xylosidic linkages in xylans PUBMED:1938968.

    \ \ ' '1817' 'IPR002205' '\

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis PUBMED:12042765, PUBMED:11395412. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227.

    \

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils PUBMED:7980433.

    \

    Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA PUBMED:8982450.

    \

    Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication PUBMED:16023670. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.

    \

    This entry represents subunit A (gyrA and parC) of bacterial gyrase and topoisomerase IV, and the equivalent C-terminal region in eukaryotic topoisomerase II composed of a single polypeptide. This subunit has DNA-binding capacity.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '1818' 'IPR002939' '\

    Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolizing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold PUBMED:8016869. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.

    Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation PUBMED:15063739. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.

    This domain consists of the C-terminal region of the DnaJ protein. Although the function of this region is unknown, it is always found associated with and .\

    \ ' '1819' 'IPR001305' '\

    Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolyzing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold PUBMED:8016869. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.

    Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation PUBMED:15063739. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.

    \ ' '1820' 'IPR004947' '\ Deoxyribonuclease II () hydrolyses DNA under acidic conditions with a preference for double-stranded DNA. It catalyses the endonucleolytic cleavage of DNA to 3\'-phosphomononucleotide and 3\'-phosphooligonucleotide end-products. The enzyme may play a role in apoptosis.\ This family also includes hypothetical proteins from Caenorhabditis elegans.\ ' '1821' 'IPR002624' '\ This family consists of various deoxynucleoside kinases including cytidine (), guanosine (), adenosine () and thymidine kinase (, which also phosphorylates deoxyuridine and deoxycytosine. These enzymes catalyse the production of deoxynucleotide 5\'-monophosphate from a deoxynucleoside, using ATP and yielding ADP in the process.\ ' '1823' 'IPR018242' '\

    Gram-positive, thermophilic anaerobes such as Clostridium thermocellum or Clostridium cellulolyticum secretes a highly active and thermostable cellulase complex (cellulosome) responsible for the degradation of crystalline cellulose PUBMED:2252383, PUBMED:1478480. The cellulosome contains at least 30 polypeptides, the majority of the enzymes are endoglucanases (), but there are also some xylanases (), beta-glucosidases () and endo-beta-1,3-1,4-glucanases ().

    \ \

    Complete sequence data for many of these enzymes has been obtained. A majority of these proteins contain a highly conserved type I dockerin domain of about 65 to 70 residues, which is generally (but not always) located in the C terminus. The dockerin domain is the binding partner of the cohesin domain (see ). The cohesin-dockerin interaction is the crucial interaction for complex formation in the cellulosome PUBMED:10390637. The dockerin domain contains a tandem repeat of two calcium-binding loop-helix motifs (distinct from EF-hand Ca-binding motifs). These motifs are about 24 amino acids in length. This entry represents these repeated Ca-binding motifs.

    \ ' '1824' 'IPR007249' '\ DopA is the founding member of the Dopey family and is required for correct cell morphology and spatiotemporal organisation of multicellular structures in the filamentous fungus Emericella nidulans (Aspergillus nidulans). DopA homologues are found in mammals. Saccharomyces cerevisiae DOP1 is essential for viability and, affects cellular morphogenesis PUBMED:10931277.\ ' '1825' 'IPR007637' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents type II restriction enzymes such as DpmII (), which recognises the double-stranded unmethylated sequence GATC and cleave before G-1 PUBMED:11133943.

    \ ' '1826' 'IPR007357' '\ This family appears to be related to DNA photolyases.\ ' '1827' 'IPR007884' '\ This family contains DREV protein homologues from several eukaryotes. The function of this protein is unknown PUBMED:11132146. However, these proteins appear to be related to other methyltransferases.\ ' '1828' 'IPR003787' '\ Four small, soluble proteins (DsrE, DsrF, DsrH and DsrC) are encoded in the dsr gene region of the phototrophic sulphur bacterium Chromatium vinosum D. The dsrAB genes encoding dissimilatory sulphite reductase are part of the gene cluster, dsrABEFHCMK. The remaining proteins that are encoded are a transmembrane protein (DsrM) with similarity to haem-b-binding polypeptides and a soluble protein (DsrK) resembling [4Fe-4S]-cluster-containing\ heterodisulphide reductase from methanogenic archaea. \ DsrE is a small soluble protein involved in intracellular sulphur reduction PUBMED:9695921.\ ' '1829' 'IPR002773' '\ Eukaryotic initiation factor 5A (eIF-5A), now considered to be an elongation factor (see ), contains an unusual amino acid, hypusine [N epsilon-(4-aminobutyl-2-hydroxy)lysine]. The first step in the\ post-translational formation of hypusine is catalysed by the enzyme\ deoxyhypusine synthase (DS) . The enzyme catalyses the following reaction:\ \ The modified version of eIF-5A, and DS, are required for eukaryotic cell proliferation PUBMED:9493264. The structure is known for this enzyme PUBMED:9493264 in complex with its NAD+ cofactor.\ ' '1830' 'IPR001853' '\ DSBA is a sub-family of the Thioredoxin family PUBMED:9149147. The efficient and correct folding of bacterial disulphide bonded proteins in vivo is dependent upon a class of periplasmic oxidoreductase proteins called DsbA, after the Escherichia coli enzyme. The bacterial protein-folding factor DsbA is the most oxidizing of the thioredoxin family. DsbA catalyses disulphide-bond formation during the folding of secreted proteins. The extremely oxidizing nature of DsbA has been proposed to result from either domain motion or stabilising active-site interactions in the reduced form. DsbA\'s highly oxidizing nature is a result of hydrogen bond, electrostatic and helix-dipole interactions that favour the thiolate over the disulphide at the active site PUBMED:9655827. In the pathogenic bacterium Vibrio cholerae, the DsbA homologue (TcpG) is responsible for the folding, maturation and secretion of virulence factors. \

    While the overall architecture of TcpG and DsbA is similar and the surface features are retained in TcpG, there are significant differences. For example, the kinked active site helix results from a three-residue loop in DsbA, but is caused by a proline in TcpG (making TcpG more similar to thioredoxin in this respect). Furthermore, the proposed peptide binding groove of TcpG is substantially shortened compared with that of DsbA due to a six-residue deletion. Also, the hydrophobic pocket of TcpG is more shallow and the acidic patch is much less extensive than that of E. coli DsbA PUBMED:9149147.

    \ ' '1831' 'IPR003752' '\

    DsbB is a protein component of the pathway that leads to disulphide bond formation in periplasmic proteins of Escherichia coli and other bacteria. The DsbB protein oxidises the periplasmic protein DsbA which in turn oxidises cysteines in other periplasmic proteins in order to make disulphide bonds PUBMED:8430071. DsbB acts as a redox potential transducer across the cytoplasmic membrane. It is a membrane protein which spans the membrane four times with both the N- and C-termini of the protein are in the cytoplasm. Each of the periplasmic domains of the protein has two essential cysteines. The two cysteines in the first periplasmic domain are in a Cys-X-Y-Cys configuration that is characteristic of the active site of other proteins involved in disulphide bond formation, including DsbA and protein disulphide isomerase PUBMED:7957076.

    \ ' '1832' 'IPR003834' '\ DsbA and DsbC, periplasmic proteins of Escherichia coli, are two key players involved in disulphide bond formation. DsbD generates a reducing source in the periplasm, which is required for maintaining proper redox conditions PUBMED:7628442. DipZ is essential for maintaining cytochrome c apoproteins in the correct conformations for the covalent attachment of haem groups to the appropriate pairs of cysteine residues PUBMED:7623667.\ ' '1833' 'IPR002836' '\

    This protein family is found in archaea and eukaryota. The human TFAR19 encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. TFAR19 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. TFAR19 may play a general role in the apoptotic process PUBMED:9920759. Also included in this family is a DNA-binding protein from the archaea, Methanobacterium thermoautotrophicum.

    \ ' '1834' 'IPR007215' '\

    The three proteins TusB, TusC, and TusD form a heterohexamer responsible for a sulphur relay reaction. In large numbers of proteobacterial species, this complex acts on a Cys-derived persulphide moiety, delivered by the cysteine desulphurase IscS to TusA, then to TusBCD. The activated sulphur group is then transferred to TusE (DsrC), then by MnmA (TrmU) for modification of an anticodon nucleotide in tRNAs for Glu, Lys, and Gln. The sulphur relay complex TusBCD is also found, under the designation DsrEFH, in phototrophic and chemotrophic sulphur bacteria, such as Chromatium vinosum. In these organisms, it seems the primary purpose is related to sulphur flux, such as oxidation from sulphide to molecular sulphur to sulphate.

    \

    DsrH is involved in oxidation of intracellular sulphur in the phototrophic sulphur bacterium C. vinosum DSMZ 180 PUBMED:9695921.

    \ ' '1835' 'IPR007834' '\ This family contains SEM1 and DSS1 which are short acidic proteins. In Saccharomyces cerevisiae, SEM1 is a regulator of both exocyst function and pseudohyphal differentiation PUBMED:9927667. Loss of DSS1 in Homo sapiens (human) has been associated with split hand/split foot malformations PUBMED:8782053.\ ' '1836' 'IPR000888' '\

    Deoxythymidine diphosphate (dTDP)-4-keto-6-deoxy-d-hexulose 3, 5-epimerase (RmlC, ) is involved in the biosynthesis of dTDP-l-rhamnose, which is an essential component of the bacterial cell wall, converting dTDP-4-keto-6-deoxy-D-glucose to dTDP-4-keto-L-rhamnose.

    \

    The crystal structure of RmlC from Methanobacterium thermoautotrophicum was determined in the presence and absence of a substrate analogue. RmlC is a homodimer comprising a central jelly roll motif, which extends in two directions into longer beta-sheets. Binding of dTDP is stabilised by ionic interactions to the phosphate group and by a combination of ionic and hydrophobic interactions with the base. The active site, which is located in the centre of the jelly roll, is formed by residues that are conserved in all known RmlC sequence homologues. The active site is lined with a number of charged residues and a number of residues with hydrogen-bonding potentials, which together comprise a potential network for substrate binding and catalysis. The active site is also lined with aromatic residues\ which provide favorable environments for the base moiety of dTDP and potentially for the sugar moiety of the substrate PUBMED:10827167.

    \ ' '1837' 'IPR005636' '\

    This presumed domain is found in bacterial and eukaryotic proteins. Its function is unknown. The domain contains multiple conserved motifs including a DTXW motif that this domain has been named after.

    \ ' '1838' 'IPR002803' '\

    Fructose-1,6-bisphophatase (FBPase) catalyses the hydrolysis of D-fructose-1,6-bisphosphate (FBP) to D-fructose-6-phopshate (F6P) and orthophosphate, and is a key enzyme in gluconeogenesis PUBMED:9452458. Three different groups of FBPases have been identified in eukaryotes and bacteria (FBPase I-III) PUBMED:10986273. None of these groups have been found in archaea so far, though a new group of FBPases (FBPase IV) which also show inositol monophosphatase activity has recently been identified in archaea PUBMED:11062561.

    \ \

    Proteins in this entry are though to represent a new group of FBPases (FBPase V) which are found in thermophilic archaea and a hyperthermophilic bacterium Aquifex aeolicus PUBMED:12065581. The characterised members of this group show strict substrate specificity for FBP and are suggested to be the true FBPase in these organisms PUBMED:12065581, PUBMED:15274916. A structural study suggests that FBPase V has a novel fold for a sugar phosphatase, forming a four-layer alpha-beta-beta-alpha sandwich, unlike the more usual five-layered alpha-beta-alpha-beta-alpha arrangement PUBMED:15274916. The arrangement of the catalytic side chains and metal ligands was found to be consistent with the three-metal ion assisted catalysis mechanism proposed for other FBPases.

    \ \ ' '1841' 'IPR008203' '\ This family includes short archaebacterial proteins of\ unknown function. Archaeoglobus fulgidus has twelve\ copies of this protein, with several being clustered\ together in the genome.\ ' '1842' 'IPR002808' '\

    This prokaryotic protein includes CbiZ, which is involved in the salvage pathway of cobinamide in archaea. Archaea convert adenosylcobinamide (AdoCbi) into adenosylcobinamide phosphate (AdoCbi-P) in two steps. First, the amidohydrolase activity of CbiZ cleaves off the aminopropanol moiety of AdoCbi yielding adenosylcobyric acid (AdoCby); second, AdoCby is converted into AdoCbi-P by the action of adenosylcobinamide-phosphate synthase (CbiB, ). Adenosylcobyric acid is an intermediate of the de novo coenzyme B12 biosynthetic route PUBMED:14990804.

    \ ' '1843' 'IPR002809' '\ This prokaryotic protein family has no known function. Members are predicted to be integral membrane proteins.\ ' '1844' 'IPR002810' '\

    The nfe genes (nfeA, nfeB, and nfeD) are involved in the nodulation efficiency and competitiveness of Rhizobium meliloti (Sinorhizobium meliloti) (Rhizobium meliloti) on alfalfa roots PUBMED:10830257. The specific function of this family is unknown although it is unlikely that NfeD is specifically involved in nodulation as the family contains several different archaeal and bacterial species most of which are not symbionts.

    \ \

    This entry describes archaeal and bacterial proteins which are variously described, examples are: nodulation protein, nodulation efficiency protein D (nfeD), hypothetical protein and membrane-bound serine protease (ClpP class). A number of these proteins are classified in MEROPS peptidase family S49 as non-peptidase homologues or as unassigned peptidases.

    \ ' '1845' 'IPR002811' '\

    This group contains aspartate dehydrogenases that belong to a unique class of amino acid\ dehydrogenases.

    The structure of Thermotoga maritima TM1643 has been found to\ contain an N-terminal Rossmann fold domain (which binds the NAD(+) cofactor) and a C-terminal\ alpha/beta domain PUBMED:12496312. This suggested that TM1643 may be a dehydrogenase with the active\ site located at the interface between the two domains. Enzymatic characterisation of TM1643 revealed that it\ possesses NAD or NADP-dependent dehydrogenase activity toward l-aspartate but no aspartate oxidase activity\ PUBMED:12496312. The product of the aspartate dehydrogenase activity is also iminoaspartate. It has\ been suggested that two different enzymes, an oxidase and a dehydrogenase, may have evolved to catalyse the\ first step of NAD biosynthesis PUBMED:12496312. Members of this group share some structural\ similarity to several other NAD(P)+-dependent oxidoreductases, including inositol 1-phosphate\ synthase, dihydrodipicolinate reductase, and ASA-DH PUBMED:12496312.

    It has been proposed that\ in Thermotoga maritima, TM1643 catalyses the first reaction of de novo biosynthesis\ of NAD from aspartate, and it produces iminoaspartate required for this pathway. The formation of an enzyme\ complex between TM1643 and NadA, the next enzyme of the pathway, may allow the channeling of this unstable\ product directly to the NadA active site PUBMED:12496312.

    The same domain is present in animals\ (e.g., Caenorhabditis elegans F17C8.3 protein).

    \ ' '1846' 'IPR002812' '\

    3-Dehydroquinate synthase () is an enzyme in the common pathway of aromatic amino acid biosynthesis that catalyses the conversion of 3-deoxy-D-arabino-heptulosonic acid 7-phosphate (DAHP) into 3-dehydroquinic acid PUBMED:11173489. This synthesis of aromatic amino acids is an essential metabolic function for most prokaryotic as well as lower eukaryotic cells, including plants. The pathway is absent in humans; therefore, DHQS represents a potential target for the development of novel and selective antimicrobial agents. Owing to the threat posed by the spread of pathogenic bacteria resistant to many currently used antimicrobial drugs, there is clearly a need to develop new anti-infective drugs acting at novel targets. A further potential use for DHQS inhibitors is as herbicides PUBMED:11412967.

    \ ' '1847' 'IPR001434' '\

    This group of sequences is represented by a conserved region of about 53 amino acids shared between regions, usually repeated, of proteins from a small number of phylogenetically distant prokaryotes. Examples include a 132-residue region found repeated in three of the five longest proteins of Bacillus anthracis, a 131-residue repeat in a cell wall-anchored protein of Enterococcus faecalis (Streptococcus faecalis), and a 120-residue repeat in Methanobacterium thermoautotrophicum. A similar region is found in some Chlamydia trachomatis outer membrane proteins.

    \ \

    In C. trachomatis, three cysteine-rich proteins (also believed to be lipoproteins), MOMP, OMP6 and OMP3, make up the extracellular matrix of the outer membrane PUBMED:2287277. They are involved in the essential structural integrity of both the elementary body (EB) and recticulate body (RB) phase. They are thought to be involved in porin formation and, as these bacteria lack the peptidoglycan layer common to most Gram-negative microbes, such proteins are highly important in the pathogenicity of the organism.

    \ ' '1849' 'IPR002822' '\

    The proteins in this family have no known function.

    \ ' '1850' 'IPR002823' '\ Members of this prokaryotic family have no known function. Members are predicted to be integral membrane proteins and are similar to a protein in a tartrate utilisation region (TAR) of Agrobacterium vitis a common pathogen of grapevine. Most grapevine strains utilise tartrate, an abundant compound in grapevine PUBMED:8672817.\ ' '1852' 'IPR002825' '\

    The function of the archaebacterial proteins in this family is unknown.

    \ ' '1853' 'IPR002826' '\

    The prokaryotic proteins in this family have no known function.

    \ ' '1854' 'IPR002829' '\ These archaeal and bacterial proteins have no known function.\ Members of this family contain seven conserved cysteines and\ may also be an integral membrane protein.\ ' '1855' 'IPR002831' '\

    TrmB, is a protein of 38,800 apparent molecular weight, that is involved in the maltose-specific regulation of the trehalose/maltose ABC transport operon in Thermococcus litoralis. TrmB has been shown to be a maltose-specific repressor, and this inhibition is counteracted by maltose and trehalose. TrmB binds maltose and trehalose half-maximally at 20 uM and 0.5 mM sugar concentration, respectively PUBMED:12426307. Other members of this family are annotated as either transcriptional regulators or hypothetical proteins.

    \ ' '1856' 'IPR002834' '\

    This entry represents an archaeal riboflavin kinase. The structure and activity of the previously uncharacterised protein Mj0056 from Methanocaldococcus jannaschii (Methanococcus jannaschii) () has been determined. It represents a riboflavin kinase with an N-terminal nucleic acid binding domain related to the AbrB-like superfamily of transcription factors. The riboflavin kinase activity is unusual in that it requires either CTP or UTP as phosphate donors, with UTP being at least one order of magnitude less efficient. At reaction temperatures of up to 85 oC the temperature of the natural habitat of M. jannaschii, riboflavin was completely converted to FMN PUBMED:18073108.\

    \ ' '1857' 'IPR002835' '\

    Members of this protein family are the CofC enzyme of coenzyme F420 biosynthesis.

    \ ' '1858' 'IPR002837' '\

    This archaebacterial domain has no known function. In Methanocaldococcus jannaschii (Methanococcus jannaschii) it occurs with an endonuclease domain .

    \ ' '1859' 'IPR002838' '\

    The proteins in this family have no known function.

    \ ' '1860' 'IPR008217' '\

    Proteins containing this entry have no known function and are predicted to be integral membrane proteins. They include the Ccc1 protein from Saccharomyces cerevisiae (Baker\'s yeast) () that may have a role in regulating calcium levels PUBMED:7941738.

    \ ' '1861' 'IPR002840' '\

    These archaebacterial proteins have no known function.

    \ ' '1862' 'IPR002845' '\

    This entry represents tRNA ribose 2\'-O-methyltransferase aTrm56, which specifically catalyzes the AdoMet-dependent 2\'-O-ribose methylation of cytidine at position 56 in tRNAs.

    \

    The crystal structure of Pyrococcus horikoshii aTrm56 complexed with S-adenosyl-L-methionine has been determined to 2.48 A resolution. aTrm56 consists of the SPOUT domain, which contains the characteristic deep trefoil knot, and a unique C-terminal beta-hairpin PUBMED:18068186.

    \

    A conserved cytidine at position 56 of tRNA contributes to the maintenance of the L-shaped tertiary structure. aTrm56 catalyzes the 2\'-O-methylation of the cytidine residue in archaeal tRNA, using S-adenosyl-L-methionine. Biochemical assays showed that aTrm56 forms a dimer and prefers the L-shaped tRNA to the lambda form as its substrate PUBMED:15987815, PUBMED:16164996.

    \ \ ' '1863' 'IPR002846' '\ These archaebacterial proteins have no known function.\ The domain is found duplicated in some sequences.\ ' '1864' 'IPR002847' '\

    This entry contains F420-0:gamma-glutamyl ligase and related proteins. F420-0:gamma-glutamyl ligase catalyzes the GTP-dependent successive addition of multiple gamma-linked L-glutamates to the L-lactyl phosphodiester of 7,8-didemethyl-8-hydroxy-5-deazariboflavin (F420-0) to form polyglutamated F420 derivatives PUBMED:2110564, PUBMED:12867481, PUBMED:8577249, PUBMED:15215601.

    \ ' '1865' 'IPR002485' '\ This domain is found in nematode proteins. It is currently\ of unknown function.\ ' '1866' 'IPR003326' '\ This domain is found in a family of proteins from Caenorhabditis elegans. The domain has no known function, but has 4 conserved cysteine residues and is a maximum of 175 residues long.\ ' '1867' 'IPR002849' '\ This archaebacterial protein family has no known function.\ The proteins are predicted to contain two transmembrane\ helices.\ ' '1868' 'IPR002852' '\

    The bacterial and archaeal proteins in this family have no known function.

    \ ' '1871' 'IPR002855' '\

    The archaeal proteins in this family have no known function.

    \ ' '1872' 'IPR003341' '\ This signature describes a cysteine repeat C-X3-C-X3-C the function of which is unknown as is the function of the proteins in which they occur. Most of the sequences in this group are from Caenorhabditis elegans and Caenorhabditis briggsae.\ ' '1873' 'IPR003453' '\ This domain has no known function nor do any of the proteins that possess it. The aligned region is approximately 150 amino acids long.\ ' '1874' 'IPR003366' '\ This domain is found in a family of hypothetical Caenorhabditis elegans proteins. The aligned region has no known function nor do any of the proteins which possess it. However, this domain is related to the CUB domain (). The aligned region is approximately 130 amino acids long and contains two conserved cysteine residues.\ \ \ ' '1875' 'IPR004394' '\ The gene iojap is a pattern-striping gene in maize, reflecting a chloroplast development defect in some cells. Maize has two RNA polymerases in plastids, but the plastid-encoded one, similar to bacterial RNA polymerases, is missing in iojap mutants. The role of iojap in chloroplast development, and the role of its bacterial orthologs modeled here, is unclear.\ ' '1876' 'IPR003458' '\

    This family contains Bacteriophage T4 gp38 and related bacterial prophage and phage proteins. Gene 38 of phage T4 codes for a protein containing 183 amino acid residues with molecular weight of 22.3 kDa. Together with genes 36 and 37, whose products are structural proteins of the fibre distal part, gene 38 forms one transcription unit. Gp38, is a chaperone, which is required for assembly of the distal part of the long fibres and which is absent from the mature phage particle. In the absence of gp38 gp37, which is a component of the distal part of the long tail fibre, fails to oligomerise. The carboxy-terminal region of gp37 forms the tip of the distal fibre that interacts with the cell receptors. Functionally the role of gp38 can be replaced by pTfa of Bacteriophage lambda PUBMED:8892827, PUBMED:1531648, PUBMED:14625682.

    \

    The function of many of the other members of this family remain to be elucidated.

    \ \ ' '1877' 'IPR003368' '\

    This repeat is found in several Chlamydia polymorphic membrane proteins (Pmps). Chlamydia pneumoniae is an obligate intracellular bacterium and a common human pathogen causing infection of the upper and lower respiratory tract. In Pmps the tetrapeptide GGA(I/V/L) motif is commonly repeated several times in the N-terminal part. The C-terminal half is characterised by conserved tryptophans and a carboxy-terminal phenylalanine. A signal peptide leader sequence is predicted in C. pneumoniae Pmps, which indicates an outer membrane localisation PUBMED:10587946. Pmp10 and Pmp11 contain a signal peptidase II cleavage site suggesting lipid modification. The C. pneumoniae pmp genes represent 17.5% of the chlamydia-specific coding capacity and they are all transcribed during chlamydial growth but the function of Pmps remains unknown PUBMED:11583841.

    \ ' '1879' 'IPR003390' '\

    The DisA protein is a bacterial checkpoint protein that dimerises into an octameric complex. The protein consists of three distinct domains. This domain is the first and is a globular, nucleotide-binding region; the next 146-289 residues constitute the DisA-linker family, that consists of an elongated bundle of three alpha helices (alpha-6, alpha-10, and alpha-11), one side of which carries an additional three helices (alpha7-9), which thus forms a spine like-linker between domains 1 and 3. The C-terminal residues, of domain 3, are represented by family HHH, the specific DNA-binding domain. The octameric complex thus has structurally linked nucleotide-binding and DNA-binding HhH domains and the nucleotide-binding domains are bound to a cyclic di-adenosine phosphate such that DisA is a specific di-adenylate cyclase. The di-adenylate cyclase activity is strongly suppressed by binding to branched DNA, but not to duplex or single-stranded DNA, suggesting a role for DisA as a monitor of the presence of stalled replication forks or recombination intermediates via DNA structure-modulated c-di-AMP synthesis.

    \ ' '1880' 'IPR003677' '\ This domain has no known function.\ ' '1881' 'IPR003728' '\

    This entry describes proteins of unknown function.

    \ ' '1882' 'IPR003729' '\

    This entry describes proteins of unknown function. The structure has been determined for one member of this group, the hypothetical protein TM0160 from Thermotoga maritima, which was found to consist of a duplication of two beta(3)-alpha(2) structural repeats, forming a single barrel-like beta-sheet PUBMED:.

    \ ' '1883' 'IPR003730' '\

    Laccases are multi-copper oxidoreductases able to oxidise a wide variety of phenolic and non-phenolic compounds and are widely distributed among both prokaryotes and eukaryotes. There are two main active catalytic sites with conserved histidines that are capable of binding four copper atoms PUBMED:16740638.

    \ ' '1884' 'IPR003731' '\

    This entry represents several Nif (B, X and Y) proteins, which are involved in the biosynthesis of the iron-molybdenum cofactor (FeMo-co) found in the dinitrogenase enzyme of the nitrogenase complex in nitrogen-fixing bacteria. The nitrogenase complex catalyses the reduction of atmospheric dinitrogen to ammonia, and is composed of an iron metalloprotein (dinitrogenase reductase; homodimer of NifH; ) and a Fe-Mo metalloprotein (dinitrogenase; heterotetramer of NifD and NifK; ). The pathway for the synthesis of the Fe-Mo cofactor involves several proteins, including NifB, NifE, NifH, NifN, NifQ, NifV and NifX. NifB appears to be an iron-sulphur source for FeMo-co biosynthesis, while NifX may be associated with the mature FeMo-co, in particular with the addition of homocitrate during the last step of biosynthesis PUBMED:11279153. The NifX protein shows sequence similarity with the C-terminus of NifB PUBMED:12892890, as well as to the conserved protein MTH1175 from the archaeon Methanobacterium thermoautotrophicum, which displays a ribonuclease H-like motif of three layers, alpha/beta/alpha, with a single mixed beta-sheet PUBMED:12836677.

    \ ' '1885' 'IPR003734' '\

    This entry describes proteins of unknown function.

    \ ' '1886' 'IPR003737' '\

    A number of the members of this family have been characterised as a probable N-acetylglucosaminyl-phosphatidylinositol de-N-acetylase, () that catalyses the second step in glycosylphosphatidylinositol (GPI) biosynthesis PUBMED:10085243, PUBMED:12958317.

    \ ' '1887' 'IPR003738' '\

    This entry describes proteins of unknown function.

    \ ' '1888' 'IPR002862' '\

    Proteins that contain this domain are of unknown function. It appears to occur towards the C-terminus of proteins from Mycoplasma pneumoniae PUBMED:8948633.

    \ ' '1889' 'IPR003741' '\

    This entry describes proteins of unknown function.

    \ ' '1890' 'IPR003742' '\

    This family of proteins are predicted to be SPOUT methyltransferases PUBMED:17338813.

    \ ' '1891' 'IPR003743' '\

    This entry describes proteins of unknown function.

    \ ' '1892' 'IPR003744' '\ This is a family of uncharacterised proteins. Conserved regions of hydrophobicity suggest that all members of the family may be integral\ membrane proteins. \ ' '1893' 'IPR003745' '\

    This entry describes proteins of unknown function.

    \ ' '1894' 'IPR003746' '\

    This entry describes proteins of unknown function. Structures for two of these proteins, YggU from Escherichia coli and MTH637 from the archaea Methanobacterium thermoautotrophicum, have been determined; they have a core 2-layer alpha/beta structure consisting of beta(2)-loop-alpha-beta(2)-alpha PUBMED:12975589, PUBMED:11854485.

    \ ' '1895' 'IPR003748' '\

    This entry describes proteins of unknown function.

    \ ' '1896' 'IPR003750' '\

    This entry describes proteins of unknown function.

    \ ' '1897' 'IPR003756' '\

    This entry describes proteins of unknown function.

    \ ' '1898' 'IPR003768' '\

    This family represents ScpA, which along with ScpB () interacts with SMC in vivo forming a complex that is required for chromosome condensation and segregation PUBMED:12065423, PUBMED:12897137. The SMC-Scp complex appears to be similar to the MukB-MukE-Muk-F complex in Escherichia coli PUBMED:10545099, where MukB () is the homologue of SMC. ScpA and ScpB have little sequence similarity to MukE () or MukF (), they are predicted to be structurally similar, being predominantly alpha-helical with coiled coil regions.

    \ \ \

    In general scpA and scpB form an operon in most bacterial genomes. Flanking genes are highly variable suggesting that the operon has moved throughout evolution. Bacteria containing an smc gene also contain scpA or scpB but not necessarily both. An exception is found in Deinococcus radiodurans, which contains scpB but neither smc nor scpA. In the archaea the gene order SMC-ScpA is conserved in nearly all species, as is the very short distance between the two genes, indicating co-transcription of the both in different archaeal genera and arguing that interaction of the gene products is not confined to the homologues in Bacillus subtilis. It would seem probable that, in light of all the studies, SMC, ScpA and ScpB proteins or homologues act together in chromosome condensation and segregation in all prokaryotes PUBMED:12100548.

    \ ' '1899' 'IPR003769' '\

    In the bacterial cytosol, ATP-dependent protein degradation is performed by several different chaperone-protease pairs, including ClpAP. ClpS directly influences the ClpAP machine by binding to the N-terminal domain of the chaperone ClpA. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins PUBMED:11931773.

    \

    ClpS is a small alpha/beta protein that consists of three alpha-helices connected to three antiparallel beta-strands PUBMED:12426582. The protein has a globular shape, with a curved layer of three antiparallel alpha-helices over a twisted antiparallel beta-sheet. Dimerization of ClpS may occur through its N-terminal domain. This short extended N-terminal region in ClpS is followed by the central seven-residue beta-strand, which is flanked by two other beta-strands in a small beta-sheet.

    \ ' '1900' 'IPR003770' '\

    This family contains several aminodeoxychorismate lyase proteins. Aminodeoxychorismate lyase is a pyridoxal 5\'-phosphate-dependent enzyme that converts 4-aminodeoxychorismate to pyruvate and p-aminobenzoate, a precursor of folic acid in bacteria PUBMED:11011151.

    \ \ ' '1902' 'IPR003772' '\

    This entry describes proteins of unknown function.

    \ ' '1903' 'IPR003773' '\

    This entry describes proteins of unknown function.

    \ ' '1904' 'IPR003774' '\

    This entry describes proteins of unknown function.

    \ ' '1905' 'IPR003775' '\

    The protein BSU35380 from Bacillus subtilis (renamed FliW) was characterised as being a flagellar assembly factor and is involved in Bacterial flagellum biogenesis. Experimental characterization was also carried out in Treponema pallidum (TP0658). In Campylobacter jejuni, Cj1075 has been shown to be involved in motility and flagellin biosynthesis. The two paralogs in Helicobacter pylori 26695 (HP1154 and HP1377) were found to be able to bind to flagellin. For additional reading see PUBMED:16936039, PUBMED:11895937, PUBMED:11196647.

    \ ' '1906' 'IPR003776' '\

    This domain comprises the whole of a protein in Methanocaldococcus jannaschii (Methanococcus jannaschii) and Methanobacterium thermoautotrophicum, all but the N-terminal 60 residues from a protein of Mycobacterium tuberculosis, and all but the C-terminal 180 residues from a protein in Haemophilus influenzae and Escherichia coli, among proteins from published complete genomes.

    \ ' '1907' 'IPR003788' '\

    This entry describes proteins of unknown function.

    \ ' '1908' 'IPR003790' '\

    This entry describes proteins of unknown function.

    \ ' '1909' 'IPR002542' '\ This domain has no known function. It is found in one or two\ copies in several Caenorhabditis elegans proteins. It is\ roughly 130 amino acids and contains 12 conserved\ cysteines.\ ' '1911' 'IPR003795' '\

    This entry describes proteins of unknown function.

    \ ' '1912' 'IPR003797' '\ This family of proteins is related to DegV of Bacillus subtilis and includes paralogous sets\ in several species (B. subtilis, Deinococcus radiodurans, Mycoplasma pneumoniae) that\ are closer in percent identity to each other than to most homologs from other species. This\ suggests both recent paralogy and diversity of function.\ ' '1913' 'IPR003798' '\

    This protein contains several bacterial RmuC DNA recombination proteins. The function of the RMUC protein is unknown but it is suspected that it is either a structural protein that protects DNA against nuclease action, or is itself involved in DNA cleavage at the regions of DNA secondary structures PUBMED:10886369. Proteins in this family are predicted to contain a central endonuclease-like fold domain, surrounded by coiled coils, consistent with a direct role in DNA cleavage PUBMED:15972856.

    \ ' '1915' 'IPR003801' '\

    This entry describes proteins of unknown function.

    \ ' '1916' 'IPR003802' '\

    This entry represents the sporulation regulator WhiA. The N-terminal domain of WhiA is related to the LAGLIDADG homing endonuclease domain PUBMED:10986251, while the C-terminal is predicted to be a DNA binding helix-turn-helix domain PUBMED:17603302.

    \ ' '1918' 'IPR003806' '\

    The ATP-grasp fold is one of several distinct ATP-binding folds, and is found in enzymes that catalyze the formation of amide bonds, catalyzing the ATP-dependent ligation of a carboxylate-containing molecule to an amino or thiol group-containing molecule PUBMED:9416615. This fold is found in many different enzyme families, including various peptide synthetases, biotin carboxylase, synapsin, succinyl-CoA synthetase, pyruvate phosphate dikinase, and glutathione synthetase, amongst others PUBMED:12392708. These enzymes contribute predominantly to macromolecular synthesis, using ATP-hydrolysis to activate their substrates.

    \

    The ATP-grasp fold shares functional and structural similarities with the PIPK (phosphatidylinositol phosphate kinase) and protein kinase superfamilies. The ATP-grasp domain consists of two subdomains with different alpha+beta folds, which grasp the ATP molecule between them. Each subdomain provides a variable loop that forms part of the active site, with regions from other domains also contributing to the active site, even though these other domains are not conserved between the various ATP-grasp enzymes PUBMED:7862655.

    \ \

    This entry describes a type of ATP-grasp fold that is found in a set of proteins of unknown function.

    \ ' '1919' 'IPR003807' '\

    This entry describes proteins of unknown function.

    \ ' '1920' 'IPR003810' '\ Uncharacterised domain in proteins of unknown function.\ ' '1921' 'IPR003811' '\

    This entry describes proteins of unknown function found only in bacteria. It may be a multi-pass membrane protein.

    \ ' '1922' 'IPR003826' '\

    Polyamines such as spermidine and spermine are essential for cellular growth under most conditions, being implicated in a large number of cellular processes including DNA, RNA and protein synthesis. S-adenosylmethionine decarboxylase (AdoMetDC) plays an essential regulatory role in the polyamine biosynthetic pathway by generating the n-propylamine residue required for the synthesis of spermidine and spermine from putrescein PUBMED:2197977, PUBMED:10047786. Unlike many amino acid decarboxylases AdoMetDC uses a covalently bound pyruvate residue as a cofactor rather than the more common pyridoxal 5\'-phosphate. These proteins can be divided into two main groups which show little sequence similarity either to each other, or to other pyruvoyl-dependent amino acid decarboxylases: class I enzymes found in bacteria and archaea, and class II enzymes found in eukaryotes. In both groups the active enzyme is generated by the post-translational autocatalytic cleavage of a precursor protein. This cleavage generates the pyruvate precursor from an internal serine residue and results in the formation of two non-identical subunits termed alpha and beta which form the active enzyme.

    \

    Members of this family are related to the amino terminus of Escherichia coli S-adenosylmethionine decarboxylase.

    \ ' '1923' 'IPR003827' '\

    The methyltransferase TYW3 (tRNA-yW- synthesising protein 3) has been identified in yeast to be involved in wybutosine (yW) biosynthesis PUBMED:16642040. yW is a complexly modified guanosine residue that contains a tricyclic base and is found at the 3\'-position adjacent the anticodon of phenylalanine tRNA. TYW3 is an N-4 methylase that methylates yW-86 to yield yW-72 in an Ado-Met-dependent manner PUBMED:16642040.

    \ ' '1924' 'IPR003828' '\

    This entry describes proteins of unknown function.

    \ ' '1925' 'IPR003829' '\

    This entry represents N-terminal domain of Pirin proteins from both eukaryotes and prokaryotes.

    \ \

    The function of Pirin is unknown but the gene coding for this protein is known to be expressed in all tissues in the human body although it is expressed most strongly in the liver and heart. Pirin is known to be a nuclear protein, exclusively localised within the nucleoplasma and predominantly concentrated within dot-like subnuclear structures PUBMED:9079676.

    \ \

    Pirin is composed of two structurally similar domains arranged face to face. The N-terminal domain additionally features four beta-strands, and the C-terminal domain also includes four additional -strands and a short alpha-helix. Although the two domains are similar, the C-terminal domain of Pirin differs from the N-terminal domain as it does not contain a metal binding site and its sequence does not contain the conserved metal-coordinating residues PUBMED:1457596.

    \ \

    Pirin is confirmed to be a member of the cupin superfamily on the basis of primary sequence and structural similarity. The presence of a metal binding site in the N-terminal beta-barrel of Pirin, may be significant in its role in regulating NFI DNA replication and NF-kappaB transcription factor activity PUBMED:1457596.

    \ \

    Pirin structure has been found to closely resemble members of the cupin superfamily. Pirin contains the two characteristic sequences of the cupin superfamily, namely PG-(X)5-HXH-(X)4-E-(X)6-G and G-(X)5-PXG-(X)2-H-(X)3-N separated by a variable stretch of 15-50 amino acids. These motifs are best conserved in the N-terminal where the conserved histidine and glutamic acid residues correspond to the metal-coordinating residues. The C-terminal domain motifs lack the metal binding residues normally associated with the cupin fold PUBMED:1457596.

    \ \

    Pirin was identified to be a metal-binding protein PUBMED:14573596, and was found that the metal-binding residues of Pirins are highly conserved across mammals, plants, fungi, and prokaryotic organisms. Pirin acts as a cofactor for the transcription factor NFI, the regulatory mechanism of which is generally believed to require the assistance of a metal ion PUBMED:12426136. Structural data supports the hypothesis that the bound iron of Pirin may participate in this transcriptional regulation by enhancing and stabilising the formation of the p50,Bcl3,DNA complex PUBMED:14573596. Metals have been implicated directly or indirectly in the NF-kappaB family of transcription factors that control expression of a number of early response genes associated with inflammatory responses, cell growth, cell cycle progression, and neoplastic transformation PUBMED:12426136. However, most metal-dependent transcription factors are DNA-binding proteins that bind to specific sequences when the metal binds to the protein. Pirin, on the other hand, appears to function differently and bind to the transcription factor DNA complex PUBMED:14573596.

    \ \ ' '1926' 'IPR002550' '\ This transmembrane region has no known function. Many of the sequences in this family are annotated as hemolysins, however this is due to a similarity to that does not contain this domain. This domain is found in the N terminus of the proteins adjacent to two intracellular CBS domains ().\ ' '1927' 'IPR003830' '\

    Methanogenic archaea produce methane via the anaerobic reduction of acetate or single carbon compounds PUBMED:12440773. Coenzyme M (CoM; 2-mercaptoethanesulphonic acid) serves as the terminal methyl carrier for this process. Previously thought to be unique to methanogenic archaea, CoM has also been found in methylotrophic bacteria.

    \ \

    Biosynthesis of CoM begins with the Michael addition of sulphite to phosphoenolpyruvate, forming 2-phospho-3-sulpholactate (PSL). This reaction is catalyzed by members of this family, PSL synthase (ComA) PUBMED:11830598. Subsequently, PSL is dephosphorylated by phosphosulpholactate phosphatase (ComB) to form 3-sulpholactate PUBMED:11589710, which is then converted to \ 3-sulphopyruvate by L-sulpholactate dehydrogenase (ComC; ) PUBMED:10850983. Sulphopyruvate decarboxylase (ComDE; ) converts 3-sulphopyruvate to sulphoacetaldehyde PUBMED:10940029. Reductive thiolation of sulphoacetaldehyde is the final step.

    \ ' '1928' 'IPR003831' '\

    This entry describes proteins of unknown function.

    \ ' '1929' 'IPR003832' '\

    This family is related to the acid phosphatase/vanadium-dependent haloperoxidases; members of this group are uncharacterised.

    \ ' '1932' 'IPR003847' '\

    This entry describes proteins of unknown function.

    \ ' '1933' 'IPR003848' '\

    This domain of unknown function is found in several uncharacterised proteins.

    \ ' '1934' 'IPR002572' '\ This region is found in 1 to 3 copies in archaeal proteins whose function is unknown. It only appears in multiple copies in proteins from Archaeoglobus fulgidus.\ ' '1935' 'IPR003863' '\

    This entry consists of several Arabidopsis thaliana hypothetical proteins, none of which have any known function. They contain a conserved region with two cysteine residues.

    \ ' '1936' 'IPR003864' '\ This domain is found in a family of hypothetical transmembrane proteins none of which have any known function, the aligned region is at 538 residues at maximum length.\ ' '1937' 'IPR003870' '\ This domain is found in a family of hypothetical proteins, mostly from Mycobacterium tuberculosis, which includes a putative transposase.\ ' '1938' 'IPR003871' '\

    This domain is found in exclusively in Arabidopsis thaliana (Mouse-ear cress) proteins; its function has not been characterised, but may be involved in nucleic acid or nucleotide binding.

    \ ' '1940' 'IPR004180' '\ This family of proteins are found in Borrelia burgdorferi and Borrelia garinii. The proteins are about 190 amino acids long and have no known function.\ ' '1941' 'IPR004239' '\

    This group comprises proteins of unknown function from Borrelia burgdorferi, the causitive organism of Lyme disease.

    \ ' '1942' 'IPR004251' '\

    This is a poxvirus protein family of unknown function.

    \ ' '1943' 'IPR004256' '\

    This represents a C-terminal domain of unknown function, usually fused to a prokaryotic putative DEXX-box ATPase domain () PUBMED:9045616.

    \ ' '1944' 'IPR004296' '\

    This entry contains Caenorhabditis proteins of unkown function which contain a common C-terminal domain.

    \ ' '1945' 'IPR004306' '\ This domain is found entirely in Mycoplasma pneumoniae proteins of unknown function. Another related domain () is found entirely in mycoplasmal proteins of the MG032/MG096/MG288 family and both domains often occur together.\ ' '1947' 'IPR002577' '\

    The hxlR-type HTH domain is a domain of ~90-100 amino acids present in putative transcription regulators with a winged helix-turn-helix (wHTH) structure. The domain is named after Bacillus subtilis hxlR, a transcription activator of the hxlAB operon involved in the detoxification of formaldehyde PUBMED:10572115. The hxlR-type domain forms the core of putative transcription regulators and of hypothetical proteins occurring in eubacteria as well as in archaea. The sequence and structure of hxlR-type proteins show similarities with the marR-type wHTH PUBMED:11839496.

    \

    \ The crystal structure of ytfH resembles the DNA-binding domains of winged helix proteins, containing a three helix (H) bundle and a three-stranded antiparallel beta-sheet (B) in the topology: H1-H2-B1-H3-H4-B2-B3-H5-H6. This topology corresponds with that of the marR-type DNA-binding domain, wherein helices 3 and 4 comprise the helix-turn-helix motif and the beta-sheet is called the wing.

    \ ' '1948' 'IPR004319' '\ This domain is found entirely in Mycoplasma pneumoniae proteins of unknown function. Another related domain () is also found entirely in mycoplasmal proteins of the MG032/MG096/MG288 family and both domains often occur together.\ ' '1949' 'IPR004335' '\ Many of the proteins in this entry are Borrelia burgdorferi plasmid proteins of unknown function.\ ' '1950' 'IPR004347' '\

    This domain represents the C-terminal region of Orf6, which is localised upstream of the 20S proteasome subunit genes, prcA and prcB, in members of the Actinobacteria: Streptomyces coelicolor PUBMED:9765579, Frankia sp. ACN14a/ts-rPUBMED:10652097 and Rhodococcus erythropolis PUBMED:7583123.

    \ ' '1952' 'IPR004858' '\ This entry represents multigene family 530 proteins from African swine fever virus (ASFV) viruses. These proteins may be involved in promoting survival of infected macrophages PUBMED:11238833.\ ' '1953' 'IPR004853' '\ This family consists entirely of aligned regions from Drosophila melanogaster proteins. contains three repeats of this region. In other proteins, the aligned region is located towards the C-terminus. The function of the aligned region is unknown.\ ' '1954' 'IPR004859' '\ Signatures of this entry align residues towards the N-terminus of several proteins with multiple functions. The members of this family all appear to\ possess 5\'-3\' exonuclease activity . Thus, the aligned region may be necessary for 5\'-3\' exonuclease function.\ ' '1955' 'IPR004861' '\

    Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; ) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation PUBMED:9818190, PUBMED:14625689. The PTP superfamily can be divided into four subfamilies PUBMED:12678841:

    \

    \

    Based on their cellular localisation, PTPases are also classified as:

    \

    \

    All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif PUBMED:9646865. Functional diversity between PTPases is endowed by regulatory domains and subunits.

    \ \

    This entry represents protein-tyrosine phosphatases predominantly from fungi, plants and bacteria, several of which are putative enzymes. These proteins are closely related to the Y-phosphatase and DSPc families. This entry includes the PTPase SIW14 from Saccharomyces cerevisiae (Baker\'s yeast), which plays a role in actin filament organization and endocytosis.

    \ ' '1956' 'IPR004879' '\

    This is a group of uncharacterised proteins.

    \ ' '1958' 'IPR005489' '\

    This family of proteins is of unknown function.

    \ ' '1959' 'IPR004881' '\

    This protein has been shown to cleave GTP, remain bound to GDP PUBMED:12220175, and acts as an unusual circulary permuted GTPase that catalyzes rapid hydrolysis of GTP with a slow catalytic turnover. A role as a regulator of translation has been suggested PUBMED:14973029. The Aquifex aeolicus ortholog is split into consecutive open reading frames.

    \ ' '1960' 'IPR004884' '\ This is a group of uncharacterised proteins of unknown function.\ ' '1961' 'IPR004919' '\ This entry is found in prokaryotic proteins of unknown function.\ ' '1962' 'IPR004921' '\

    This family represents a group of terminase proteins and is found in a group of plasmid encodes proteins specifically found in Borrelia and that currently do not show any similarity to\ any other proteins outside the Borrelia genus. Proteins within this family are about 450 residues long and are found to be expanded in Borrelia burgdorferi (Lyme disease spirochete).

    \ ' '1963' 'IPR004948' '\ This family includes hypothetical ATP-binding proteins from prokaryotes.\ ' '1965' 'IPR004952' '\ This family includes several proteins of unknown function. Members of this family may be involved in nitrogen fixation, since they are found within nitrogen fixation operons.\ ' '1966' 'IPR004878' '\

    The otopetrins are a group of proteins that are restricted to the metazoa. The structure of otopetrin-1 () shows it to have 12 transmembrane domains, with three conserved sub-domains (OD-1 to OD-III) PUBMED:18254951. Otopetrins modulate calcium homeostasis and influx of calcium in response to extracellular ATP.

    \

    The otopetrins are required for normal formation of otoconia/otoliths in the inner ear. Otoconia are minute biomineral particles embedded in a gelatinous membrane that overlies the sensory epithelium in the inner ear. Gravity and acceleration cause the octoconia to deflect the stereocilia of sensory hair cells. Otoconia are required for normal processing of information regarding spatial orientation and acceleration.

    \ ' '1967' 'IPR005069' '\

    Proteins in this family have been been predicted to be nucleotide-diphospho-sugar transferases PUBMED:15215454.

    \ ' '1968' 'IPR004988' '\

    This is a family of proteins of unknown function.

    \ ' '1970' 'IPR005096' '\ This family is specific to Borrelia burgdorferi (Lyme disease spirochete). The protein is encoded on extrachromosomal DNA and is of unknown function.\ ' '1971' 'IPR005020' '\

    This is a family of Caenorhabditis elegans proteins of unknown function.

    \ ' '1972' 'IPR002876' '\

    This entry represents the core region of several hypothetical proteins found in bacteria, plants, and yeast proteins. This core region can be subdivided into three domains: a 3-helical bundle domain, and two alpha+beta domains with different folds, where domain 3 (ferredoxin-like fold) is inserted within domain 2. This core region is found in the following hypothetical proteins: YebC from Escherichia coli, HP0162 from Helicobacter pylori (Campylobacter pylori) and aq1575 from Aquifex aeolicus PUBMED:12060744.

    \ \

    The crystal structure of a conserved hypothetical protein, Aq1575, from Aquifex aeolicus has been determined. A structural homology search reveals that this protein has a new fold with no obvious similarity to those of other proteins of known three-dimensional structure. The protein reveals a monomer consisting of three domains arranged along a pseudo threefold symmetry axis. There is a large cleft with approximate dimensions of 10 A x 10 A x 20 A in the centre of the three domains along the symmetry axis. Two possible active sites are suggested based on the structure and multiple sequence alignment. There are several highly conserved residues in these putative active sites PUBMED:12060744.

    \ ' '1974' 'IPR005098' '\ This domain is found in a number of worm proteins and has no known function. The boundaries of the presumed domain are rather uncertain.\ ' '1975' 'IPR005047' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class xa (Srxa), from the Str superfamily PUBMED:18050473.

    \ ' '1976' 'IPR005048' '\

    This is a domain of unknown function found in proteins of unknown function.

    \ ' '1977' 'IPR005049' '\

    This is a protein family of unknown function.

    \ ' '1978' 'IPR000615' '\ Bestrophin is a 68-kDa basolateral plasma membrane protein expressed in retinal pigment epithelial cells (RPE). It is encoded by the VMD2 gene, which is mutated in Best macular dystrophy, a disease characterised by a depressed light peak in the electrooculogram PUBMED:12032738. VMD2 encodes a 585-amino acid protein with an approximate mass of 68 kDa which has been designated bestrophin. Bestrophin shares homology with the Caenorhabditis elegans RFP gene family, named for the presence of a conserved arginine (R), phenylalanine (F), proline (P), amino acid sequence motif. Bestrophin is a plasma membrane protein, localised to the basolateral surface of RPE cells consistent with a role for bestrophin in the generation or regulation of the EOG light peak. Bestrophin and other RFP family members represent a new class of chloride channels, indicating a direct role for bestrophin in generating the light peak PUBMED:12032738. The VMD2 gene underlying Best disease was shown to represent the first human member of the RFP-TM protein family. More than 97% of the disease-causing mutations are located in the N-terminal RFP-TM domain implying important functional properties PUBMED:12058047.\ ' '1979' 'IPR002636' '\ This family consists of various hypothetical proteins\ from cyanobacteria, none of which are functionally\ described. The aligned region is approximately 120-140\ amino acids long corresponding to almost the entire\ length of the proteins in the family.\ ' '1980' 'IPR005102' '\

    The structure of this module is known PUBMED:11080456 and consists of an Ig-like fold. The function of this domain is unknown, but might be involved in mediating interaction with carbohydrates.

    \ ' '1981' 'IPR005061' '\

    This is a eukaryotic protein family of unknown function.

    \ ' '1982' 'IPR005104' '\ This domain is always found N-terminal to a pair of cystathionine-beta-synthase (CBS) domains . This region may be distantly related to the HrcA proteins of prokaryotes.\ ' '1983' 'IPR005105' '\

    This domain is found associated with an N-terminal cyclic nucleotide-binding domain () and two CBS domains (). This domain, normally represents the C-terminal region, is uncharacterised; however, it seems to be similar to the nucleotidyltransferase domain (), conserving the DXD motif, which strongly suggests that proteins containing this domain are also nucleotidyltransferases.

    \ ' '1984' 'IPR005175' '\

    This putative conserved domain is found in proteins that contain AT-hook motifs , suggesting a DNA-binding function for the proteins as a whole, however, the function of this domain is unknown. Overexpression of a protein containing this domain, , in Arabidopsis thaliana causes late flowering and modified leaf development PUBMED:10759496.

    \ ' '1985' 'IPR004352' '\

    Eighty-one archaeal-like genes, ranging in\ size from 4-20kb, are clustered in 15 regions of the Thermotoga maritima genome PUBMED:10360571.\ Conservation of gene order between T. maritima and Archaea in many of these\ regions suggests that lateral gene transfer may have occurred between\ thermophilic Eubacteria and Archaea PUBMED:10360571.

    \

    One of the T. maritima sequences (hypothetical protein TM1410) \ shares similarity with Methanocaldococcus jannaschii (Methanococcus jannaschii) hypothetical protein MJ1477\ and with hypothetical protein DR0705 from Deinococcus radiodurans. The \ sequences are characterised by relatively variable N- and C-terminal domains,\ and a more conserved central domain. They share no similarity with any other \ known, functionally or structurally characterised proteins.

    \ ' '1986' 'IPR005177' '\

    This entry represents a family of uncharacterised proteins which are predicted to function as phosphotransferases.

    \ ' '1989' 'IPR005180' '\

    This domain is found in an undescribed set of proteins. It normally occurs uniquely within a sequence, but is found as a tandem repeat (). It has an interesting phylogenetic distribution with the majority of examples in bacteria and archaea, but it is also found in Drosophila melanogaster (e.g. ). The hypothetical protein TT1751 from Thermus thermophilus has a beta-alpha-beta(4)-alpha structural fold PUBMED:15481054.

    \ ' '1990' 'IPR005181' '\

    This domain is associated with proteins from viruses, bacteria and eukaryotes. In the latter two taxonomic groups some of the proteins are annotated as either sialic acid-specific 9-O-acetylesterase () or acetylxylan esterase related enzyme. The function of this domain is unknown.

    \ ' '1991' 'IPR005185' '\

    This proteins contain a domain which occurs as one or more copies in a small family of putative membrane proteins.

    \ ' '1992' 'IPR005325' '\

    This represents a group of short repeats that occurs in a limited number of membrane proteins. It may divide further in short repeats of around 7-10 residues of the pattern G-#-X(2)-#(2)-X (#=hydrophobic).

    \ ' '1993' 'IPR005500' '\

    This family consists of eubacterial and archaebacterial proteins of unknown function. The proteins contain a motif HXXXEXX(W/Y) where X can be any amino acid. This motif is likely to be functionally important and may be involved in metal binding.

    \ ' '1994' 'IPR002414' '\

    This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.

    \ ' '1995' 'IPR010149' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents the C-terminal domain of a minor family of CRISPR-associated proteins. These proteins are found adjacent to a characteristic short, palindromic repeat cluster termed CRISPR, a probable mobile DNA element.

    \ ' '1997' 'IPR005512' '\

    In plants, the small GTP-binding proteins called Rops work as signalling switches that control growth, development and plant responses to various environmental stimuli. Rop proteins (Rho of plants, Rac-like and atRac in Arabidopsis thaliana (Mouse-ear cress)) belong to the Rho family of Ras-related GTP-binding proteins that turn on signalling pathways by switching from a GDP-bound inactive to a GTP-bound active conformation. Activation depends on guanine nucleotide exchange factors (GEFs) that catalyse the otherwise slow GDP dissociation for subsequent GTP binding. The plant-specific RopGEFs represent a unique family of exchange factor that display no homology to any known RhoGEFs from animals and fungi. They comprise a highly conserved catalytic domain termed PRONE (plant-specific Rop nucleotide exchanger) with exclusive substrate specificity for members of the Rop family. The PRONE domain has been shown to be necessary and sufficient to promote nucleotide release from Rop PUBMED:15980860, PUBMED:16415208, PUBMED:16754995.

    \ \

    The PRONE domain can be divided into three highly conserved subdomains separated by two short stretches of variable amino acid residues PUBMED:15980860, PUBMED:16415208. It is approximately 370 residues in length and displays an almost all alpha-helical structure except for a beta-turn that protrudes from the main body of the molecule. The overall structure of the PRONE domain can be divided into two subdomains, the first one including helices alpha1-5 and alpha13, the second alpha6-12 PUBMED:17218277.

    \ ' '1998' 'IPR005524' '\

    This family of predicted integral membrane proteins.

    \ ' '2001' 'IPR005529' '\

    This entry represents a group of tandem repeats, found in several plant and fungal species, whose sequence is distantly related to the FARP (FMRFamide) group of neuropeptides, . The function of these repreats is not known, being mostly found in uncharacterised proetins, but they are also present in the nuclear migration protein NUM1 PUBMED:11266443.

    \ ' '2002' 'IPR005531' '\

    This is a family of small proteins. It includes a protein identified as an alkaline shock protein PUBMED:7864904 so may be involved in stress response.

    \ ' '2003' 'IPR005532' '\

    This entry represents domains that have a structure homologous to the complex alpha/beta topology found in sulphatase-modifying factors (SUMF1). SUMF1 is a paralogue of oxoalanine-generating enzyme, also called C(alpha)-formylglycine generating enzyme (FGE). SUMF1 converts newly synthesized inactive sulphatases to their active form by modifying an active site cysteine residue to oxoalanine. Sulphatases are essential for the degradation of sulphate esters, whose catalytic activity is dependent upon an oxoalanine residue PUBMED:16041070. Defects in SUMF1 or FGE cause multiple sulphatase deficiency (MSD), which leads to the impairment of all sulphatases and to the accumulation of glycoaminoglycans or sulpholipids, causing early infant death PUBMED:17206939, PUBMED:16124866, PUBMED:16174644. Known substrates for SUMF1 are: N-acetylgalactosamine-6-sulphate sulphatase (GALNS), arylsulphatase A (ARSA), steroid sulphatase (STS) and arylsulphatase E (ARSE). SUMF1 occurs in the endoplasmic reticulum or its lumen.

    \

    This domain is also found in a few methyltransferases and protein kinases.

    \ ' '2004' 'IPR005537' '\

    The members of this family have no known function. They are around 300 amino acids in length and have two conserved motifs. At the N-terminus is a PXXIG motif and a more strongly conserved motif in the central region YXPGXXXKGXXR where X can be any amino acid.

    \ ' '2005' 'IPR003225' '\

    This family, of hypothetical proteins, is found in viruses.

    \ ' '2006' 'IPR005585' '\

    The proteins in this family are around 140-170 residues in length. The proteins contain many conserved residues, with the most conserved motifs found in the central and C-terminal region. The function of these proteins is unknown.

    \ ' '2007' 'IPR005583' '\

    The members of this family are functionally uncharacterised. They are about 250 amino acids in length.

    \ ' '2008' 'IPR005584' '\

    The biological function of these short proteins is unknown, but they contain four conserved cysteines, suggesting that they all bind zinc. YacG () from Escherichia coli has been shown to bind zinc and contains the structural motifs typical of zinc-binding proteins PUBMED:12211008. The conserved four cysteine motif in these proteins (-C-X(2)-C-X(15)-C-X(3)-C-) is not found in other zinc-binding proteins with known structures.

    \ ' '2009' 'IPR005586' '\

    The proteins in this family are uncharacterised. The proteins are 170-190 amino residues in length.

    \ ' '2010' 'IPR005589' '\

    The members of this family are uncharacterised proteins from a number of bacterial species. They range in size from 50-100 residues.

    \ ' '2012' 'IPR005590' '\

    This family consists of bacterial proteins whose function has not been characterised.

    \ ' '2013' 'IPR005602' '\

    This is a family of proteins found in Staphylococcus aureus plasmid with no characterised function.

    \ ' '2015' 'IPR005624' '\ This entry contains uncharacterised proteins, including GlcG . The alignment contains many conserved motifs that are suggestive of cofactor binding and enzymatic activity.\ ' '2016' 'IPR005625' '\

    This domain represents a conserved transmembrane (TM) helix that is found in bacterial proteins. Coil residues are significantly more conserved than other residues and are frequently found within channels and transporters, where they introduce the flexibility and polarity required for transport across the membrane PUBMED:18511074.

    \ \

    This TM helix associates with PepSY (peptidase (M4) and YpeB of subtilis). PepSY is a repeated region first identified in Thermoanaerobacter tengcongensis. The PepSY domain functions in the control of M4 peptidases through their propeptide and in the germination of spores. It may also play a part in regulating protease activity PUBMED:15124630.

    \ ' '2017' 'IPR005631' '\

    This entry represents a group of uncharacterised small proteins found in both eukaryotes and prokaryotes, including NMA1147 from Neisseria meningitidis PUBMED:15103637 and YgfY from Escherichia coli PUBMED:15593094. YgfY may be involved in transcriptional regulation. The structure of these proteins consists of a complex bundle of five alpha-helices, which is composed of an up-down 3-helix bundle plus an orthogonal 2-helix bundle.

    \ ' '2018' 'IPR002678' '\

    This family contains several NIF3 (NGG1p interacting factor 3) protein homologues. NIF3 interacts with the yeast transcriptional coactivator NGG1p which is part of the ADA complex, the exact function of this interaction is unknown PUBMED:11124544PUBMED:8663102.

    \ ' '2019' 'IPR005642' '\

    Members of this family contain a conserved core of four predicted transmembrane segments. Some members have an additional pair of N-terminal transmembrane helices. The functions of the proteins in this family are unknown.

    \ ' '2020' 'IPR005645' '\

    The function of the proteins from this family is unknown.

    \ ' '2021' 'IPR005646' '\ This family of bacterial proteins has no known function. The proteins are in the region of 500-600 amino acid residues in length.\ ' '2022' 'IPR005651' '\

    This family of short proteins have no known function. The bacterial members are about 60-70 amino acids in length and the eukaryotic examples are about 120 amino acids in length. The C-terminus contains the strongest conservation.

    \ ' '2023' 'IPR005660' '\

    This presumed domain is found in one or two copies per protein. The domain is about 230 amino acids in length and has many conserved motifs, it has polyphosphate kinase activity PUBMED:12482933, PUBMED:12486232.

    \ ' '2024' 'IPR007126' '\

    This family consists of several REV proteins from Borrelia burgdorferi (Lyme disease spirochete) and Borrelia garinii. The function of REV is unknown although it has been shown that the gene is induced during the ingesting of host blood suggesting a role in the metabolic activation of borreliae to adapt to physiological stimuli PUBMED:11580974.

    \ ' '2025' 'IPR007132' '\ This repeat was found as seven tandem copies in one protein. It is predicted to be composed of beta-strands. Thus it is likely that it forms a beta-propeller structure. It is found in association with BNR repeats, which also form a beta-propeller.\ ' '2026' 'IPR007136' '\ This repeat is found as four tandem repeats in a family of bacterial membrane proteins. Each repeat contains two transmembrane regions and a conserved tryptophan.\ ' '2027' 'IPR002878' '\ This domain has no known function and is found in conserved hypothetical archaea\ and bacterial proteins. The domain is approximately 120 amino acids long.\ ' '2028' 'IPR007140' '\ This motif occurs in a small set of bacterial proteins. It has two transmembrane regions, and often occurs as tandem repeats. The are no conserved catalytic residues.\ ' '2029' 'IPR007141' '\

    This domain is currently only found in a small number of proteins restricted to Streptomyces spp. All have four conserved cysteines that probably form two disulphide bonds. One of these proteins from Streptomyces nigrescens, is the well characterised metalloproteinase inhibitor PUBMED:2243793, PUBMED:3888972, SMPI (), which belongs to MEROPS proteinase inhibitor family I36, clan IU. The functional of the other proteins is not known.

    \ \ \

    The structure of SMPI has been determined. It has 102 amino acid residues with two disulphide bridges and specifically inhibits metalloproteinases such as thermolysin, which belongs to MEROPS peptidase family M4. SMPI is composed of two beta-sheets, each consisting of four antiparallel beta-strands. The structure can be considered as two Greek key motifs with 2-fold internal symmetry, a Greek key beta-barrel. One unique structural feature found in SMPI is in its extension between the first and second strands of the second Greek key motif which is known to be involved in the inhibitory activity of SMPI. In the absence of sequence similarity, the SMPI structure shows clear similarity to both domains of the eye lens crystallins , both domains of the calcium sensor protein-S, as well as the single-domain yeast killer toxin. The yeast killer toxin structure was thought to be a precursor of the two-domain beta gamma-crystallin proteins, because of its structural similarity to each domain of the beta gamma-crystallins. SMPI thus provides another example of a single-domain protein structure that corresponds to the ancestral fold from which the two-domain proteins in the beta gamma-crystallin superfamily are believed to have evolved PUBMED:9735297.

    \ ' '2030' 'IPR007147' '\ This is a family of proteins of unknown function found in yeast.\ ' '2031' 'IPR007152' '\ Members of this family are around 350 amino acids in length. They are found in archaea and some bacteria and have no known function.\ ' '2032' 'IPR007154' '\ Members of this family are around 120 amino acids in length and are found in some archaebacteria. The function of this family is unknown. However it contains a conserved motif IHPPAH that may be involved in its function.\ ' '2033' 'IPR007155' '\ Members of this entry are short (less than 100 amino acids) proteins found in archaebacteria. The function of these proteins is unknown.\ ' '2034' 'IPR007164' '\ This is family of archaebacterial proteins, which are about 170 amino acids in length. They have no known function. The most conserved portion of the protein contains the sequence GEEDL that may be important for its function.\ ' '2035' 'IPR007166' '\

    This entry represents an amino terminal motif QXSXEXXXL thought to be part of a class III signal sequence for a family of archaeal proteins. The Q residue is the +1 residue of the signal peptidase cleavage site PUBMED:17114255. Two proteins containing this motif are cleaved by a type IV pilin-like signal peptidase.

    \ ' '2036' 'IPR007160' '\ This domain is found in some iron-sulphur proteins.\ ' '2037' 'IPR007161' '\

    This is a family of bacterial and archaeal proteins of unknown function.

    \ ' '2038' 'IPR007176' '\ This is an archaeal family of unknown function.\ ' '2039' 'IPR007162' '\ This is an archaeal family of unknown function.\ ' '2040' 'IPR007177' '\

    This domain is found in a family of proteins of unknown function. It appears to be found in eukaryotes and archaebacteria, and occurs associated with a potential metal-binding region in RNase L inhibitor, RLI ().

    \ ' '2041' 'IPR007163' '\ This is a predicted transmembrane family of unknown function. Proteins usually have between 6 and 9 predicted transmembrane segments.\ ' '2042' 'IPR007256' '\ Proteins of this family have no known function.\ ' '2043' 'IPR002696' '\ This is a family of short (70 amino acid) hypothetical\ proteins from various bacteria. They contain three conserved \ cysteine residues. from Aeromonas hydrophila has\ been found to have hemolytic activity.\ ' '2044' 'IPR007169' '\ This is a bacterial family of unknown function.\ ' '2045' 'IPR007171' '\ This is an archaeal family of unknown function.\ ' '2046' 'IPR007179' '\

    This is a group of proteins of unknown function. It is found N-terminal to another domain of unknown function ().

    \ ' '2047' 'IPR007254' '\ This archaeal family of unknown function is predicted to be an integral membrane protein with six transmembrane regions.\ ' '2048' 'IPR007172' '\ This is a bacterial domain of unknown function.\ ' '2050' 'IPR007184' '\

    Glycosidases or glycosyl hydrolases are a big and widespread family of enzymes that hydrolyse the glycosidic bonds between carbohydrates or between a carbohydrate and an aglycone moiety. On the basis of sequence and structural similarity, the glycoside hydrolase family belongs to the beta-fructosidase (furanosidase) superfamily of glycosyl hydrolases. This leads to the prediction that proteins of this family have a glycosidase (glycoside hydrolase) activity and, most probably, act on a furanoside residue (fructose, arabinose and ribose). Crystal structure from Thermotoga maritima a member of this family (PDB:1VKD], determined to high-resolution by Structural Genomics initiatives, reveals a five-bladed beta-propeller fold with three acidic residues forming the active site.

    \ ' '2051' 'IPR007211' '\ These are predicted membrane proteins of unknown function. The majority of the proteins have two predicted transmembrane regions.\ ' '2052' 'IPR007181' '\

    This is a strongly conserved YPLM motif. It is found C-terminal to another domain of unknown function ().

    \ ' '2053' 'IPR007205' '\

    This is a protein of unknown function. It is found N-terminal to another domain of unknown function ().

    \ ' '2054' 'IPR007206' '\

    This is a protein of unknown function. It is found C-terminal to another domain of unknown function ().

    \ ' '2055' 'IPR004375' '\

    This family consists of conserved hypothetical proteins, about 150 amino acids in length, with no known function. The family is restricted to the bacteria. It includes three members in Escherichia coli (strain K12) and three in Streptococcus pneumoniae.

    \ ' '2056' 'IPR005234' '\

    This family represents ScpB, which along with ScpA () interacts with SMC in vivo forming a complex that is required for chromosome condensation and segregation PUBMED:12065423, PUBMED:12897137. The SMC-Scp complex appears to be similar to the MukB-MukE-Muk-F complex in Escherichia coli PUBMED:10545099, where MukB () is the homologue of SMC. ScpA and ScpB have little sequence similarity to MukE () or MukF (), they are predicted to be structurally similar, being predominantly alpha-helical with coiled coil regions.

    \ \ \

    In general scpA and scpB form an operon in most bacterial genomes. Flanking genes are highly variable suggesting that the operon has moved throughout evolution. Bacteria containing an smc gene also contain scpA or scpB but not necessarily both. An exception is found in Deinococcus radiodurans, which contains scpB but neither smc nor scpA. In the archaea the gene order SMC-ScpA is conserved in nearly all species, as is the very short distance between the two genes, indicating co-transcription of the both in different archaeal genera and arguing that interaction of the gene products is not confined to the homologues in Bacillus subtilis. It would seem probable that, in light of all the studies, SMC, ScpA and ScpB proteins or homologues act together in chromosome condensation and segregation in all prokaryotes PUBMED:12100548.

    \ ' '2057' 'IPR005220' '\

    Proteins in this entry have an OB-fold fold (oligonucleotide/oligosaccharide binding motif). Analysis of the predicted nucleotide-binding site of the OB-fold suggests that they lack nucleic acid-binding properties. They contain an predicted N-terminal signal peptide which indicates that they localise to the periplasm where they may function to bind proteins, small molecules, or other typical OB-fold ligands. As hypothesised for the distantly related OB-fold containing bacterial enterotoxins, the loss of nucleotide-binding function and the rapid evolution of the OB-fold ligand-binding site may be associated with the presence of members in mobile genetic elements and their potential role in bacterial pathogenicity.

    \ ' '2058' 'IPR002708' '\

    This domain, whose function is not known, is about 320 residues long and is found in proteins that have two C-terminal CBS domains, . The\ protein is described as inosine-5\'-monophosphate dehydrogenase related protein VIII, based on the sequence simarity it shares to the CBS domains.

    \ ' '2059' 'IPR007228' '\

    This domain is found in a family of long proteins that are currently found only in rice. They have no known function. However they may be some kind of transposable element. There is a putative gypsy type transposon domain () towards the N terminus of the proteins.

    \ ' '2061' 'IPR007263' '\

    The DCC family, named after the conserved N-terminal DxxCxxC motif, encompasses COG3011. Proteins in this family are predicted to have a thioredoxin-like fold which, together with the presence of an invariant catalytic cysteine residue, suggests that they are a novel group of thiol-disulphide oxidoreductases PUBMED:15236740. As some of the bacterial proteins are encoded near penicillin-binding proteins, it has been suggested that these may be involved in redox regulation of cell wall biosynthesis PUBMED:15236740.

    \ ' '2063' 'IPR007272' '\ This entry includes YeeE and YedE from Escherichia coli. These proteins are integral membrane proteins of unknown function. Many of these proteins contain two homologous regions that are represented by this entry. This region contains several conserved glycines and an invariant cysteine that is probably an important functional residue.\ ' '2064' 'IPR007277' '\

    Erv26 is an integral membrane protein that is packed into COPII vesicles and cycles between the ER and Golgi compartments. It directs pro-alkaline phosphatase into endoplasmic reticulum-derived COPII transport vesicles PUBMED:16957051.

    \ ' '2065' 'IPR007278' '\ The function of this family is unknown. It has been suggested that some members of this family are regulators of transcription.\ ' '2066' 'IPR007314' '\ No function is known for any member of this family.\ ' '2068' 'IPR007293' '\ This domain is found in functionally uncharacterised proteins from such pathogenic bacteria as Helicobacter pylori, Campylobacter jejuni, and Vibrio cholerae. The H. pylori protein consists of two copies of this domain.\ ' '2069' 'IPR007294' '\ Members of this family are predicted to have 10 transmembrane regions.\ ' '2070' 'IPR007295' '\ This entry contains FomD, which is a predicted protein from a fosfomycin biosynthesis gene cluster in Streptomyces wedmorensis PUBMED:7500951. Its function is unknown.\ ' '2071' 'IPR007296' '\

    This is a domain of unknown function. It sometimes occurs singly or as the C-terminal domain, in combination with another two domains of unknown function: () and ().

    \ ' '2072' 'IPR007297' '\

    This is a domain of unknown function. It often occurs, as the N-terminal domain, in combination with either one or two domains of unknown function () and ().

    \ ' '2074' 'IPR005272' '\

    These small proteins are approximately 100 amino acids in length and appear to be found only in gamma proteobacteria. The function of this protein family is unknown.

    \ \ \ \ ' '2075' 'IPR007302' '\

    This is a domain of unknown function. It sometimes occurs in combination with ) and ().

    \ ' '2076' 'IPR007308' '\ This is a protein of unknown function.\ ' '2077' 'IPR007315' '\

    This is a family of eukaryotic ER membrane proteins that are involved in the synthesis of glycosylphosphatidylinositol (GPI), a glycolipid that anchors many proteins to the eukaryotic cell surface. Proteins in this family are involved in transferring the second mannose in the biosynthetic pathway of GPI PUBMED:15720390 PUBMED:15623507.

    \ ' '2078' 'IPR007317' '\ This is a family of conserved eukaryotic proteins with undetermined function.\ ' '2079' 'IPR007332' '\

    The function of the members of this bacterial protein family is unknown. Some members may be involved in conferring cation resistance.

    \ ' '2080' 'IPR007334' '\ This family consists of bacterial uncharacterised proteins.\ ' '2081' 'IPR007335' '\ This is a family of uncharacterised proteins.\ ' '2082' 'IPR007336' '\ This family includes several bacterial proteins of unknown function, although at least one member () is a putative coproporphyrinogen III oxidase.\ ' '2083' 'IPR007338' '\ This is a bacterial family of uncharacterised proteins.\ ' '2084' 'IPR007339' '\ This family of uncharacterised proteins appears to be restricted to proteobacteria.\ ' '2085' 'IPR007349' '\

    Tihs is a probable integral membrane protein. It is usually found associated with ().

    \ ' '2086' 'IPR007351' '\ This is a family of uncharacterised proteins.\ ' '2087' 'IPR007352' '\ This is a predicted membrane protein with four transmembrane helices.\ ' '2088' 'IPR007353' '\ This family of uncharacterised proteins is known as YDFR family\ ' '2089' 'IPR007354' '\ The proteins in this entry are predicted to be an integral membrane proteins.\ ' '2090' 'IPR007355' '\ This is an archaeal protein of unknown function.\ ' '2091' 'IPR007361' '\ This is a family of uncharacterised proteins.\ ' '2092' 'IPR007362' '\ This is a family of uncharacterised proteins.\ ' '2093' 'IPR002723' '\

    This family of prokaryotic proteins have not been characterised. All the members are 350-400 amino acids long.

    \ ' '2094' 'IPR007366' '\ This is an archaeal protein of unknown function.\ ' '2095' 'IPR007368' '\ This is a family of uncharacterised proteins.\ ' '2096' 'IPR007369' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of sequences contain aspartic endopeptidases belong to MEROPS peptidase family A22 (presenilin family, clan AD): subfamily A22B.

    \ \ \

    \ The peptidases were originally classified by hierarchical homology to the most conserved member - IMPAS 1. They are also known as signal peptide peptidase (SPP) PUBMED:14741365. They belong to the I-CliP family of peptidases. SPP cleaves cleaves remnant signal peptides left behind in the membrane by the action of signal peptidase and also plays key roles in immune surveillance and the maturation of certain viral proteins PUBMED:12966028. SPPs do not require cofactors as demonstrated by expression in bacteria and purification of a proteolytically active form. The C-terminal region defines the functional domain, which is in itself sufficient for proteolytic activity PUBMED:17517891.\ \

    \ ' '2097' 'IPR006340' '\

    Members of this family are uncharacterised proteins of about 180 amino acids from the Bacillus/Clostridium group of Gram-positive bacteria, found in no more than one copy per genome.

    \ ' '2098' 'IPR007374' '\

    The ASCH domain adopts a beta-barrel fold similar to that of the PUA domain (). It is thought to function as an RNA-binding domain during coactivation, RNA-processing and possibly during prokaryotic translation regulation PUBMED:16322048.

    \ ' '2099' 'IPR007380' '\ This is a a group of uncharacterised proteins.\ ' '2100' 'IPR007381' '\ This is an archaeal protein of unknown function.\ ' '2101' 'IPR007376' '\

    This entry represents hypothetical proteins such as HI1450, which is believed to act as a putative dsDNA mimic. HI1450 is an acidic protein with a core structure consisting of alpha(2)-beta(4), where the alpha-helices are packed against the side of an anti-parallel 4-stranded beta meander. As such, it has some similarity to the dsDNA mimics uracil-DNA glycosylase inhibitor and nuclease A inhibitor (NuiA), including the distribution of surface charges and the position of the hydrophobic cavity PUBMED:14747986. DNA mimics act to inhibit or regulate dsDNA-binding proteins.

    \ ' '2102' 'IPR007382' '\ This entry contains proteins of unknown function. They are predicted to be transmembrane proteins with 4 TM domains.\ ' '2103' 'IPR005939' '\

    Although this domain is uncharacterised it seems likely that it performs a phosphatase function.

    \ ' '2104' 'IPR005915' '\

    The members of this family share 50 % or greater sequence identity. They are found as eleven tandem genes, arranged head-to-tail, in Staphylococcus aureus (strain COL). Distant full-length homologs are found in a Staphylococcus haemolyticus plasmid and in Bacillus halodurans. The function of these proteins is unknown.

    \ ' '2105' 'IPR006698' '\

    These are Bacterial and Archaeal proteins of unknown function.

    \ ' '2106' 'IPR007383' '\ This entry contains proteins of unknown function. They are predicted to be transmembrane proteins with 2 or 3 TM domains.\ ' '2107' 'IPR007384' '\

    This family includes an N-terminal region of unknown function from the Erwinia carotovora exoenzyme regulation regulon orf1 protein, which also contains a domain found in RNA pseudouridylate synthase .

    \ ' '2108' 'IPR007386' '\ This entry contains archaeal and bacterial proteins of unknown function.\ ' '2109' 'IPR007393' '\

    This entry represents a group of uncharacterised proteins. Some member sequences retain zinc-binding residues. The structure of the hypothetical cytosolic protein SP0554 from Streptococcus pneumoniae revealed an alpha+beta fold that could have evolved from a glucocorticoid receptor-like zinc finger domain PUBMED:11679764.

    \ ' '2110' 'IPR002725' '\ Members of this family are found in some archaebacteria, as well as Helicobacter pylori. The proteins are 190-240 amino acids long, with the C terminus being the most conserved region, containing three conserved histidines.\ ' '2111' 'IPR007409' '\

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    Type I restriction endonucleases are components of prokaryotic DNA restriction-modification mechanisms that protects the organism against invading foreign DNA. Type I enzymes have three different subunits subunits - M (modification), S (specificity) and R (restriction) - that form multifunctional enzymes with restriction (), methylase () and ATPase activities PUBMED:15121719, PUBMED:12595133. The S subunit is required for both restriction and modification and is responsible for recognition of the DNA sequence specific for the system. The M subunit is necessary for modification, and the R subunit is required for restriction. These enzymes use S-Adenosyl-L-methionine (AdoMet) as the methyl group donor in the methylation reaction, and have a requirement for ATP. They recognise asymmetric DNA sequences split into two domains of specific sequence, one 3-4 bp long and another 4-5 bp long, separated by a nonspecific spacer 6-8 bp in length. Cleavage occurs a considerable distance from the recognition sites, rarely less than 400 bp away and up to 7000 bp away. Adenosyl residues are methylated, one on each strand of the recognition sequence. These enzymes are widespread in eubacteria and archaea. In enteric bacteria they have been subdivide into four families: types IA, IB, IC and ID.

    \

    Type III restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. Type III enzymes are hetero-oligomeric, multifunctional proteins composed of two subunits, Res and Mod. The Mod subunit recognises the DNA sequence specific for the system and is a modification methyltransferase; as such it is functionally equivalent to the M and S subunits of type I restriction endonuclease. Res is required for restriction, although it has no enzymatic activity on its own. Type III enzymes recognise short 5-6 bp long asymmetric DNA sequences and cleave 25-27 bp downstream to leave short, single-stranded 5\' protrusions. They require the presence of two inversely oriented unmethylated recognition sites for restriction to occur. These enzymes methylate only one strand of the DNA, at the N-6 position of adenosyl residues, so newly replicated DNA will have only one strand methylated, which is sufficient to protect against restriction. Type III enzymes belong to the beta-subfamily of N6 adenine methyltransferases, containing the nine motifs that characterise this family, including motif I, the AdoMet binding pocket (FXGXG), and motif IV, the catalytic region (S/D/N (PP) Y/F) PUBMED:15121719, PUBMED:12595133.

    \

    This entry represents the N-terminal domain found in both the R subunit (HsdR) of type I enzymes and the Res subunit of type III enzymes. The type I enzyme represented is EcoRI, which recognises the DNA sequence 5\'-GAATTC; the R protein (HsdR) is required for both nuclease and ATPase activity PUBMED:, PUBMED:, PUBMED:10449767, PUBMED:11555298.

    \

    This domain is often found adjacent to a methylase domain () in restriction endonucleases or methylases. In one of the proteins, , it is adjacent to a helicase domain () in a putative restriction endonuclease.

    \ ' '2113' 'IPR007398' '\ This is a family of uncharacterised proteins.\ ' '2114' 'IPR007400' '\

    PrpF is a protein found in the 2-methylcitrate pathway. It is structurally similar to DAP epimerase and proline racemase. This protein is likely to acts to isomerise trans-aconitate to cis-aconitate PUBMED:17567742.

    \ ' '2115' 'IPR007401' '\ This is a predicted membrane protein.\ ' '2116' 'IPR007402' '\ This is a family of uncharacterised proteins.\ ' '2117' 'IPR007403' '\ This is a family of putative membrane proteins.\ ' '2118' 'IPR007404' '\ This is a family of predicted transmembrane proteins.\ ' '2119' 'IPR007405' '\

    This is a family of uncharacterised, mainly bacterial, proteins. While the functions of these proteins are unknown, an analysis has suggested that they may form a novel family within the RNASE H-like superfamily PUBMED:16165328. These proteins appear to contain all the core secondary structural elements of the RNase H-like fold and share several conserved, possible active site, residues. It was suggested, therefore, that they function as nucleases. From the taxonomic distibution of these proteins it was further inferred that they may play a role in DNA repair under stressful conditions.

    \ ' '2120' 'IPR007407' '\ This is a putative periplasmic protein.\ ' '2121' 'IPR002726' '\ This archaebacterial protein has no known function. It\ contains several predicted transmembrane regions,\ suggesting it is an integral membrane protein.\ ' '2122' 'IPR007408' '\ This is an archaeal protein of unknown function.\ ' '2123' 'IPR007410' '\

    This entry represents a group of proteins of unknown function, including DR1885 from Deinococcus radiodurans and CC3502 from Caulobacter crescentus (Caulobacter vibrioides), which share a potential metal binding motif H(M)X10MX21HXM. DR1885 was found to bind copper(I) through a histidine and three Mets in a cupredoxin-like fold PUBMED:15753304. The surface location of the copper-binding site as well as the type of coordination are well poised for metal transfer chemistry, suggesting that DR1885 might transfer copper, taking the role of Cox17 in bacteria (Cox17 being an accessory protein required for correct assembly of eukaryotic cyochrome c oxidase).

    \ ' '2124' 'IPR007411' '\ This family consists of bacterial proteins of uncharacterised function.\ ' '2125' 'IPR007413' '\ Some members of this family are thought to possess an ATP-binding domain towards their N terminus.\ ' '2126' 'IPR007422' '\ This is a family of uncharacterised archaeal proteins.\ ' '2127' 'IPR007420' '\ Family members are found in small bacterial proteins, and also in the heavy chains of eukaryotic myosin and kinesin, C-terminal of the motor domain. Members of this family may form coiled coil structures.\ ' '2128' 'IPR007423' '\ This is a small bacterial protein of unknown function.\ ' '2129' 'IPR007421' '\

    AAA ATPases form a large, functionally diverse protein family belonging to the AAA+ superfamily of ring-shaped P-loop NTPases, which exert their activity through the energy-dependent unfolding of macromolecules. AAA ATPases contain a P-loop NTPase domain, which is the most abundant class of NTP-binding protein fold, and is found throughout all kingdoms of life PUBMED:15037234. P-loop NTPase domains act to hydrolyse the beta-gamma phosphate bond of bound nucleoside triphosphate. There are two classes of P-loop domains: the KG (kinase-GTPase) division, and the ASCE division, the latter including the AAA+ group as well as several other ATPases.

    \

    There are at least six major clades of AAA domains (metalloproteases, meiotic proteins, D1 and D2 domains of ATPases with two AAA domains, proteasome subunits, and BSC1), as well as several minor clades, some of which consist of hypothetical proteins PUBMED:15037233. The domain organisation of AAA ATPases consists of a non-ATPase N-terminal domain that acts in substrate recognition, followed by one or two AAA domains (D1 and D2), one of which may be degenerate.

    \

    This entry is related to , and presumably has the same function (ATP-binding). A number of the archaeal members of this group are annotated as ATP-dependent DNA helicases .

    \ ' '2130' 'IPR007414' '\ This is a family of uncharacterised yeast proteins.\ ' '2131' 'IPR007416' '\

    This entry represents a family of uncharacterised proteins which are predicted to function as phosphotransferases.

    \ ' '2132' 'IPR018445' '\ This family includes prokaryotic proteins of unknown function, as well as a protein annotated as the pit accessory protein from Rhizobium meliloti (Sinorhizobium meliloti) (). However, the function of this protein is also unknown (Pit stands for Phosphate transport) PUBMED:8013901.\ ' '2136' 'IPR007417' '\ This is a family of uncharacterised archaeal proteins.\ ' '2138' 'IPR007621' '\ This is a family of uncharacterised proteins. They are found in both eukarya and eubacteria. In eubacteria the region is towards the N-terminal of the protein and is accompanied by an N-terminal signal sequence. The C-terminal of eubacterial proteins typically contains one or more putative transmembrane regions. In eukaryotes the region is not accompanied by a signal sequence.\ ' '2139' 'IPR007429' '\ This family contains uncharacterised protein encoded on Trypanosomal kinetoplast minicircles.\ ' '2140' 'IPR007431' '\

    This entry contains the Escherichia coli gene yajB, now renamed acpH, which encodes an ACP hydrolase. AcpH converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine prosthetic group from ACP PUBMED:16107329.

    \

    A mutant E. coli strain having a total deletion of the acpH grows normally, showing that phosphodiesterase activity is not essential for growth, although it is required for turnover of the ACP prosthetic group in vivo. AcpH is found only in Gram-negative organisms suggesting that it plays a role in some aspect of lipid metabolism that is unique to these organisms. The most obvious of which is biosynthesis of lipid A. Because AcpH is a hydrolase, it could possibly be an editing enzyme that intercepts acyl-ACPs that would give an inappropriate lipid A structure if used as acyl donors PUBMED:16107329.

    \ ' '2141' 'IPR002729' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This family of proteins are found in archaea and bacteria and are, as yet, functionally uncharacterised. It is one of four protein families in prokaryotic genomes that contain multiple CRISPR elements. CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeats. The cas genes are found near the repeats PUBMED:11952905. This protein is otherwise uncharacterised.

    \ ' '2142' 'IPR007432' '\ This family consists of several proteins of uncharacterised function.\ ' '2143' 'IPR007433' '\ This family includes several proteins of uncharacterised function.\ ' '2144' 'IPR007434' '\ This family contains several proteins of uncharacterised function.\ ' '2145' 'IPR007556' '\ This is a family of uncharacterised prokaryotic proteins.\ ' '2146' 'IPR007435' '\ This family consists of several proteins of uncharacterised function.\ ' '2147' 'IPR007436' '\ This family includes several putative integral membrane proteins.\ ' '2148' 'IPR007437' '\ This family contains several proteins of uncharacterised function.\ ' '2149' 'IPR007438' '\ This family includes several proteins of uncharacterised function.\ ' '2150' 'IPR007451' '\ Protein of unknown function, cotranscribed with purB in Escherichia coli, but with function unrelated to purine biosynthesis PUBMED:8969519.\ ' '2151' 'IPR007452' '\ This family contains several proteins of uncharacterised function.\ ' '2154' 'IPR007456' '\ Members of this family of uncharacterised proteins are often named Smg.\ ' '2155' 'IPR007457' '\

    The protein represented by this entry, YggX, serves to protect Fe-S clusters from oxidative damage PUBMED:11416172. The effect is two-fold: proteins that rely on Fe-S clusters do not become inactivated, and the release of free iron and hydrogen peroxide--a DNA damaging agent--is prevented. These observations are consistent with the hypothesis that YggX chelates free iron, and recent experiments show that YggX can indeed bind Fe(II) in vitro and in vivo PUBMED:12670952. Furthermore, YggX has a positive effect on the action of at least one Fe(II)-responsive protein. The combined actions of YggX is reminiscent of iron trafficking proteins PUBMED:12033438, and YggX is therefore proposed to play a role in Fe(II) trafficking PUBMED:12670952. In Escherichia coli, YggX was shown to be under the transcriptional control of the redox-sensing SoxRS system PUBMED:14594836.\

    \ ' '2156' 'IPR007458' '\ Members of this family are uncharacterised proteins.\ ' '2157' 'IPR007460' '\ Members of this family are uncharacterised proteins.\ ' '2158' 'IPR007523' '\

    This is a family of uncharacterised proteins possibly involved in DNA repair. The crystal structure of this protein revealed a 3-layer beta+alpha/beta/alpha topology PUBMED:11746696.

    \ ' '2159' 'IPR001142' '\ A number of uncharacterised integral membrane proteins from yeast contain an internal duplication due to duplicated genes. Duplicated copies of genes may be classified in two types of cluster organization. The first type includes genes sharing a significant level of identity in the amino acid sequences of their predicted protein product. They are recovered on two different chromosomes, transcribed in the same orientation and the distance between them is conserved. The second type of cluster is based on one gene unit tandemly repeated. This duplication is itself repeated elsewhere in the genome. The basic gene unit is recovered many times in the genome and is a component of a multigene family of unknown function. These organizations in clusters of genes suggest a \'Lego organization\' of the yeast chromosomes PUBMED:9234674. The proteins belonging to this family are of unknown function.\ ' '2160' 'IPR007555' '\ This is a family of uncharacterised hypothetical prokaryotic proteins.\ ' '2161' 'IPR007461' '\

    This entry corresponds to proteins having the Ysc84 actin binding domain (YAB). This 184 amino acid domain lies at the N-terminus of the Saccharomyces cerevisiae (Baker\'s yeast) protein Ysc84 (). It is essential for the organization of the actin cytoskeleton, and interacts with the Arp2/3 complex PUBMED:10512884. Homologous domains are found across a range of species. In fungi and vertebrates the domain is at the N-terminus, while there is an SH3 domain at the C-terminus. In plants the domain seems to be at the C-terminus and in association with a FYVE domain. Interestingly, the domain is absent in invertebrates.

    \ \

    The domain is also found in prokaryotes, where presumable it is also involved in protein binding, perhaps to the prokaryotic homologue of actin PUBMED:11544518.

    \ ' '2162' 'IPR007511' '\ This is a family of uncharacterised bacterial proteins.\ ' '2163' 'IPR007462' '\ This entry contains proteins that are predicted to be integral membrane proteins.\ ' '2164' 'IPR007546' '\

    This is a family of conserved hypothetical bacterial proteins, including TT1725 from Thermus thermophilus (strain HB8 / ATCC 27634 / DSM 579), which has a ferredoxin-like alpha+beta-sandwich fold PUBMED:14579367.

    \ ' '2165' 'IPR007547' '\ This is a family of uncharacterised proteins.\ ' '2166' 'IPR007548' '\ This is a family of uncharacterised prokaryotic proteins.\ ' '2167' 'IPR007463' '\ This is a bacterial protein of unknown function.\ ' '2168' 'IPR007465' '\ This is a family of uncharacterised proteins from Caenorhabditis elegans.\ ' '2170' 'IPR002733' '\

    The contiguous gene deletion syndrome is characterised by Alport syndrome (A), mental retardation (M), midface hypoplasia (M), and elliptocytosis (E), as well as generalized hypoplasia and cardiac abnormalities. It is caused by a deletion in Xq22.3, comprising several genes including AMME chromosomal region gene 1 (AMMECR1), which encodes a protein with a nuclear location and presently unknown function. The C-terminal region of AMMECR1 (from residue 122 to 333) is well conserved, and homologues appear in species ranging from bacteria and archaea to eukaryotes. The high level of conservation of the AMMECR1 domain points to a basic cellular function, potentially in either the transcription, replication, repair or translation machinery PUBMED:10049589, PUBMED:10828604.

    \

    \ The AMMECR1 domain contains a 6-amino-acid motif (LRGCIG) that might be functionally important since it is strikingly conserved throughout evolution PUBMED:10049589. The AMMECR1 domain consists of two distinct subdomains of different sizes. The large subdomain, which contains both the N- and C-terminal regions, consists of five alpha-helices and five beta-strands. These five beta-strands form an antiparallel beta-sheet. The small subdomain consists of four alpha-helices and three beta-strands, and these beta-strands also form an antiparallel beta-sheet. The conserved \'LRGCIG\' motif is located at beta(2) and its N-terminal loop, and most of the side chains of these residues point toward the interface of the two subdomains. The two subdomains are connected by only two loops, and the interaction between the two subdomains is not strong. Thus, these subdomains may move dynamically when the substrate enters the cleft. The size of the cleft suggests that the substrate is large, e.g., the substrate may be a nucleic acid or protein. However, the inner side of the cleft is not filled with positively charged residues, and therefore it is unlikely that negatively charged nucleic acids such as DNA or RNA interact at this site PUBMED:15558565.

    \ ' '2172' 'IPR007468' '\ This is a bacterial protein of unknown function.\ ' '2173' 'IPR007549' '\

    This is a domain of uncharacterised prokaryotic proteins. It is often found C-terminal to the radical SAM domain ().

    \ ' '2174' 'IPR007470' '\

    The majority of proteins in this family are annotated as uroporphyrin-III C-methyltransferase () PUBMED:3062586; however, there is no direct evidence to support this annotation for these proteins, which come from mainly pathogenic Gram-negative organisms. There is some evidence to suggest that the proteins are membrane anchored as they have a predicted N-terminal signal peptide and transmembrane domain and may be involved in haem transport PUBMED:.

    \ ' '2176' 'IPR007509' '\ This is a family of hypothetical archaeal proteins.\ ' '2177' 'IPR007508' '\

    D-aminoacyl-tRNA deacylases hydrolyse the ester bond between the polynucleotide and the D-amino acid, thereby preventing the accumulation of such mis-acylated and metabolically inactive tRNA molecules. Several aminoacyl-tRNA synthetases have the ability to transfer the D-isomer of their amino acid onto their cognate tRNA.

    \ ' '2180' 'IPR007473' '\ This is a bacterial protein of unknown function, possibly secreted.\ ' '2181' 'IPR007551' '\ This is a family of uncharacterised proteins.\ ' '2182' 'IPR007506' '\ This is a group of hypothetical proteins.\ ' '2184' 'IPR007553' '\ This entry contains uncharacterised bacterial proteins.\ ' '2185' 'IPR007505' '\

    This domain has been identified as a member of the PD-(D/E)XK nuclease superfamily through transitive meta profile searches PUBMED:17584917. The domain has two additional beta-strands inserted to the core fold after the first core alpha-helix. It has been speculated that it could function as s methylation-dependent restriction PUBMED:17584917.

    \ ' '2186' 'IPR007474' '\

    This domain is found in the bacterial protein ApaG and at the C termini of some F-box proteins (). F-box proteins contain a carboxy-terminal domain that interacts with protein substrates PUBMED:10531037. The ApaG domain is ~125 amino acids in length, and is named after the bacterial ApaG protein, of which it forms the core. The Salmonella typhimurium ApaG domain protein, CorD, is involved in Co(2+) resistance and Mg(2+) efflux. Tertiary structures from different ApaG proteins show a fold of several beta-sheets. The ApaG domain may be involved in protein-protein interactions which could be implicated in \ substrate-specificity PUBMED:1779764, PUBMED:10945468, PUBMED:15213450.

    \ ' '2187' 'IPR007475' '\ This is a family of uncharacterised proteins.\ ' '2188' 'IPR007479' '\

    Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S] PUBMED:16221578. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.

    \

    The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins PUBMED:16211402, PUBMED:16843540. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly PUBMED:15937904.

    \

    The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA PUBMED:17350000. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA PUBMED:15278785, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.

    \

    In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins PUBMED:11498000. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen PUBMED:8875867.

    \ \

    This entry represents IscX proteins (also known as hypothetical protein YfhJ) that are part of the ISC system. IscX is active as a monomer. The structure of YfhJ is an orthogonal alpha-bundle PUBMED:15937904. YfhJ is a small acidic protein that binds IscS, and contains a modified winged helix motif that is usually found in DNA-binding proteins PUBMED:16698547. YfhJ/IscX can bind Fe, and may function as an Fe donor in the assembly of FeS clusters

    \ ' '2189' 'IPR007480' '\ This entry represents a repeated region found in several Theileria parva proteins.The repeat is normally about 70 residues long and contains a conserved aromatic residue in the middle.\ ' '2190' 'IPR007503' '\ This is a family of hypothetical archaeal proteins.\ ' '2191' 'IPR007501' '\ This is a family of hypothetical archaeal proteins.\ ' '2192' 'IPR007486' '\ Some family members may be secreted or integral membrane proteins.\ ' '2193' 'IPR007487' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    \

    This family contains many hypothetical proteins and some ABC transporter substrate binding proteins.

    \ ' '2194' 'IPR007488' '\

    Family member Shigella flexneri VirK () is a virulence protein required for the expression, or correct membrane localisation of IcsA (VirG) on the bacterial cell surface PUBMED:1406277, PUBMED:11115111. This family also includes Pasteurella haemolytica lapB (), which is thought to be membrane-associated.

    \ ' '2195' 'IPR007489' '\ This is a C-terminal region from several bacterial proteins of unknown function that may be involved in a theta-type replication mechanism.\ ' '2196' 'IPR007493' '\ This family consists of several plant proteins of unknown function.\ ' '2197' 'IPR007495' '\ This is a family of putative periplasmic proteins.\ ' '2198' 'IPR002739' '\

    These archaebacterial proteins have no known function.

    \ ' '2199' 'IPR007496' '\

    This is an uncharacterised bacterial integral membrane protein, possibly involved in cysteine biosynthesis. It is speculated to be involved in sulphate transport.

    \ ' '2200' 'IPR007497' '\ Members of this family have so far been found in bacteria and mouse UniProtKB/Swiss-Prot or UniProtKB/TrEMBL entries. However possible family members have also been identified in translated rat (GenBank:AW144450) and human (GenBank:AI478629) ESTs. A mouse family member has been named SIMPL (signalling molecule that associates with mouse pelle-like kinase). SIMPL appears to facilitate and/or regulate complex formation between IRAK/mPLK (IL-1 receptor-associated kinase) and IKK (inhibitor of kappa-B kinase) containing complexes, and thus regulate NF-kappa-B activity PUBMED:11096118. Separate experiments demonstrate that a mouse family member (named LaXp180) binds the Listeria monocytogenes surface protein ActA, which is a virulence factor that induces actin polymerisation. It may also bind stathmin, a protein involved in signal transduction and in the regulation of microtubule dynamics PUBMED:11207567. In bacteria its function is unknown, but it is thought to be located in the periplasm or outer membrane.\ ' '2201' 'IPR007512' '\ This family of short eukaryotic proteins has no known function. Most of the members of this family are only 80 amino acid residues long. However the Arabidopsis homologue is over 300 residues long. These proteins contain a conserved N-terminal cysteine and a conserved motif GXGXGXG in the carboxy terminal half that may be functionally important.\ ' '2202' 'IPR007518' '\ This is a eukaryotic protein of unknown function.\ ' '2203' 'IPR007531' '\

    Dysbindin is an evolutionary conserved 40-kDa coiled-coil-containing protein that binds to alpha- and beta-dystrobrevin in muscle and brain. Dystrophin and alpha-dystrobrevin are co-immunoprecipitated with dysbindin, indicating that dysbindin is DPC-associated in muscle. Dysbindin co-localises with alpha-dystrobrevin at the sarcolemma and is up-regulated in dystrophin-deficient muscle. In the brain, dysbindin is found primarily in axon bundles and especially in certain axon terminals, notably mossy fibre synaptic terminals in the cerebellum and hippocampus. Dysbindin may have implications for the molecular pathology of Duchenne muscular dystrophy and may provide an alternative route for anchoring dystrobrevin and the DPC to the muscle membrane PUBMED:2098102. Genetic variation in the human dysbindin gene is also thought to be associated with Schizophrenia PUBMED:11316798.

    .\ ' '2204' 'IPR007536' '\ This is a protein of unknown function found in proteobacteria. In Salmonella typhimurium, expression of this protein is regulated by heat shock PUBMED:10629202.\ ' '2205' 'IPR007537' '\

    The Thg1 protein from Saccharomyces cerevisiae (Baker\'s yeast) is responsible for adding a GMP residue to the 5\' end of tRNA His PUBMED:14633974.

    \ ' '2206' 'IPR002740' '\

    This family of proteins that have no known function.

    \ ' '2207' 'IPR007538' '\ This entry represents the N terminus of a protein of unknown function, found in dsDNA viruses with no RNA stage, including bacteriophages lambda and P22, and also in some Escherichia coli prophages.\ ' '2208' 'IPR007539' '\ This entry represents the C terminus of a protein of unknown function, found in dsDNA viruses with no RNA stage, including bacteriophages lambda and P22, and also in some Escherichia coli prophages.\ ' '2209' 'IPR007561' '\

    This entry represents a cell division protein, designated SepF, which is conserved in Gram-positive bacteria. SepF accumulates at the cell division site in an FtsZ-dependent manner and is required for proper septum formation PUBMED:16420366. Mutants are viable but the formation of the septum is much slower and occurs with a very abnormal morphology.

    \ ' '2210' 'IPR007562' '\

    This entry represents a transglutaminase-like domain found in a family of uncharacterised archaeal proteins that had previously been called DUF553 and UPF0252.

    \ ' '2211' 'IPR007563' '\

    This is a family of uncharacterised prokaryotic proteins. Multiple predicted transmembrane regions suggest that the protein is membrane associated.

    \ ' '2212' 'IPR007564' '\ This is a family of uncharacterised, hypothetical archaeal proteins.\ ' '2213' 'IPR007565' '\ This is a family of uncharacterised, hypothetical prokaryotic proteins.\ ' '2215' 'IPR006700' '\

    Methyltransferases (Mtases) are responsible for the transfer of methyl groups between two molecules. The transfer of the methyl group from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms. The reaction is catalyzed by Mtases and modifies DNA, RNA, proteins or small molecules, such as catechol, for regulatory purposes. Proteins in this entry belong to the RsmE family of Mtases, this is supported by crystal structural studying, which show a close structural homology to other known methyltransferases PUBMED:14517985.

    \ \

    This entry contains RsmE of Escherichia coli, which specifically methylates the uridine in position 1498 of 16S rRNA in the fully assembled 30S ribosomal subunit PUBMED:16431987, PUBMED:7872509.

    \ ' '2216' 'IPR007569' '\ This is a family of uncharacterised proteins.\ ' '2217' 'IPR007655' '\ This is a family of hypothetical bacterial proteins.\ ' '2218' 'IPR007570' '\ This is a protein of unknown function found in a cyanobacterium, and the chloroplasts of algae.\ ' '2219' 'IPR006850' '\ This represents a conserved region found in a number of Chlamydophila pneumoniae proteins.\ ' '2220' 'IPR007657' '\

    This is a family of uncharacterised glycosyltransferases belonging to glycosyltransferase family 61. Sequences are further processed into a mature form.

    \ ' '2222' 'IPR007572' '\ This is a predicted transmembrane protein found in plants, chloroplasts and cyanobacteria. This family is also known as YCF20.\ ' '2223' 'IPR007612' '\ This is a family of plant and bacterial uncharacterised proteins.\ ' '2224' 'IPR007679' '\ This is a family of hypothetical proteins. Some family members contain two copies of the region.\ ' '2225' 'IPR002743' '\ This archaebacterial and bacterial protein family has no known function.\ ' '2226' 'IPR007578' '\ This is a protein of unknown function, found in herpesvirus and cytomegalovirus.\ ' '2227' 'IPR006707' '\ This is a family of hypothetical bacterial proteins.\ ' '2228' 'IPR006764' '\ This is a family of uncharacterised proteins.\ ' '2229' 'IPR006835' '\ This represents a conserved region found in a number of Chlamydophila pneumoniae proteins.\ ' '2230' 'IPR007595' '\

    This family contains several uncharacterised staphylococcal proteins. Members of this family are mostly predicted lipoproteins, found in Staphylococcus aureus but are also found clustered in Staphylococcus epidermidis.

    \ ' '2231' 'IPR007598' '\ This is a family of Arabidopsis thaliana (Mouse-ear cress) proteins. Many of these members contain a repeated region.\ ' '2233' 'IPR006514' '\

    These sequences contain an uncharacterised domain found in both Arabidopsis thaliana(at least 10 copies) and Oryza sativa (Rice). Most member proteins have only a short stretch of sequence N-terminal to this domain, but one has a long N-terminal extension that includes a protein kinase domain ().

    \ ' '2234' 'IPR002881' '\

    This domain is found in a family of prokaryotic proteins that have no known function. Proteins belonging to this family include hypothetical proteins from eubacteria and archaebacteria. Some of these proteins also contain the Von Willebrand factor, type A domain (see ).

    \ ' '2235' 'IPR007603' '\ This is a family of uncharacterised proteins.\ ' '2236' 'IPR007650' '\ This is a family of uncharacterised proteins.\ ' '2237' 'IPR007606' '\ This family contains several uncharacterised chlamydial proteins.\ ' '2238' 'IPR007607' '\ This family contains several uncharacterised hypothetical proteins.\ ' '2239' 'IPR007608' '\ This family contains several uncharacterised proteins.\ ' '2240' 'IPR007610' '\

    This region represents the N-termini of bromovirus 2a protein, and is always found N-terminal to a predicted RNA dependent RNA polymerase region ().

    \ ' '2242' 'IPR007618' '\ This domain is found at the N-termini of some human herpesvirus U58 proteins, and some cytomegalovirus UL87 proteins. This region is always found N-terminal to the UL87 (), which has no known function.\ ' '2244' 'IPR007649' '\ This entry represents a conserved region in a number of uncharacterised plant proteins.\ ' '2245' 'IPR007654' '\ This N-terminal region is found in SIR2 proteins () and its homologues. Its function is uncharacterised.\ ' '2246' 'IPR007658' '\ This is a family of uncharacterised proteins.\ ' '2248' 'IPR007670' '\ This family contains several uncharacterised proteins.\ ' '2249' 'IPR006734' '\ This family includes a conserved region in several uncharacterised plant proteins.\ ' '2251' 'IPR006747' '\ This family includes several uncharacterised proteins.\ ' '2252' 'IPR000620' '\ This domain is found in proteins including the Erwinia chrysanthemi PecM protein, which is involved in pectinase, cellulase and blue pigment regulation; and the Salmonella typhimurium PagO protein, the function of which is unknown. Many members of this family have no known function and are predicted to be integral membrane proteins and many of the proteins contain two copies of the domain.\ ' '2253' 'IPR006728' '\ This conserved region is found in several uncharacterised proteins from Gram-positive bacteria.\ ' '2254' 'IPR006739' '\ This family includes several uncharacterised proteins from Borrelia species.\ ' '2255' 'IPR006740' '\ This family includes a conserved region found in several uncharacterised plant proteins.\ ' '2256' 'IPR006750' '\

    This family contains uncharacterised bacterial proteins.

    \ ' '2257' 'IPR002746' '\ These are a family of hypothetical proteins are found in Archaebacteria and have no known function.\ ' '2258' 'IPR006837' '\ This is a family of uncharacterised proteins that includes YibQ.\ ' '2260' 'IPR006851' '\ This is a family of chloroplast proteins of unknown function. Some members have two copies of the conserved region.\ ' '2261' 'IPR006839' '\ This family of bacterial proteins has no known function.\ ' '2262' 'IPR006852' '\ This is a family of uncharacterised proteins.\ ' '2263' 'IPR006855' '\

    This region of unknown function is found at the C terminus of Neurospora crassa acetylglutamate synthase (). It is also found C-terminal to the amino acid kinase region in some fungal acetylglutamate kinase enzymes (). These enzymes play a role in arginine biosynthesis.

    \ ' '2264' 'IPR002747' '\ Protein found in Archaebacteria and Bacteria. These proteins have no known function.\ ' '2265' 'IPR006873' '\ This is a family of uncharacterised proteins.\ ' '2266' 'IPR006907' '\ This family includes several uncharacterised mouse proteins.\ ' '2267' 'IPR006458' '\

    This group of sequences contain an uncharacterised domain of about 70 residues found exclusively in plants, generally toward the C terminus of proteins of 200 to 350 amino acids in length. At least 14 such proteins are found in Arabidopsis thaliana (Mouse-ear cress). Other regions of these proteins tend to consist largely of low-complexity sequence. Function is not known.

    \ ' '2268' 'IPR006938' '\

    This family consists of uncharacterised or hypothetical bacterial proteins.

    \ ' '2269' 'IPR006462' '\

    These sequences comprise a paralogous family of hypothetical proteins in Arabidopsis thaliana (Mouse-ear cress). No homologs are detected from other species. Length heterogeneity within the family is attributable partly to a 21-residue repeat present in from zero to three tandem copies. The proteins have no known function.

    \ ' '2270' 'IPR006866' '\ This domain represents the N-terminal region of several plant proteins of unknown function.\ ' '2271' 'IPR006865' '\ This domain represents a region of several plant proteins of unknown function. A C2H2 zinc finger is predicted in this region in some family members, but the spacing between the cysteine residues is not conserved throughout the family.\ ' '2272' 'IPR002749' '\ These proteins of unknown function are found in archaebacteria and are\ probably transmembrane proteins.\ ' '2273' 'IPR006901' '\

    This is a family of uncharacterised bacterial proteins.

    \ ' '2274' 'IPR006912' '\ This family of plant proteins have no known function.\ ' '2275' 'IPR006913' '\

    Glutathione-dependent formaldehyde-activating enzymes catalyze the condensation of formaldehyde and glutathione to S-hydroxymethylglutathione. All known members of this family contain 5 strongly conserved cysteine residues.

    \ ' '2276' 'IPR006915' '\

    This group of sequences from Pseudomonas aeruginosa PUBMED:11021912 and Neisseria meningitidis contain a conserved region which is often associated with a second conserved domain, . These proteins may have haemagglutinin or haemolysin activity. Filamentous haemagglutinin (FHA) is a major virulence attachment factor produced by certain bacterial species that functions as both a primary adhesin and an immunomodulator. Haemolysin is pore-forming toxin.

    \ ' '2277' 'IPR006914' '\

    This group of proteins, mainly from Neisseria meningitidis, may have haemagglutinin or haemolysin activity. A number of them have a second conserved domain, , which is found in possible Pseudomonas aeruginosa haemagglutinins PUBMED:11021912. Filamentous haemagglutinin (FHA) is a major virulence attachment factor produced by certain bacterial species that functions as both a primary adhesin and an immunomodulator. Haemolysin is pore-forming toxin.

    \ ' '2278' 'IPR006927' '\

    The sequences in this family are plant proteins of unknown function.

    \ ' '2280' 'IPR006936' '\ This conserved region is found in plant proteins including the resistance protein-like protein ().\ ' '2281' 'IPR006946' '\ This family contains a conserved region found in a number of uncharacterised plant proteins.\ ' '2282' 'IPR006951' '\ These are proteins of unknown function found in Borrelia burgdorferi (Lyme disease spirochete)\ ' '2283' 'IPR006954' '\ This family contains a conserved region found in a number of uncharacterised Caenorhabditis elegans proteins.\ ' '2284' 'IPR006959' '\ This family contains uncharacterised proteins from Vibrio cholerae.\ ' '2285' 'IPR006967' '\ This family of proteins is found in the caudovirales and prophage. It may be a head/tail component or be involved in tail assembly.\ ' '2286' 'IPR006974' '\

    This is a family of hypothetical proteins from Chlamydia pneumoniae.

    \ ' '2287' 'IPR006978' '\ This conserved region is found in the N-terminal region of a number of conserved archaeal proteins of unknown function.\ ' '2288' 'IPR006979' '\

    This conserved region is found in the C-terminal region of a number of conserved archaeal proteins of unknown function.

    \ ' '2289' 'IPR006994' '\

    This entry appears to represent a novel family of basic helix-loop-helix (bHLH) proteins that control differentiation and development of a variety of organs PUBMED:16574069, PUBMED:12107429.

    \ \

    Human Nulp1 () is a basic helix-loop-helix protein expressed broadly during early embryonic organogenesis. Over expression of human Nulp1 in COS-7 cells inhibits the transcriptional activity of serum \ response factor (SRF), suggesting that Nulp1 may act as a novel bHLH transcriptional repressor in the SRF signalling pathway to mediate cellular functions PUBMED:12107429.

    \ \ ' '2290' 'IPR007003' '\ This family includes several uncharacterised archaeal proteins.\ ' '2291' 'IPR007004' '\ This is a family of hypothetical proteins, the majority is from Beet necrotic yellow vein virus.\ ' '2293' 'IPR007020' '\

    These are proteins of unknown function found in Lactococcus lactis and in their associated bacteriophage.

    \ ' '2294' 'IPR007021' '\ These are transposase-like proteins with no known function.\ ' '2296' 'IPR007023' '\

    This is a family of eukaryotic ribosomal biogenesis regulatory proteins.

    \ ' '2298' 'IPR007061' '\ This family is commonly found in Streptomyces coelicolor and is of unknown function. These proteins contain several conserved histidines at their N-terminus that may form a metal binding site.\ ' '2300' 'IPR007703' '\ This family contains several uncharacterised viral proteins of unknown function.\ ' '2301' 'IPR007714' '\ This family of proteins are highly conserved in eukaryotes. Some proteins in the family are annotated as transcription factors. However, there is currently no support for this in the literature.\ ' '2302' 'IPR007700' '\ This is a family of uncharacterised plant proteins of unknown function.\ ' '2303' 'IPR007731' '\ This is a family of phage proteins of unknown function.\ ' '2305' 'IPR007744' '\ This family includes several proteins of unknown function and seems to be specific to Caenorhabditis elegans.\ ' '2306' 'IPR007748' '\ This is a family of uncharacterised viral proteins of unknown function.\ ' '2307' 'IPR007750' '\ This family is found in Arabidopsis thaliana and contains uncharacterised proteins.\ ' '2308' 'IPR007749' '\ This family consists of AT14A like proteins from Arabidopsis thaliana. At14a contains a small domain that has sequence similarities to integrins from fungi, insects and humans. Transcripts of At14a are found in all Arabidopsis tissues and the protein localises partly to the plasma membrane PUBMED:10196471.\ ' '2309' 'IPR007769' '\ This family contains poxvirus proteins belonging to the A19 family. The proteins are of unknown function.\ ' '2310' 'IPR007770' '\ This family contains uncharacterised plant proteins of unknown function.\ ' '2311' 'IPR007771' '\ This family contains uncharacterised proteins which seem to be found exclusively in Rhizobium loti (Mesorhizobium loti).\ ' '2312' 'IPR007772' '\ This family contains uncharacterised beak and feather disease virus proteins.\ ' '2313' 'IPR007773' '\ This family consists of uncharacterised baculovirus proteins.\ ' '2314' 'IPR007774' '\ This family contains several uncharacterised bacterial proteins. These proteins are found in nitrogen fixation operons, so are likely to play a role in this process.\ ' '2315' 'IPR007767' '\ This family contains uncharacterised proteins from Caenorhabditis elegans.\ ' '2316' 'IPR007777' '\ This family consists of uncharacterised proteins from Borrelia species. There is some evidence to suggest that the proteins may be outer surface proteins.\ ' '2317' 'IPR007784' '\ This family consists of uncharacterised baculovirus proteins.\ ' '2318' 'IPR007787' '\ This family contains uncharacterised Chlamydia proteins.\ ' '2319' 'IPR007785' '\ This family contains several uncharacterised eukaryotic proteins of unknown function.\ ' '2320' 'IPR007823' '\ This family consists of uncharacterised eukaryotic proteins which are related to S-adenosyl-L-methionine-dependent methyltransferases.\ ' '2321' 'IPR007801' '\ This family consists of uncharacterised bacterial proteins.\ ' '2322' 'IPR007800' '\ This family consists of uncharacterised proteins from Borrelia burgdorferi.\ ' '2323' 'IPR006482' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents TM1801-type Cas proteins found in at least five species that contain CRISPR loci situated exclusively next to other cas genes. The function of this Cas family is unknown.

    \ ' '2324' 'IPR016097' '\

    This entry is found at the N-terminus of a number of proteobacterial proteins of unknown function.

    \ ' '2325' 'IPR006507' '\

    These are putative membrane proteins from alpha and gamma proteobacteria, each making up their own clade. The two clades have less than 25% identity between them.

    \ ' '2326' 'IPR002760' '\

    This family contains archaebacterial proteins of unknown function. Members of this\ family may be transmembrane proteins.

    \ ' '2327' 'IPR007818' '\ This is a family of plant proteins of unknown function.\ ' '2329' 'IPR007827' '\ This family contains uncharacterised baculoviral proteins.\ ' '2330' 'IPR007828' '\ This is a family of uncharacterised eukaryotic proteins. Some members have a described putative function, but a common theme is not evident.\ ' '2331' 'IPR007877' '\ This family consists of uncharacterised proteins from Arabidopsis thaliana.\ ' '2332' 'IPR007840' '\ This family of proteins formerly called DUF709 includes the Escherichia coli gene ycgL. Homologues of YcgL are found in gammaproteobacteria. The structure of this protein shows a novel alpha/beta/alpha sandwich structure PUBMED:17221885.\ ' '2333' 'IPR002761' '\

    This domain is about 200 amino acids long with a strongly conserved motif\ SGGKD at the N-terminal. The structure of Q8U2K6 from Pyrococcus furiosus has been resolved to 2.7A and is suggested to be a putative N-type pytophosphatase.

    \ \ \ \ \

    In some members of the family e.g.\ , this domain is associated with , another domain of unknown function. Proteins with this uncharacterised domain include two apparent ortholog families in the archaea, one of which is universal among the first four completed archaeal genomes. The domain comprises the full length of the archaeal proteins and the first third of fungal proteins.

    \ \ ' '2334' 'IPR007838' '\

    This entry represents a structural protein found in the cell division protein ZapA, as well as in related proteins. This domain has a core structure consisting of two layers alpha/beta, and has a long C-terminal helix that forms dimeric parallel and tetrameric antiparallel coiled coils PUBMED:15288790. ZapA interacts with FtsZ, where FtsZ is part of a mid-cell cytokinetic structure termed the Z-ring that recruits a hierarchy of fission related proteins early in the bacterial cell cycle. ZapA drives the polymerisation and filament bundling of FtsZ, thereby contributing to the spatio-temporal tuning of the Z-ring.

    \ ' '2335' 'IPR007841' '\ The proteins in this family are functionally uncharacterised. The proteins are around 450 amino acids long.\ ' '2336' 'IPR007842' '\

    The HEPN (higher eukaryotes and prokaryotes nucleotide-binding) domain is a\ region of 110 residues found in the C-terminus of sacsin, a chaperonin\ implicated in an early-onset neurodegenerative disease in human, and in many\ bacterial and archeabacterial proteins. There are three classes of proteins\ with HEPN domain:

    \ \
  • Single-domain HEPN proteins found in many bacteria.
  • \
  • Two-domain proteins with N-terminal nucleotidyltransferase (NT) and C-\ terminal HEPN domains. This N-terminal NT domain belongs to a large family\ of NTs, which includes several classes of enzymes that are responsible for\ some types of bacterial resistance to aminoglycosides. These enzymes\ deactivate various antibiotics by transferring a nucleotidyl group to the\ drug.
  • \
  • A multidomain sacsin protein in genomes of fish and mammals. The HEPN\ domain is located at the C-terminus of the protein, directly after the DnaJ\ domain (see ).
  • \ \

    The crystal structure of the HEPN domain from the TM0613 protein of Thermotoga maritima indicates that it is structurally similar to the C-terminal all-\ alpha-helical domain of kanamycin nucleotidyltransferases (KNTases). It is composed of five alpha helices, three of which form an up-\ and-down helical bundle, with a pair of short helices on the side. The distant\ structural similarity suggests that the HEPN domain might be involved in\ nucleotide binding PUBMED:12765831.

    \ ' '2337' 'IPR007871' '\

    This family of eukaryotic proteins specifically methylates guanosine-4 in various tRNAs with a Gly(CCG), His or Pro signatures. The alignment contains some conserved cysteines and histidines that might form a zinc binding site.

    \ ' '2338' 'IPR002763' '\ The function of this family is unknown. Aquifex aeolicus has two copies of this protein. A probable aspartyl-tRNA synthetase from Escherichia coli PUBMED:2129559 belongs to this group.\ ' '2339' 'IPR002764' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents the Csa2 (CRISPR/Cas subtype protein 2) family of proteins, which includes MJ0381 from Methanocaldococcus jannaschii (Methanococcus jannaschii). This archaeal clade is a member of the DevR family, which includes the DevR protein of Myxococcus xanthus, a protein whose expression appears to be regulated through a number of means, including both location and autorepression. DevR is a key regulator of development, and mutants in DevR are incapable of fruiting body development PUBMED:16292354.

    \ ' '2340' 'IPR002765' '\

    This family of bacterial proteins have not been characterised.

    \ ' '2342' 'IPR002767' '\

    This entry contains several hypothetical proteins of unknown function found in archaebacteria, eukaryotes and eubacteria. The structures of YBL001c from Saccharomyces cerevisiae and its homologue MTH1187 from the archaea Methanobacterium thermoautotrophicum have been determined PUBMED:12866058. These proteins have a ferredoxin-like alpha/beta sandwich structure with anti-parallel beta-sheets. Generally, they have two domains that form a single beta-sheet dimer, where two dimers pack sheet-to-sheet into a tetramer, some proteins having an extra C-terminal helix.

    \ ' '2343' 'IPR002775' '\

    Members of this family include the archaeal protein Alba and a number of eukaryotic proteins with no known function. The DNA/RNA-binding protein Alba binds double-stranded DNA tightly but without sequence specificity. It binds rRNA and mRNA in vivo, and may play a role in maintaining the structural and functional\ stability of RNA, and, perhaps, ribosomes. It is distributed uniformly and abundantly on the chromosome. Alba has been shown to bind DNA and affect DNA supercoiling in a temperature dependent manner PUBMED:10869069. It is regulated by acetylation (alba = acetylation lowers binding affinity) by the Sir2 protein. Alba is proposed to play a role in establishment or maintenace of chromatin architecture and thereby in transcription repression. For further information see PUBMED:16256418.

    \ ' '2345' 'IPR002779' '\

    ATP:cob(I)alamin (or ATP:corrinoid) adenosyltransferases (), catalyse the conversion of cobalamin (vitamin B12) into its coenzyme form, adenosylcobalamin (coenzyme B12) PUBMED:15516577. Adenosylcobalamin (AdoCbl) is required for the ativity of certain enzymes. AdoCbl contains an adenosyl moiety liganded to the cobalt ion of cobalamin via a covalent Co-C bond, and its synthesis is unique to certain prokaryotes. ATP:cob(I)alamin adenosyltransferases are classed into three groups: CobA-type PUBMED:16672609, EutT-type PUBMED:15317775 and PduO-type PUBMED:11160088. Each of the three enzyme types appears to be specialised for particular AdoCbl-dependent enzymes or for the de novo synthesis AdoCbl. PduO and EutT are distantly related, sharing short conserved motifs, while CobA is evolutionarily unrelated and is an example of convergent evolution.

    \

    This entry represents EutT- and PduO-type ATP:cob(I)alamin adenosyltransferases. PduO functions to convert cobalamin to AdoCbl for 1,2-propanediol degradation PUBMED:9311132, while EutT produces AdoCbl for ethanolamine utilisation PUBMED:16636051.

    \ ' '2346' 'IPR002781' '\

    This family is found in integral membrane proteins of prokaryotes which are uncharacterised.

    \ ' '2347' 'IPR002782' '\

    This prokaryotic family of proteins have no known function.\ The proteins contain four conserved cysteines that may be involved in metal binding or disulphide bridges.

    \ ' '2348' 'IPR013343' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents the Cas4 family of proteins. Cas4 proteins resembles the RecB family of exonucleases () and contains a cysteine-rich motif indicative of DNA binding. As such, Cas4 proteins may function as part of a hypothetical DNA repair system PUBMED:11788711, PUBMED:15972856. Cas4 is one of four protein families (Cas1 to Cas4) that are associated with CRISPR elements and always occur near a repeat cluster, usually in the order cas3-cas4-cas1-cas2.

    \ ' '2349' 'IPR002786' '\

    This is a family of prokaryotic proteins of unknown function.

    \ ' '2350' 'IPR008201' '\

    This entry describes prokaryotic proteins of unknown function.

    \ ' '2351' 'IPR002789' '\

    This prokaryotic protein family has no known function. It contains several conserved aspartates and histidines that could be metal ligands.

    \ ' '2352' 'IPR002790' '\

    This entry describes archaebacterial proteins of unknown function.

    \ ' '2353' 'IPR002791' '\

    This entry contains uncharacterised proteins. Those with structural information consist of two domains: an all-alpha domain with a 3-helical bundle fold, and an alpha-beta domain in 3 layers, alpha/beta/alpha.

    \ ' '2354' 'IPR002793' '\

    The function of these prokaryotic proteins is unknown. Computational analysis suggests that they may form a restriction endonuclease-like fold, similar to that found in a variety of endonucleases and DNA repair enzymes PUBMED:15972856.

    \ ' '2355' 'IPR002794' '\ Many members of this family have no known function and are predicted to be integral membrane proteins.\ ' '2356' 'IPR002795' '\

    A highly diverged class of S-adenosylmethionine synthetases have been identified in the archaea. S-adenosylmethionine is the primary alkylating agent in all known organisms. ATP:L-methionine S-adenosyltransferase (MAT) catalyses the only known biosynthetic route to this central metabolite. Although the amino acid sequence of MAT is strongly conserved among bacteria and eukarya (see ) no homologues had been recognised in the completed genome sequences of any archaea. The identification of a second major class of MAT emphasises the long evolutionary history of the archaeal lineage and the structural diversity found even in crucial metabolic enzymes PUBMED:10660563. Three bacterial genomes encode both the archaeal and eukaryotic/bacterial types of MAT PUBMED:10660563.

    \ ' '2357' 'IPR002798' '\ Many members of this family have no known function and are predicted to be integral membrane proteins. is annotated as "Stage II sporulation protein M related"; and weakly related to other proteins with similar annotation.\ ' '2358' 'IPR002800' '\

    Proteins that belong to this group are restricted to the Mycobacteria and the Archaea and have no known function.

    \ ' '2359' 'IPR002802' '\

    The function of the archaebacterial proteins in this family is unknown.

    \ ' '2360' 'IPR001269' '\

    Members of this family catalyse the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archae. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. Dus 1 () from Saccharomyces cerevisiae (Baker\'s yeast) acts on pre-tRNA-Phe, while Dus 2 () acts on pre-tRNA-Tyr and pre-tRNA-Leu. Dus 1 is active as a single subunit, requiring NADPH or NADH, and is stimulated by the presence of FAD PUBMED:12003496. Some family members may be targeted to the mitochondria and even have a role in mitochondria PUBMED:12003496.

    \ ' '2361' 'IPR008180' '\

    Synonym(s): dUTP diphosphatase, Deoxyuridine-triphosphatase

    \

    The essential enzyme dUTP pyrophosphatase () is specific for dUTP and is critical for the fidelity of DNA replication and repair. dUTPase hydrolyzes dUTP to dUMP and pyrophosphate, simultaneously reducing dUTP levels and providing the dUMP for dTTP biosynthesis. dUTPase decreases the intracellular concentration of dUPT so that uracil cannot be incorporated into DNA PUBMED:8805593.

    \

    The crystal structure of human dUTPase reveals that each subunit of the dUTPase trimer folds into an eight-stranded jelly-roll beta barrel, with the C-terminal beta strands interchanged among the subunits. The structure is similar to that of the Escherichia coli enzyme, despite low sequence homology between the two enzymes PUBMED:8805593.

    \

    Other enzymes like deoxycytidine triphosphate deaminase (dCTP) () that specifically bind uridine also belong to this group suggesting that the signature may recognise a putative uridine-binding motif.

    \

    Some retroviruses encode dUTPases. Retroviral dUTPase is synthesised as part of POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, dUTPase and RNase H.

    \ ' '2362' 'IPR013512' '\ 1-deoxy-D-xylulose 5-phosphate reductoisomerase synthesises 2-C-methyl-D-erythritol 4-phosphate from 1-deoxy-D-xylulose 5-phosphate in a single step by intramolecular rearrangement and reduction and is responsible for terpenoid biosynthesis in some organisms PUBMED:. In Arabidopsis thaliana 1-deoxy-D-xylulose 5-phosphate reductoisomerase is the first committed enzyme of the non-mevalonate pathway for isoprenoid biosynthesis. The enzyme requires Mn2+, Co2+ or Mg2+ for activity, with the first being most effective.\ This domain is found at the N terminus of bacterial and plant 1-deoxy-D-xylulose 5-phosphate reductoisomerases.\ ' '2363' 'IPR001401' '\

    Membrane transport between compartments in eukaryotic cells requires proteins that allow the budding and scission of nascent cargo vesicles from one compartment and their targeting and fusion with another. Dynamins are large GTPases that belong to a protein superfamily PUBMED:15040446 that, in eukaryotic cells, includes classical dynamins, dynamin-like proteins, OPA1, Mx proteins, mitofusins and guanylate-binding proteins/atlastins PUBMED:2142876, PUBMED:2112425, PUBMED:1532158, PUBMED:2607176, and are involved in the scission of a wide range of vesicles and organelles. They play a role in many processes including budding of transport vesicles, division of organelles, cytokinesis and pathogen resistance.

    The minimal distinguishing architectural features that are common to all dynamins and are distinct from other GTPases are the structure of the large GTPase domain (300 amino acids) and the presence of two additional domains; the middle domain and the GTPase effector domain (GED), which are involved in oligomerization and regulation of the GTPase activity.

    \

    This entry represents the GTPase domain, containing the GTP-binding motifs that are needed for guanine-nucleotide binding and hydrolysis. The conservation of these motifs is absolute except for the the final motif in guanylate-binding proteins. The GTPase catalytic activity can be stimulated by oligomerisation of the protein, which is mediated by interactions between the GTPase domain, the middle domain and the GED.

    \ ' '2364' 'IPR006996' '\ Dynamitin is a subunit of the microtubule-dependent motor complex, it is also implicated in cell adhesion by binding to macrophage-enriched myristoylated alanine-rice C kinase substrate (MacMARCKS) PUBMED:12082093.\ ' '2365' 'IPR001372' '\

    Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force generating protein of eukaryotic cilia and flagella. The cytoplasmic isoform of dynein acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules.

    \

    Dynein is composed of a number of ATP-binding large subunits (see ), intermediate size subunits and small subunits. Among the small subunits, there is a family of highly conserved proteins which make up this family PUBMED:7744782, PUBMED:8628263.

    \

    Both type 1 (DLC1) and 2 (DLC2) dynein light chains have a similar two-layer alpha-beta core structure consisting of beta-alpha(2)-beta-X-beta(2) PUBMED:10426949, PUBMED:14561217.

    \ ' '2366' 'IPR001177' '\

    Papillomaviruses are a large family of DNA tumour viruses which give rise to warts in their host species. The helicase E1 protein is an ATP-dependent DNA helicase required for initiation of viral DNA replication PUBMED:8389467. It forms a complex with the viral E2 protein, which is a site-specific DNA-binding transcriptional activator. The E1-E2 complex binds to the replication origin which contains binding sites for both proteins PUBMED:2176744.

    \ \

    The E1 protein is a 70 kDa polypeptide with a central DNA-binding domain and a C-terminal ATPase/helicase domain. It binds specific 18 bp DNA sequences at the origin of replication, melts the DNA duplex and functions as a 3\' to 5\' helicase PUBMED:9060646. In addition to E2 it also interacts with DNA polymerase alpha and replication protein A to effect DNA replication. The DNA-binding domain forms a five-stranded antiparallel beta sheet bordered by four loosely packed alpha helices on one side and two tightly packed helices on the other PUBMED:10949036. Two structural modules within this domain, an extended loop and a helix, contain conserved residues and are critical for DNA binding. In solution E1 is a monomer, but binds DNA as a dimer. Recruitment of more E1 subunits to the complex leads to melting of the origin and ultimately to the formation of an E1 hexamer with helicase activity PUBMED:9658141.

    \ \

    The entry represents the C-terminal region of E1, containing both the DNA-binding and ATPase/helical domains.

    \ ' '2367' 'IPR008250' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    P-ATPases (sometime known as E1-E2 ATPases) () are found in bacteria and in a number of eukaryotic plasma membranes and organelles PUBMED:9419228. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.

    \

    This entry represents an ATPase-associated region found in P-type ATPases PUBMED:8226755. P-type (or E1-E2-type) ATPases that form an aspartyl phosphate intermediate in the course of ATP hydrolysis, can be divided into 4 major groups PUBMED:8151716: (1) Ca2+-transporting ATPases; (2) Na+/K+- and gastric H+/K+-transporting ATPases; (3) plasma membrane H+-transporting ATPases (proton pumps) of plants, fungi and lower eukaryotes; and (4) all bacterial P-type ATPases, except the g2+-ATPase of Salmonella typhimurium, which is more similar to the eukaryotic sequences. However, great variety of sequence analysis methods results in diversity of classification.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '2368' 'IPR001017' '\ This entry includes a number of dehydrogenases all of which use thiamine\ pyrophosphate as a cofactor and are members of a multienzyme complex.\ Pyruvate dehydrogenase (), a component of the multienzyme\ pyruvate dehydrogenase complex; 2-oxoglutarate dehydrogenase (),\ a component of the multienzyme 2-oxoglutarate dehydrogenase which contains\ multiple copies of three enzymatic components: 2-oxoglutarate dehydrogenase (E1),\ dihydrolipoamide succinyltransferase (E2) and lipoamide dehydrogenase (E3);\ and 2-oxoisovalerate dehydrogenase (), a component of the multienzyme\ branched-chain alpha-keto dehydrogenase complex all belong to this family.\ ' '2369' 'IPR014000' '\

    Papillomaviruses are a large family of DNA tumour viruses which give rise to warts in their host species. The helicase E1 protein is an ATP-dependent DNA helicase required for initiation of viral DNA replication PUBMED:8389467. It forms a complex with the viral E2 protein, which is a site-specific DNA-binding transcriptional activator. The E1-E2 complex binds to the replication origin which contains binding sites for both proteins PUBMED:2176744.

    \ \

    The E1 protein is a 70 kDa polypeptide with a centrally-located DNA-binding domain and a C-terminal ATPase/helicase domain. It binds specific 18 bp DNA sequences at the origin of replication, melts the DNA duplex and functions as a 3\' to 5\' helicase PUBMED:9060646. In addition to E2 it also interacts with DNA polymerase alpha and replication protein A to effect DNA replication. The DNA-binding domain forms a five-stranded antiparallel beta sheet bordered by four loosely packed alpha helices on one side and two tightly packed helices on the other PUBMED:10949036. Two structural modules within this domain, an extended loop and a helix, contain conserved residues and are critical for DNA binding. In solution E1 is a monomer, but binds DNA as a dimer. Recruitment of more E1 subunits to the complex leads to melting of the origin and ultimately to the formation of an E1 hexamer with helicase activity PUBMED:9658141.

    \ \

    This entry represents the N-terminal region of E1, which contains the nuclear localisation signal.

    \ ' '2370' 'IPR000427' '\

    E2 is an early regulatory protein found in the dsDNA papillomaviruses. The viral genome is a 7.9-kb circular DNA that codes for at least eight early and two late (capsid) proteins. The products of the early genes E6 and E7 are oncoproteins that destabilise the cellular tumour suppressors p53 and pRB. The product of the E1 gene is a helicase necessary for viral DNA replication. The products of the E2 gene play key roles in the regulation of viral gene transcription and DNA replication. During early stages of viral infection, the\ E2 protein represses the transcription of the oncogenes E6 and E7, reintroduction of E2 into cervical cancer cell-lines leads to repression of E6/E7 transcription, stabilisation of the tumour suppressor p53, and\ cell-cycle arrest at the G1 phase of the cell cycle. E2 can also induce apoptosis by a p53-independent mechanism.

    \ \

    E2 proteins from all papillomavirus strains bind a consensus palindromic sequence ACCgNNNNcGGT present in multiple copies in the regulatory region. It can either activate or repress transcription, depending on E2RE\'s position with regard to proximal promoter elements. Repression occurs by sterically hindering the assembly of the transcription initiation complex. The E2 protein is composed of a C-terminal DNA-binding domain and an N-terminal trans-activation domain. E2 exists in solution and binds to DNA as a dimer The E2-DNA binding domain forms a dimeric beta-barrel, with each subunit contributing an anti-parallel 4-stranded beta-sheet "half-barrel" PUBMED:1328886, PUBMED:11988474. The topology of each subunit is beta1-1-beta2-beta3-2-beta4. Helix 1 is the recognition helix housing all of the amino acid residues involved in direct DNA sequence specification. Upon dimerisation, strands beta2 and beta4 at the edges of each subunit participate in a continuous hydrogen-bonding network, which results in an 8-stranded beta-barrel. The dimer interface is extensive, made up of hydrogen bonds\ between subunits and a substantial hydrophobic beta-barrel core.

    \ ' '2371' 'IPR001866' '\ E2 is an early regulatory protein found in the dsDNA papillomaviruses. E2 regulates viral transcription and DNA replication. It binds to the E2RE response element (5\'-ACCNNNNNNGGT-3\') present in multiple copies in the regulatory region. It can either activate or repress transcription, depending on E2RE\'s posiiton with regard to proximal promoter elements. Repression occurs by sterically hindering the assembly of the transcription initiation complex. The E1-E2 dimer complex binds to the origin of DNA replication PUBMED:1328886.\ ' '2372' 'IPR003316' '\ The mammalian transcription factor E2F plays an important role in regulating the\ expression of genes that are required for passage through the cell cycle. Multiple E2F family members have been identified that bind to DNA as heterodimers, interacting with proteins known as DP - the dimerisation partners PUBMED:7739537.\ ' '2373' 'IPR004167' '\ A small domain of the E2 subunit of 2-oxo-acid dehydrogenases that is responsible for the binding of the E3 subunit. Proteins containing this domain include the branched-chain alpha-keto acid dehydrogenase complex of bacteria, which catalyses the overall conversion of alpha-keto acids to acyl-CoA and carbon dioxide; and the E-3 binding protein of eukaryotic pyruvate dehydrogenase.\ ' '2375' 'IPR001334' '\

    The papillomavirus E6 oncoproteins are small zinc-binding proteins that share a conserved zinc-binding CXXC motif and do not have identified intrinsic enzymatic activity. E6 proteins are thought to act as adapter proteins, thereby altering the function of E6-associated cellular proteins. This model for E6 function is best supported by observations of human papillomavirus type 16 (HPV-16) E6 (16E6), which can alter the metabolism of the p53 tumor suppressor through association with a cellular E3 ubiquitin ligase called E6AP. HPV-16 E6 interacts with an 18-amino-acid sequence in E6AP, and in an as yet ill-defined fashion the E6AP-16E6 complex binds to p53, inducing the ubiquitin-dependent degradation of the trimolecular complex. 16E6 apparently functions as an adapter protein in the complex with p53, since E6AP does not interact with p53 in the absence of E6 and since the degradation of p53 requires both E6 and E6AP.

    \ \

    Despite the similarity in structure of the E6 oncoproteins, studies have indicated surprising biochemical diversity among E6 oncoproteins of different papillomavirus types. E6 from the cancer-associated human papillomaviruses (HPVs) complex with a cellular protein termed E6-AP and together with E6-AP bind to the p53 tumor suppressor protein thereby degrading p53 through ubiquitin-mediated proteolysis. E6 from the non-cancer-associated HPV types do not bind E6-AP or degrade p53. Bovine papilloma virus E6 (BE6) binds E6-AP but fails either to complex with p53 or to degrade associated proteins, implying that BE6 might transform cells through a mechanism different from that of the HPVs. In addition to targeting p53, E6 of both cancer-associated HPVs and BPV-1 have been shown to associate with a cellular-calcium-binding protein localized to the endoplasmic reticulum PUBMED:10623743, PUBMED:9151888.

    \ ' '2376' 'IPR000148' '\ This family includes the E7 oncoprotein from various papillomaviruses PUBMED:11422538. Along with E5 and E6 their activities seem to be especially important for viral oncogenesis. E5 is located at the cell surface and reduces cell gap-gap junction communication. In cervical cancer E5 is expressed in earlier \ stages of neoplastic transformation of the cervical epithelium during viral infection. The role of E7 is less well understood but it has been shown to impede growth arrest signals in both NIH 3T3 cells and HFKs and that this correlates with elevated cdc25A gene expression. This deregulation of cdc25A is\ linked to disruption of cell cycle arrest PUBMED:11752153.\ \ ' '2377' 'IPR001633' '\

    This domain is found in diverse bacterial signalling proteins. It is called EAL after its conserved residues. The EAL domain is a good candidate for a diguanylate phosphodiesterase function PUBMED:11557134. The domain contains many conserved acidic residues that could participate in metal binding and might form the phosphodiesterase active site. It often but not always occurs along with and domains that are also found in many signalling proteins.

    \ ' '2378' 'IPR007286' '\

    EAP30 is a subunit of the ELL complex. The ELL is an 80-kDa RNA polymerase II transcription factor. ELL interacts with three other proteins to form the complex known as ELL complex. The ELL complex is capable of increasing that catalytic rate of transcription elongation, but is unable to repress initiation of transcription by RNA polymerase II as is the case of ELL. EAP30 is thought to lead to the derepression of ELL\'s transcriptional inhibitory activity.

    \ ' '2379' 'IPR001913' '\

    Equine arteritis virus small envelope glycoprotein (GS) is a class I transmembrane protein which adopts a number of different conformations PUBMED:8938984, PUBMED:7745690.

    \ ' '2380' 'IPR004186' '\

    The Epstein-Barr virus (strain GD1) nuclear antigen 1 (EBNA1) binds to and activates DNA replication from the latent origin of replication. The crystal structure of the DNA-binding and dimerization domains were solved PUBMED:7553871, and it was found that EBNA1 appears to bind DNA via two independent regions, the core and the flanking DNA-binding domains. This DNA-binding domain has a ferredoxin-like fold.

    \ ' '2381' 'IPR001753' '\

    The crotonase superfamily is comprised of mechanistically diverse proteins that share a conserved trimeric quaternary structure (sometimes a hexamer consisting of a dimer of trimers), the core of which consists of 4 turns of a (beta/beta/alpha)n superhelix. Some enzymes in the superfamily have been shown to display dehalogenase, hydratase, and isomerase activities, while others have been implicated in carbon-carbon bond formation and cleavage as well as the hydrolysis of thioesters PUBMED:11263873. However, these different enzymes share the need to stabilise an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two structurally conserved peptidic NH groups that provide hydrogen bonds to the carbonyl moieties of the acyl-CoA substrates and form an "oxyanion hole". The CoA thioester derivatives bind in a characteristic hooked shape and a conserved tunnel binds the pantetheine group of CoA, which links the 3\'-phosphate ADP binding site to the site of reaction PUBMED:17198383. Enzymes in the crotonase superfamily include:

    \

    \

    This entry represents the core domain found in crotonase superfamily members.

    \ ' '2382' 'IPR006825' '\ Eclosion hormone is an insect neuropeptide that triggers the performance of ecdysis behaviour, which causes shedding of the old cuticle at the end of a molt PUBMED:11950244, PUBMED:1634328.\ ' '2383' 'IPR004221' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents restriction endonucleases EcoRI, which requires magnesium as a cofactor. EcoRI recognises the DNA sequence GAATTC and cleaves after G-1 PUBMED:11170385.

    \ ' '2384' 'IPR005658' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    This entry represents proteinase inhibitors that belong to MEROPS inhibitor family I11, clan IN. Ecotins are dimeric periplasmic proteins from Escherichia coli and related Gram-negative bacteria that have been shown to be potent inhibitors of many trypsin-fold serine proteases of widely varying substrate specificity, which belong to MEROPS peptidase family S1 () PUBMED:14705960. Phylogenetic analysis suggested that ecotin has an exogenous target, possibly neutrophil elastase. Ecotin from E. coli, Yersinia pestis, and Pseudomonas aeruginosa, all species that encounter the mammalian immune system inhibit neutrophil elastase strongly while ecotin from the plant pathogen Pantoea citrea inhibits neutrophil elastase 1000-fold less than the others PUBMED:14705961.

    \ \

    They all potently inhibit pancreatic digestive peptidases trypsin and chymotrypsin, while showing more variable inhibition of the blood peptidases Factor Xa, thrombin, and urokinase-type plasminogen activator.

    \ ' '2385' 'IPR014038' '\

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome PUBMED:12762045, PUBMED:15922593, PUBMED:12932732. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    \

    Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta) PUBMED:12762045.

    \

    This entry represents the guanine nucleotide exchange domain of the beta (EF-1beta, also known as EF1B-alpha) and delta (EF-1delta, also known as EF1B-beta) chains of EF1B proteins from eukaryotes and archaea. The beta and delta chains have exchange activity, which mainly resides in their homologous guanine nucleotide exchange domains, found in the C-terminal region of the peptides. Their N-terminal regions may be involved in interactions with the gamma chain (EF-1gamma).

    \

    More information about these proteins can be found at Protein of the Month: Elongation Factors PUBMED:.

    \ ' '2386' 'IPR001662' '\

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome PUBMED:12762045, PUBMED:15922593, PUBMED:12932732. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    \

    Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta) PUBMED:12762045.

    \

    This entry represents a conserved domain usually found near the C-terminus of EF1B-gamma chains, a peptide of 410-440 residues. The gamma chain appears to play a role in anchoring the EF1B complex to the beta and delta chains and to other cellular components.

    \

    More information about these proteins can be found at Protein of the Month: Elongation Factors PUBMED:.

    \ \ \ ' '2387' 'IPR014039' '\

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome PUBMED:12762045, PUBMED:15922593, PUBMED:12932732. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    \

    Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta) PUBMED:12762045.

    \

    This entry represents the C-terminal dimerisation domain found primarily in EF-Tu (EF1A) proteins from bacteria, mitochondria and chloroplasts.

    More information about these proteins can be found at Protein of the Month: Elongation Factors PUBMED:.

    \ ' '2388' 'IPR001059' '\

    Elongation factor P (EF-P) is a prokaryotic protein translation factor required\ for efficient peptide bond synthesis on 70S ribosomes from fMet-tRNAfMet PUBMED:9195040.\ Probably functions indirectly by altering the affinity of the ribosome for aminoacyl-tRNA,\ thus increasing their reactivity as acceptors for peptidyl transferase.

    \ \

    This entry reresents the central domain of elongation factor P and its homologues. It forms an oligonucleotide-binding (OB) fold, though it is not clear if this region is involved in binding nucleic acids PUBMED:15210970.

    \ \ ' '2390' 'IPR006947' '\ Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system PUBMED:.\ ' '2391' 'IPR001379' '\

    Fertilization proteins are acrosomal proteins involved in various roles during the fertilization process. Structurally these proteins consist of a closed bundle of helices with a right-hand twist. Lysin and SP18, both characterised in abalone, are two evolutionarily related fertilization proteins that have distinctive roles. Following its release from sperm, lysin binds to the egg vitelline envelope (VE) via the VE receptor for lysin (VERL), then non-enzymatically dissolves the VE to create a hole, thereby allowing the sperm to pass through the envelope and fuse with the egg PUBMED:10666624. Lysins exhibit species-specific binding to their egg receptor, possibly through differences in charged surface residues PUBMED:10698629. SP18 is also released from sperm, acting as a potent fusagen of liposomes to mediate the fusion between the sperm and egg cell membranes. Despite a similarity in the overall fold, the variation in the surface features of SP18 and lysin account for their different roles in fertilization PUBMED:11331004.

    \ \ ' '2393' 'IPR001361' '\ Equine infectious anemia virus(EIAV) belongs to the family retroviridae. EIAV gp90 is \ hypervariable in the carboxyl-end region and more stable in the amino-end region. This \ variability is a pathogenicity factor that allows the evasion of the host\'s immune \ response PUBMED:1649329.\ ' '2394' 'IPR006196' '\

    The S1 domain of around 70 amino acids, originally identified in ribosomal protein S1, is found in a large number of RNA-associated proteins. It has been shown that S1 proteins bind RNA through their S1 domains with some degree of sequence specificity. This type of S1 domain is found in translation initiation factor 1.

    \

    The solution structure of one S1 RNA-binding domain from Escherichia coli polynucleotide phosphorylase has been determined PUBMED:9008164. It displays some similarity with the cold shock domain (CSD) (). Both the S1 and the CSD domain consist of an antiparallel beta barrel of the same topology with 5 beta strands. This fold is also shared by many other proteins of unrelated function and is known as the OB fold. However, the S1 and CSD fold can be distinguished from the other OB folds by the presence of a short 3(10) helix at the end of strand 3. This unique feature is likely to form a part of the DNA/RNA-binding site.

    \

    More information about these proteins can be found at Protein of the Month: RNA Exosomes PUBMED:.

    \ ' '2395' 'IPR007783' '\ This family is made up of eukaryotic translation initiation factor 3 subunit 7 (eIF-3 zeta/eIF3 p66/eIF3d). Eukaryotic initiation factor 3 is a multi-subunit complex that is required for binding of mRNA to 40S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits. These functions and the complex nature of eIF3 suggest multiple interactions with many components of the translational machinery PUBMED:11042177. The gene coding for the protein has been implicated in cancer in mammals PUBMED:11733359.\ ' '2396' 'IPR020189' '\

    A five-stranded beta-barrel was first noted as a common structure among four proteins binding single-stranded nucleic acids (staphylococcal nuclease and\ aspartyl-tRNA synthetase) or oligosaccharides (B subunits of enterotoxin and verotoxin-1), and has been termed the oligonucleotide/oligosaccharide binding motif, or OB fold, a five-stranded beta-sheet coiled to form a closed beta-barrel capped by an alpha helix located between the third and fourth strands PUBMED:12769718. Two ribosomal proteins, S17 and S1, are members of this class, and have different variations of the OB fold theme. Comparisons with other OB fold nucleic acid binding proteins suggest somewhat different mechanisms of nucleic acid recognition in each case PUBMED:9862955.

    \

    There are many nucleic acid-binding proteins that contain domains with this OB-fold structure, including anticodon-binding tRNA synthetases, ssDNA-binding proteins (CDC13, telomere-end binding proteins), phage ssDNA-binding proteins (gp32, gp2.5, gpV), cold shock proteins, DNA ligases, RNA-capping enzymes, DNA replication initiators and RNA polymerase subunit RBP8 PUBMED:15178340.

    \

    This entry represents the RNA-binding domain of translation elongation factor IF5A PUBMED:19424157.

    \ ' '2397' 'IPR002735' '\

    The beta subunit of archaeal and eukaryotic translation initiation factor 2 (IF2beta) and the N-terminal domain of translation initiation factor 5 (IF5) show significant sequence homology PUBMED:11980477. Archaeal IF2beta contains two independent structural domains: an N-terminal mixed alpha/beta core domain (topological similarity to the common core of ribosomal proteins L23 and L15e), and a C-terminal domain consisting of a zinc-binding C4 finger PUBMED:14978306. Archaeal IF2beta is a ribosome-dependent GTPase that stimulates the binding of initiator Met-tRNA(i)(Met) to the ribosomes, even in the absence of other factors PUBMED:17608795. The C-terminal domain of eukaryotic IF5 is involved in the formation of the multi-factor complex (MFC), an important intermediate for the 43S pre-initiation complex assembly PUBMED:16781736. IF5 interacts directly with IF1, IF2beta and IF3c, which together with IF2-bound Met-tRNA(i)(Met) form the MFC.

    \

    This entry represents both the N-terminal and zinc-binding domains of IF2, as well as a domain in IF5.

    \ ' '2398' 'IPR002769' '\

    This family includes eukaryotic translation initiation factor\ 6 (eIF6) as well as presumed archaeal homologues.

    \ \

    The assembly of 80S ribosomes requires joining of the 40S and 60S subunits, which is triggered by the formation of an initiation complex on the 40S subunit. This\ event is rate-limiting for translation, and depends on external stimuli and the status of the cell. \ \ \ \ Eukaryotic translation initiation factor 6 (eIF6) binds specifically to the free 60S ribosomal subunit and \ prevents its association with the 40S ribosomal subunit ribosomes PUBMED:9891075. Furthermore, eIF6 interacts in the cytoplasm with RACK1, a receptor for activated protein kinase C (PKC). RACK1 is a major component of translating ribosomes, which harbour significant amounts of PKC. Loading 60S subunits with eIF6 caused a dose-dependent translational block and impairment of 80S formation, which are reversed by expression of RACK1 and stimulation of PKC in vivo and in vitro. PKC stimulation leads to eIF6 phosphorylation and its release, promoting 80S subunit formation. RACK1 provides a physical and functional link between PKC signalling and ribosome activation.

    \ \ \ ' '2399' 'IPR004699' '\ Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Man family is unique in several respects among PTS permease families.\
  • It is the only PTS family in which members possess a IID protein.
  • It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue.
  • Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars.
  • \

    The Gut family consists only of glucitol-specific transporters, but these occur both in Gram-negative and Gram-positive bacteria. Escherichia coli consists of IIA protein, a IIC protein and a IIBC protein.

    This family is specific for the IIC component.

    \ ' '2400' 'IPR004700' '\ Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Man family is unique in several respects among PTS permease families.\
  • It is the only PTS family in which members possess a IID protein.
  • It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue.
  • Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars.
  • \

    The mannose permease of Escherichia coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine, N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine.

    \

    This family is specific for the sorbose-specific IIC subunits of this family of PTS transporters.

    \ ' '2401' 'IPR011618' '\

    Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Man family is unique in several respects among PTS permease families.\

  • It is the only PTS family in which members possess a IID protein.
  • It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue.
  • Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars.
  • \

    The Gut family consists only of glucitol-specific permeases, but these occur both in Gram-negative and Gram-positive bacteria. Escherichia coli consists of IIA protein, a IIC protein and a IIBC protein.

    This entry represents the N-terminal conserved region of the IIBC component.

    \ \ ' '2402' 'IPR004703' '\ Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Man family is unique in several respects among PTS permease families.\
  • It is the only PTS family in which members possess a IID protein.
  • It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue.
  • Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars.
  • \

    The only characterised member of this family of PTS transporters is the Escherichia coli galactitol transporter. Gat family PTS systems typically have 3 components: IIA, IIB and IIC.

    This family is specific for the IIC component of the PTS Gat family.

    \ ' '2403' 'IPR004704' '\

    The phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS) PUBMED:8246840, PUBMED:2197982 is a major carbohydrate transport system in bacteria. The PTS catalyses the phosphorylation of incoming sugar substrates and coupled with translocation across the cell membrane, makes the PTS a link between the uptake and metabolism of sugars.

    \ \

    The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred via a signal transduction pathway, to enzyme I (EI) which in turn transfers it to a phosphoryl carrier, the histidine protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease, a membrane-bound complex known as enzyme 2 (EII), which transports the sugar to the cell. EII consists of at least three structurally distinct domains IIA, IIB and IIC PUBMED:1537788. These can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII).

    \ \

    The first domain (IIA or EIIA) carries the first permease-specific phosphorylation site, a histidine which is phosphorylated by phospho-HPr. The second domain (IIB or EIIB) is phosphorylated by phospho-IIA on a cysteinyl or histidyl residue, depending on the sugar transported. Finally, the phosphoryl group is transferred from the IIB domain to the sugar substrate concomitantly with the sugar uptake processed by the IIC domain. This third domain (IIC or EIIC) forms the translocation channel and the specific substrate-binding site.

    \ \

    An additional transmembrane domain IID, homologous to IIC, can be found in some PTSs, e.g. for mannose PUBMED:8246840, PUBMED:1537788, PUBMED:7815935, PUBMED:11361063.

    \ \ Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Man family is unique in several respects among PTS permease families.\
  • It is the only PTS family in which members possess a IID protein.
  • \
  • It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue.
  • \
  • Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars.
  • \ \

    The mannose permease of Escherichia coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine,N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine.

    This family is specific for the IID subunits of this family of PTS transporters.

    \ ' '2404' 'IPR006957' '\ Ethylene insensitive 3 (EIN3) proteins are a family of plant DNA-binding proteins that regulate transcription in response to the gaseous plant hormone ethylene, and are essential for ethylene-mediated responses. \ \ In the presence of ethylene, dark-grown dicotyledonous\ seedlings undergo dramatic morphological changes collectively known as the \'triple response\'. In Arabidopsis, these changes consist of a radial swelling of the hypocotyl, an exaggeration in the\ curvature of the apical hook, and the inhibition of cell elongation in the hypocotyl and root.\ ' '2405' 'IPR004990' '\ This is a family of hypothetical proteins from cereal crops.\ ' '2406' 'IPR003424' '\ This family consists of egg-laying hormone (ELH) precursor and atrial gland peptides from the little (Aplysia parvula) and california (Aplysia californica) sea hares. The family also includes ovulation prohormone precursor from the great pond snail (Lymnaea stagnalis). This family thus represents a conserved gastropoda ovulation and egg production prohormone. Note that many of the proteins present are further cleaved to give individual peptides PUBMED:9520477. Neuropeptidergic bag cells of the marine mollusc A. californica synthesize an egg-laying hormone (ELH) precursor protein which is cleaved to generate several bioacitve peptides including ELH, bag cell peptides (BCP) and acidic peptide (AP) PUBMED:10518477.\ ' '2407' 'IPR002200' '\

    Elicitins are a family of small, highly-conserved proteins secreted by phytopathogenic fungi belonging to the phytophthora species PUBMED:7753775, PUBMED:. They are toxic proteins reponsible for inducing a necrotic and systemic hypersensitive response in plants from the solanaceae and cruciferae families. Leaf necrosis provides immediate control of fungal invasion and induces systemic acquired resistance; both responses mediate basic protection against subsequent pathogen inoculation.

    \ \

    Members of this family share a high level of sequence similarity, but they differ in net charge, dividing them into two classes: alpha and beta PUBMED:7753775, PUBMED:. Alpha-elicitins are highly acidic, with a valine residue at position 13, whereas beta-elicitins are basic, with a lysine at the same position. Residue 13 is known to be involved in the control of necrosis and, being exposed, is thought to be involved in ligand/receptor binding PUBMED:, PUBMED:9385630. Phenotypically, the two classes can be distinguished by their necrotic properties: beta-elicitins are 100-fold more toxic and provide better subsequent protection PUBMED:7753775, PUBMED:.

    \ ' '2408' 'IPR005539' '\

    This domain is required for the nuclear localisation of these proteins PUBMED:11352458. All of these proteins are members of the Tale/Knox homeodomain family, a subfamily, containing homeobox .

    \ ' '2409' 'IPR002076' '\

    This group of eukaryotic integral membrane proteins are evolutionary related, but exact function has not yet clearly been established. The proteins have from 290 to 435 amino acid residues. Structurally, they seem to be formed of three sections: a N-terminal region with two transmembrane domains, a central hydrophilic loop and a C-terminal region that contains from one to three transmembrane domains. Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis PUBMED:8027068. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 () affects plasma membrane H+-ATPase activity, and may act on a glucose-signalling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1 PUBMED:7768822.

    \ ' '2410' 'IPR003407' '\ This family represents the immunodominant surface antigen of Theileria parasites including equi merozoite antigen-1 (EMA-1) and equi merozoite antigen-2 (EMA-2) PUBMED:9497033. The protein shows variation at a putative glycosylation site, a potential mechanism for host immune response evasion PUBMED:8538686.\ ' '2411' 'IPR000348' '\

    p24 proteins are major membrane components of COPI- and COPII-coated vesicles and are implicated in cargo selectivity of ER to Golgi transport PUBMED:9472029 PUBMED:8947548.\ \ Multiple members of the p24 family are found in all eukaryotes, from yeast to mammals. \ Members of the p24 family are type I membrane proteins with a signal peptide at the amino terminus, a lumenal coiled-coil (extracytosolic) domain, a single transmembrane domain with conserved amino acids, and a short cytoplasmic tail. They may be grouped into at least three subfamilies based on primary sequence PUBMED:8663407. One subfamily comprises yeast Emp24p and mammalian p24A. Another subfamily comprises yeast Erv25p and mammalian Tmp21, and the third subfamily comprises mammalian gp25L proteins.

    \ ' '2412' 'IPR007581' '\

    Endonuclease V is specific for single-stranded DNA, for duplex DNA that contains uracil, or that is damaged PUBMED:8990280. Matrix metalloproteinase-1 (MMP-1) is the major enzyme responsible for collagen 1 digestion. It is induced by exposure to sunlight, but is reduced with treatment of DNA repair enzyme endonuclease V PUBMED:18459971. This family consequently has potential medical importance PUBMED:18328204.

    \ ' '2413' 'IPR004211' '\

    This family of proteins which includes Bacteriophage T4 endonuclease VII, Mycobacteriophage D29 gene 59, and other as yet uncharacterised proteins. The T4 endonuclease VII (Endo VII) recognises a broad spectrum of DNA substrates ranging from branched DNAs to single base mismatches. The structure of this enzyme has been resolved and it was found that the monomers form an elongated, intertwined molecular dimer that exibits extreme domain swapping. Two pairs of antiparallel helices which form a novel \'four-helix cross\' motif are the major dimerization elements PUBMED:10075917.

    \ ' '2414' 'IPR007346' '\ Bacterial periplasmic or secreted () Escherichia coli endonuclease I (EndoI) is a sequence independent endonuclease located in the periplasm. It is inhibited by different RNA species. It is thought to normally generate double strand breaks in DNA, except in the presence of high salt concentrations and RNA, when it generates single strand breaks in DNA. Its biological role is unknown PUBMED:7867949. Other family members are known to be extracellular PUBMED:3036665. This family also includes a non-specific, Mg2+-activated ribonuclease precursor () PUBMED:1396690.\ ' '2415' 'IPR006760' '\

    This is a conserved region found in both cAMP-regulated phosphoprotein 19 (ARPP-19) and alpha/beta endosulphine. No function has yet been assigned to ARPP-19. Endosulphine is the endogenous ligand for the ATP-dependent potassium channels which occupy a key position in the control of insulin release from the pancreatic beta cell by coupling cell polarity to metabolism. In both cases the region occupies the majority of the protein PUBMED:11279279, PUBMED:11213264.

    \ ' '2416' 'IPR001928' '\

    Endothelins (ET\'s) are the most potent vasoconstrictors known PUBMED:2690429, PUBMED:2168326, PUBMED:1916094. They stimulate cardiac contraction, regulate release of vasoactive substances, and stimulate mitogenesis in blood vessels in primary culture. They also stimulate contraction in almost all other smooth muscles (e.g., uterus, bronchus, vas deferensa and stomach) and stimulate secretion in several tissues (e.g., kidney, liver and adrenals). Endothelin receptors have also been found in the brain, e.g. cerebral cortex, cerebellum and glial cells. Endothelins have been implicated in a variety of pathophysiological conditions associated with stress, including hypertension, myocardial infarction, subarachnoid haemorrhage and renal failure.

    \

    Endothelins are synthesised by proteolysis of large preproendothelins, which are cleaved to \'big endothelins\' before being processed to the mature peptide.

    \

    Sarafotoxins (SRTX) and bibrotoxin (BTX) are cardiotoxins from the venom of snakes of the Atractaspis family, structurally and functionally PUBMED:2549664, PUBMED:1656557 similar to endothelin.

    \

    As shown in the following schematic representation, these peptides which are 21 residues long contain two intramolecular disulphide bonds.\

    \
                            +-------------+\
                            |             |\
                            CxCxxxxxxxCxxxCxxxxxx\
                              |       |\
                              +-------+\
    \'C\': conserved cysteine involved in a disulphide bond.\
    

    \ ' '2417' 'IPR015790' '\

    This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins, they are activated by proteolytic cleavage. The N-terminus is cleaved in all of the proteins and a C-terminal extension is cleaved in some members. Once activated, the endotoxin binds to the gut epithelium and causes cell lysis by the formation of cation-selective channels, which leads to death. The activated region of the delta toxin is composed of three distinct structural domains: an N-terminal helical bundle domain () involved in membrane insertion and pore formation; a beta-sheet central domain involved in receptor binding; and a C-terminal beta-sandwich domain () that interacts with the N-terminal domain to form a channel PUBMED:7490762, PUBMED:11468393. This entry represents the central beta-sheet domain.

    \ ' '2418' 'IPR005638' '\

    This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins, they are activated by proteolytic cleavage. The N-terminus is cleaved in all of the proteins and a C-terminal extension is cleaved in some members. Once activated, the endotoxin binds to the gut epithelium and causes cell lysis by the formation of cation-selective channels, which leads to death. The activated region of the delta toxin is composed of three distinct structural domains: an N-terminal helical bundle domain () involved in membrane insertion and pore formation; a beta-sheet central domain () involved in receptor binding; and a C-terminal beta-sandwich domain that interacts with the N-terminal domain to form a channel PUBMED:7490762, PUBMED:11468393. This entry represents the conserved C-terminal domain.

    \ ' '2419' 'IPR005639' '\

    This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins, they are activated by proteolytic cleavage. The N-terminus is cleaved in all of the proteins and a C-terminal extension is cleaved in some members. Once activated, the endotoxin binds to the gut epithelium and causes cell lysis by the formation of cation-selective channels, which leads to death. The activated region of the delta toxin is composed of three distinct structural domains: an N-terminal helical bundle domain involved in membrane insertion and pore formation; a beta-sheet central domain () involved in receptor binding; and a C-terminal beta-sandwich domain () that interacts with the N-terminal domain to form a channel PUBMED:7490762, PUBMED:11468393. This entry represents the conserved N-terminal domain.

    \ ' '2420' 'IPR004954' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M60 (enhancin family, clan MA(E)). The active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH. \ The viral enhancin protein, or enhancing factor, is involved in disruption of the peritrophic membrane and fusion of nucleocapsids with mid-gut cells.

    \ \ ' '2421' 'IPR005050' '\

    The expression of early nodulin (ENOD) genes has been well characterised in several legume species. Based on their biochemical attributes and expression\ patterns, they are postulated to have roles in cell structure, in the control of nodule ontogeny by the degradation of Nod factor, and in carbon metabolism PUBMED:10759502.

    \ ' '2422' 'IPR000941' '\

    Enolase (2-phospho-D-glycerate hydrolase) is an essential glycolytic enzyme that catalyses the interconversion of 2-phosphoglycerate and phosphoenolpyruvate PUBMED:1859865, PUBMED:1840492. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional \ enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an \ alpha/gamma heterodimer PUBMED:3390159. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown PUBMED:3589669 to be evolutionary related to enolase.

    \

    Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.

    \ ' '2423' 'IPR000941' '\

    Enolase (2-phospho-D-glycerate hydrolase) is an essential glycolytic enzyme that catalyses the interconversion of 2-phosphoglycerate and phosphoenolpyruvate PUBMED:1859865, PUBMED:1840492. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional \ enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an \ alpha/gamma heterodimer PUBMED:3390159. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown PUBMED:3589669 to be evolutionary related to enolase.

    \

    Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.

    \ ' '2424' 'IPR005491' '\ Emsy protein is amplified in breast cancer and interacts with BRCA2. The Emsy N-terminal (ENT) domain is found in other vertebrate and plant proteins of unknown function, and has a completely conserved histidine residue that may be functionally important.\ ' '2425' 'IPR001144' '\

    Escherichia coli heat-labile enterotoxin is a bacterial protein toxin with an AB5 multimer structure, in which the B pentamer () has a membrane-binding function and the A chain is needed for enzymatic activity PUBMED:8478941. The B subunits are arranged as a donut-shaped pentamer, each subunit participating in ~30 hydrogen bonds and 6 salt bridges with its two neighbours PUBMED:8478941.

    \

    The A subunit has a less well-defined secondary structure. It predominantly interacts with the pentamer via the C-terminal A2 fragment, which runs through the charged central pore of the B subunits. A putative catalytic residue in the A1 fragment (Glu112) lies close to a hydrophobic region, which packs two loops together. It is thought that this region might be important for catalysis and membrane translocation PUBMED:8478941.

    \ ' '2426' 'IPR001835' '\

    Escherichia coli heat-labile enterotoxin is a bacterial protein toxin with an AB5 multimer structure, in which the B pentamer has a membrane-binding function and the A chain () is needed for enzymatic activity PUBMED:8478941. The B subunits are arranged as a donut-shaped pentamer, each subunit participating in ~30 hydrogen bonds and 6 salt bridges with its two neighbours PUBMED:8478941.

    \

    The A subunit has a less well-defined secondary structure. It predominantly interacts with the pentamer via the C-terminal A2 fragment, which runs through the charged central pore of the B subunits. A putative catalytic residue in the A1 fragment (Glu112) lies close to a hydrophobic region, which packs two loops together. It is thought that this region might be important for catalysis and membrane translocation PUBMED:8478941.

    \ ' '2427' 'IPR001489' '\

    This entry represents a group of heat-stable enterotoxins, such as STa from Escherichia coli, which is the cause of acute diarrhoea in infants and travellers in developing countries. The mature STa protein is a 19-residue peptide containing three disulphide bridges that are functionally important. STa contains an N-terminal signal peptide composed of two domains, Pre and Pro, involved in extracellular toxin release, and a core enterotoxigenic domain PUBMED:15049831. STa binds to and activates the guanylate cyclise C intestinal receptor, causing an increase in the intracellular levels of cyclic guanosine monophosphate (cGMP) PUBMED:10799798, PUBMED:17094787, PUBMED:12813912.

    \ ' '2428' 'IPR001026' '\

    The ENTH (Epsin N-terminal homology) domain is approximately 150 amino acids in length and is always found located at the N-termini of proteins. The domain forms a compact globular structure, composed of 9 alpha-helices connected by loops of varying length. The general topology is determined by three helical hairpins that are stacked consecutively with a right hand twist PUBMED:11911874. An N-terminal helix folds back, forming a deep basic groove that\ forms the binding pocket for the Ins(1,4,5)P3 ligand PUBMED:12353027. The ligand is coordinated by residues from surrounding alpha-helices and all three phosphates are multiply coordinated. The coordination of Ins(1,4,5)P3 suggests that ENTH is specific for particular head groups.

    \

    Proteins containing this domain have been found to bind PtdIns(4,5)P2 and PtdIns(1,4,5)P3 suggesting that the domain may be a membrane interacting module. The main function of proteins containing this domain appears to be to act as accessory clathrin adaptors in endocytosis, Epsin is able to recruit and promote clathrin polymerisation on\ a lipid monolayer, but may have additional roles in signalling and actin regulation PUBMED:10048338. Epsin causes a strong degree of membrane curvature and\ tubulation, even fragmentation of membranes with a high PtdIns(4,5)P2 content. Epsin binding to\ membranes facilitates their deformation by insertion of the N-terminal helix into the outer leaflet of the bilayer, pushing the head groups\ apart. This would reduce the energy needed to curve the membrane into a vesicle, making it easier for the clathrin cage to\ fix and stabilise the curved membrane. This points to a pioneering role for epsin in vesicle\ budding as it provides both a driving force and a link between membrane invagination and clathrin polymerisation.

    \ ' '2429' 'IPR018154' '\

    Enveloped viruses such as Human immunodeficiency virus 1, influenza virus, and Ebola virus sp. express a surface glycoprotein that mediates both cell attachment and fusion of viral and cellular membranes. The ENV polyprotein (coat polyprotein) usually contains two coat proteins which differ depending on the source.

    \ \

    The structure of a number of the ENV polyprotein domains have been determined:

    \ \ ' '2430' 'IPR001299' '\

    Ependymins are secretory proteins found predominantly in the cerebrospinal fluid of teleost fish PUBMED:1831964, PUBMED:8350351. A bound form of the glycoproteins is associated \ with the extracellular matrix, probably with collagen fibrils, that may be the functional \ form of ependymins PUBMED:8005346. The proteins bind calcium via N-linked sialic acid \ residues. The molecular function of ependymins appear to be related to cell contact\ phenomena involving the extracellular matrix PUBMED:8005346.

    \ ' '2431' 'IPR001090' '\

    Interactions between the Eph receptor tyrosine kinases and their membrane-bound ligands, ephrins are promiscuous, but largely fall into two groups: EphA receptors bind to GPI-anchored ephrin-A ligands, while EphB receptors bind to ephrin-B proteins that have a transmembrane and cytoplasmic domain PUBMED:10072375. Remarkably, ephrin-B proteins transduce signals, such that bidirectional signalling can occur upon interaction with Eph receptor. An important role of Eph receptors and ephrins is to mediate cell-contact-dependent repulsion. Eph receptors and ephrins also act at boundaries to channel neuronal growth cones along specific pathways, restrict the migration \ of neural crest cells, and via bidirectional signalling prevent intermingling between hindbrain segments. Intriguingly, Eph receptors and ephrins can also trigger an adhesive response of endothelial cells and are required for the remodelling of blood vessels PUBMED:10730216.

    \ \

    Biochemical studies suggest that the extent of multimerisation of Eph receptors modulates the cellular response and that the actin cytoskeleton is one major target of the intracellular pathways activated by Eph receptors PUBMED:10207129. Eph receptors and ephrins have thus emerged as key regulators of the repulsion and adhesion of cells that underlie the establishment, maintenance, and remodelling of patterns of cellular \ organization PUBMED:10730216.

    \ ' '2432' 'IPR001509' '\

    This family of proteins utilise NAD as a cofactor. The proteins in this family use nucleotide-sugar substrates for a variety of chemical reactions PUBMED:9174344. It contains the NAD(P)- binding domain () which is a commonly found domain with a core Rossmann-type fold. One of the best studied of these proteins is UDP-galactose 4-epimerase which catalyses the conversion of UDP-galactose to UDP-glucose during galactose metabolism PUBMED:11279032, PUBMED:10801319.

    \ ' '2433' 'IPR003331' '\ UDP-N-acetylglucosamine 2-epimerase catalyses the production of UDP-ManNAc from UDP-GlcNAc. Some of the enzymes is this family are bifunctional. In microorganisms the epimerase is involved in in the synthesis of the capsule precursor UDP-ManNAcA PUBMED:9515923, PUBMED:9440531. The protein from rat liver displays both epimerase and kinase activity PUBMED:9305888.\ ' '2434' 'IPR001323' '\ Erythropoietin, a plasma glycoprotein, is the primary physiological mediator of \ erythropoiesis PUBMED:3773894. It is involved in the regulation of the level of peripheral \ erythrocytes by stimulating the differentiation of erythroid progenitor cells, found in \ the spleen and bone marrow, into mature erythrocytes PUBMED:3346214. It is primarily \ produced in adult kidneys and foetal liver, acting by attachment to specific binding \ sites on erythroid progenitor cells, stimulating their differentiation PUBMED:2877922. \ Severe kidney dysfunction causes reduction in the plasma levels of erythropoietin,\ resulting in chronic anaemia - injection of purified erythropoietin into the blood stream \ can help to relieve this type of anaemia. Levels of erythropoietin in plasma fluctuate \ with varying oxygen tension of the blood, but androgens and prostaglandins also modulate \ the levels to some extent PUBMED:2877922. Erythropoietin glycoprotein sequences are well \ conserved, a consequence of which is that the hormones are cross-reactive among mammals,\ i.e. that from one species, say human, can stimulate erythropoiesis in\ other species, say mouse or rat PUBMED:1420369. \ \

    Thrombopoeitin (TPO), a glycoprotein, is the mammalian0 hormone which functions as a \ megakaryocytic lineage specific growth and differentiation factor affecting the \ proliferation and maturation from their committed progenitor cells acting at a late \ stage of megakaryocyte development. It acts as a circulating regulator of platelet \ numbers.

    \ ' '2435' 'IPR001986' '\

    5-enolpyruvylshikimate-3-phosphate (EPSP) synthase (also known as 3-phosphoshikimate 1-carboxyvinyltransferase), catalyses the sixth step in the biosynthesis from chorismate of the aromatic amino acids (the shikimate pathway) in bacteria (gene aroA), plants and fungi (where it is part of a multifunctional enzyme which catalyses five consecutive steps in this pathway) PUBMED:11607190. The sixth step is the formation of EPSP and inorganic phosphate from shikimate-3-phosphate (S3P) and phosphoenolpyruvate (PEP).

    \ \

    EPSP can use shikimate or shikimate-3-phosphate as a substrate. By binding shikimate, the backbone of the active site is changed, which affects the binding of glyphosate and renders the reaction insensitive to inhibition by glyphosate PUBMED:16225867. On isolation of the discontinuous C-terminal domain, it was found that it binds neither its substrate nor its inhibitor but maintains structural integrity PUBMED:18051609.

    \ \

    Earlier studies suggested that the active site of the enzyme is in the cleft between its two globular domains. When the enzyme binds S3P, there is a conformational change in the isolated N-terminal domain PUBMED:11300775. The sequence of EPSP from various biological sources shows that the structure of the enzyme has been well conserved throughout evolution. Two strongly conserved regions are well defined. The first one corresponds to a region that is part of the active site and which is also important for the resistance to glyphosate PUBMED:1939260. The second second one is located in the C-terminal part of the protein and contains a conserved lysine which seems to be important for the activity of the enzyme.

    \ \

    Since the shikimate pathway is not present in vertebrates but is essential for the life of plants, fungi and bacteria; it is commonly viewed as a target for antimicrobial drug development.

    \

    This entry represents the core domain of 3-phosphoshikimate 1-carboxyvinyltransferase.

    \ ' '2436' 'IPR005492' '\

    Mutations in the LGI/Epitempin gene can result in a special form of epilepsy, autosomal dominant lateral temporal epilepsy. The Epitempin protein was seen to contain a 130 amino acid repeat in its C-terminal section, although a sub-domain of 50 amino acids has now been further defined within this. The domain is often repeated and each repeat forms a beta-sheet, suggesting the formation of a beta-sheet structure. This presumed domain has no known function, but might form an Ig like fold such as a beta propeller.

    \

    This domain has now been found in a number of proteins associated with neurological disorders suggesting that it may play a role in the development of epilepsy and other related conditions PUBMED:12217514.

    \ ' '2437' 'IPR000781' '\ The Drosophila protein \'enhancer of rudimentary\' (gene (e(r)) is a small protein of 104\ residues whose function is not yet clear. From an evolutionary point of view, it is highly\ conserved PUBMED:9074495 and has been found to exist in probably all multicellular\ eukaryotic organisms. It has been proposed that this protein plays a role in the cell cycle.\ ' '2438' 'IPR000133' '\

    Proteins resident in the lumen of the endoplasmic reticulum (ER) contain a C-terminal\ tetrapeptide, commonly known as Lys-Asp-Glu-Leu (KDEL) in mammals and His-Asp-Glu-Leu\ (HDEL) in yeast (Saccharomyces cerevisiae) that acts as a signal for their retrieval from subsequent\ compartments of the secretory pathway. The receptor for this signal is a ~26 kDa Golgi\ membrane protein, initially identified as the ERD2 gene product in S. cerevisiae. The\ receptor molecule, known variously as the ER lumen protein retaining receptor or the\ \'KDEL receptor\', is believed to cycle between the cis side of the Golgi apparatus and\ the ER. It has also been characterised in a number of other species, including plants,\ Plasmodium, Drosophila and mammals. In mammals, 2 highly related forms of the\ receptor are known.

    \ \

    The KDEL receptor is a highly hydrophobic protein of 220 residues; its sequence\ exhibits 7 hydrophobic regions, all of which have been suggested to traverse the\ membrane PUBMED:8392934. More recently, however, it has been suggested that only 6 of these\ regions are transmembrane (TM), resulting in both N- and C-termini on the cytoplasmic\ side of the membrane.

    \ ' '2439' 'IPR006166' '\

    This entry represents a structural motif found in several DNA repair nucleases, such as Rad1/Mus81/XPF endonucleases () PUBMED:12679022, and in ATP-dependent helicases. The XPF/Rad1/Mus81-dependent nuclease family specifically cleaves branched structures generated during DNA repair, replication, and recombination, and is essential for maintaining genome stability. The nuclease domain architecture exhibits remarkable similarity to those of restriction endonucleases.

    \ ' '2440' 'IPR007499' '\ The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to ERF PUBMED:11914131.\ ' '2441' 'IPR005140' '\

    This domain is found in the release factor eRF1 which terminates protein biosynthesis by recognizing stop codons at the A site of the ribosome and stimulating\ peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known PUBMED:10676813. The overall\ shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop,\ aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip\ of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl\ transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site PUBMED:10676813.

    \ \

    This domain is also found in other proteins for which the precise molecular function is unknown. Many of them are from\ Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification.

    \ ' '2442' 'IPR005141' '\

    This domain is found in the release factor eRF1 which terminates protein biosynthesis by recognizing stop codons at the A site of the ribosome and stimulating\ peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known PUBMED:10676813. The overall\ shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop,\ aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip\ of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl\ transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site PUBMED:10676813.

    \ \

    This domain is also found in other proteins which may also be involved in translation termination

    \ ' '2443' 'IPR005142' '\

    This domain is found in the release factor eRF1 which terminates protein biosynthesis by recognizing stop codons at the A site of the ribosome and stimulating\ peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known PUBMED:10676813. The overall\ shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop,\ aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip\ of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl\ transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site PUBMED:10676813.

    \ \

    This domain is also found in other proteins which may also be involved in translation termination but this awaits experimental verification.

    \ ' '2444' 'IPR005490' '\

    This family of proteins are found in a range of bacteria. The conserved region contains a conserved histidine and cysteine, suggesting that these proteins have an enzymatic activity. Several members of this family contain peptidoglycan binding domains. So these proteins may use peptidoglycan or a precursor as a substrate.

    \ \

    The molecular structure of YkuD protein shows this domain has a novel tertiary fold consisting of a beta-sandwich with two mixed sheets, one containing five strands and the other, six strands. The two beta-sheets form a cradle capped by an alpha-helix. This domain contains a putative catalytic site with a tetrad of invariant His123, Gly124, Cys139, and Arg141. The stereochemistry of this active site shows similarities to peptidotransferases and sortases, and suggests that the enzymes of this family may play an important role in cell wall biology. This family was formerly called the ErfK/YbiS/YcfS/YnhG family, but is now named after the first protein of known structure.

    \ ' '2445' 'IPR005352' '\

    This is a family of integral membrane proteins, which may contain four transmembrane helices. Members of this family are thought to be involved in sterol C-4 demethylation. In Saccharomyces cerevisiae (Baker\'s yeast) they may tether Erg26p (sterol dehydrogenase/decarboxylase) and Erg27p (3-ketoreductase) to the endoplasmic reticulum or may facilitate interaction between these proteins PUBMED:11160377. The family contains a conserved arginine and histidine that may be functionally important.

    \ ' '2446' 'IPR006716' '\ This family consists of the fungal C-8 sterol isomerase and mammalian sigma1 receptor. C-8 sterol isomerase (delta-8--delta-7 sterol isomerase), catalyses a reaction in ergosterol biosynthesis, which results in unsaturation at C-7 in the B ring of sterols PUBMED:8082205. Sigma 1 receptor is a low molecular mass mammalian protein located in the endoplasmic reticulum PUBMED:8755605, which interacts with endogenous steroid hormones, such as progesterone and testosterone PUBMED:9425306. It also binds the sigma ligands, which are a set of chemically unrelated drugs including haloperidol, pentazocine, and ditolylguanidine PUBMED:8755605. Sigma1 effectors are not well understood, but sigma1 agonists have been observed to affect NMDA receptor function, the alpha-adrenergic system and opioid analgesia.\ ' '2447' 'IPR001171' '\

    The two fungal enzymes, C-14 sterol reductase (gene ERG24 in budding yeast and erg3 in Neurospora crassa) and C-24(28) sterol reductase (gene ERG4 in budding yeast and sts1 in fission yeast), are involved in ergosterol biosynthesis. They act by reducing double bonds in precursors of ergosterol PUBMED:8125337. These proteins are highly hydrophobic and seem to contain seven or eight transmembrane regions. Chicken lamin B receptor that is thought to anchor the lamina to the inner nuclear membrane belongs to this family.

    \ ' '2448' 'IPR011259' '\ The ERM family consists of three closely-related proteins, ezrin, radixin and moesin PUBMED:9048483.\ Ezrin was first identified as a constituent of microvilli PUBMED:6885906, radixin as a barbed, \ end-capping actin-modulating protein from isolated junctional fractions PUBMED:2500445, and moesin as a heparin\ binding protein PUBMED:3046603. A tumour suppressor molecule responsible for neurofibromatosis type 2 (NF2)\ is highly similar to ERM proteins and has been designated merlin (moesin-ezrin-radixin-like protein).\ ERM molecules contain 3 domains, an N-terminal globular domain; an extended alpha-helical domain; and a\ charged C-terminal domain PUBMED:9048483. Ezrin, radixin and merlin also contain a polyproline region between\ the helical and C-terminal domains. The N-terminal domain is highly conserved, and is also found in merlin,\ band 4.1 proteins and members of the band 4.1 superfamily. ERM proteins crosslink actin filaments with\ plasma membranes. They co-localise with CD44 at actin filament-plasma membrane interaction sites,\ associating with CD44 via their N-terminal domains and with actin filaments via their C-terminal domains\ PUBMED:9048483.\ ' '2449' 'IPR007266' '\ Members of this family are required for the formation of disulphide bonds in the endoplasmic reticulum PUBMED:10754564, PUBMED:10982384.\ ' '2450' 'IPR004922' '\

    Trypanosoma brucei is the causative agent of sleeping sickness in humans and nagana in cattle. The parasite lives extracellularly in the blood and tissue fluids of the mammalian host, and is transmitted by the bite of infected tsetse. Each variant surface glycoprotein (Vsg) expression site (ES) in bloodstream-form T. brucei is a polycistronic transcription unit containing several distinct expression site-associated genes (esag), in addition to a single vsg gene. They are co-transcribed with the gene encoding the VSG protein, forming the surface coat of the parasite.

    ESAG1 genes from different ESs encode a highly polymorphic family of membrane-associated glycoproteins, whose function is unknown PUBMED:8892306.

    \ ' '2451' 'IPR005095' '\

    EspA is the prototypical member of this family. EspA, together with EspB, EspD and Tir are exported by a type III secretion system. These proteins are essential for\ attaching and effacing lesion formation. EspA is a structural protein and a major component of a large, transiently expressed, filamentous surface organelle which\ forms a direct link between the bacterium and the host cell PUBMED:9545230, PUBMED:10760148.

    \ ' '2452' 'IPR006891' '\

    Enteropathogenic Escherichia coli O127:H6 attaches to the intestinal muscosa through actin pedestals that are created after it has injected the Type III secretion protein EspF (E. coli secreted protein F-like protein from prophage U) into the cells. EspF recruits the actin machinery by activating the WASP (Wiscott-Aldrich syndrome protein) family of actin nucleating factors PUBMED:18650806. Subsequent cell-death (apoptosis) is caused by EspF being targeted to the mitochondria as a consequence of its mitochondrial targeting sequence. Import into mitochondria leads to a loss of membrane potential, leakage of cytochrome c and activation of the apoptotic caspase cascade. Mutation of leucine to glutamic at position 16 of EspF (L16E) resulted in the failure of EspF import into mitochondria; mitochondrial membrane potential was not affected and cell death abolished. This suggests that the targeting of EspF to mitochondria is essential for bacterial pathogenesis and apoptosis PUBMED:11298644, PUBMED:15533930.

    \ ' '2453' 'IPR000801' '\ This family contains several seemingly unrelated proteins, including human esterase D; \ mycobacterial antigen 85, which is responsible for the high affinity of mycobacteria to \ fibronectin; Corynebacterium glutamicum major secreted protein PS1; and hypothetical proteins \ from Escherichia coli, yeast, mycobacteria and Haemophilus influenzae.\ ' '2454' 'IPR002603' '\ The proteins in this entry have no known function, and are found in Caenorhabditis elegans and in Caenorhabditis briggsae. Each repeat contains 8-10 conserved cysteines that probably form 4-5 disulphide bridges. By\ inspection of the conservation of cysteines it looks like\ cysteines 1, 2, 3, 4, 9 and 10 are always present and that\ sometimes the pair 5 and 8 or the pair 6 and 7 are missing.\ This suggests that cysteines 5/8 and 6/7 make disulphide\ bridges.\ ' '2455' 'IPR014731' '\

    Electron transfer flavoproteins (ETFs) serve as specific electron acceptors for primary dehydrogenases, transferring the electrons to terminal respiratory systems. They can be functionally classified into constitutive, "housekeeping" ETFs, mainly involved in the oxidation of fatty acids (Group I), and ETFs produced by some prokaryotes under specific growth conditions, receiving electrons only from the oxidation of specific substrates (Group II) PUBMED:8599534.

    \ \

    ETFs are heterodimeric proteins composed of an alpha and beta subunit, and contain an FAD cofactor and AMP PUBMED:2326318, PUBMED:8525056, PUBMED:8962055, PUBMED:10026281, PUBMED:12567183. ETF consists of three domains: domains I and II are formed by the N- and C-terminal portions of the alpha subunit, respectively, while domain III is formed by the beta subunit. Domains I and III share an almost identical alpha-beta-alpha sandwich fold, while domain II forms an alpha-beta-alpha sandwich similar to that of bacterial flavodoxins. FAD is bound in a cleft between domains II and III, while domain III binds the AMP molecule. Interactions between domains I and III stabilise the protein, forming a shallow bowl where domain II resides.

    \ \

    This entry represents the C-terminal domain of the alpha subunit of both Group I and Group II ETFs.

    \ ' '2456' 'IPR014730' '\

    Electron transfer flavoproteins (ETFs) serve as specific electron acceptors for primary dehydrogenases, transferring the electrons to terminal respiratory systems. They can be functionally classified into constitutive, "housekeeping" ETFs, mainly involved in the oxidation of fatty acids (Group I), and ETFs produced by some prokaryotes under specific growth conditions, receiving electrons only from the oxidation of specific substrates (Group II) PUBMED:8599534.

    \ \

    ETFs are heterodimeric proteins composed of an alpha and beta subunit, and contain an FAD cofactor and AMP PUBMED:2326318, PUBMED:8525056, PUBMED:8962055, PUBMED:10026281, PUBMED:12567183. ETF consists of three domains: domains I and II are formed by the N- and C-terminal portions of the alpha subunit, respectively, while domain III is formed by the beta subunit. Domains I and III share an almost identical alpha-beta-alpha sandwich fold, while domain II forms an alpha-beta-alpha sandwich similar to that of bacterial flavodoxins. FAD is bound in a cleft between domains II and III, while domain III binds the AMP molecule. Interactions between domains I and III stabilise the protein, forming a shallow bowl where domain II resides.

    \ \

    This entry represents the N-terminal domain of both the alpha and beta subunits from Group I and Group II ETFs.

    \ \ \ ' '2457' 'IPR007859' '\ Electron-transfer flavoprotein-ubiquinone oxidoreductase (ETF-QO) in the inner mitochondrial membrane accepts electrons from electron-transfer flavoprotein which is located in the mitochondrial matrix and reduces ubiquinone in the mitochondrial membrane. The two redox centres in the protein, FAD and a [4Fe4S] cluster, are present in a 64 kDa monomer PUBMED:8306995.\ ' '2458' 'IPR000418' '\

    Transcription factors are protein molecules that bind to specific DNA\ sequences in the genome, resulting in the induction or inhibition of gene\ transcription PUBMED:2163347. The ets oncogene is such a factor, possessing a region \ of 85-90 amino acids known as the ETS (erythroblast transformation specific) domain PUBMED:2163347, PUBMED:2253872, PUBMED:14693367. This domain is rich in\ positively-charged and aromatic residues, and binds to purine-rich segments\ of DNA. The ETS domain has been identified in other transcription factors\ such as PU.1, human erg, human elf-1, human elk-1, GA binding protein, and\ a number of others PUBMED:2163347, PUBMED:2253872, PUBMED:8425553.\ It is generally localized at the C-terminus of the protein,\ with the exception of ELF-1, ELK-1, ELK-3, ELK-4 and ERF where it is found at\ the N-terminus.

    \

    NMR-analysis of the structure of the Ets domains revealed that it contains three alpha-helixes (1-3)\ and four-stranded beta-sheets (1-4) arranged in the order alpha1-beta1-beta2-alpha2-alpha3-beta3-beta4 forming a\ winged helix-turn-helix (wHTH) topology PUBMED:12559563. The third alpha-helix is\ responsive to contact to the major groove of the DNA. Different members of the Ets family proteins\ display distinct DNA binding specificities. The Ets domains and the flanking amino acid sequences\ of the proteins influence the binding affinity, and the alteration of a\ single amino acid in the Ets domain can change its DNA binding specificities.

    \

    Avian leukemia virus E26 is a replication defective retrovirus that induces a\ mixed erythroid/myeloid leukemia in chickens.This virus carries two distinct\ oncogenes: v-myb and v-ets. The ets portion of this oncogene is required for\ the induction of erythroblastosis. V-ets and c-ets-1, its cellular progenitor,\ have been shown PUBMED:2165853 to be nuclear DNA-binding proteins. Ets-1 differs slightly\ from v-ets at its carboxy-terminal region. In most species where it has been\ sequenced, c-ets-1 exists in various isoforms generated by alternative\ splicing and differential phosphorylation.

    \ ' '2459' 'IPR004991' '\

    This is a family of related bacterial toxins.

    \ ' '2460' 'IPR001925' '\

    The major protein of the outer mitochondrial membrane of eukaryotes is a porin that forms a voltage-dependent anion-selective channel (VDAC) that behaves as a general diffusion pore for small hydrophilic molecules PUBMED:8031826, PUBMED:1384178, PUBMED:1689252, PUBMED:2442148. The channel adopts an open conformation at low or zero membrane potential and a closed conformation at potentials above 30-40 mV.

    \

    This protein contains about 280 amino acids and its sequence is composed of between 12 to 16 beta-strands that span the mitochondrial outer membrane. Yeast contains two members of this family (genes POR1 and POR2); vertebrates have at least three members (genes VDAC1, VDAC2 and VDAC3) PUBMED:8812436.

    \ ' '2461' 'IPR007441' '\ EutH is a bacterial membrane protein whose molecular function is unknown. It has been suggested that it may act as an ethanolamine transporter, responsible for carrying ethanolamine from the periplasm to the cytoplasm PUBMED:10464203.\ ' '2462' 'IPR004992' '\

    This is a family of related bacterial proteins with roles in ethanolamine and carbon dioxide metabolism.

    \ ' '2463' 'IPR003400' '\ This group of proteins are membrane bound transport proteins essential for ferric ion uptake in bacteria PUBMED:9371459. The family consists of ExbD, and TolR which are involved in TonB-dependent transport of various receptor bound substrates including colicins PUBMED:3294803.\ ' '2464' 'IPR000305' '\

    During the process of Escherichia coli nucleotide excision repair, DNA damage recognition and processing are achieved by the action of the uvrA, uvrB, and uvrC gene products PUBMED:12034838. The UvrC proteins contain 4 conserved regions: a central region which interacts with UvrB (Uvr domain), a Helix hairpin Helix (HhH) domain important for 5 prime incision of damage DNA and the homology regions 1 and 2 of unknown function. UvrC homology region 2 is specific for UvrC proteins, whereas UvrC homology region 1 is also shared by few other nucleases.

    \

    It is found in the amino terminal region of excinuclease abc subunit c (uvrC), Bacteriophage T4 endonucleases segA, segB, segC, segD and segE; it is also found in putative endonucleases encoded by group I introns of fungi and phage.

    \ ' '2465' 'IPR004140' '\ The Exo70 protein forms one subunit of the exocyst complex. First discovered in Saccharomyces cerevisiae PUBMED:8978675, Exo70 and other exocyst proteins have been observed in several other eukaryotes, including humans. In S. cerevisiae, the exocyst complex is involved in the late stages of exocytosis, and is localized at the tip of the bud, the major site of exocytosis in yeast PUBMED:8978675. Exo70 interacts with the Rho3 GTPase PUBMED:10207081. This interaction mediates one of the three known functions of Rho3 in cell polarity: vesicle docking and fusion with the plasma membrane (the other two functions are regulation of actin polarity and transport of exocytic vesicles from the mother cell to the bud) PUBMED:10588647. In humans, the functions of Exo70 and the exocyst complex are less well characterised: Exo70 is expressed in several tissues and is thought to also be involved in exocytosis PUBMED:9405631.\ ' '2466' 'IPR006697' '\

    The exodeoxyribonuclease V enzyme is a multisubunit enzyme comprised of the proteins RecB (), RecC (this family) and RecD (). This enzyme plays an important role in homologous genetic recombination, repair of double strand DNA breaks resistance to UV irradiation and chemical DNA-damage. The enzyme () catalyzes hydrolysis of single-stranded (ss) DNA or double-stranded (ds) DNA and unwinding of the ends of dsDNA PUBMED:7746848. Its nuclease activity is controlled by Chi sites (5\' G-C-T-G-G-T-G-G 3\') in such a way that the enzyme produces a potent single-stranded DNA substrate for homologous pairing by RecA and single-stranded DNA binding proteins.

    \ ' '2467' 'IPR020579' '\

    Exonuclease VII is composed of two nonidentical subunits; one large subunit and 4 small ones PUBMED:6284744.\ Exonuclease VII catalyses exonucleolytic cleavage in\ either 5\'-3\' or 3\'-5\' direction to yield 5\'-phosphomononucleotides. The large subunit also contains the OB-fold domains () that bind to nucleic acids at the N-terminus.

    \

    This entry represents Exonuclease VII, large subunit, C-terminal.

    \ ' '2468' 'IPR003883' '\ Extensins are plant cell-wall proteins; they can account for up to 20% of the dry weight of the cell wall. They are highly-glycosylated, possibly reflecting their interactions with cell-wall carbohydrates. Amongst their functions is cell\ wall strengthening in response to mechanical stress (e.g., during attack by pests, plant-bending in the wind, etc.). This repeat occurs within extensin-like proteins.\ ' '2469' 'IPR018315' '\

    The actin filament system, a prominent part of the cytoskeleton in eukaryotic cells, is both a static structure and a dynamic network that can undergo rearrangements: it is thought to be involved in processes such as cell movement and phagocytosis PUBMED:2341404, as well as muscle contraction.

    \

    The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta (see ). Neither of the subunits shows sequence similarity to other filament-capping proteins PUBMED:2341404.

    \

    The alpha subunit is a protein of about 268 to 286 amino acid residues whose sequence is well conserved in eukaryotic species PUBMED:1711931.

    \ ' '2470' 'IPR001558' '\

    Human immunodeficiency virus 1\ (HIV-1) negative factor (Nef protein) accelerates virulent\ progression of acquired immunodeficiency syndrome (AIDS) by its interaction with specific\ cellular proteins involved in signal transduction and host cell activation. Nef has been shown\ to bind specifically to a subset of the Src family of kinases PUBMED:9351809.

    \ ' '2471' 'IPR004455' '\ The function of F420-dependent NADP reductase is the transfer of electrons from reduced coenzyme F420 into an electron transport chain. It catalyses the reduction of F420 with NADP(+) and the reduction of NADP(+) with F420H(2).\ ' '2472' 'IPR001698' '\

    The actin filament system, a prominent part of the cytoskeleton in eukaryotic cells, is both a static structure and a dynamic network that can undergo rearrangements: it is thought to be involved in processes such as cell movement and phagocytosis PUBMED:2341404, as well as muscle contraction.

    \

    The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin (see ) and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta. Neither of the subunits shows sequence similarity to other filament-capping proteins PUBMED:2341404.

    \

    The beta subunit is a protein of about 280 amino acid residues whose sequence is well conserved in eukaryotic species PUBMED:2179733.

    \ ' '2473' 'IPR000771' '\

    Fructose-bisphosphate aldolase PUBMED:2199259, PUBMED:1412694 is a glycolytic enzyme that catalyses the reversible aldol cleavage or condensation of fructose-1,6-bisphosphate into \ dihydroxyacetone-phosphate and glyceraldehyde 3-phosphate. There are two classes of fructose-bisphosphate aldolases with different catalytic mechanisms. Class-II aldolases PUBMED:1412694, mainly found in prokaryotes and fungi, are homodimeric enzymes, which require a divalent metal ion, generally zinc, for their activity. This family also includes the Escherichia coli galactitol operon protein, gatY, which catalyses the transformation of tagatose 1,6-bisphosphate into glycerone phosphate and D-glyceraldehyde 3-phosphate; and E. coli N-acetyl galactosamine operon protein, agaY, which catalyses the same reaction. There are two histidine residues in the first half of the sequence of these enzymes that have been shown to be involved in binding a zinc ion PUBMED:8436219.

    \ ' '2474' 'IPR005067' '\

    Fatty acid desaturases are enzymes that catalyze the insertion\ of a double bond at the delta position of fatty acids.

    \ \

    There seem to be two distinct families of fatty acid desaturases which do not\ seem to be evolutionary related.

    \ \

    Family 1 is composed of:

    \ \ \

    Family 2 is composed of:

    \ \ \

    This entry contains fatty acid desaturases belonging to Family 2.

    \ ' '2475' 'IPR006694' '\

    This superfamily includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways PUBMED:8718622. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif. Members of this family are integral membrane proteins.

    \ ' '2476' 'IPR003664' '\ The plsX gene is part of the bacterial fab gene cluster which encodes several key fatty acid biosynthetic enzymes PUBMED:9642179.\ The plsX gene encodes a poorly understood enzyme of phospholipid\ metabolism PUBMED:10464226.\ ' '2477' 'IPR002529' '\

    Fumarylacetoacetase (; also known as fumarylacetoacetate hydrolase or FAH) catalyses the hydrolytic cleavage of a carbon-carbon bond in fumarylacetoacetate to yield fumarate and acetoacetate as the final step in phenylalanine and tyrosine degradation PUBMED:11154690. This is an essential metabolic function in humans, the lack of FAH causing type I tyrosinaemia, which is associated with liver and kidney abnormalities and neurological disorders PUBMED:9101289, PUBMED:16602095. The enzyme mechanism involves a catalytic metal ion, a Glu/His catalytic dyad, and a charged oxyanion hole PUBMED:10508789. FAH folds into two domains: an N-terminal domain SH3-like beta-barrel, and a C-terminal with an unusual fold consisting of three layers of beta-sheet structures PUBMED:10508789.

    \ \

    This entry represents the C-terminal domain of fumarylacetoacetase, as well as other domains that share a homologous sequence, including:

    \ \ ' '2478' 'IPR003097' '\

    This domain is found in sulphite reductase, NADPH cytochrome P450 reductase, nitric oxide synthase and methionine synthase reductase. Flavoprotein pyridine nucleotide cytochrome reductases PUBMED:1748631 (FPNCR) catalyse the interchange of reducing equivalents between one-electron carriers and the two-electron-carrying nicotinamide dinucleotides. The enzymes include ferredoxin:NADP+reductases (FNR) PUBMED:8027025, plant and fungal NAD(P)H:nitrate reductases PUBMED:1748631, PUBMED:12165428, NADH:cytochrome b5 reductases PUBMED:3700359, NADPH:P450 reductases PUBMED:1908607, NADPH:sulphite reductases PUBMED:2550423, nitric oxide synthases PUBMED:1712077, phthalate dioxygenase reductase PUBMED:8298460, and various other flavoproteins.

    \ ' '2479' 'IPR003953' '\

    In bacteria two distinct, membrane-bound, enzyme complexes are responsible for\ the interconversion of fumarate and succinate (): fumarate\ reductase (Frd) is used in anaerobic growth, and succinate dehydrogenase (Sdh)\ is used in aerobic growth. Both complexes consist of two main components: a\ membrane-extrinsic component composed of a FAD-binding flavoprotein and an\ iron-sulphur protein; and an hydrophobic component composed of a membrane\ anchor protein and/or a cytochrome B.

    \

    In eukaryotes mitochondrial succinate dehydrogenase (ubiquinone) ()\ is an enzyme composed of two subunits: a FAD flavoprotein and and iron-sulphur\ protein.

    \

    The flavoprotein subunit is a protein of about 60 to 70 Kd to which FAD is\ covalently bound to a histidine residue which is located in the N-terminal\ section of the protein PUBMED:2668268. The sequence around that histidine is well\ conserved in Frd and Sdh from various bacterial and eukaryotic species PUBMED:1375942.

    \

    This family includes members that bind FAD such as the flavoprotein subunits from\ succinate and fumarate dehydrogenase, aspartate oxidase and the alpha subunit of adenylylsulphate\ reductase.

    \ ' '2480' 'IPR002938' '\ Monooxygenases incorporate one hydroxyl group into substrates and are found in many metabolic pathways. In this reaction, two atoms of dioxygen are reduced to one hydroxyl group and one H2O molecule by the concomitant oxidation of NAD(P)H PUBMED:1444267. P-hydroxybenzoate hydroxylase from Pseudomonas fluorescens contains this sequence motif (present in in flavoprotein hydroxylases) with a putative dual function in FAD and NADPH binding PUBMED:10025942.\ ' '2481' 'IPR002346' '\

    Oxidoreductases, that also bind molybdopterin, have essentially no similarity outside this common domain. \ They include aldehyde oxidase (), that converts an aldehyde and water to an acid and hydrogen peroxide, and xanthine dehydrogenase (), that converts xanthine to urate. These enzymes require molybdopterin and FAD as cofactors and have and two 2FE-2S clusters. Another enzyme that contains this domain is the Pseudomonas thermocarboxydovorans carbon monoxide oxygenase.

    \ ' '2482' 'IPR015865' '\

    Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase (), which converts it into FMN, and FAD synthetase (), which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme PUBMED:14580199, the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family PUBMED:17049878. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases PUBMED:12517446.

    \

    This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme.

    \ ' '2483' 'IPR006793' '\ This family represents a number of fimbrial protein transcription regulators found in Gram-negative bacteria. These proteins are thought to facilitate binding of the leucine-rich regulatory protein to regulatory elements, possibly by inhibiting deoxyadenosine methylation of these elements by deoxyadenosine methylase PUBMED:7476191, PUBMED:8846772.\ ' '2484' 'IPR000686' '\ Fanconi anaemia (FA) PUBMED:8490620, PUBMED:7929819, PUBMED:1574115 is a recessive inherited disease characterised\ by defective DNA repair. FA cells are sensitive to DNA cross-linking agents that cause chromosomal instability\ and cell death. The disease is manifested clinically by progressive pancytopenia, variable physical anomalies,\ and predisposition to malignancy. Four complementation groups have been identified, designated A to D. The\ gene for group C (FACC) has been cloned. Expression of the FACC cDNA corrects the phenotypic defect of FA(C)\ cells, resulting in normalized cell growth in the presence of DNA cross-linking agents such as mitomycin C\ (MMC). Gene transfer of the FACC gene should provide a survival advantage to transduced hematopoietic cells,\ suggesting that FA might be an ideal candidate for gene therapy PUBMED:7929819. The function of the FACC gene\ is not known. Immunofluorescence and sub-cellular fractionation studies of human cell lines, and COS-7 cells\ transiently expressing human FACC, showed the protein to be located primarily in the cytoplasm. Yet, placement\ of a nuclear localisation signal at the N-terminus of FACC directed the hybrid protein to the nuclei of\ transfected COS-7 cells. Such findings suggest an indirect role for FACC in regulating DNA repair in this\ group of Fanconi anaemia PUBMED:8058745.\ ' '2485' 'IPR003516' '\ Fanconi anaemia (FA) PUBMED:1641028, PUBMED:8490620, PUBMED:7929819 is a recessive inherited disease characterised by\ defective DNA repair. FA cells are sensitive to DNA cross-linking agents\ that cause chromosomal instability and cell death. The disease is manifested\ clinically by progressive pancytopenia, variable physical anomalies, and\ predisposition to malignancy PUBMED:7929819. Four complementation groups have been\ identified, designated A to D. The FA group A gene (FAA) has been\ cloned PUBMED:9169126, but its function remains to be elucidated.\ ' '2486' 'IPR012319' '\

    This entry represents the catalytic domain of DNA glycosylase/AP lyase enzymes, which are involved in base excision repair of DNA damaged by oxidation or by mutagenic agents. Most damage to bases in DNA is repaired by the base excision repair pathway PUBMED:15588838. These enzymes are primarily from bacteria, and have both DNA glycosylase activity () and AP lyase activity (). Examples include formamidopyrimidine-DNA glycosylases (Fpg; MutM) and endonuclease VIII (Nei).

    \

    Formamidopyrimidine-DNA glycosylases (Fpg, MutM) is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidation-damaged bases (N-glycosylase activity; ) and cleaves both the 3\'- and 5\'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity; ). Fpg has a preference for oxidised purines, excising oxidized purine bases such as 7,8-dihydro-8-oxoguanine (8-oxoG). ITs AP (apurinic/apyrimidinic) lyase activity introduces nicks in the DNA strand, cleaving the DNA backbone by beta-delta elimination to generate a single-strand break at the site of the removed base with both 3\'- and 5\'-phosphates. Fpg is a monomer composed of 2 domains connected by a flexible hinge PUBMED:10921868. The two DNA-binding motifs (a zinc finger and the helix-two-turns-helix motifs) suggest that the oxidized base is flipped out from double-stranded DNA in the binding mode and excised by a catalytic mechanism similar to that of bifunctional base excision repair enzymes PUBMED:10921868. Fpg binds one ion of zinc at the C-terminus, which contains four conserved and essential cysteines PUBMED:8473347, PUBMED:7704272.

    \

    Endonuclease VIII (Nei) has the same enzyme activities as Fpg above (, ), but with a preference for oxidized pyrimidines, such as thymine glycol, 5,6-dihydrouracil and 5,6-dihydrothymine PUBMED:15232006.

    \

    These protein contains three structural domains: an N-terminal catalytic core domain, a central helix-two turn-helix (H2TH) module and a C-terminal zinc finger (see PDB:1K82) PUBMED:11912217. The N-terminal catalytic domain and the C-terminal zinc finger straddle the DNA with the long axis of the protein oriented roughly orthogonal to the helical axis of the DNA. Residues that contact DNA are located in the catalytic domain and in a beta-hairpin loop formed by the zinc finger PUBMED:12055620.

    \ ' '2487' 'IPR006838' '\ This family includes the hamster androgen-induced FAR-17a protein () PUBMED:2045681, and its human homologue, the AIG1 protein () PUBMED:11266118. The function of these proteins is unknown. This family also includes homologous regions from a number of other metazoan proteins.\ ' '2488' 'IPR002544' '\ The neuropeptide Phe-Met-Arg-Phe-NH2 (FMRFamide) is a potent cardioactive neuropeptide in Lymnaea stagnalis PUBMED:1968092. FMRFamide (Phe-Met-Arg-Phe-NH2) was first demonstrated to be cardioactive in several molluscan species. FMRFamide is now known to be cardioexcitatory in mammals, to inhibit morphine-induced antinociception, and to block morphine-, defeat-, and deprivation-induced feeding PUBMED:3067224. \

    Thirteen neuropeptides varying in length from 7 to 11 residues and ending C-terminally in -Phe-Met-Arg-Phe-NH2 (calliFMRFamides 1-13) and one dodecapeptide ending in -Met-Ile-Arg-Phe-NH2 (calliMIRFamide 1) have been isolated from thoracic ganglia of the blowfly Calliphora vomitoria. Results indicate that the N terminus (in addition to the C terminus as previously found for FMRFamides of other organisms) is crucial for at least some biological activities PUBMED:1549595.

    \ ' '2489' 'IPR000782' '\

    The FAS1 (fasciclin-like) domain is an extracellular module of about 140 amino acid residues. It has been suggested that the FAS1 domain represents an ancient cell adhesion domain common to plants and animals PUBMED:7925267; related FAS1 domains are also found in bacteria PUBMED:7822037.

    \ \

    The crystal structure of FAS1 domains 3 and 4 of fasciclin I from Drosophila melanogaster (Fruit fly) has been determined, revealing a novel domain fold consisting of a seven-stranded beta wedge and at least five alpha helices; two well-ordered N-acetylglucosamine groups attached to a conserved asparagine are located in the interface region between the two FAS1 domains PUBMED:12575939. Fasciclin I is an insect neural cell adhesion molecule involved in axonal guidance that is attached to the membrane by a GPI-anchored protein.

    \ \

    FAS1 domains are present in many secreted and membrane-anchored proteins. These proteins are usually GPI anchored and consist of: (i) a single FAS1 domain, (ii) a tandem array of FAS1 domains, or (iii) FAS1 domain(s) interspersed with other domains.

    \ \

    Proteins known to contain a FAS1 domain include:

    \ \

    \ \

    The FAS1 domains of both human periostin () and BIgH3 () proteins were found to contain vitamin K-dependent gamma-carboxyglutamate residues PUBMED:18450759. Gamma-carboxyglutamate residues are more commonly associated with GLA domains (), where they occur through post-translational modification catalysed by the vitamin K-dependent enzyme gamma-glutamylcarboxylase.

    \ ' '2490' 'IPR003151' '\ The FAT domain is a domain present in the PIK-related kinases. Members of the family of PIK-related kinases may act as intracellular sensors that govern radial and horizontal pathways PUBMED:10782091.\ ' '2491' 'IPR003152' '\ The FATC domain is found at the C-terminal end of the PIK-related kinases. Members of the family of PIK-related kinases may act as intracellular sensors that govern radial and horizontal pathways PUBMED:10782091.\ ' '2492' 'IPR000146' '\

    This entry represents fructose-1,6-bisphosphatase (FBPase), a critical regulatory enzyme in gluconeogenesis that catalyses the removal of 1-phosphate from fructose 1,6-bis-phosphate to form fructose 6-phosphate PUBMED:2159755, PUBMED:3008716. It is involved in many different metabolic pathways and found in most organisms. FBPase requires metal ions for catalysis (Mg2+ and Mn2+ being preferred) and the enzyme is potently inhibited by Li+. The fold of fructose-1,6-bisphosphatase was noted to be identical to that of inositol-1-phosphatase (IMPase) PUBMED:8382485. Inositol polyphosphate 1-phosphatase (IPPase), IMPase and FBPase share a sequence motif (Asp-Pro-Ile/Leu-Asp-Gly/Ser-Thr/Ser) which has been shown to bind metal ions and participate in catalysis. This motif is also found in the distantly-related fungal, bacterial and yeast IMPase homologues. It has been suggested that these proteins define an ancient structurally conserved family involved in diverse metabolic pathways, including inositol signalling, gluconeogenesis, sulphate assimilation and possibly quinone metabolism PUBMED:7761465.

    \ ' '2493' 'IPR004464' '\ The GlpX protein is involved in glycerol metabolism but its exact function is unknown. It is induced by but not required for growth in glycerol.\ ' '2494' 'IPR003786' '\

    Formate dehydrogenase is required for nitrate inducible formate dehydrogenase activity. In Wolinella succinogenes it is a membranous molybdo-enzyme which is involved in phosphorylative electron transport. The functional formate dehydrogenase may be made up of three or four different subunits PUBMED:1781728. In Escherichia coli, FdhD is required for the formation of active formate dehydrogenases.

    \ ' '2495' 'IPR006452' '\

    This family of sequences describe an accessory protein required for the assembly of formate dehydrogenase of certain proteobacteria although not present in the final complex PUBMED:2170340. The exact nature of the function of FdhE in the assembly of the complex is unknown, but considering the presence of selenocysteine, molybdopterin, iron-sulphur clusters and cytochrome b556, it is likely to be involved in the insertion of cofactors.

    \ ' '2496' 'IPR005121' '\

    This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2) PUBMED:10447505, PUBMED:9016717.

    \ ' '2497' 'IPR001670' '\

    Alcohol dehydrogenase () (ADH) catalyzes the reversible oxidation of ethanol to acetaldehyde with the concomitant reduction of NAD PUBMED:. Currently three, structurally and catalytically, different types of alcohol dehydrogenases are known:\

    \

    Iron-containing ADH\'s have been found in yeast (gene ADH4) PUBMED:3584063, as well as in Zymomonas mobilis (gene adhB) PUBMED:2823079. These two iron-containing ADH\'s are closely related to the following enzymes:\

    \

    \ ' '2498' 'IPR001367' '\ The diphtheria toxin repressor protein (DTXR) is a member of this group PUBMED:7568230. In \ Corynebacterium diphtheriae where it has been studied in some detail this protein acts\ as an iron-binding repressor of dipheteria toxin gene expression and may serve as a \ global regulator of gene expression. The N-terminus may be involved in iron binding and\ may associate with the Tox operator. Binding of DTXR to Tox operator requires a divalent\ metal ion such as cobalt, ferric, manganese and nickel whereas zinc shows weak \ activation PUBMED:7743135.\ ' '2499' 'IPR001367' '\ The diphtheria toxin repressor protein (DTXR) is a member of this group PUBMED:7568230. In \ Corynebacterium diphtheriae where it has been studied in some detail this protein acts\ as an iron-binding repressor of dipheteria toxin gene expression and may serve as a \ global regulator of gene expression. The N-terminus may be involved in iron binding and\ may associate with the Tox operator. Binding of DTXR to Tox operator requires a divalent\ metal ion such as cobalt, ferric, manganese and nickel whereas zinc shows weak \ activation PUBMED:7743135.\ ' '2500' 'IPR004108' '\ Proteins containing this domain may be involved in the mechanism of biological hydrogen activation and contain 4FE-4S clusters. They can use molecular hydrogen for the reduction of a variety of substances.\ ' '2501' 'IPR003149' '\

    Many microorganisms, such as methanogenic, acetogenic, nitrogen-fixing, photosynthetic, or sulphate-reducing bacteria, metabolise hydrogen. Hydrogen activation is mediated by a family of enzymes, termed hydrogenases, which either provide these organisms with reducing power from hydrogen oxidation, or act as electron sinks. There are two hydrogenases families that differ functionally from each other: NiFe hydrogenases tend to be more involved in hydrogen oxidation, while Iron-only FeFe (Fe only) hydrogenases in hydrogen production.

    \

    Fe only hydrogenases () show a common core structure, which contains a moiety, deeply buried inside the protein, with an Fe-Fe dinuclear centre, nonproteic bridging, terminal CO and CN- ligands attached to each of the iron atoms, and a dithio moiety, which also bridges the two iron atoms and has been tentatively assigned as a di(thiomethyl)amine. This common core also harbours three [4Fe-4S] iron-sulphur clusters PUBMED:11921392. In FeFe hydrogenases, as in NiFe hydrogenases, the set of iron-sulphur clusters is dispersed regularly between the dinuclear Fe-Fe centre and the molecular surface. These clusters are distant by about 1.2 nm from each other but the [4Fe-4S] cluster closest to the dinuclear centre is covalently bound to one of the iron atoms though a thiolate bridging ligand. The moiety including the dinuclear centre, the thiolate bridging ligand, and the proximal [4Fe-4S] cluster is known as the H-cluster. A channel, lined with hydrophobic amino acid side chains, nearly connects the dinuclear centre and the molecular surface. Furthermore hydrogen-bonded water molecule sites have been identified at the interior and at the surface of the protein.

    \

    The small subunit is comprised of alternating random coil and alpha helical structures that encompass the large subunit in a novel protein fold PUBMED:10368269.

    \ ' '2502' 'IPR000522' '\ This is a subfamily of bacterial binding-protein-dependent transport systems family, and includes transport system permease proteins involved in the transport across the membrane of several compounds. This entry contains the inner components of this multicomponent transport system.\ \ ' '2503' 'IPR006860' '\ FecR is involved in regulation of iron dicitrate transport. In the absence of citrate FecR inactivates FecI. FecR is probably a sensor that recognises iron dicitrate in the periplasm.\ ' '2504' 'IPR003447' '\ The femAB operon codes for two nearly identical approximately 50-kDa proteins involved in the formation of the Staphylococcal pentaglycine interpeptide bridge in peptidoglycan PUBMED:9393725. These proteins are also considered as a factor influencing the level of methicillin resistance PUBMED:10209768.\ ' '2505' 'IPR007167' '\

    This entry represents the core domain of the ferrous iron (Fe2+) transport protein FeoA found in bacteria. This domain also occurs at the C-terminus in related proteins. The transporter Feo is composed of three proteins: FeoA a small, soluble SH3-domain protein probably located in the cytosol; FeoB, a large protein with a cytosolic N-terminal G-protein domain and a C-terminal integral inner-membrane domain containing two \'Gate\' motifs which likely functions as the Fe2+ permease; and FeoC, a small protein apparently functioning as an [Fe-S]-dependent transcriptional repressor PUBMED:, PUBMED:. Feo allows the bacterial cell to acquire iron from its environment.

    \ ' '2506' 'IPR011619' '\

    Escherichia coli has an iron(II) transport system (feo) which may make an important contribution to the iron supply of the cell under anaerobic conditions. FeoB has been identified as part of this transport system and may play a role in the transport of ferrous iron. FeoB is a large 700-800 amino acid integral membrane protein. The N terminus contains a P-loop motif suggesting that iron transport may be ATP dependent PUBMED:8407793.

    \ ' '2507' 'IPR000392' '\

    This entry represents members of the NifH/BchL/ChlL family.

    \ \

    Nitrogen fixing bacteria possess a nitrogenase enzyme complex that catalyses the reduction of molecular nitrogen to ammonia PUBMED:2672439, PUBMED:6327620, PUBMED:. The nitrogenase enzyme complex consists of two components:

    \

    \

    Component II has 2 ATP-binding domains and one 4Fe-4S cluster per homodimer: it supplies energy by ATP hydrolysis, and transfers electrons from reduced ferredoxin or flavodoxin to component I for the reduction of molecular nitrogen to ammonia PUBMED:2491672. There are a number of conserved regions in the sequence of these proteins: in the N-terminal section there is an ATP-binding site motif \'A\' (P-loop) and in the central section there are two conserved cysteines which have been shown, in nifH, to be the ligands of the 4Fe-4S cluster.

    \ \

    Protochlorophyllide reductase is involved in light-independent chlorophyll biosynthesis. The light-independent reaction uses Mg-ATP and reduced ferredoxin to reduce ring D of protochlorophyllide (Pchlide) to form chlorophyllide a (Chlide). This enzyme complex is composed of three subunits: ChlL, ChlN and ChlB. ChlL is present as a homodimer, and binds one 4Fe-4S cluster per dimer. The\ conserved domains, including the ATP-binding motif and the Fe-S binding motif found in the three subunits, are similar to those in nitrogenases PUBMED:16889380.

    \ \ ' '2508' 'IPR013130' '\ This family includes a common region in the transmembrane proteins mammalian cytochrome b-245 heavy chain (gp91-phox), ferric reductase transmembrane component in yeast and respiratory burst oxidase from Arabidopsis thaliana.\ This may be a family of flavocytochromes capable of moving electrons across the plasma membrane PUBMED:8321236 that include a potential FAD binding domain.\ Mutations in the sequence of cytochrome b-245 heavy chain (gp91-phox)\ lead to the X-linked chronic granulomatous disease. The bacteriocidal\ ability of phagocytic cells is reduced and is characterised by the\ absence of a functional plasma membrane associated NADPH oxidase PUBMED:3600768.\ \ ' '2509' 'IPR001015' '\ Synonym(s): Protohaem ferro-lyase, Iron chelatase, etc.\

    Ferrochelatase catalyses the last step in haem biosynthesis: the chelation of a ferrous ion to proto-porphyrin IX, to form protohaem PUBMED:2185242, PUBMED:1704134. In eukaryotic cells, it binds to the mitochondrial inner membrane with its active site on the matrix side of the membrane.

    \

    The X-ray structure of Bacillus subtilis and human ferrochelatase have been solved PUBMED:9384565, PUBMED:11175906.\ The human enzyme exists as a homodimer. Each\ subunit contains one [2Fe-2S] cluster. The monomer is folded into two\ similar domains, each with a four-stranded parallel\ beta-sheet flanked by an alpha-helix in a beta-alpha-beta motif that is\ reminiscent of the fold found in the periplasmic binding\ proteins. The topological similarity between the domains suggests that\ they have arisen from a gene duplication event. However,\ significant differences exist between the two domains, including an\ N-terminal section (residues 80-130) that forms part of the\ active site pocket, and a C-terminal extension (residues 390-423) that\ is involved in coordination of the [2Fe-2S] cluster and in\ stabilisation of the homodimer.

    \

    Ferrochelatase seems to have a structurally conserved core region that is common to the enzyme from bacteria, plants and mammals. Porphyrin binds in the identified cleft; this cleft also includes the metal-binding site of the enzyme. It is likely that the structure of the cleft region will have different conformations upon substrate binding and release PUBMED:9384565.

    \ \ \ \ \ ' '2510' 'IPR007202' '\

    These proteins contain a domain with four conserved cysteines that probably form an Fe-S redox cluster.

    \ ' '2511' 'IPR004207' '\ Ferredoxin thioredoxin reductase is a [4FE-4S] protein which plays an important role in the ferredoxin/thioredoxin regulatory chain. It converts an electron signal (photoreduced ferredoxin) to a thiol signal (reduced thioredoxin), regulating enzymes by reduction of specific disulphide groups. It catalyses the light-dependent activation of several photosynthetis enzymes. Ferredoxin thioredoxin reductase is a heterodimer of subunit a and subunit b. Subunit a is the variable subunit, and b is the catalytic chain. This family is the alpha chain.\ ' '2512' 'IPR004209' '\

    Ferredoxin thioredoxin reductase is a [4FE-4S] protein which plays an important role in the ferredoxin/thioredoxin regulatory chain. It converts an electron signal (photoreduced ferredoxin) to a thiol signal (reduced thioredoxin), regulating enzymes by reduction of specific disulphide groups. It catalyses the light-dependent activation of several photosynthetis enzymes. Ferredoxin thioredoxin reductase is a heterodimer of subunit alpha and subunit beta. Subunit alpha is the variable subunit, and beta is the catalytic chain. The structure of the beta subunit has been determined and found to fold around the FeS cluster PUBMED:10649999.

    \

    This entry represents the beta subunit of ferredoxin thioredoxin reductase.

    \ ' '2513' 'IPR002713' '\ The FF domain may be involved in protein-protein interaction PUBMED:10390614. It often occurs as multiple copies and often accompanies WW domains . PRP40 from yeast encodes a novel, essential splicing component that associates with the yeast U1 small nuclear ribonucleoprotein particle PUBMED:8622699.\ ' '2514' 'IPR007709' '\ Formylglutamate amidohydrolase (FGase) catalyzes the terminal reaction in the five-step pathway for histidine utilization in Pseudomonas putida. By this action, N-formyl-L-glutamate (FG) is hydrolyzed to produce L-glutamate plus formate PUBMED:3308850.\ ' '2515' 'IPR002348' '\ The interleukin-1 (IL1) and heparin-binding growth factor (HBGF) families\ share low sequence similarity (about 25% PUBMED:1849658) but have very similar\ structures. Coupled with the Kunitz-type soybean trypsin inhibitors (STI),\ they form a structural superfamily. Despite their structural correspondence, however, they show no sequence similarity to the STI family.\ \ The crystal structures of interleukin-1 beta and HBGF1 have been solved, \ showing both families to have the same 12-stranded beta-sheet structure \ PUBMED:1738162; the beta-sheets are arranged in 3 similar lobes around a central \ axis, 6 strands forming an anti-parallel beta-barrel PUBMED:1707542, PUBMED:4071057. The beta-sheets \ are generally well preserved and the crystal structures superimpose in\ these areas. The intervening loops are less well conserved - the loop \ between beta-strands 6 and 7 is slightly longer in interleukin-1 beta.\ ' '2516' 'IPR018484' '\ It has been shown PUBMED:1659648 that four different type of carbohydrate kinases seem to be evolutionary related.\ These enzymes include L-fucolokinase () (gene fucK); gluconokinase () (gene gntK); glycerol\ kinase () (gene glpK); xylulokinase () (gene xylB); and L-xylulose kinase ()\ (gene lyxK). These enzymes are proteins of from 480 to 520 amino acid residues.\

    This entry represents the N-terminal domain of these proteins. It adopts a ribonuclease H-like fold and is structurally related to the C-terminal domain PUBMED:8430315, PUBMED:9843423.

    \ ' '2517' 'IPR018485' '\ It has been shown PUBMED:1659648 that four different type of carbohydrate kinases seem to be evolutionary related.\ These enzymes include L-fucolokinase () (gene fucK); gluconokinase () (gene gntK); glycerol\ kinase () (gene glpK); xylulokinase () (gene xylB); and L-xylulose kinase ()\ (gene lyxK). These enzymes are proteins of from 480 to 520 amino acid residues.\

    This entry represents the C-terminal domain of these proteins. It adopts a ribonuclease H-like fold and is structurally related to the N-terminal domain PUBMED:8430315, PUBMED:9843423.

    \ ' '2518' 'IPR000253' '\

    The forkhead-associated (FHA) domain PUBMED:7482699 is a phosphopeptide recognition domain found in many regulatory proteins. It displays specificity for phosphothreonine-containing epitopes but will also recognise phosphotyrosine with relatively high affinity. It spans approximately 80-100 amino acid residues folded into an 11-stranded beta sandwich, which sometimes contain small helical insertions between the loops connecting the strands PUBMED:11911881.

    \ \

    To date, genes encoding FHA-containing proteins have been identified in eubacterial and eukaryotic but not archaeal genomes. The domain is present in a diverse range of proteins, such as kinases, phosphatases, kinesins, transcription factors, RNA-binding proteins and metabolic enzymes which partake in many different cellular processes - DNA repair, signal transduction, vesicular transport and protein degradation are just a few examples.

    \ ' '2519' 'IPR001712' '\

    The Flagellar/Hr/Invasion Proteins Export Pore (FHIPEP) family PUBMED:8253684, PUBMED:8316211 consists of a number of proteins that constitute the type III secretion (or signal peptide-independent) pathway apparatus PUBMED:1365398, PUBMED:1592799. This mechanism translocates proteins lacking an N-terminal signal peptide across the cell membrane in one step, as it does not require an intermediate periplasmic process to cleave the signal peptide. It is a common pathway amongst Gram-negative bacteria for secreting toxic and flagellar proteins.

    \

    The pathway apparatus comprises three components: two within the inner membrane and one within the outer PUBMED:8316211. An FHIPEP protein is located within the inner membrane, although it is unknown which component it constitutes. FHIPEP proteins have all about 700 amino-acid residues. Within the sequence, the N terminus is highly conserved and hydrophobic, suggesting that this terminus is embedded within the membrane, with 6-8 transmembrane (TM) domains, while the C terminus is less conserved and appears to be devoid of TM regions. It is possible that members of the FHIPEP family serve as pores for the export of specific proteins.

    \ ' '2520' 'IPR000692' '\ Fibrillarin is a component of a nucleolar small nuclear ribonucleoprotein (SnRNP), functioning in vivo\ in ribosomal RNA processing PUBMED:2026646, PUBMED:8493104. It is associated with U3, U8 and U13 small nuclear\ RNAs in mammals PUBMED:2026646 and is similar to the yeast NOP1 protein PUBMED:2686980. Fibrillarin has a\ well conserved sequence of around 320 amino acids, and contains 3 domains, an N-terminal Gly/Arg-rich\ region; a central domain resembling other RNA-binding proteins and containing an RNP-2-like consensus\ sequence; and a C-terminal alpha-helical domain. An evolutionarily related pre-rRNA processing protein,\ which lacks the Gly/Arg-rich domain, has been found in various archaebacteria.\ ' '2521' 'IPR002181' '\

    Fibrinogen plays key roles in both blood clotting and platelet aggregation. During blood clot formation, the conversion of soluble fibrinogen to insoluble fibrin is triggered by thrombin, resulting in the polymerisation of fibrin, which forms a soft clot; this is then converted to a hard clot by factor XIIIA, which cross-links fibrin molecules. Platelet aggregation involves the binding of the platelet protein receptor integrin alpha(IIb)-beta(3) to the C-terminal D domain of fibrinogen PUBMED:12799374. In addition to platelet aggregation, platelet-fibrinogen interaction mediates both adhesion and fibrin clot retraction.

    \

    Fibrinogen occurs as a dimer, where each monomer is composed of three non-identical chains, alpha, beta and gamma, linked together by several disulphide bonds PUBMED:11460466. The N-terminals of all six chains come together to form the centre of the molecule (E domain), from which the monomers extend in opposite directions as coiled coils, followed by C-terminal globular domains (D domains). Therefore, the domain composition is: D-coil-E-coil-D. At each end, the C-terminal of the alpha chain extends beyond the D domain as a protuberance that is important for cross-linking the molecule.

    \

    During clot formation, the N-terminal fragments of the alpha and beta chains (within the E domain) in fibrinogen are cleaved by thrombin, releasing fibrinopeptides A and B, respectively, and producing fibrin. This cleavage results in the exposure of four binding sites on the E domain, each of which can bind to a D domain from different fibrin molecules. The binding of fibrin molecules produces a polymer consisting of a lattice network of fibrins that form a long, branching, flexible fibre PUBMED:11593005, PUBMED:15837518. Fibrin fibres interact with platelets to increase the size of the clot, as well as with several different proteins and cells, thereby promoting the inflammatory response and concentrating the cells required for wound repair at the site of damage.

    \ \

    This entry represents the C-terminal globular D domain of the alpha, beta and gamma chains. These domains are related to domains in other proteins: in the Parastichopus parvimensis (Sea cucumber) fibrogen-like FreP-A and FreP-B proteins; in the C-terminus of the Drosophila scabrous protein that is involved in the regulation of neurogenesis, possibly through the inhibition of R8 cell differentiation; and in ficolin proteins, which display lectin activity towards N-acetylglucosamine through their fibrogen-like domains PUBMED:12396010.

    \

    More information about these proteins can be found at Protein of the Month: Fibrinogen PUBMED:.

    \ ' '2522' 'IPR003303' '\ Filaggrins are filament-associated proteins that interact with keratin\ intermediate filaments of terminally differentiating mammalian epidermis\ via disulphide bond formation PUBMED:2740331. They show wide species variations and\ their aberrant expression has been implicated in a number of keratinising\ disorders. The proteins are synthesised as large, insoluble, highly-\ phosphorylated precursors, containing multiple tandem repeats of 324 amino\ acids, which are not separated by a large linker. The precursor is\ deposited as keratohyalin granules. During terminal differentiation, it\ is dephosphorylated and proteolytically cleaved.\ ' '2523' 'IPR016044' '\

    Intermediate filaments (IF) PUBMED:8771189, PUBMED:3052284, PUBMED:2183847 are proteins which are primordial components of the cytoskeleton and the nuclear envelope. They generally form filamentous structures 8 to 14 nm wide.

    \

    IF proteins are members of a very large multigene family of proteins which has been subdivided in five major subgroups:\

    \

    All IF proteins are structurally similar in that they consist of: a central rod domain comprising some 300 to 350 residues which is arranged in coiled-coiled alpha-helices, with at least two short characteristic interruptions; a N-terminal non-helical domain (head) of variable length; and a C-terminal domain (tail) which is also non-helical, and which shows extreme length variation between different IF proteins.

    \

    While IF proteins are evolutionary and structurally related, they have limited sequence homologies except in several regions of the rod domain.

    \ \

    This entry represents the central rod domain found in IF proteins.

    \ \ ' '2524' 'IPR006821' '\ This domain represents the N-terminal head region of intermediate filaments. Intermediate filament heads bind DNA PUBMED:11513613. Vimentin heads are able to alter nuclear architecture and chromatin distribution, and the liberation of heads by HIV-1 protease liberates may play an important role in HIV-1 associated cytopathogenesis and carcinogenesis PUBMED:11160829. Phosphorylation of the head region can affect filament stability PUBMED:12177195. The head has been shown to interaction with the rod domain of the same protein PUBMED:12064937.\ ' '2525' 'IPR017868' '\

    The many different actin cross-linking proteins share a common architecture, consisting of a globular actin-binding domain and an extended rod. Whereas their actin-binding domains consist of two calponin homology domains (see ), their rods fall into three families.

    \ \

    The rod domain of the family including the Dictyostelium discoideum (Slime mould) gelation factor (ABP120) and human filamin (ABP280) is constructed from tandem repeats of a 100-residue motif that is glycine and proline rich PUBMED:9164464. The gelation factor\'s rod contains 6 copies of the repeat, whereas filamin has a rod constructed from 24 repeats. The resolution of the 3D structure of rod repeats from the gelation factor has shown that they consist of a beta-sandwich, formed by two beta-sheets arranged in an immunoglobulin-like fold PUBMED:9164464, PUBMED:10467095. Because conserved residues that form the core of the repeats are preserved in filamin, the repeat structure should be common to the members of the gelation factor/filamin family.

    \ \

    The head to tail homodimerisation is crucial to the function of the ABP120 and ABP280 proteins. This interaction involves a small portion at the distal end of the rod domains. For the gelation factor it has been shown that the carboxy-terminal repeat 6 dimerises through a double edge-to-edge extension of the beta-sheet and that repeat 5 contributes to dimerisation to some extent PUBMED:9417983, PUBMED:10467095, PUBMED:2668299.

    \ ' '2526' 'IPR002561' '\

    This entry represents an extracellular region from the envelope glycoprotein of Ebola virus sp. and Lake Victoria marburgvirus. This region is also produced as a separate transcript that gives rise to a non-structural, secreted glycoprotein,\ which is produced in large amounts and has an unknown function PUBMED:9576958. Processing of this protein may be involved in viral\ pathogenicity PUBMED:8622982.

    \ ' '2527' 'IPR002953' '\

    The filoviridae are a group of viruses that cause haemorrhagic fevers with\ a high mortality rate. The family currently contains three viruses: Ebola virus sp., Lake Victoria marburgvirus and Reston ebolavirus, named after their corresponding outbreak regions. They possess negative-stranded RNA genomes, which encode at least 7 proteins. The VP35 protein is found in the genomes of all filoviruses. Its function is presently unknown, but it is thought to share the function of the phosphorylated proteins (polymerase subunits) of rhabdoviruses and paramyxoviruses due to its position in the genome. There is no evidence however, to suggest that VP35 is phosphorylated PUBMED:, PUBMED:8482365.

    \ ' '2528' 'IPR014779' '\ Members of this family of bacterial proteins are involved in regulation of length and mediation of adhesion of fimbriae. Fimbriae (also called pili), are polar filaments radiating from the surface of the bacterium to a length of 0.5-1.5 micrometers, that enable bacteria to colonize the epithelium of specific host organs. Fimbriae are also responsible to promote virulence PUBMED:10066469, PUBMED:1681580, PUBMED:2890081.\ ' '2529' 'IPR007540' '\ Fimbriae, also known as pili, form filaments radiating from the surface of the bacterium to a length of 0.5-1.5 micrometres. They enable the cell to colonise host epithelia. This family constitutes the major subunits of CS1 like pili, including CS2 and CFA1 from Escherichia coli, and also the Cable type II pilin major subunit from Burkholderia cepacia PUBMED:10094617. The major subunit of CS1 pili is called CooA. Periplasmic CooA is mostly complexed with the assembly protein CooB. In addition, a small pool of CooA multimers, and CooA-CooD complexes exists, but the functional significance is unknown PUBMED:10094617. A member of this family has also been identified in Salmonella typhi and Salmonella enterica PUBMED:10417651.\ ' '2530' 'IPR003467' '\ Fimbriae (also know as pili) are polar filaments radiating from the surface of the bacterium to a length of 0.5-1.5 micrometers, that enable bacteria to colonize the epithelium of specific host organs. This family consists of the minor and major fimbrial subunits.\ ' '2531' 'IPR007854' '\ This short motif is about 40 amino acids in length. In the Fip1 protein that is a component of a Saccharomyces cerevisiae pre-mRNA polyadenylation factor that directly interacts with poly(A) polymerase PUBMED:7736590. This region of Fip1 is needed for the interaction with the Yth1 subunit of the complex and for specific polyadenylation of the cleaved mRNA precursor PUBMED:11238938.\ ' '2532' 'IPR003468' '\

    Cytochrome cbb3 oxidases are found almost exclusively in Proteobacteria, and represent a distinctive class of proton-pumping respiratory haem-copper oxidases (HCO) that lack many of the key structural features that contribute to the reaction cycle of the intensely studied mitochondrial cytochrome c oxidase (CcO) PUBMED:15100055. Cytochrome cbb3 oxidases are required both to support symbiotic nitrogen fixation, whilst ensuring that the oxygen-labile nitrogenase is not compromised. Cytochrome cbb3 oxidases consist of four subunits: FixN (or CcoN), FixO (or CcoO), FixP (or CcoP) and FixQ (or CcoQ). The catalytic core is comprised of subunits FixN, FixO and FixP, where FixN acts as the catalytic subunit, and Fix O and FixP are membrane-bound mono- and di-haem cytochromes c, respectively. The FixQ subunit protects the core complex in the presence of oxygen from proteolytic degradation PUBMED:11864982. This entry represents the mono-haem FixO subunit.

    \ ' '2533' 'IPR000774' '\

    Peptidyl-prolyl cis-trans isomerase (PPIase) catalyses the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides [, PUBMED:2644542 and is only found at the amino terminus of proteins belonging to the family.

    \

    Peptidyl-prolyl cis-trans isomerase has also been shown to accelerate the refolding of several proteins in vitro PUBMED:6395866, PUBMED:3306408, PUBMED:3277061, the FKPB-type enzymes probably act in the folding of extracytoplasmic proteins.

    \ ' '2534' 'IPR006714' '\

    Periplasmic flagella are the organelles of spirochete mobility, and are structurally different from the flagella of other motile bacteria. They reside inside the cell within the periplasmic space, and confer mobility in viscous gel-like media such as connective tissue PUBMED:2194955. The flagella are composed of an outer sheath of FlaA proteins and a core filament of FlaB proteins. Each species usually has several FlaA protein species PUBMED:8990312.

    \ ' '2536' 'IPR005186' '\

    Although these proteins are known to be important for flagellar their exact function is unknown.

    \ ' '2537' 'IPR003223' '\ Flagellin is the subunit which polymerises to form the filaments of bacterial\ flagella. The proteins in this family are transcriptional repressors of phase-1 flagellin genes.\ ' '2539' 'IPR007824' '\ This family consists of several eukaryotic paraflagellar rod component proteins. The eukaryotic flagellum represents one of the most complex macromolecular structures found in any organism and contains more than 250 proteins PUBMED:11112698. In addition to its locomotive role, the flagellum is probably involved in nutrient uptake since receptors for host low-density lipoproteins are localised on the flagellar membrane as well as on the flagellar pocket membrane PUBMED:11163437.\ ' '2540' 'IPR001029' '\

    Flagellin is the subunit which polymerises to form the filaments of\ bacterial flagella. Two regions, one at the N terminus and the other, this one, at the C terminus seem always to occur \ together PUBMED:2190210.

    \ ' '2541' 'IPR001492' '\

    Flagellin is a subunit which polymerises to form the filaments of bacterial flagella. This N-terminal domain and the C-terminal () always occur together. Flagellin is recognised as a virulence factor by the innate immune systems of a variety of diverse organisms. Mammalian TLR5 (Toll like receptor 5) recognises bacterial flagellin from both Gram-positive and Gram-negative bacteria PUBMED:11323673.

    \ ' '2542' 'IPR000404' '\ Flaviviruses encode a single polyprotein. This is cleaved into\ three structural and seven non-structural proteins. The NS4A\ protein is small and poorly conserved among the Flaviviruses.\ NS4A contains multiple hydrophobic potential membrane spanning\ regions PUBMED:2174669. NS4A has only been found in cells infected by Kunjin virus PUBMED:2541547.\ ' '2543' 'IPR001528' '\ Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. The NS4B protein is small and poorly conserved among the Flaviviruses. NS4B contains multiple hydrophobic potential membrane spanning regions PUBMED:2174669. NS4B may form membrane components of the viral replication complex and could be involved in membrane localisation of NS3 and NS5 (see ) PUBMED:2174669.\ ' '2544' 'IPR000208' '\

    RNA-directed RNA polymerase (RdRp) () is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage PUBMED:2759231, PUBMED:8709232. It catalyses synthesis of the RNA strand complementary to a given RNA template, but the precise molecular mechanism remains unclear.\ The postulated RNA replication process is a two-step mechanism. First, the initiation step of RNA synthesis begins at or near the 3\' end of the RNA template by means of a primer-independent (de novo) mechanism. The de novo initiation consists in the addition of a nucleotide tri-phosphate (NTP) to the 3\'-OH of the first initiating NTP. During the following so-called elongation phase, this nucleotidyl transfer reaction is repeated with subsequent NTPs to generate the complementary RNA product PUBMED:11531403.

    \

    All the RNA-directed RNA polymerases, and many DNA-directed polymerases, employ a fold whose organisation has been likened to the shape of a right hand with three subdomains termed fingers, palm and thumb PUBMED:9309225. Only the palm subdomain, composed of a four-stranded antiparallel beta-sheet with two alpha-helices, is well conserved among all of these enzymes. In RdRp, the palm subdomain comprises three well conserved motifs (A, B and C). Motif A (D-x(4,5)-D) and motif C (GDD) are spatially juxtaposed; the Asp residues of these motifs are implied in the binding of Mg2+ and/or Mn2+. The Asn residue of motif B is involved in selection of ribonucleoside triphosphates over dNTPs and thus determines whether RNA is synthesised rather than DNA PUBMED:10827187.\ The domain organisation PUBMED:9878607 and the 3D structure of the catalytic centre of a wide range of RdPp\'s, even those with a low overall sequence homology, are conserved. The catalytic centre is formed by several motifs containing a number of conserved amino acid residues.

    \

    There are 4 superfamilies of viruses that cover all RNA containing viruses with no DNA stage:

    \ The RNA-directed RNA polymerases in the first of the above superfamilies can be divided into the following three subgroups:\

    \ \ Flaviviruses produce a polyprotein from the ssRNA genome. The polyprotein is cleaved to a number of products one of which is NS5. Recombinant dengue type 1 virus NS5 protein expressed in Escherichia coli exhibits RNA-dependent RNA polymerase activity.\ This RNA-directed RNA polymerase possesses a number of short\ regions and motifs homologous to other RNA-directed RNA \ polymerases PUBMED:8607261.\ ' '2545' 'IPR002535' '\ The Flaviviruses are small enveloped animal viruses containing a single\ positive strand genomic RNA PUBMED:2174669. The genome encodes one large ORF, a\ polyprotein which undergos proteolytic processing into mature viral \ peptide chains.\ This entry consists of a propeptide region of approximately 90 amino\ acids in length.\ ' '2546' 'IPR003680' '\

    This family consists of a domain with a flavodoxin-like fold. The family includes bacterial and eukaryotic NAD(P)H dehydrogenase (quinone) . These enzymes catalyse the NAD(P)H-dependent two-electron reductions of quinones and protect cells against damage by free radicals and reactive oxygen species PUBMED:2168383. This enzyme uses a FAD cofactor. The equation for this reaction is NAD(P)H + acceptor = NAD(P)(+) + reduced acceptor. This enzyme is also involved in the bioactivation of prodrugs used in chemotherapy PUBMED:2168383. The family also includes acyl carrier protein phosphodiesterase . This enzyme converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine residue from ACP PUBMED:7568029. This family is related to FMN_red and Flavodoxin_1 .

    \ ' '2547' 'IPR003382' '\ This entry contains a diverse range of flavoprotein enzymes, including epidermin biosynthesis protein, EpiD, which has been shown to be a flavoprotein that binds FMN PUBMED:1644762. This enzyme catalyzes the removal of two reducing equivalents from the cysteine residue of the C-terminal meso-lanthionine of epidermin to form a --C==C-- double bond. This family also includes the B chain of dipicolinate synthase a small polar molecule that accumulates to high concentrations in bacterial endospores, and is thought to play a role in spore heat resistance, or the maintenance of heat resistance PUBMED:8345520. Dipicolinate synthase catalyses the formation of dipicolinic acid from dihydroxydipicolinic acid. This family also includes phenylacrylic acid decarboxylase PUBMED:8181743.\ ' '2548' 'IPR001444' '\

    Many bacterial species swim actively by means of flagella. The flagella\ organelle is made of three parts: the basal body, the hook and the filament.\ The basal body consists of four rings (L,P,S, and M) mounted on a central rod PUBMED:2129540.

    \

    In Salmonella typhimurium and related organisms the rod has been shown to\ consist of four different, yet evolutionary related proteins: in the distal\ portion of the rod there are about 26 subunits of protein flgG and in the\ proximal portion there are about six subunits each of proteins flgB, flgC, and\ flgF.\ These four proteins contain a highly conserved\ asparagine-rich domain at their N terminus.

    \ ' '2549' 'IPR001635' '\

    During flagellar morphogenesis in Salmonella typhimurium and Escherichia coli, the fliK gene product is responsible for hook length control PUBMED:8631687. The deduced amino acid sequences of FliK proteins from S. typhimurium and E. coli have molecular masses of 41,748 and 39,246 Da, respectively, and are fairly hydrophilic PUBMED:8631687. Sequence comparison reveals around 50% identity, with greatest conservation in the C-terminal region, with 71% identity in the last 154 amino acids - mutagenesis of this conserved region completely abolishes motility. The central and C-terminal regions are rich in proline and glutamine respectively; it is thought that they may constitute distinct domains, separated by a linker region PUBMED:8631687.

    \

    It is considered unlikely that FliK functions as a molecular ruler for determining hook length, but that it is more likely to be employing a novel mechanism PUBMED:8631687.

    \ ' '2551' 'IPR005648' '\ FlgD is known to be absolutely required for hook assembly, yet it has not been detected in the mature flagellum PUBMED:8157595. It appears to act as a hook-capping protein to enable assembly of hook protein subunits PUBMED:8157595.\ ' '2552' 'IPR000527' '\ The flgH, flgI and fliF genes of Salmonella typhimurium encode the major proteins for the L, P and M rings of the flagellar basal body PUBMED:2544561. In fact, the basal body consists of four rings (L,P,S and M) surrounding the flagellar rod, which is believed to transmit motor rotation to the filament PUBMED:2129540. The M ring is integral to the inner membrane of the cell, and may be connected to the rod via the S (supramembrane) ring, which lies just distal to it. The L and P rings reside in the outer membrane and periplasmic space, respectively. FlgH and FlgI, which are exported across the cell membrane to their destinations in the outer membrane and periplasmic space, have typical N-terminal cleaved signal-peptide sequences. FlgH is predicted to have an extensive beta-sheet structure, in keeping with other outer membrane proteins PUBMED:2544561.\ ' '2553' 'IPR001782' '\

    The flgH, flgI and fliF genes of Salmonella typhimurium encode the major proteins for the L, P and M rings of the flagellar basal body PUBMED:2544561. In fact, the basal body consists of four rings (L,P,S and M) surrounding the flagellar rod, which is believed to transmit motor rotation to the filament PUBMED:2129540. The M ring is integral to the inner membrane of the cell, and may be connected to the rod via the S (supramembrane) ring, which lies just distal to it. The L and P rings reside in the outer membrane and periplasmic space, respectively.

    \

    The sequences of the FlgH, FlgI and FliF gene products have been determined PUBMED:2544561. FlgH and FlgI, which are exported across the cell membrane to their destinations in the outer membrane and periplasmic space, have typical N-terminal cleaved signal-peptide sequences PUBMED:2544561, PUBMED:3549691. FlgH is predicted to have an extensive beta-sheet structure, in keeping with other outer membrane proteins, and FlgI is thought to have even more beta-structure content PUBMED:2544561. Several aspects of the DNA sequence of these genes and their surrounds suggest complex regulation of the flagellar gene system.

    \ ' '2554' 'IPR007412' '\ FlgM binds and inhibits the activity of the transcription factor sigma 28. Inhibition of sigma 28 prevents the expression of genes from flagellar transcriptional class 3, which include genes for the filament and chemotaxis. Correctly assembled basal body-hook structures export FlgM, relieving inhibition of sigma 28 and allowing expression of class 3 genes. NMR studies show that free FlgM is mostly unfolded, which may facilitate its export. The C-terminal half of FlgM adopts a tertiary structure when it binds to sigma 28. All mutations in FlgM that prevent sigma 28 inhibition affect the C-terminal domain and is the region thought to constitute the binding domain. A minimal binding domain has been identified between Glu 64 and Arg 88 in Salmonella typhimurium ().The N-terminal portion remains unstructured and may be necessary for recognition by the export machinery PUBMED:9095196.\ ' '2555' 'IPR007809' '\ This family includes the FlgN protein, an export chaperone involved in flagellar synthesis PUBMED:11169117.\ ' '2556' 'IPR003481' '\ The flagellar hook-associated protein 2 (HAP2 or FliD) is the capping protein for the flagella and forms the distal end of the flagella. The protein plays a role in mucin specific adhesion of the bacteria PUBMED:9488388.\ ' '2557' 'IPR001624' '\

    Four genes from the major Bacillus subtilis chemotaxis locus have been shown to encode proteins that are similar to the Salmonella typhimurium FlgB, FlgC, FlgG and FliF proteins; a further gene product is similar to the Escherichia coli FliE protein PUBMED:1905667. All of these proteins are thought to form part of the hook-basal body complex of the bacterial flagella PUBMED:1905667. The FlgB, FlgC and FlgG proteins are components of the proximal and distal rods; FliF forms the M-ring that anchors the rod assembly to the membrane; but the role of FliE has not yet been determined PUBMED:1905667. The similarity between the proteins in these two organisms suggests that the structures of the M-ring and the rod may be similar PUBMED:1905667. Nevertheless, some differences in size and amino acid composition between some of the homologues suggest the basal body proteins may be organised slightly differently within B. subtilis PUBMED:1905667.

    \

    From gel electrophoresis and autoradiography of 35S-labelled S. typhimurium hook-basal body complexes and the deduced number of sulphur-containing residues in FliE, the stoichiometry of the protein in the hook-basal body complex has been estimated to be about nine subunits PUBMED:1551848. FliE does not undergo cleavage of a signal peptide, nor does it show any similarity to the axial components like the rod or hook proteins, which are thought to be exported by the flagellum-specific export pathway PUBMED:1551848. On this evidence, it has been suggested that FliE may be in the vicinity of the MS ring, perhaps acting as an adaptor protein between ring and rod substructures PUBMED:1551848.

    \ ' '2558' 'IPR000090' '\

    The flagellar motor switch in Escherichia coli and Salmonella typhimurium regulates the \ direction of flagellar rotation and hence controls swimming behaviour PUBMED:8224881.\ The switch is a complex apparatus that responds to signals transduced by the\ chemotaxis sensory signalling system during chemotactic behaviour PUBMED:8224881. CheY,\ the chemotaxis response regulator, is believed to act directly on the switch\ to induce tumbles in the swimming pattern, but no physical interactions of \ CheY and switch proteins have yet been demonstrated.

    \

    The switch complex comprises at least three proteins - FliG, FliM and FliN.\ It has been shown that FliG interacts with FliM, FliM interacts with itself,\ and FliM interacts with FliN PUBMED:8631704. Several residues within the middle third\ of FliG appear to be strongly involved in the FliG-FliM interaction, with\ residues near the N- or C-termini being less important PUBMED:8631704. Such clustering\ suggests that FliG-FliM interaction plays a central role in switching.

    \

    Analysis of the FliG, FliM and FliN sequences shows that none are especially\ hydrophobic or appear to be integral membrane proteins PUBMED:2656645. This result is\ consistent with other evidence suggesting that the proteins may be \ peripheral to the membrane, possibly mounted on the basal body M ring PUBMED:2656645, PUBMED:1631122. FliG is present in about 25 copies per flagellum. This structure of the\ C-terminal domain is known, this domain functions\ specifically in motor rotation PUBMED:10440379.

    \ ' '2559' 'IPR018035' '\

    This entry represents a region found in the flagellar assembly protein FliH, as well as in type III secretion system protein HrpE.

    \ \

    Many flagellar proteins are exported by a flagellum-specific export pathway. Attempts have been made to characterise the apparatus responsible for this process, by designing assays to screen for mutants with export defects.\ Experiments involving filament removal from temperature-sensitive flagellar mutants of Salmonella typhimurium have shown that, while most mutants were able to regrow filaments, flhA, fliH, fliI and fliN mutants showed no or greatly reduced regrowth. This suggests that the corresponding gene products are involved in the process of flagellum-specific export PUBMED:1646201. The sequence of fliH has been deduced and shown to encode a protein of molecular mass of 25,782 Da.

    \ \

    Bacterial HrpE proteins are belived to function on the type III secretion system, specifically the secretion of HrpZ (harpinPss) PUBMED:9045830.

    \ ' '2560' 'IPR000809' '\

    Many flagellar proteins are exported by a flagellum-specific export pathway. Attempts have been \ made to characterise the apparatus responsible for this process, by designing assays to screen \ for mutants with export defects PUBMED:1646201. Experiments involving filament removal from \ temperature-sensitive flagellar mutants of Salmonella typhimurium have shown that, while most \ mutants were able to regrow filaments, flhA, fliH, fliI and fliN mutants showed no or greatly \ reduced regrowth. This suggests that the corresponding gene products are involved in the process \ of flagellum-specific export. The sequences of fliH, fliI and the adjacent gene, fliJ, have been \ deduced. FliJ was shown to encode a protein of molecular mass 17,302 Da PUBMED:1646201. It is a \ membrane-associated protein that affects chemotactic events, mutations in FliJ result in failure \ to respond to chemotactic stimuli.

    \ ' '2561' 'IPR005503' '\ This FliL protein controls the rotational direction of the flagella during chemotaxis PUBMED:3519573. FliL is a cytoplasmic membrane protein associated with the basal body PUBMED:10439416.\ ' '2562' 'IPR001689' '\

    The flagellar motor switch in Escherichia coli and Salmonella typhimurium regulates the direction of flagellar rotation and hence controls swimming behaviour PUBMED:8224881. The switch is a complex apparatus that responds to signals transduced by the chemotaxis sensory signalling system during chemotactic behaviour PUBMED:8224881. CheY, the chemotaxis response regulator, is believed to act directly on the switch to induce tumbles in the swimming pattern, but no physical interactions of CheY and switch proteins have yet been demonstrated.

    \

    The switch complex comprises at least three proteins - FliG, FliM and FliN. It has been shown that FliG interacts with FliM, FliM interacts with itself, and FliM interacts with FliN PUBMED:8631704. Several residues within the middle third of FliG appear to be strongly involved in the FliG-FliM interaction, with residues near the N or C termini being less important PUBMED:8631704. Such clustering suggests that FliG-FliM interaction plays a central role in switching.

    \

    Analysis of the FliG, FliM and FliN sequences shows that none are especially hydrophobic or appear to be integral membrane proteins PUBMED:2656645. This result is consistent with other evidence suggesting that the proteins may be peripheral to the membrane, possibly mounted on the basal body M ring PUBMED:2656645, PUBMED:1631122.

    \ ' '2563' 'IPR007442' '\

    FliO is an essential component of the flagellum-specific protein export apparatus PUBMED:10049367. It is an integral membrane protein. Its precise molecular function is unknown.

    \ \

    FliO is a short protein found in flagellar biosynthesis operons, and which contains a highly hydrophobic N-terminal sequence followed generally by two basic amino acids. This region is reminiscent of but distinct from the twin-arginine translocation signal sequence. Some instances of this gene have been names "FliZ" but phylogenetic tree building supports a single FliO family.

    \ ' '2564' 'IPR005838' '\

    Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior PUBMED:8969244. There have been four secretion systems described in animal enteropathogens such as Salmonella and Yersinia, with further sequence similarities in plant pathogens like Ralstonia and Erwinia. The type III secretion system is of great interest as it is used to transport virulence factors from the pathogen directly into the host cell PUBMED:10334981 and is only triggered when the bacterium comes into close contact with the host. The protein subunits of the system are very similar to those of bacterial flagellar biosynthesis PUBMED:10564516. However, while the latter forms a ring structure to allow secretion of flagellin and is an integral part of the flagellum itself PUBMED:10564516, type III subunits in the outer membrane translocate secreted proteins through a channel-like structure. It is believed that the family of type III inner membrane proteins are used as structural moieties in a complex with several other subunits PUBMED:9618447, including the ATPase necessary for driving the secretion system.

    \ \

    One such set of inner membrane proteins, termed "P" here for nomenclature purposes, includes the Salmonella and Shigella SpaP, the Yersinia YscR, the Erwinia HrcR, and the Xanthamonas Pro2 genes PUBMED:9618447, as well as several FliP flagellar biosynthesis genes PUBMED:10564516. FliP is an ~30Kd protein containing three or four transmembrane (TM) regions.

    \ \ ' '2565' 'IPR003713' '\ The fliD operon of several bacteria consists of three flagellar genes, fliD, fliS, and fliT, and is transcribed in this order PUBMED:8550529. In Bacillus subtilis the operon encoding the flagellar proteins FliD, FliS, and FliT is sigma D-dependent PUBMED:8195064.\ ' '2566' 'IPR002910' '\ This family consists of various plant development proteins which are homologues of Floricaula (FLO) and leafy (LFY) proteins which are floral meristem\ identity proteins.\ Mutations in the sequences of these proteins affect flower and leaf development.\ ' '2567' 'IPR001389' '\

    Yeast flocculation protein may be directly involved in the flocculation process PUBMED:7502576. The extensively O-glycosylated protein is probably attached to the membrane by a GPI-anchor.

    \ ' '2569' 'IPR005626' '\

    This is a family of FLP proteins that catalyse recombination between large inverted repetitions of the plasmid.

    \ \ \ ' '2570' 'IPR007047' '\

    This entry is for the fimbriae associated protein Flp/Fap pilin component.

    \ ' '2571' 'IPR005626' '\

    This is a family of FLP proteins that catalyse recombination between large inverted repetitions of the plasmid.

    \ \ \ ' '2572' 'IPR003813' '\

    Methyl-viologen-reducing hydrogenase (MVH) is one of the enzymes involved in methanogenesis and coded in the mth-flp-mvh-mrt cluster of methane genes in Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum) PUBMED:7730278. No specific functions have been assigned to the delta subunit.

    \ ' '2573' 'IPR004213' '\ The flt3 (fms-related tyrosine kinase 3) ligand is a short chain cytokine with a 4 helical bundle fold. It is a type I membrane protein which stimulates the proliferation of of early hematopoeitic cells, and synergises well with other colony stimulating factors and interleukins.\ ' '2574' 'IPR006859' '\ BM2 is synthesised in the late phase of infection and incorporated into the virion. It may be phosphorylated in vivo. The function of BM2 is unknown PUBMED:10573149.\ ' '2575' 'IPR004208' '\ A specific region of the Influenza B virus NS1 protein, which includes part of its effector domain, blocks the covalent linkage of mouse ISG15 to its target proteins both in vitro and in infected cells. Of the several hundred proteins induced by interferon (IFN) alpha/beta, the ubiquitin-like ISG15 protein is one of the most predominant. Influenza A virus employs a different strategy: its NS1 protein does not bind the ISG15 protein, but little or no ISG15 protein is produced during infection PUBMED:11157743.\ ' '2576' 'IPR005187' '\

    The influenza C virus genome consists of seven single-stranded RNA segments. The shortest RNA segment encodes a 286 amino acid non-structural protein NS1 PUBMED:10900030. This protein contains 6 conserved cysteines that may be functionally important, perhaps binding to a metal ion.

    \ ' '2577' 'IPR005188' '\

    The influenza C virus genome consists of seven single-stranded RNA segments. The shortest RNA segment encodes a 286 amino acid non-structural protein NS1 as well as the NS2 protein. The NS2 protein is only about 60 amino acids in length and of unknown function.

    \ ' '2578' 'IPR001561' '\

    Matrix protein (M1) of Influenza virus is a bifunctional membrane/RNA-binding protein that mediates the encapsidation of RNA-nucleoprotein cores into the membrane envelope. It is therefore required that M1 binds both membrane and RNA simultaneously PUBMED:9164466. M1 is comprised of two domains connected by a linker sequence. The N-terminal domain has a multi-helical structure that can be divided into two subdomains PUBMED:11162800. The C-terminal domain also contains alpha-helical structure.

    \ ' '2579' 'IPR002089' '\ This entry contains Influenza virus matrix protein 2. It is an integral membrane protein that is expressed on the infected cell surface and incorporated into virions where it is a minor component. The protein spans the viral membrane with an extracellular amino-terminus and a cytoplasmic carboxy-terminus. The transmembrane domain of the M2 protein forms the channel pore. The M2 protein, which forms a homotetramer, has H+ ion channel which was found to be regulated by pH PUBMED:9360376 and may have a pivotal role in the biology of Influenza virus infection PUBMED:1374685.\ ' '2580' 'IPR002141' '\ Influenza virus nucleoprotein is a structural protein which encapsidates the negative strand viral RNA. NP is one of the main determinants of species specificity. The question of how far the NP gene can cross the species barrier by reassortment and become adapted by mutation to the new host has been discussed PUBMED:4024728.\ ' '2581' 'IPR000256' '\ NS1 is a homodimeric RNA-binding protein found in influenza virus that is required for viral replication. NS1 binds polyA tails of mRNA keeping them in the nucleus. NS1 inhibits pre-mRNA splicing by tightly binding to a specific stem-bulge of U6 snRNA PUBMED:9360601.\ ' '2582' 'IPR000968' '\ The Influenza A virus belongs to the class of ssRNA negative-strand viruses. Nonstructural protein 2 (NS2) may play a role in promoting normal replication of the genomic RNAs by preventing the replication of short-length RNA species PUBMED:8113739. NS1 and NS2 proteins are produced from the same gene by \ alternative splicing.\ ' '2583' 'IPR001009' '\

    RNA-directed RNA polymerase (RdRp) () is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage PUBMED:2759231, PUBMED:8709232. It catalyses synthesis of the RNA strand complementary to a given RNA template, but the precise molecular mechanism remains unclear.\ The postulated RNA replication process is a two-step mechanism. First, the initiation step of RNA synthesis begins at or near the 3\' end of the RNA template by means of a primer-independent (de novo) mechanism. The de novo initiation consists in the addition of a nucleotide tri-phosphate (NTP) to the 3\'-OH of the first initiating NTP. During the following so-called elongation phase, this nucleotidyl transfer reaction is repeated with subsequent NTPs to generate the complementary RNA product PUBMED:11531403.

    \

    All the RNA-directed RNA polymerases, and many DNA-directed polymerases, employ a fold whose organisation has been likened to the shape of a right hand with three subdomains termed fingers, palm and thumb PUBMED:9309225. Only the palm subdomain, composed of a four-stranded antiparallel beta-sheet with two alpha-helices, is well conserved among all of these enzymes. In RdRp, the palm subdomain comprises three well conserved motifs (A, B and C). Motif A (D-x(4,5)-D) and motif C (GDD) are spatially juxtaposed; the Asp residues of these motifs are implied in the binding of Mg2+ and/or Mn2+. The Asn residue of motif B is involved in selection of ribonucleoside triphosphates over dNTPs and thus determines whether RNA is synthesised rather than DNA PUBMED:10827187.\ The domain organisation PUBMED:9878607 and the 3D structure of the catalytic centre of a wide range of RdPp\'s, even those with a low overall sequence homology, are conserved. The catalytic centre is formed by several motifs containing a number of conserved amino acid residues.

    \

    There are 4 superfamilies of viruses that cover all RNA containing viruses with no DNA stage:

    \ The RNA-directed RNA polymerases in the first of the above superfamilies can be divided into the following three subgroups:\

    \ \

    The pattern describes the P2 subunit of Influenza RNA polymerase ,an enzyme which is composed of three subunits: P1 (or PB1), P2 (or PA), and P3 (or PB2). The P2 subunit in addition to the P1 subunit is required for viral RNA synthesis in replication of the Influenza virus genome PUBMED:8709268.

    \ ' '2584' 'IPR001407' '\

    Influenza RNA-dependent RNA polymerase is composed of three subunits;\ P1 (or PB1), P2 (or PA), and P3 (or PB2).\ There are two separate domains in the influenza virus PB1 protein involved in the interaction with the PB2 and PA subunits PUBMED:9348094, PUBMED:8948635. PB1 has two GTP binding sites.

    \ ' '2585' 'IPR001591' '\ Orthomyxoviridae RNA polymerase with the subunit composition of PB1-PB2-PA is a unique multifunctional enzyme with the activities of both synthesis and cleavage of RNA, and is involved in both transcription and replication of the RNA genome. Transcription is initiated by using capped RNA fragments, which are generated after cleavage of host cell mRNA by the RNA polymerase-associated capped RNA endonuclease PUBMED:8806170. It would appear that two separate sequences, one N-(242-282) and the other C-terminal (538-577)\ proximal segments of PB2 subunit, constitute the RNA cap-binding site of the Influenza virus RNA polymerase PUBMED:10526235.\ ' '2586' 'IPR004304' '\ This family includes amidohydrolases of formamide and acetamide . The formamidase from Methylophilus methylotrophus (Bacterium W3A1) forms a homotrimer suggesting that this may be a common property of other members of this family.\ ' '2587' 'IPR000262' '\

    A number of oxidoreductases that act on alpha-hydroxy acids and which are FMN-containing flavoproteins have been shown PUBMED:2324094, PUBMED:2271624, PUBMED:1939137 to be structurally related. These enzymes are:

    \

    \ \

    The first step in the reaction mechanism of these enzymes is the abstraction of the proton from the alpha-carbon of the substrate producing a carbanion which can subsequently attach to the N5 atom of FMN. A conserved histidine has been shown PUBMED:2644287 to be involved in the removal of the proton. The region around this active site residue is highly conserved and contains an arginine residue which is involved in substrate binding.

    \ ' '2588' 'IPR000960' '\

    Flavin-containing monooxygenases (FMOs) constitute a family of xenobiotic-metabolising enzymes PUBMED:8311461. Using an NADPH cofactor and FAD prosthetic group, these microsomal proteins catalyse the oxygenation of nucleophilic nitrogen, sulphur, phosphorous and selenium atoms in a range of structurally diverse compounds. FMOs have been implicated in the metabolism of a number of pharmaceuticals, pesticides and toxicants. In man, lack of hepatic FMO-catalysed trimethylamine metabolism results in trimethylaminuria (fish odour syndrome). Five mammalian forms of FMO are now known and have been designated FMO1-FMO5 PUBMED:1712018, PUBMED:2318837,\ PUBMED:1542660, PUBMED:1417778, PUBMED:8486656. This is a recent nomenclature based on comparison of amino acid sequences, and has been introduced in an attempt to eliminate confusion inherent in multiple, laboratory-specific designations and tissue-based classifications PUBMED:8311461. Following the determination of the complete nucleotide sequence of Saccharomyces cerevisiae (Baker\'s yeast) PUBMED:8091229, a novel gene was found to encode a protein with similarity to mammalian monooygenases.

    \ \ ' '2589' 'IPR000083' '\

    Fibronectin type I repeats are one of the three repeats found in the fibronectin protein.\ Fibronectin is a plasma protein that binds cell surfaces and various compounds\ including collagen, fibrin, heparin, DNA, and actin. Type I domain (FN1) is approximately\ 40 residues in length. Four conserved cysteines are involved in disulphide bonds. The 3D\ structure of the FN1 domain has been determined PUBMED:2112232, PUBMED:1602484, PUBMED:7582899. It consists of two antiparallel\ beta-sheets, first a double-stranded one, that is linked by a disulphide bond to a\ triple-stranded beta-sheet. The second conserved disulphide bridge links the C-terminal\ adjacent strands of the domain.

    \

    In human tissue plasminogen activator chain A the FN1 domain together with the\ following epidermal growth factor (EGF)-like domain are involved in\ fibrin-binding PUBMED:1900516. It has been suggested that these two modules form a single structural\ and functional unit PUBMED:7582899. The two domains keep their specific tertiary structure, but interact\ intimately to bury a hydrophobic core; the inter-module linker makes up the third strand of\ the EGF-module\'s major beta-sheet.

    \ ' '2590' 'IPR000562' '\ Fibronectin is a multi-domain glycoprotein, found in a soluble form in plasma, and in an insoluble form in loose\ connective tissue and basement membranes, that binds cell surfaces and various compounds including collagen,\ fibrin, heparin, DNA, and actin. Fibronectins are involved in a number of important functions e.g., wound\ healing; cell adhesion; blood coagulation; cell differentiation and migration; maintenance of the cellular\ cytoskeleton; and tumour metastasis PUBMED:3031656. The major part of the sequence of fibronectin consists of the\ repetition of three types of domains, which are called type I, II, and III PUBMED:3780752. Type II domain is\ approximately forty residues long, contains four conserved cysteines involved in disulphide bonds and is part of\ the collagen-binding region of fibronectin. In fibronectin the type II domain is duplicated. Type II domains have\ also been found in a range of proteins including blood coagulation factor XII; bovine seminal plasma proteins\ PDC-109 (BSP-A1/A2) and BSP-A3 PUBMED:3606570; cation-independent mannose-6-phosphate receptor PUBMED:1323236;\ mannose receptor of macrophages PUBMED:2373685; 180 Kd secretory phospholipase A2 receptor PUBMED:8294398. DEC-205\ receptor PUBMED:7753172; 72 Kd and 92 Kd type IV collagenase () PUBMED:2834383; and hepatocyte\ growth factor activator PUBMED:7683665.\ ' '2591' 'IPR003961' '\

    Fibronectins are multi-domain glycoproteins found in a soluble form in plasma, and in an insoluble form in loose connective tissue and basement membranes PUBMED:3780752. They contain multiple copies of 3 repeat regions (types I, II and III), which bind to a variety of substances including heparin, collagen, DNA, actin, fibrin and fibronectin receptors on cell surfaces. The wide variety of these substances means that fibronectins are involved in a number of important functions: e.g., wound healing; cell adhesion; blood coagulation; cell differentiation and migration; maintenance of the cellular cytoskeleton; and tumour metastasis PUBMED:3031656. The role of fibronectin in cell differentiation is demonstrated by the marked reduction in the expression of its gene when neoplastic transformation occurs. Cell attachment has been found to be mediated by the binding of the tetrapeptide RGDS to integrins on the cell surface PUBMED:2466295, although related sequences can also display cell adhesion activity.

    \

    Plasma fibronectin occurs as a dimer of 2 different subunits, linked together by 2 disulphide bonds near the C-terminus. The difference in the 2 chains occurs in the type III repeat region and is caused by alternative splicing of the mRNA from one gene PUBMED:3780752. The observation that, in a given protein, an individual repeat of one of the 3 types (e.g., the first FnIII repeat) shows much less similarity to its subsequent tandem repeats within that protein than to its equivalent repeat between fibronectins from other species, has suggested that the repeating structure of fibronectin arose at an early stage of evolution. It also seems to suggest that the structure is subject to high selective pressure PUBMED:6317187.

    \

    The fibronectin type III repeat region is an approximately 100 amino acid domain, different tandem repeats of which contain binding sites for DNA, heparin and the cell surface PUBMED:3780752. The superfamily of sequences believed to contain FnIII repeats represents 45 different families, the majority of which are involved in cell surface binding in some manner, or are receptor protein tyrosine kinases, or cytokine receptors.

    \ ' '2592' 'IPR004237' '\ The ability of bacteria to bind fibronectin is thought to enable the colonisation of wound tissue and blood clots. The fibronectin-binding protein is directly involved in the fibronectin-mediated adherence of the bacteria to epithelial cells PUBMED:1386839. The fibronectin binding repeat is found in bacterial fibronectin binding proteins and serum opacity factor.\ ' '2593' 'IPR004956' '\

    Foamy virus (FV) gene expression is strictly dependent on their transactivator proteins called Bel1/Tas. The presence of a functionally active, internal promoter, besides the conventional LTR promoters, is unique to FVs. The nuclear Bel1/Tas protein of primate prototype FV binds DNA target sites directly and consists of at least two functional domains, an N-terminal/central DNA binding and a C-terminal activation domain PUBMED:14972532.

    \ ' '2594' 'IPR005070' '\ Expression of the envelope (Env) glycoprotein is essential for viral particle egress. This feature is unique to the Spumavirinae, a subclass of the Retroviridae. \ ' '2595' 'IPR005189' '\

    Focal adhesion kinase (FAK) is a tyrosine kinase found in focal adhesions, intracellular signalling complexes that are formed following engagement of the extracellular matrix by integrins. The C-terminal "focal adhesion targeting" (FAT) region is necessary and sufficient for localizing FAK to focal adhesions. The crystal structure of FAT shows it forms a four-helix bundle that resembles those found in two other proteins involved in cell adhesion, alpha-catenin and vinculin PUBMED:11799401. The binding of FAT to the focal adhesion protein, paxillin, requires the integrity of the helical bundle, whereas binding to another focal adhesion protein, talin, does not.

    \ ' '2596' 'IPR004233' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    Thie entry represents the type IIS restriction endonuclease FokI (), which is a member of an unusual class of bipartite restriction enzymes that recognise a specific DNA sequence and cleave DNA nonspecifically a short distance away from that sequence PUBMED:9724744. FokI contains amino- and carboxy-terminal domains corresponding to the DNA-recognition () and cleavage functions, respectively.

    \

    The catalytic domain contains only a single catalytic centre, raising the question of how monomeric FokI manages to cleave both DNA strands. The catalytic domain is sequestered in a \'piggyback\' fashion by the recognition domain PUBMED:9214510.

    \ ' '2597' 'IPR004234' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    Thie entry represents the type IIS restriction endonuclease FokI (), which is a member of an unusual class of bipartite restriction enzymes that recognise a specific DNA sequence and cleave DNA nonspecifically a short distance away from that sequence PUBMED:9724744. FokI contains amino- and carboxy-terminal domains corresponding to the DNA-recognition and cleavage functions (), respectively.

    \

    The recognition domain is made of three smaller subdomains (D1, D2 and D3) which are evolutionarily related to the helix-turn-helix-containing DNA-binding domain of the catabolite gene activator protein CAP PUBMED:9214510.

    \ ' '2598' 'IPR002666' '\

    The reduced folate carrier (a transmembrane glycoprotein) transports reduced folate into mammalian cells via the carrier mediated mechanism (as opposed to the receptor mediated mechanism) it also transports cytotoxic folate analogues used in chemotherapy PUBMED:9161403, such as methotrexate (MTX). Mammalian cells have an absolute requirement for exogenous folates which are needed for growth, and biosynthesis of macromolecules PUBMED:9161403.

    \ ' '2599' 'IPR018143' '\ This family includes the folate receptor which binds to folate and reduced folic acid derivatives and mediates delivery of\ 5-methyltetrahydrofolate to the interior of cells. These proteins are attached to the membrane by a GPI-anchor. A riboflavin-binding protein required for the transport of riboflavin to the developing oocyte in chicken also belong to this family.\ ' '2600' 'IPR006157' '\

    Dihydroneopterin aldolase catalyses the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate. In the opportunistic pathogen Pneumocystis carinii, dihydroneopterin aldolase function is expressed as the N-terminal portion of the multifunctional folic acid synthesis protein (Fas). This region encompasses two domains, FasA and FasB, which are 27% amino acid identical. FasA and FasB also share significant amino acid sequence similarity with bacterial dihydroneopterin aldolases.

    \

    This region consists of two tandem sequences each homologous to folB and which form tetramers PUBMED:9709001.

    \ ' '2601' 'IPR001766' '\ The fork head protein of Drosophila melanogaster, a transcription factor that promotes terminal rather than segmental development, contains neither homeodomains nor zinc-fingers characteristic of other transcription factors PUBMED:2566386. Instead, it contains a distinct type of DNA-binding region, containing around 100 amino acids, which has since been identified in a number of transcription factors (including D. melanogaster FD1-5, mammalian HNF-3, human HTLF, Saccharomyces cerevisiae HCM1, etc.). This is referred to as the fork head domain but is also known as a \'winged helix\' PUBMED:2566386, PUBMED:8332212, PUBMED:1356269.\ The fork head domain binds B-DNA as a monomer PUBMED:8332212, but shows no similarity to previously identified DNA-binding motifs. Although the domain is found in several different transcription factors, a common function is their involvement in early developmental decisions of cell fates during embryogenesis PUBMED:1356269.\ ' '2602' 'IPR000292' '\

    A number of bacterial and archaebacterial proteins involved in transporting\ formate or nitrite have been shown PUBMED:8022272 to be related:\

    \

    These transporters are proteins of about 280 residues and seem to contain six\ transmembrane regions.

    \ ' '2603' 'IPR013802' '\

    The formiminotransferase (FT) domain of formiminotransferase-cyclodeaminase (FTCD) forms a homodimer, with each protomer being comprised of two subdomains. The formiminotransferase domain has an N-terminal subdomain that is made up of a six-stranded mixed beta-pleated sheet and five alpha helices, which are arranged on the external surface of the beta sheet. This, in turn, faces the beta-sheet of the C-terminal subdomain to form a double beta-sheet layer. The two subdomains are separated by a short linker sequence, which is not thought to be any more flexible than the remainder of the molecule. The substrate is predicted to form a number of contacts with residues found in both the N-terminal and C-terminal subdomains PUBMED:10673422. In humans, deficiency of this enzyme results in a disease phenotype PUBMED:12815595.

    \

    This entry represents the C-terminal subdomain of the formiminotransferase domain.

    \ ' '2604' 'IPR005793' '\

    Methionyl-tRNA formyltransferase () transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. The formyl group appears to play a dual role in the initiator identity of N-formylmethionyl-tRNA by promoting its recognition by IF2 and by impairing its binding to EFTU-GTP. This family also includes formyltetrahydrofolate dehydrogenases, which produce formate from formyl-tetrahydrofolate. These enzymes contain an N-terminal domain in common with other formyl transferase enzymes (). The C-terminal domain has an open beta-barrel fold PUBMED:8887566.

    \ ' '2605' 'IPR002376' '\ A number of formyl transferases belong to this group.\ Methionyl-tRNA formyltransferase transfers a formyl group onto\ the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. The formyl group appears to play a dual role in the initiator identity of N-formylmethionyl-tRNA by promoting its recognition by IF2 and by impairing its binding to EFTU-GTP.\ Formyltetrahydrofolate dehydrogenase produces formate from formyl-\ tetrahydrofolate. This is the N-terminal domain of these enzymes and is found upstream of the C-terminal domain ().\ \

    The trifunctional glycinamide ribonucleotide synthetase-aminoimidazole ribonucleotide synthetase-glycinamide ribonucleotide transformylase catalyses the second, third and fifth steps in de novo purine biosynthesis. The glycinamide ribonucleotide transformylase belongs to this group.

    \ ' '2607' 'IPR002908' '\

    The eukaryotic proteins in this entry include frataxin, the protein that is mutated in Friedreich\'s ataxia PUBMED:8931268, and related sequences. Friedreich\'s ataxia is a progressive neurodegenerative disorder caused by loss of function mutations in the gene encoding frataxin (FRDA). Frataxin mRNA is predominantly expressed in tissues with a high metabolic rate (including liver, kidney, brown fat and heart). Mouse and yeast frataxin homologues contain a potential N-terminal mitochondrial targeting sequence, and human frataxin has been observed to co-localise with a mitochondrial protein. Furthermore, disruption of the yeast gene has been shown to result in mitochondrial dysfunction. Friedreich\'s ataxia is thus believed to be a mitochondrial disease caused by a mutation in the nuclear genome (specifically, expansion of an intronic GAA triplet repeat) PUBMED:8596916, PUBMED:8815938, PUBMED:9241270.

    \ \

    The bacterial proteins in this entry are iron-sulphur cluster (FeS) metabolism CyaY proteins hmologous to eukaryotic frataxin. Partial Phylogenetic Profiling PUBMED:16930487 suggests that CyaY most likely functions as part of the ISC system for FeS cluster biosynthesis, and is supported by expermimental data in some species PUBMED:16603772, PUBMED:16428423.

    \ ' '2608' 'IPR004885' '\

    This is a group of proteins of unknown function from bacteriophages.

    \ ' '2609' 'IPR007516' '\

    Coenzyme F420 hydrogenase () reduces the low-potential two-electron acceptor coenzyme F420. This entry contains the N termini of F420 hydrogenase and dehydrogenase beta subunits PUBMED:2207102, PUBMED:10751389. The N terminus of Methanobacterium formicicum formate dehydrogenase beta chain (, ) is also represented in this entry PUBMED:3531194. This region is often found in association with the 4Fe-4S binding domain, fer4 (), and the C terminus .

    \ ' '2610' 'IPR016477' '\

    Ketosamines derive from a non-enzymatic reaction between a sugar and a protein PUBMED:3319287. Ketosamine-3-kinases (KT3K), of which fructosamine-3-kinase (FN3K) is the best-known example, catalyse the phosphorylation of the ketosamine moiety of glycated proteins. The instability of a phosphorylated ketosamine leads to its degradation, and KT3K is thus thought to be involved in protein repair PUBMED:14633848.

    The function of the prokaryotic members of this group has not been established. However, several lines of evidence indicate that they may function as fructosamine-3-kinases (FN3K). First, they are similar to characterised FN3K from mouse and human. Second, the Escherichia coli members are found in close proximity on the genome to fructose-6-phosphate kinase (PfkB). Last, FN3K activity has been found in a Anacystis montana (Gloeocapsa montana Kutzing 1843) PUBMED:214181, indicating such activity-directly demonstrated in eukaryotes-is nonetheless not confined to eukaryotes.

    \

    This family includes eukaryotic fructosamine-3-kinase enzymes PUBMED:11016445 which may initiate a process leading to the deglycation of fructoselysine and of glycated proteins and in the phosphorylation of 1-deoxy-1-morpholinofructose, fructoselysine, fructoseglycine, fructose and\ glycated lysozyme. The family also includes bacterial members that have not been characterised but probably have a similar or identical function.

    \ \

    For additional information please see PUBMED:11016445.

    \ ' '2611' 'IPR007044' '\

    Enzymes containing the cyclodeaminase domain function in channeling one-carbon units to the folate pool. In most cases, this domain catalyses the cyclisation of formimidoyltetrahydrofolate to methenyltetrahydrofolate as shown in reaction (1). In the methylotrophic bacterium Methylobacterium extorquens, however, it catalyses the interconversion of formyltetrahydrofolate and methylenetetrahydrofolate PUBMED:10215859,as shown in reaction (2)

    \ \ \ \ \

    In prokaryotes, this domain mostly occurs on its own, while in eukaryotes it is fused to a glutamate formiminotransferase domain (which catalyses the previous step in the pathway) to form the bifunctional enzyme formiminotransferase-cyclodeaminase PUBMED:7654689. The eukaryotic enzyme is a circular tetramer of homodimers PUBMED:7410436, while the prokaryotic enzyme is a dimer PUBMED:10215859, PUBMED:15651027.

    \ \

    The crystal structure of the cyclodeaminase enzyme () from Thermaotogoa maritima has been studied PUBMED:15651027. It is a homodimer, where each monomer is composed of six alpha helices arranged in an up and down helical bundle, forming a novel fold. The location of the active site is not known, but sequence alignments revealed two clusters of conserved residues located in a deep pocket within the dimmer interface. This pocket was large enough to accommodate the reaction product and it was postulated that this is the active site.

    \ ' '2612' 'IPR000559' '\

    Formate--tetrahydrofolate ligase () (formyltetrahydrofolate synthetase) (FTHFS) is one of the enzymes\ participating in the transfer of one-carbon units, an essential element of various biosynthetic pathways. In many of these processes the transfers of one-carbon units are mediated by the coenzyme tetrahydrofolate (THF). In eukaryotes the FTHFS activity is expressed by a multifunctional enzyme, C-1-tetrahydrofolate synthase (C1-THF synthase), which also catalyses the dehydrogenase and cyclohydrolase activities. Two forms of C1-THF synthases are known PUBMED:2836393, one is located in the mitochondrial matrix, while the second one is cytoplasmic. In both forms the FTHFS domain\ consists of about 600 amino acid residues and is located in the C-terminal section of C1-THF synthase. In prokaryotes FTHFS activity is expressed by a monofunctional homotetrameric enzyme of about 560 amino acid residues PUBMED:2200509.

    \ \

    The crystal structure of N(10)-formyltetrahydrofolate synthetase from Moorella thermoacetica shows that the subunit is composed of three domains organised around three mixed beta-sheets. There are two cavities between adjacent domains. One of them was identified as the nucleotide binding site by homology modelling. The large domain contains a seven-stranded beta-sheet surrounded by helices on both sides. The second domain contains a five-stranded beta-sheet with two alpha-helices packed on one side while the other two are a wall of the active site cavity. The third domain contains a four-stranded beta-sheet forming a half-barrel. The concave side is covered by two helices while the convex side is another wall of the large cavity. Arg 97 is likely involved in formyl phosphate binding. The tetrameric molecule is relatively flat with the shape of the letter X, and the active sites are located at the end of the subunits far from the subunit interface PUBMED:10747779.

    \ ' '2613' 'IPR004923' '\ The Saccharomyces cerevisiae (Baker\'s yeast) iron permease FTR1 is a plasma membrane permease for high-affinity iron uptake. Also included in this family are bacterial hypothetical integral membrane proteins.\ ' '2614' 'IPR003494' '\ FtsA is essential for bacterial cell division, and co-localizes to the septal ring with FtsZ. It has been suggested that the interaction\ of FtsA-FtsZ has arisen through coevolution in different bacterial strains PUBMED:9352931.\ ' '2615' 'IPR002543' '\

    The FtsK/SpoIIIE domain is found extensively in a wide variety of proteins \ from prokaryotes and plasmids PUBMED:7592387 some of which contain up to three copies.The domain contains a putative ATP binding P-loop motif. A mutation in FtsK causes a temperature sensitive block in cell\ division and it is involved in peptidoglycan synthesis or modification PUBMED:7592387. The SpoIIIE protein is implicated in intercellular chromosomal DNA transfer PUBMED:7592387.

    \ ' '2616' 'IPR007082' '\ In Escherichia coli, nine gene products are known to be essential for assembly of the division septum. One of these, FtsL, is a bitopic membrane protein whose precise function is not understood. It has been proposed that FtsL interacts with the DivIC protein PUBMED:10844672, however this interaction may be indirect PUBMED:11994149.\ ' '2617' 'IPR005548' '\

    FtsQ is one of several cell division proteins. FtsQ interacts with other Fts proteins, reviewed in PUBMED:9864306. The precise function of FtsQ is unknown.

    \ ' '2618' 'IPR001182' '\ A number of prokaryotic integral membrane proteins involved in cell cycle processes\ have been found to be structurally related PUBMED:2509435, PUBMED:2113157. These proteins include, the\ Escherichia coli and related bacteria cell division protein ftsW and the rod\ shape-determining protein rodA (or mrdB), the Bacillus subtilis stage V sporulation\ protein E (spoVE), the B. subtilis hypothetical proteins ywcF and ylaO and the\ Cyanophora paradoxa cyanelle ftsW homolog.\ ' '2619' 'IPR005567' '\

    This region contains the important motif (LXXLL) necessary for the interaction of FTZ with the nuclear receptor FTZ-F1. FTZ is thought to represent a category of LXXLL motif-dependent co-activators for nuclear receptors.

    \ ' '2620' 'IPR015888' '\

    L-fucose isomerase () converts the aldose L-fucose into the corresponding ketose L-fuculose during the first step in fucose\ metabolism using Mn2+ as a cofactor. The enzyme is a hexamer, forming the largest structurally known ketol isomerase, and has no sequence or structural similarity with other ketol isomerases. The structure was determined by X-ray crystallography at 2.5 A resolution PUBMED:9367760.

    \

    This entry represents l-fucose isomerase enzymes as well as severl hypothetical proteins.

    \ ' '2622' 'IPR003510' '\ Fumarate reductase is a membrane-bound flavoenzyme consisting of four subunits, A-B. A and B comprise the membrane-extrinsic catalytic domain and C and D link the catalytic centres to the electron-transport chain. This family consists of the 15kDa hydrophobic subunit C.\ ' '2623' 'IPR003418' '\ Fumarate reductase is a membrane-bound flavoenzyme consisting of four subunits, A-B. A and B comprise the membrane-extrinsic catalytic domain and C and D link the catalytic centres to the electron-transport chain. This family consists of the 13kDa hydrophobic subunit D. This component may be required to anchor the catalytic components of the fumarate reductase complex to the cytoplasmic membrane.\ ' '2624' 'IPR011602' '\

    Fumble is required for cell division in Drosophila. Mutants lacking fumble exhibit abnormalities in bipolar spindle organisation, chromosome segregation, and contractile ring formation. Analyses have demonstrated that it encodes three protein isoforms, all of which contain a domain with high similarity to the pantothenate kinases of Emericella nidulans and mouse PUBMED:11238410. A role of fumble in membrane synthesis has been proposed PUBMED:11238410.

    \ ' '2625' 'IPR007014' '\

    This is a family of short proteins found in eukaryotes and some archaea. Although the function of these proteins is not known they may contain transmembrane helices.

    \ ' '2626' 'IPR002481' '\

    The Ferric uptake regulator (FUR) family includes metal ion uptake regulator proteins. These are responsible for controlling the intracellular concentration of iron in many bacteria. Although iron is essential for most organisms, high concentrations can be toxic because of the formation of hydroxyl radicals PUBMED:12581348. FURs can also control zinc homeostasis PUBMED:18452427 and is the subject of research on the pathogenesis of mycobacteria.

    \ ' '2627' 'IPR006211' '\ The furin-like cysteine rich region has been found in a variety of proteins from eukaryotes that are involved in the mechanism of signal transduction by receptor tyrosine kinases, which involves receptor aggregation PUBMED:1936959.\ ' '2629' 'IPR006726' '\ This domain includes a conserved region found in two proteins associated with fusaric acid resistance, from Burkholderia cepacia PUBMED:1370369 and from Klebsiella oxytoca. The function of this region is unknown.\ ' '2630' 'IPR000776' '\

    The fusion glycoproteins from this family are found in ssRNA negative-strand viruses.\ This protein directs fusion of viral and cellular membranes, resulting in viral penetration,\ and can direct fusion of infected cells with adjoining cells, resulting in the formation of\ syncytia. The mature form is a dimer of polypeptides F1 and F2 linked by a disulphide\ bond PUBMED:3776349.

    \ ' '2631' 'IPR002567' '\ Human herpesvirus 1 (HHV-1) (Human herpes simplex virus 1) glycoprotein K (gK) plays an essential role in viral replication and cell fusion. gK is a very hydrophobic membrane protein that contains a signal sequence and several hydrophobic regions. gK contains three transmembrane domains (amino acids 125-139, 226-239, and 311-325) and another hydrophobic domain (amino acids 241-265), which is relatively less hydrophobic and much longer compared with the transmembrane sequences located in the extracellular loop. The domains may interact with each other to form a complex tertiary structure that is critical for the biological function of gK PUBMED:9407122.\ ' '2632' 'IPR003814' '\

    Formylmethanofuran dehydrogenases () is found in methanogenic and sulphate-reducing archaea. The enzyme contains molybdenum or tungsten, a molybdopterin guanine dinuceotide cofactor (MGD) and iron-sulphur clusters PUBMED:8125106. It catalyses the reversible reduction of CO2 and methanofuran via N-carboxymethanofuran (carbamate) to N-formylmethanofuran, the first and second steps in methanogenesis from CO2 PUBMED:8575452, PUBMED:9342247. This reaction is important for the reduction of CO2 to methane, in autotrophic CO2 fixation, and in CO2 formation from reduced C1 units PUBMED:8954165. The synthesis of formylmethanofuran is crucial for the energy metabolism of archaea. Methanogenic archaea derives the energy for autrophic growth from the reduction of CO2 with molecular hydrogen as the electron donor PUBMED:12492476. The process of methanogenesis consists of a series of reduction reactions at which the one-carbon unit derived from CO2 is bound to C1 carriers.

    \

    There are two isoenzymes of formylmethanofuran dehydrogenase: a tungsten-containing isoenzyme (Fwd) and a molybdenum-containing isoenzyme (Fmd). The tungsten isoenzyme is constitutively transcribed, whereas transcription of the molybdenum operon is induced by molybdate PUBMED:9818358. The archaea Methanobacterium thermoautotrophicum contains a 4-subunit (FwdA, FwdB, FwdC, FwdD) tungsten formylmethanofuran dehydrogenase and a 3-subunit (FmdA, FmdB, FmdC) molybdenum formylmethanofuran dehydrogenase PUBMED:8954165.

    \ \

    This entry represents subunit E of formylmethanofuran dehydrogenase enyzmes. The enzyme from Methanosarcina barkeri is a molybdenum iron-sulphur protein involved in methanogenesis. Subunit E protein is co-expressed with the enzyme but fails to co-purify and thus its function is unknown PUBMED:8617280.

    \ ' '2633' 'IPR007313' '\ This is a bacterial family of cytoplasmic membrane proteins. It includes two transmembrane regions. The molecular function of FxsA is unknown, but in Escherichia coli its overexpression has been shown to alleviate the exclusion of phage T7 in those cells with an F plasmid.\ ' '2634' 'IPR006884' '\ This is conserved C-terminal region is found in a number of putative transmembrane GTPase. The Fzo protein is a mediator of mitochondrial fusion PUBMED:9230308. This conserved region is also found in the human mitofusin protein PUBMED:11181170.\ ' '2635' 'IPR001748' '\ A Xenopus protein known as G10 PUBMED:2568313 has been found to be highly conserved in a wide range of eukaryotic species. The function of G10 is still unknown. G10 is a protein of about 17 to 18 kDa (143 to 157 residues) which is hydrophilic and whose C-terminal half is rich in cysteines and could be involved in metal-binding.\ ' '2636' 'IPR001019' '\

    Guanine nucleotide binding proteins (G proteins) are membrane-associated, heterotrimeric proteins composed of three subunits: alpha (), beta () and gamma () PUBMED:14762218. G proteins and their receptors (GPCRs) form one of the most prevalent signalling systems in mammalian cells, regulating systems as diverse as sensory perception, cell growth and hormonal regulation PUBMED:15294442. At the cell surface, the binding of ligands such as hormones and neurotransmitters to a GPCR activates the receptor by causing a conformational change, which in turn activates the bound G protein on the intracellular-side of the membrane. The activated receptor promotes the exchange of bound GDP for GTP on the G protein alpha subunit. GTP binding changes the conformation of switch regions within the alpha subunit, which allows the bound trimeric G protein (inactive) to be released from the receptor, and to dissociate into active alpha subunit (GTP-bound) and beta/gamma dimer. The alpha subunit and the beta/gamma dimer go on to activate distinct downstream effectors, such as adenylyl cyclase, phosphodiesterases, phospholipase C, and ion channels. These effectors in turn regulate the intracellular concentrations of secondary messengers, such as cAMP, diacylglycerol, sodium or calcium cations, which ultimately lead to a physiological response, usually via the downstream regulation of gene transcription. The cycle is completed by the hydrolysis of alpha subunit-bound GTP to GDP, resulting in the re-association of the alpha and beta/gamma subunits and their binding to the receptor, which terminates the signal PUBMED:15119945. The length of the G protein signal is controlled by the duration of the GTP-bound alpha subunit, which can be regulated by RGS (regulator of G protein signalling) proteins () or by covalent modifications PUBMED:11313912.

    \

    There are several isoforms of each subunit, many of which have splice variants, which together can make up hundreds of combinations of G proteins. The specific combination of subunits in heterotrimeric G proteins affects not only which receptor it can bind to, but also which downstream target is affected, providing the means to target specific physiological processes in response to specific external stimuli PUBMED:9278091, PUBMED:11882385. G proteins carry lipid modifications on one or more of their subunits to target them to the plasma membrane and to contribute to protein interactions.

    \ \

    This family consists of the G protein alpha subunit, which acts as a weak GTPase. G protein classes are defined based on the sequence and function of their alpha subunits, which in mammals fall into four main categories: G(S)alpha, G(Q)alpha, G(I)alpha and G(12)alpha; there are also fungal and plant classes of alpha subunits. The alpha subunit consists of two domains: a GTP-binding domain and a helical insertion domain (). The GTP-binding domain is homologous to Ras-like small GTPases, and includes switch regions I and II, which change conformation during activation. The switch regions are loops of alpha-helices with conformations sensitive to guanine nucleotides. The helical insertion domain is inserted into the GTP-binding domain before switch region I and is unique to heterotrimeric G proteins. This helical insertion domain functions to sequester the guanine nucleotide at the interface with the GTP-binding domain and must be displaced to enable nucleotide dissociation.

    \ ' '2637' 'IPR015898' '\

    This entry represents the G protein gamma subunit and the GGL (G protein gamma-like) domain, which are related in sequence and are comprised of an extended alpha-helical polypeptide. The G protein gamma subunit forms a stable dimer with the beta subunit, but it does not make any contact with the alpha subunit, which contacts the opposite face of the beta subunit. The GGL domain is found in several RGS proteins. GGL domains can interact with beta subunits to form novel dimers that prevent gamma subunit binding, and may prevent heterotrimer formation by inhibiting alpha subunit binding. The interaction between G protein beta-5 neuro-specific isoforms and RGS GGL domains may represent a general mode of binding between beta-propeller proteins and their partners PUBMED:11331068.

    \ ' '2638' 'IPR006701' '\ Initiation of packaging of double-stranded viral DNA involves the specific interaction of the prohead with viral DNA in a process mediated by a phage-encoded terminase protein. The terminase enzymes are usually hetero-oligomers composed of a small and a large subunit. This region is found on the large subunit and possesses an endonuclease and ATPase activity that requires Mg2+ and a neutral or slightly basic reaction. This region is also found in bacterial sequences PUBMED:10930407, PUBMED:1548711.\ ' '2639' 'IPR006699' '\

    Glycerol enters bacterial cells via facilitated diffusion, an\ energy-independent transport process catalysed by the glycerol transport\ facilitator GlpF, an integral membrane\ protein of the aquaporin family. Intracellular\ glycerol is usually converted to glycerol-3-P in an ATP-requiring\ phosphorylation reaction catalysed by glycerol kinase (GlpK).\ Glycerol-3-P, the inducer of the glpFK operon, is not a substrate for GlpF\ and hence remains entrapped in the cell where it is metabolized further. In\ some bacterial species, for example Bacillus firmus, glycerol-3-P activates the antiterminator GlpP PUBMED:1809833. In B. subtilis, glpF and glpK are organised in an operon followed by the\ glycerol-3-P dehydrogenase-encoding glpD gene and preceded by glpP\ coding for an antiterminator regulating the expression of glpFK, glpD and\ glpTQ. Their induction\ requires the inducer glycerol-3-P, which activates the antiterminator GlpP\ by allowing it to bind to the leader region\ of glpD and presumably also of glpFK and glpTQ\ mRNAs.

    \ ' '2640' 'IPR001282' '\

    Glucose-6-phosphate dehydrogenase () (G6PDH) is a ubiquitous protein, present\ in bacteria and all eukaryotic cell types PUBMED:2838391. The enzyme catalyses the\ the first step in the pentose pathway, i.e. the conversion of glucose-6-phosphate to \ gluconolactone 6-phosphate in the presence of NADP, producing NADPH. The ubiquitous \ expression of the enzyme gives it a major role in the production of NADPH for the many \ NADPH-mediated reductive processes in all cells PUBMED:3393536. Deficiency of G6PDH is \ a common genetic abnormality affecting millions of people worldwide. Many sequence variants, most caused by single point mutations, are known, exhibiting a wide variety of \ phenotypes PUBMED:3393536.

    \ ' '2641' 'IPR001282' '\

    Glucose-6-phosphate dehydrogenase () (G6PDH) is a ubiquitous protein, present\ in bacteria and all eukaryotic cell types PUBMED:2838391. The enzyme catalyses the\ the first step in the pentose pathway, i.e. the conversion of glucose-6-phosphate to \ gluconolactone 6-phosphate in the presence of NADP, producing NADPH. The ubiquitous \ expression of the enzyme gives it a major role in the production of NADPH for the many \ NADPH-mediated reductive processes in all cells PUBMED:3393536. Deficiency of G6PDH is \ a common genetic abnormality affecting millions of people worldwide. Many sequence variants, most caused by single point mutations, are known, exhibiting a wide variety of \ phenotypes PUBMED:3393536.

    \ ' '2642' 'IPR002988' '\

    Protein G-related albumin-binding (GA) modules occur on the surface of numerous Gram-positive bacterial pathogens. Protein G of group C and G Streptococci interacts with the constant region of IgG and with human serum albumin. The GA module is composed of a left-handed three-helix bundle and is found in a range of bacterial cell surface proteins PUBMED:15269208, PUBMED:9086265. GA modules may promote bacterial growth and virulence in mammalian hosts by scavenging albumin-bound nutrients and camouflaging the bacteria. Variations in sequence give rise to differences in structure and function between GA modules in different proteins, which could alter pathogenesis and host specificity due to their varied affinities for different species of albumin PUBMED:16906768. Proteins containing a GA module include PAB from Peptostreptococcus magnus PUBMED:7589548.

    \ ' '2643' 'IPR004115' '\

    This domain is found in some members of the GatB and aspartyl tRNA synthetases, which are both involved in protein biosynthesis.

    \ ' '2644' 'IPR003322' '\

    Retroviral matrix proteins (or major core proteins) are components of envelope-associated capsids, which line the inner surface of virus envelopes and are associated with viral membranes PUBMED:9657938. Matrix proteins are produced as part of Gag precursor polyproteins. During viral maturation, the Gag polyprotein is cleaved into major structural proteins by the viral protease, yielding the matrix (MA), capsid (CA), nucleocapsid (NC), and some smaller peptides. Gag-derived proteins govern the entire assembly and release of the virus particles, with matrix proteins playing key roles in Gag stability, capsid assembly, transport and budding. Although matrix proteins from different retroviruses appear to perform similar functions and can have similar structural folds, their primary sequences can be very different.

    \

    This entry represents matrix proteins from beta-retroviruses such as Mason-Pfizer monkey virus (MPMV) (Simian Mason-Pfizer virus) and Mouse mammary tumor virus (MMTV) PUBMED:15113883, PUBMED:9499052. This entry also identifies matrix proteins from several eukaryotic endogenous retroviruses, which arise when one or more copies of the retroviral genome becomes integrated into the host genome PUBMED:12876457.

    \ ' '2645' 'IPR000071' '\

    Retroviral matrix proteins (or major core proteins) are components of envelope-associated capsids, which line the inner surface of virus envelopes and are associated with viral membranes PUBMED:9657938. Matrix proteins are produced as part of Gag precursor polyproteins. During viral maturation, the Gag polyprotein is cleaved into major structural proteins by the viral protease, yielding the matrix (MA), capsid (CA), nucleocapsid (NC), and some smaller peptides. Gag-derived proteins govern the entire assembly and release of the virus particles, with matrix proteins playing key roles in Gag stability, capsid assembly, transport and budding. Although matrix proteins from different retroviruses appear to perform similar functions and can have similar structural folds, their primary sequences can be very different.

    \

    This entry represents matrix proteins from immunodeficiency lentiviruses, such as Human immunodeficiency virus (HIV) and Simian immunodeficiency virus (SIV-cpz) PUBMED:12465460. The structure of the HIV protein consists of 5 alpha helices, a short 3.10 helix and a 3-stranded mixed beta-sheet PUBMED:7966331.

    \ \ ' '2646' 'IPR003139' '\

    Retroviral matrix proteins (or major core proteins) are components of envelope-associated capsids, which line the inner surface of virus envelopes and are associated with viral membranes PUBMED:9657938. Matrix proteins are produced as part of Gag precursor polyproteins. During viral maturation, the Gag polyprotein is cleaved into major structural proteins by the viral protease, yielding the matrix (MA), capsid (CA), nucleocapsid (NC), and some smaller peptides. Gag-derived proteins govern the entire assembly and release of the virus particles, with matrix proteins playing key roles in Gag stability, capsid assembly, transport and budding. Although matrix proteins from different retroviruses appear to perform similar functions and can have similar structural folds, their primary sequences can be very different.

    \

    This entry represents matrix proteins from delta-retroviruses such as Human T-lymphotropic virus 1 and Human T-cell leukemia virus 2 (HTLV-2), both members of the human oncovirus subclass of retroviruses PUBMED:11752179, PUBMED:9000634.

    \ ' '2647' 'IPR000721' '\ The Gag protein from retroviruses, also known as p24, forms the inner protein layer of the\ nucleocapsid. This protein performs highly complex orchestrated tasks during the assembly,\ budding, maturation and infection stages of the viral replication cycle. During viral assembly,\ the proteins form membrane associations and self-associations that ultimately result in\ budding of an immature virion from the infected cell. Gag precursors also function during\ viral assembly to selectively bind and package two plus strands of genomic RNA. ELISA tests\ for p24 is the most commonly used method to demonstrate virus replication both in vivo and in\ vitro.\ ' '2648' 'IPR003036' '\ P30 is essential for viral assembly PUBMED:2414902.\ Cleavage of P70 in vitro can be accompanied by a shift from a concentrically coiled internal strand ("immature") to a collapsed ("mature") form of the virus core PUBMED:410020.\ ' '2649' 'IPR004957' '\ The Spumavirus gag protein is a core viral polyprotein that undergoes specific enzymatic cleavages in vivo to yield the mature protein.\ ' '2650' 'IPR001079' '\

    Galectins (also known as galaptins or S-lectin) are a family of proteins defined by having at least one characteristic carbohydrate recognition domain (CRD) with an affinity for beta-galactosides and sharing certain sequence elements. Members of the galectins family are found in mammals, birds, amphibians, fish, nematodes, sponges, and some fungi. Galectins are known to carry out intra- and extracellular functions through glycoconjugate-mediated recogntion. From the cytosol they may be secreted by non-classical pathways, but they may also be targeted to the nucleus or specific sub-cytosolic sites. Within the same peptide chain some galectins have a CRD with only a few additional amino acids, whereas others have two CRDs joined by a link peptide, and one (galectin-3) has one CRD joined to a different type of domain PUBMED:16051274, PUBMED:14758066.

    \ \

    The galectin carbohydrate recognition domain (CRD) is a beta-sandwich of about 135 amino acid. The two sheets are slightly bent with 6 strands forming the concave side and 5 strands forming the convex side. The concave side forms a groove in which carbohydrate is bound, and which is long enough to hold about a linear tetrasaccharide PUBMED:8262940, PUBMED:8747464.

    \ ' '2651' 'IPR005600' '\

    The DNA binding domain (residues 1 to 147) of the yeast transcriptional activator GAL4 exists in\ solution in dimeric form, with the region responsible for dimerisation somewhere between residues 74 and 147. Experimental studies confirmed that the\ \'hydrophobic region\' of the protein (residues 54-97, which contains a larger proportion of alpha-helix), is essential for dimerisation PUBMED:8765712.

    \ ' '2652' 'IPR002659' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 31 () comprises\ enzymes with a number of known activities; N-acetyllactosaminide beta-1,3-N-acetylglucosaminyltransferase ();\ beta-1,3-galactosyltransferase (); fucose-specific beta-1,3-N-acetylglucosaminyltransferase (); globotriosylceramide beta-1,3-GalNAc transferase () PUBMED:9417100, PUBMED:9417047.

    \ ' '2653' 'IPR003859' '\ This is a family of galactosyltransferases from a wide range of metazoa with three related galactosyltransferase activities; all three of which are possessed by one sequence in some cases. The three functions are N-acetyllactosamine synthase (); beta-N-acetylglucosaminyl-glycopeptide beta-1,4-galactosyltransferase (); and lactose synthase (). Note that N-acetyllactosamine synthase is a component of lactose synthase along with alpha-lactalbumin, in the absence of alpha-lactalbumin N-acetyllactosamine synthase is used.\ ' '2654' 'IPR008174' '\

    Galanin is a peptide hormone that controls various biological activities PUBMED:1710578. Galanin-like immuno-reactivity has been found in the central and peripheral nervous systems of mammals, with high concentrations demonstrated in discrete regions of the central nervous system, including the median eminence, hypothalamus, arcuate nucleus, septum, neuro-intermediate lobe of the pituitary, and the spinal cord. Its localisation within neurosecretory granules suggests that galanin may function as a neurotransmitter, and it has been shown to coexist with a variety of other peptide and amine neurotransmitters within individual neurons PUBMED:2448788.

    \

    Although the precise physiological role of galanin is uncertain, it has a number of pharmacological properties: it stimulates food intake, when injected into the third ventricle of rats; it increases levels of plasma growth hormone and prolactin, and decreases dopamine levels in the median eminence PUBMED:2448788; and infusion into humans results in hyperglycemia and glucose intolerance, and inhibits pancreatic release of insulin, somatostatin and pancreatic peptide. Galanin also modulates smooth muscle contractility within the gastro-intestinal and genito-urinary tracts, all such activities suggesting that the hormone may play an important role in the nervous modulation of endocrine and smooth muscle function PUBMED:2448788.

    \

    Galanin is a 29 amino acid peptide processed from a larger precursor protein. Except in human, galanin is C-terminally amidated. Its sequence is highly conserved and the first 14 residues are identical in all currently known sequences.

    \ \ ' '2655' 'IPR006079' '\

    Lantibiotics are heavily-modified bacteriocin-like peptides from Gram- positive bacteria. They contain alpha,beta-unsaturated amino acids (dehydroalanine and dehydrobutyrine) and lanthionine or 3-methyllanthionine rings\ (collectively known as thioether rings). There are 2 types of lantibiotic:

    1. Type A (which include nisin, subtilin, epidermin,\ gallidermin and Pep5) are strongly cationic and bactericidal - nisin, subtilin and Pep5 inhibit the growth of Gram-positive\ bacteria, probably by voltage-dependent pore formation in the cytoplasmic membrane, resulting in cellular efflux of\ electrolytes, amino acids and ATP;
    2. Type B lantibiotics possess at most one positive charge and are not bactericidal.

    \ \

    This family contains both type A and type B molecules.

    \ ' '2656' 'IPR005850' '\

    Galactose-1-phosphate uridyl transferase catalyses the conversion of UDP-glucose and alpha-D-galactose 1-phosphate to alpha-D-glucose 1-phosphate and UDP-galactose during galactose metabolism. The enzyme is present \ in prokaryotes and eukaryotes. Defects in GalT in humans is the cause of galactosemia, an \ inherited disorder of galactose metabolism that leads to jaundice, cataracts and mental retardation.

    \

    This domain describes the C-terminal of Galactose-1-phosphate uridyl transferase. SCOP reports fold duplication of the C-terminal with the N-terminal domain. Both are involved in Zn and Fe binding

    \ ' '2657' 'IPR005849' '\

    Galactose-1-phosphate uridyl transferase catalyses the conversion of UDP-glucose and alpha-D-galactose 1-phosphate to alpha-D-glucose 1-phosphate and UDP-galactose during galactose metabolism. The enzyme is present \ in prokaryotes and eukaryotes. Defects in GalT in humans is the cause of galactosemia, an \ inherited disorder of galactose metabolism that leads to jaundice, cataracts and mental retardation.

    \

    This domain describes the C-terminal of Galactose-1-phosphate uridyl transferase. SCOP reports fold duplication of the C-terminal with the N-terminal domain. Both are involved in Zn and Fe binding

    \ ' '2658' 'IPR008176' '\

    The following small plant proteins are evolutionary related:

    \ \ \ \

    In their mature form, these proteins generally consist of about 45 to 50 amino-acid residues. As shown in the following schematic representation, these peptides contain eight conserved cysteines involved in disulphide bonds.

    \
    \
              +-------------------------------------------+\
              |          +-------------------+            |\
              |          |                   |            |\
            xxCxxxxxxxxxxCxxxxxCxxxCxxxxxxxxxCxxxxxxCxCxxxC\
                               |   |                | |\
                               +---|----------------+ |\
                                   +------------------+\
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
    
    \

    The folded structure of Gamma-purothionin is characterised by a well-defined 3-stranded anti-parallel beta-sheet and a short alpha-helix PUBMED:8380707. Three disulphide bridges are located in the hydrophobic core between the helix and sheet, forming a cysteine-stabilised alpha-helical motif. This structure differs from that of the plant alpha- and beta- thionins, but is analogous to scorpion toxins and insect defensins.

    \ ' '2659' 'IPR007504' '\ Gar1 is a small nucleolar RNP that is required for pre-mRNA processing and pseudouridylation PUBMED:10690410. It is co-immunoprecipitated with the H/ACA families of snoRNAs. This family represents the conserved central region of Gar1. This region is necessary and sufficient for normal cell growth, and specifically binds two snoRNAs snR10 and snR30. This region is also necessary for nucleolar targeting, and it is thought that the protein is co-transported to the nucleolus as part of a nucleoprotein complex PUBMED:9556561. In humans, Gar1 is also component of telomerase in vivo PUBMED:10757788.\ ' '2660' 'IPR020561' '\

    Phosphoribosylglycinamide synthetase () (GARS) (phosphoribosylamine glycine ligase) PUBMED:2687276 catalyses the second step in the de novo biosynthesis of purine. The reaction catalysed by phosphoribosylglycinamide synthetase is the ATP-dependent addition of 5-phosphoribosylamine to glycine to form 5\'phosphoribosylglycinamide:\ \ In bacteria, GARS is a monofunctional enzyme (encoded by the purD gene). In\ yeast, GARS is part of a bifunctional enzyme (encoded by the ADE5/7 gene) in conjunction with phosphoribosylformylglycinamidine cyclo-ligase (AIRS) (). In higher eukaryotes, GARS\ is part of a trifunctional enzyme in conjunction with AIRS () and with phosphoribosylglycinamide formyltransferase (GART) (), forming GARS-AIRS-GART.

    \

    This entry represents the A-domain of the enzyme, and is related to the ATP-grasp domain of biotin carboxylase/carbamoyl phosphate synthetase.

    \ ' '2662' 'IPR020560' '\

    Phosphoribosylglycinamide synthetase () (GARS) (phosphoribosylamine glycine ligase) PUBMED:2687276 catalyses the second step in the de novo biosynthesis of purine. The reaction catalysed by phosphoribosylglycinamide synthetase is the ATP-dependent addition of 5-phosphoribosylamine to glycine to form 5\'phosphoribosylglycinamide:\ \ In bacteria, GARS is a monofunctional enzyme (encoded by the purD gene). In\ yeast, GARS is part of a bifunctional enzyme (encoded by the ADE5/7 gene) in conjunction with phosphoribosylformylglycinamidine cyclo-ligase (AIRS) (). In higher eukaryotes, GARS\ is part of a trifunctional enzyme in conjunction with AIRS () and with phosphoribosylglycinamide formyltransferase (GART) (), forming GARS-AIRS-GART.

    \

    This entry represents the C-domain, which is related to the C-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase ().

    \ ' '2663' 'IPR020562' '\

    Phosphoribosylglycinamide synthetase () (GARS) (phosphoribosylamine glycine ligase) PUBMED:2687276 catalyses the second step in the de novo biosynthesis of purine. The reaction catalysed by phosphoribosylglycinamide synthetase is the ATP-dependent addition of 5-phosphoribosylamine to glycine to form 5\'phosphoribosylglycinamide:\ \ In bacteria, GARS is a monofunctional enzyme (encoded by the purD gene). In\ yeast, GARS is part of a bifunctional enzyme (encoded by the ADE5/7 gene) in conjunction with phosphoribosylformylglycinamidine cyclo-ligase (AIRS) (). In higher eukaryotes, GARS\ is part of a trifunctional enzyme in conjunction with AIRS () and with phosphoribosylglycinamide formyltransferase (GART) (), forming GARS-AIRS-GART.

    \

    This entry represents the N-domain, which is related to the N-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase ().

    \ ' '2664' 'IPR004886' '\ This family is a group of yeast glycolipid proteins anchored to the membrane. It includes Candida albicans (Yeast) pH-regulated protein, which is required for apical growth and plays a role in morphogenesis and Saccharomyces cerevisiae glycolipid anchored surface protein.\ ' '2665' 'IPR003108' '\

    The growth-arrest-specific protein 2 domain is found associated with the spectrin repeat, calponin homology domain and EF hand in many proteins.

    \ ' '2666' 'IPR000638' '\ Gas vesicles are small, hollow, gas filled protein structures found in several cyanobacterial and archaebacterial\ microorganisms PUBMED:2513809. They allow the positioning of the bacteria at the favourable depth for growth.\ Gas vesicles are hollow cylindrical tubes, closed by a hollow, conical cap at each end. Both the conical end\ caps and central cylinder are made up of 4-5 nm wide ribs that run at right angles to the long axis of the\ structure. Gas vesicles seem to be constituted of two different protein components, GVPa and GVPc. GVPa, a\ small protein of about 70 amino acid residues, is the main constituent of gas vesicles and form the essential\ core of the structure. The sequence of GVPa is extremely well conserved. GvpJ and gvpM, two proteins encoded\ in the cluster of genes required for gas vesicle synthesis in the archaebacteria Halobacterium salinarium and\ Halobacterium mediterranei (Haloferax mediterranei), have been found PUBMED:1864501 to be evolutionary related to GVPa. The exact function\ of these two proteins is not known, although they could be important for determining the shape determination\ gas vesicles. The N-terminal domain of Aphanizomenon flos-aquae protein gvpA/J is also related to GVPa.\ ' '2667' 'IPR002003' '\ Gas vesicles are small, hollow, gas filled protein structures found in several cyanobacterial and archaebacterial microorganisms PUBMED:2513809. They allow the\ positioning of the bacteria at the favorable depth for growth. Gas vesicles\ are hollow cylindrical tubes, closed by a hollow, conical cap at each end.\ Both the conical end caps and central cylinder are made up of 4-5 nm wide\ ribs that run at right angles to the long axis of the structure. Gas vesicles\ seem to be constituted of two different protein components: GVPa and GVPc.\ GVPc is a minor constituent of gas vesicles and seems to be located on the\ outer surface. Structurally, cyanobacterial GVPc consists of four or five\ tandem repeats of a 33 residue sequence flanked by sequences of 18 and 10\ residues at the N- and C-termini, respectively.\ ' '2668' 'IPR003854' '\ This is the GASA gibberellin regulated cysteine rich protein family. The expression of these proteins is up-regulated by the plant hormone gibberellin, most of these proteins have some role in plant development. There are 12 cysteine residues conserved within the alignment giving the potential for these proteins to posses 6 disulphide bonds.\ ' '2669' 'IPR001651' '\ Gastrin and cholecystokinin (CCK) PUBMED: are structurally and functionally related peptide hormones that function as hormonal regulators of various digestive processes and feeding behaviors. They are known to induce gastric secretion, stimulate pancreatic secretion, increase blood circulation and water secretion in the stomach and intestine, and stimulate smooth muscle contraction. Originally found in the gut, these hormones have since been shown to be present in various parts of the nervous system. Like many other active peptides they are synthesized as larger protein precursors that are enzymatically converted to their mature forms. They are found in several molecular forms due to tissue-specific post-translational processing. The biological activity of gastrin and CCK is associated with the last five C-terminal residues. One or two positions downstream, there is a conserved sulphated tyrosine residue. The amphibian caerulein skin peptide, the cockroach leukosulphakinin I and II (LSK) peptides, Drosophila melanogaster (Fruit fly) putative CCK-homologs Drosulphakinins I and II, cionin, a Gallus gallus (Chicken) gastrin/cholecystokinin-like peptide and cionin, a neuropeptide from the protochordate Ciona intestinalis belong to the same family.\ ' '2670' 'IPR000583' '\

    A large group of biosynthetic enzymes are able to catalyse the removal of the ammonia group from glutamine and\ then to transfer this group to a substrate to form a new carbon-nitrogen group. This catalytic activity is known as\ glutamine amidotransferase (GATase) () PUBMED:4355768. The GATase domain exists either as a separate polypeptidic\ subunit or as part of a larger polypeptide fused in different ways to a synthase domain. On the basis of sequence\ similarities two classes of GATase domains have been identified PUBMED:3298209, PUBMED:6086650, class-I (also known as\ trpG-type) and class-II (also known as purF-type). Enzymes containing Class-II GATase domains include amido\ phosphoribosyltransferase (glutamine phosphoribosylpyrophosphate amidotransferase) (), which catalyses the\ first step in purine biosynthesis (gene purF in bacteria, ADE4 in yeast); glucosamine--fructose-6-phosphate aminotransferase\ (), which catalyses the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine\ (gene glmS in Escherichia coli, nodM in Rhizobium, GFA1 in yeast); and asparagine synthetase (glutamine-hydrolizing) (), which is responsible for the synthesis of asparagine from aspartate and glutamine. A cysteine is present at the N-terminal extremity of the mature form of all these enzymes.

    \ \

    This domain is found in a number of cysteine peptidases belonging to MEROPS peptidase family C44 and their non-peptidase homologs.

    \ ' '2671' 'IPR007652' '\ The glycosphingolipids (GSL) form part of eukaryotic cell membranes. They consist of a hydrophilic carbohydrate moiety linked to a hydrophobic ceramide tail embedded within the lipid bilayer of the membrane. Lactosylceramide, Gal1,4Glc1Cer (LacCer), is the common synthetic precursor to the majority of GSL found in vertebrates. Alpha 1.4-glycosyltransferases utilise UDP donors and transfer the sugar to a beta-linked acceptor. This region appears to be confined to higher eukaryotes. No function has been yet assigned to this region PUBMED:10854428.\ ' '2672' 'IPR015894' '\

    Guanylate-binding protein is a GTPase that is induced by interferon (IFN)-gamma. GTPases induced by IFN-gamma are key to the protective immunity against microbial and viral pathogens. These GTPases are classified into three groups: the small 47-kd GTPases, the Mx proteins, and the large 65- to 67-kd GTPases. Guanylate-binding proteins (GBP) fall into the last class. In humans, there are seven GBPs (hGBP1-7) PUBMED:17266443. Structurally, hGBP1 consists of two domains: a compact globular N-terminal domain harbouring the GTPase function, and an alpha-helical finger-like C-terminal domain (). Human GBP1 is secreted from cells without the need of a leader peptide, and has been shown to exhibit antiviral activity against Vesicular stomatitis virus and Encephalomyocarditis virus, as well as being able to regulate the inhibition of proliferation and invasion of endothelial cells in response to IFN-gamma PUBMED:16936281.

    \ ' '2673' 'IPR003191' '\

    Guanylate-binding protein is a GTPase that is induced by interferon (IFN)-gamma. GTPases induced by IFN-gamma are key to the protective immunity against microbial and viral pathogens. These GTPases are classified into three groups: the small 47-kd GTPases, the Mx proteins, and the large 65- to 67-kd GTPases. Guanylate-binding proteins (GBP) fall into the last class. In humans, there are seven GBPs (hGBP1-7) PUBMED:17266443. Structurally, hGBP1 consists of two domains: a compact globular N-terminal domain harbouring the GTPase function (), and an alpha-helical finger-like C-terminal domain. Human GBP1 is secreted from cells without the need of a leader peptide, and has been shown to exhibit antiviral activity against Vesicular stomatitis virus and Encephalomyocarditis virus, as well as being able to regulate the inhibition of proliferation and invasion of endothelial cells in response to IFN-gamma PUBMED:16936281.

    \ ' '2674' 'IPR003463' '\ This family includes insect peptides that are short (23 amino acids) and contain 1 disulphide bridge. The family includes growth-blocking peptide (GBP) of Pseudaletia separata (Oriental armyworm) and the paralytic peptides from Manduca sexta (Tobacco hawkmoth), Heliothis virescens (Noctuid moth), and Spodoptera exigua (Beet armyworm) PUBMED:2071576 as well as plasmatocyte-spreading peptide (PSP1) PUBMED:9988679. These peptides function to halt metamorphosis from larvae to pupae.\ ' '2675' 'IPR003681' '\

    The glycophorin-binding protein contains a tandem repeat. The repeated sequence determines the binding domain for an erythrocyte receptor binding protein of Plasmodium falciparum, the malarial parasite PUBMED:7891744. Erythrocyte invasion by the malarial merozoite is a receptor-mediated process, an obligatory step in the development of the parasite. The P. falciparum protein binds to the erythrocyte receptor glycophorin.

    \ ' '2676' 'IPR004588' '\

    This protein previously of unknown biochemical function is essential in Escherichia coli. It has now been characterised as 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase, which converts 2C-methyl-D-erythritol 2,4-cyclodiphosphate (ME-2,4CPP) into 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate in the sixth step of nonmevalonate terpenoid biosynthesis. The family is restricted to bacteria, where it is widely but not universally distributed. No homology can be detected between this family and other proteins.

    \ ' '2677' 'IPR001409' '\ Steroid or nuclear hormone receptors (NRs) constitute an important super-\ family of transcription regulators that are involved in widely diverse \ physiological functions, including control of embryonic development, cell\ differentiation and homeostasis. Members of the superfamily include the\ steroid hormone receptors and receptors for thyroid hormone, retinoids, \ 1,25-dihydroxy-vitamin D3 and a variety of other ligands. The proteins \ function as dimeric molecules in nuclei to regulate the transcription of \ target genes in a ligand-responsive manner PUBMED:7899080, PUBMED:8165128. In addition to C-terminal\ ligand-binding domains, these nuclear receptors contain a highly-conserved,\ N-terminal zinc-finger that mediates specific binding to target DNA \ sequences, termed ligand-responsive elements. In the absence of ligand,\ steroid hormone receptors are thought to be weakly associated with nuclear\ components; hormone binding greatly increases receptor affinity.\ \

    NRs are extremely important in medical research, a large number of them\ being implicated in diseases such as cancer, diabetes, hormone resistance\ syndromes, etc. While several NRs act as ligand-inducible transcription\ factors, many do not yet have a defined ligand and are accordingly termed \ "orphan" receptors. During the last decade, more than 300 NRs have been\ described, many of which are orphans, which cannot easily be named due to \ current nomenclature confusions in the literature. However, a new system \ has recently been introduced in an attempt to rationalise the increasingly \ complex set of names used to describe superfamily members.

    \

    \ The glucocorticoid receptor consists of 3 functional and structural\ domains: an N-terminal (modulatory) domain; a DNA binding domain that\ mediates specific binding to target DNA sequences (ligand-responsive\ elements); and a hormone binding domain. The N-terminal domain is unique\ to the glucocorticoid receptors; it spans the first 440 residues, and is\ primarily responsible for transcriptional activation. The smaller (around\ 65 residues), highly-conserved central portion of the protein is the DNA \ binding domain, which plays a role in DNA binding specificity, homo-\ dimerisation and in interactions with other proteins. The hormone binding \ domain comprises approximately 250 residues at the C-terminus of the\ receptor. This domain mediates receptor activity via interaction with heat\ shock proteins and cyclophilins, or with hormone.

    \ ' '2678' 'IPR006336' '\

    Also known as gamma-glutamylcysteine synthetase and gamma-ECS (). This enzyme catalyses the first and rate limiting step in de novo glutathione biosynthesis. Members of this family are found in archaea, bacteria and plants. May and Leaver PUBMED:7937837 discuss the possible evolutionary origins of glutamate-cysteine ligase enzymes in different organisms and suggest that it evolved independently in different eukaryotes, from an ancestral bacterial enzyme. They also state that Arabidopsis thaliana (Mouse-ear cress) gamma-glutamylcysteine synthetase is structurally unrelated to mammalian, yeast and Escherichia coli homologues. In plants, there are separate cytosolic and chloroplast forms of the enzyme.

    \ ' '2679' 'IPR002930' '\

    This is a family of glycine cleavage H-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes.\ A lipoyl group is attached to a completely conserved lysine residue.\ The H protein shuttles the methylamine group of glycine from the P protein to the T protein PUBMED:8197146.

    \ ' '2680' 'IPR006222' '\ This is a family of glycine cleavage T-proteins, part of the glycine \ cleavage multienzyme complex (GCV) found in bacteria and the mitochondria\ of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes.\ The T-protein is an aminomethyl transferase \ that catalyses the following reaction:\ \ ' '2681' 'IPR007392' '\

    This domain is found at the C-terminus of D-galactarate dehydratase () which is thought to catalyse the reaction PUBMED:9772162 and altronate hydrolase (altronic acid hydratase, ), which catalyses PUBMED:9579062. As purified, both enzymes are catalytically inactive in the absence of added Fe2+, Mn2+, and beta-mercaptoethanol. Synergistic activation of altronate hydrolase activity is seen in the presence of both iron and manganese ions, suggesting that the enzyme may have two ion binding sites. Mn2+ appears to be part of the enzyme active centre, but the function of the single bound Fe2+ ion is unknown. The hydratase has no Fe-S core PUBMED:3038546. The N-terminal is represented by .

    \ ' '2683' 'IPR000407' '\

    A number of nucleoside diphosphate and triphosphate hydrolases as well as some\ yet uncharacterised proteins have been found to belong to the same family PUBMED:8579614, PUBMED:8703025. The uncharacterised proteins all seem to be membrane-bound.

    \ \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).\

    \ \ \ ' '2684' 'IPR020580' '\

    This family consists of glycine cleavage system P-proteins () from bacterial, mammalian and plant sources. The P protein is part of the glycine decarboxylase multienzyme complex ( (GDC) also annotated as glycine cleavage system or glycine synthase. The P protein binds the alpha-amino group of glycine through its pyridoxal phosphate cofactor, carbon dioxide is released and the remaining methylamin moiety is then transferred to the lipoamide cofactor of the H protein. GDC consists of four proteins P, H, L and T PUBMED:8181752. The reaction catalysed by this protein is:

    \

    Glycine + lipoylprotein = S-aminomethyldihydrolipoylprotein + CO2

    \ ' '2685' 'IPR018203' '\ Rab proteins constitute a family of small GTPases that serve a regulatory\ role in vesicular membrane traffic PUBMED:7957092, PUBMED:7585614; C-terminal geranylgeranylation is\ crucial for their membrane association and function. This post-translational\ modification is catalysed by Rab geranylgeranyl transferase (Rab-GGTase), a \ multi-subunit enzyme that contains a catalytic heterodimer and an accessory\ component, termed Rab escort protein (REP)-1 PUBMED:7957092. REP-1 presents newly-\ synthesised Rab proteins to the catalytic component, and forms a stable\ complex with the prenylated proteins following the transfer reaction. \

    The mechanism of REP-1-mediated membrane association of Rab5 is similar\ to that mediated by Rab GDP dissociation inhibitor (GDI). REP-1 and Rab GDI \ also share other functional properties, including the ability to inhibit the\ release of GDP and to remove Rab proteins from membranes.

    \

    The crystal structure of the bovine alpha-isoform of Rab GDI has been\ determined to a resolution of 1.81A PUBMED:8609986. The protein is composed of two\ main structural units: a large complex multi-sheet domain I, and a smaller\ alpha-helical domain II.

    \

    The structural organisation of domain I is closely related to FAD-containing\ monooxygenases and oxidases PUBMED:8609986. Conserved regions common to GDI and the\ choroideraemia gene product, which delivers Rab to catalytic subunits of\ Rab geranylgeranyltransferase II, are clustered on one face of the domain\ PUBMED:7585614. The two most conserved regions form a compact structure at the apex of\ the molecule; site-directed mutagenesis has shown these regions to play a\ critical role in the binding of Rab proteins PUBMED:8609986.

    \ ' '2686' 'IPR016017' '\

    This cysteine rich domain is found in multiple copies in GNDF and GAS1 proteins. GDNF and neurturin (NTN) receptors are potent survival factors for sympathetic, sensory and central nervous system neurons PUBMED:16551639, PUBMED:9192899. GDNF and neurturin promote neuronal survival by signalling through similar multicomponent receptors that consist of a common receptor tyrosine kinase and a member of a GPI-linked family of receptors that determines ligand specificity PUBMED:9192898.

    \ ' '2687' 'IPR003130' '\

    Dynamin GTPase effector domain found in proteins related to dynamin.

    \ \

    Dynamin is a GTP-hydrolysing protein that is an essential participant in clathrin-mediated endocytosis by cells. It self-assembles into \'collars\' in vivo at the necks of invaginated coated pits; the self-assembly of dynamin being coordinated by the GTPase domain. Mutation studies indicate that dynamin functions as a molecular regulator of receptor-mediated endocytosis PUBMED:10206643.

    \ \ ' '2688' 'IPR006892' '\

    This family consists mostly of Gemini virus AC4 and AC5 proteins PUBMED:7844539.

    \ ' '2689' 'IPR001191' '\ Geminiviruses are characterised by a genome of circular single-stranded DNA encapsidated in twinned (geminate) quasi-isometric particles, from which the group derives its name PUBMED:16453696. Most geminiviruses can be divided into two subgroups on the basis of host range and/or insect vector: i.e.those that infect dicotyledenous plants and are transmitted \ by the same whitefly species, and those that infect monocotyledenous plants and are transmitted by different leafhopper vectors. The genomes of the whitefly-transmitted African cassava mosaic virus, Tomato golden mosaic virus (TGMV) and Bean golden mosaic virus (BGMV) possess a bipartite genome. By contrast, only a single DNA component has been identified for the leafhopper-transmitted Maize streak virus (MSV) and Wheat dwarf virus (WDV) PUBMED:6526009, PUBMED:2829117. Beet curly top virus (BCTV), and Tobacco yellow \ dwarf virus belong to a third possible subgroup. Like MSV and WDV, BCTV is transmitted by a specific leafhopper species, yet like the whitefly-transmitted geminiviruses it has a host range confined to dicotyledenous plants.\ \

    Sequence comparison of the whitefly-transmitted Squash leaf curl virus (SqLCV) and Tomato yellow leaf curl virus (TYLCV) with the genomic components of TGMV and BGMV reveals a close evolutionary relationship PUBMED:1840676, PUBMED:1984668, PUBMED:1926771. Amino acid sequence alignments of Potato yellow mosaic virus (PYMV) \ proteins with those encoded by other geminiviruses show that PYMV is closely related to geminiviruses isolated from the New World, especially in the putative \ coat protein gene regions PUBMED:1926771. Comparison of MSV DNA-encoded proteins with those of other geminiviruses infecting monocotyledonous plants, including Panicum streak virus PUBMED:1588314 and Miscanthus streak virus (MiSV) PUBMED:1919519, reveal high levels of similarity.

    \ ' '2690' 'IPR000942' '\ Geminiviruses are characterised by a genome of circular single-stranded DNA encapsidated in twinned (geminate) quasi-isometric particles, from which the group derives its name PUBMED:16453696. Most geminiviruses can be divided into two subgroups on the basis of host range and/or insect vector: i.e.those that infect dicotyledenous plants and are transmitted \ by the same whitefly species, and those that infect monocotyledenous plants and are transmitted by different leafhopper vectors. The genomes of the whitefly-transmitted African cassava mosaic virus, Tomato golden mosaic virus (TGMV) and Bean golden mosaic virus (BGMV) possess a bipartite genome. By contrast, only a single DNA component has been identified for the leafhopper-transmitted Maize streak virus (MSV) and Wheat dwarf virus (WDV) PUBMED:6526009, PUBMED:2829117. Beet curly top virus (BCTV), and Tobacco yellow \ dwarf virus belong to a third possible subgroup. Like MSV and WDV, BCTV is transmitted by a specific leafhopper species, yet like the whitefly-transmitted geminiviruses it has a host range confined to dicotyledenous plants.\ \

    Sequence comparison of the whitefly-transmitted Squash leaf curl virus (SqLCV) and Tomato yellow leaf curl virus (TYLCV) with the genomic components of TGMV and BGMV reveals a close evolutionary relationship PUBMED:1840676, PUBMED:1984668, PUBMED:1926771. Amino acid sequence alignments of Potato yellow mosaic virus (PYMV) \ proteins with those encoded by other geminiviruses show that PYMV is closely related to geminiviruses isolated from the New World, especially in the putative \ coat protein gene regions PUBMED:1926771. Comparison of MSV DNA-encoded proteins with those of other geminiviruses infecting monocotyledonous plants, including Panicum streak virus PUBMED:1588314 and Miscanthus streak virus (MiSV) PUBMED:1919519, reveal high levels of similarity.

    \ ' '2691' 'IPR000657' '\ Geminiviruses are characterised by a genome of circular single-stranded DNA encapsidated in twinned (geminate) quasi-isometric particles, from which the group derives its name PUBMED:16453696. Most geminiviruses can be divided into two subgroups on the basis of host range and/or insect vector: i.e.those that infect dicotyledenous plants and are transmitted \ by the same whitefly species, and those that infect monocotyledenous plants and are transmitted by different leafhopper vectors. The genomes of the whitefly-transmitted African cassava mosaic virus, Tomato golden mosaic virus (TGMV) and Bean golden mosaic virus (BGMV) possess a bipartite genome. By contrast, only a single DNA component has been identified for the leafhopper-transmitted Maize streak virus (MSV) and Wheat dwarf virus (WDV) PUBMED:6526009, PUBMED:2829117. Beet curly top virus (BCTV), and Tobacco yellow \ dwarf virus belong to a third possible subgroup. Like MSV and WDV, BCTV is transmitted by a specific leafhopper species, yet like the whitefly-transmitted geminiviruses it has a host range confined to dicotyledenous plants.\ \

    Sequence comparison of the whitefly-transmitted Squash leaf curl virus (SqLCV) and Tomato yellow leaf curl virus (TYLCV) with the genomic components of TGMV and BGMV reveals a close evolutionary relationship PUBMED:1840676, PUBMED:1984668, PUBMED:1926771. Amino acid sequence alignments of Potato yellow mosaic virus (PYMV) \ proteins with those encoded by other geminiviruses show that PYMV is closely related to geminiviruses isolated from the New World, especially in the putative \ coat protein gene regions PUBMED:1926771. Comparison of MSV DNA-encoded proteins with those of other geminiviruses infecting monocotyledonous plants, including Panicum streak virus PUBMED:1588314 and Miscanthus streak virus (MiSV) PUBMED:1919519, reveal high levels of similarity.

    \

    Geminiviruses contain three ORFs (designated AL1, AL2, and AL3) that \ overlap and are specified by multiple polycistronic mRNAs. The AL3 protein comprises approximately 0.05% \ of the cellular proteins and is present in the soluble and organelle fractions PUBMED:8030214. \ AL3 may form oligomers PUBMED:8794317. Immunoprecipitation of AL3 in a \ baculovirus expression system extracts expressing both AL1 and AL3 showed \ that the two proteins also complex with each other. The AL3 protein is involved in viral replication. \

    \ ' '2692' 'IPR000211' '\

    The movement of bipartite Geminiviruses such as squash leaf curl virus (SqLCV) requires the cooperative\ interaction of two essential virus-encoded movement proteins, BR1 and BL1. Recent studies of SqLCV and bean dwarf mosaic virus have shown that BR1 and BL1 act in a cooperative manner to move the viral genome intracellularly from the nucleus to the cytoplasm and across the wall cell to cell. BR1 is a nuclear shuttle protein, and it has been proposed to bind newly replicated viral ssDNA genomes and move these between the nucleus and cytoplasm. These BR1-genome complexes are then directed to the cell periphery through interactions between BR1 and\ BL1, where, as the result of BL1 action, the complexes are moved to adjacent uninfected cells. The precise\ mechanism by which BL1 acts to transport these genome complexes across the cell wall, and whether this may differ in different cell\ types, remains at issue PUBMED:9765472.

    \ ' '2694' 'IPR002488' '\ This family consists of the N-terminal region of geminivirus\ C4 or AC4 proteins. In Tomato yellow leaf curl virus the C4 protein is necessary for efficient spreading of\ the virus in Solanum lycopersicum (Tomato) (Lycopersicon esculentum) PUBMED:8091687.\ ' '2695' 'IPR000263' '\ Geminiviruses are characterised by a genome of circular single-stranded DNA encapsidated in twinned (geminate) quasi-isometric particles, from which the group derives its name PUBMED:16453696. Most geminiviruses can be divided into 2 subgroups on the basis of host range and/or insect vector: i.e. those that infect dicotyledenous plants and are transmitted by the same whitefly species, and those that infect monocotyledenous plants and are transmitted by different leafhopper vectors. \ It has been shown that the 104 N-terminal amino acids of the Maize streak virus coat protein bind DNA non-specifically PUBMED:9191917.\ ' '2696' 'IPR002621' '\

    This family consists of putative movement proteins from Maize streak virus and Wheat dwarf virus PUBMED:3947330.

    \ ' '2697' 'IPR002511' '\ Disruption of the V1 gene in Tomato yellow leaf curl virus (TYLCV) stopped its ability to systemically infect Solanum lycopersicum (Tomato) (Lycopersicon esculentum) plants, suggesting that the V1 gene product is required for successful infection\ of the host PUBMED:9123819.\ ' '2698' 'IPR000714' '\ The IR5 open reading frame (ORF) of the Equid herpesvirus 1 (EHV-1) genome maps within the \ inverted repeat segments. Sequence analyses of the gene region revealed an ORF of 236 amino acids that showed a high degree of similarity to ORF64 of Human herpesvirus 3 (HHV-3) and ORF3 of Equid herpesvirus 4 (EHV-4), both of which map within the inverted repeats, and to the US10 ORF of Human herpesvirus 1 (HHV-1), which maps within the unique short segment. The IR5 ORF houses a sequence of 13 residues (CAYWCCLGHAFAC) that matches perfectly the consensus zinc finger motif (C-X2-4-C-X2-15-C/H-X2-4-C/H) PUBMED:1316680. Putative cis-acting elements flanking the IR5 ORF include a TATA box, a CAAT box, and a polyadenylation signal. Coupled with various experimental data, the IR5 gene of EHV-1 thus exhibits characteristics representative of a late gene of the gamma-1 class. The DNA sequence covering ~70% of the short unique region (Us) and part of the short inverted repeat of the Meleagrid herpesvirus 1 (MeHV-1) GA strain has been determined. Sequence analysis showed the presence of nine potential ORFs in the Us region, four of which were found to be similar to US10 (minor virion protein) PUBMED:1282282.\ ' '2699' 'IPR004995' '\

    Dormant Bacillus subtilis spores germinate in the presence of particular nutrients called germinants. The spores are thought to\ recognise germinants through receptor proteins encoded by the gerA family of operons, which includes gerA, gerB, and gerK. The GerA proteins are predicted to be membrane associated.

    \ ' '2700' 'IPR000792' '\

    This domain is a DNA-binding, helix-turn-helix (HTH) domain of about 65 amino acids, present in transcription regulators of the LuxR/FixJ family of response regulators. The domain is named after Vibrio fischeri luxR, a transcriptional activator for quorum-sensing control of luminescence. LuxR-type HTH domain proteins occur in a variety of organisms. The DNA-binding HTH domain is usually located in the C-terminal region; the N-terminal region often containing an autoinducer-binding domain or a response regulatory domain. Most luxR-type regulators act as transcription activators, but some can be repressors or have a dual role for different sites. LuxR-type HTH regulators control a wide variety of activities in various biological processes.

    \ \

    The luxR-type, DNA-binding HTH domain forms a four-helical bundle structure. The HTH motif comprises the second and third helices, known as the scaffold and recognition helix, respectively. The HTH binds DNA in the major groove, where the N-terminal part of the recognition helix makes most of the DNA contacts. The fourth helix is involved in dimerisation of gerE and traR. Signalling events by one of the four activation mechanisms described below lead to multimerisation of the regulator. The regulators bind DNA as multimers PUBMED:11243786, PUBMED:12740396, PUBMED:12087407.

    \ \

    LuxR-type HTH proteins can be activated by one of four different mechanisms:

    \ \

    1) Regulators which belong to a two-component sensory transduction system where the protein is activated by its phosphorylation, generally on an aspartate residue, by a transmembrane kinase PUBMED:12352954, PUBMED:12162958. Some proteins that belong to this category are:

    \
  • Rhizobiaceae fixJ (global regulator inducing expression of nitrogen-fixation genes in microaerobiosis)
  • \
  • Escherichia coli and Salmonella typhimurium uhpA (activates hexose phosphate transport gene uhpT)
  • \
  • E. coli narL and narP (activate nitrate reductase operon)
  • \
  • Enterobacteria rcsB (regulation of exopolysaccharide biosynthesis in enteric and plant pathogenesis)
  • \
  • Bordetella pertussis bvgA (virulence factor)
  • \
  • Bacillus subtilis coma (involved in expression of late-expressing competence genes)
  • \ \

    2) Regulators which are activated, or in very rare cases repressed, when bound to N-acyl homoserine lactones, which are used as quorum sensing molecules in a variety of Gram-negative bacteria PUBMED:15255890:

    \
  • V. fischeri luxR (activates bioluminescence operon)
  • \
  • Agrobacterium tumefaciens traR (regulation of Ti plasmid transfer)
  • \
  • Erwinia carotovora carR (control of carbapenem antibiotics biosynthesis)
  • \
  • E. carotovora expR (virulence factor for soft rot disease; activates plant tissue macerating enzyme genes)
  • \
  • Pseudomonas aeruginosa lasR (activates elastase gene lasB)
  • \
  • Erwinia chrysanthemi echR and Erwinia stewartii esaR
  • \
  • Pseudomonas chlororaphis phzR (positive regulator of phenazine antibiotic production)
  • \
  • Pseudomonas aeruginosa rhlR (activates rhlAB operon and lasB gene)
  • \ \

    3) Autonomous effector domain regulators, without a regulatory domain, represented by gerE PUBMED:11243786.

    \
  • B. subtilis gerE (transcription activator and repressor for the regulation of spore formation)
  • \ \

    4) Multiple ligand-binding regulators, exemplified by malT PUBMED:11931562.

    \
  • E. coli malT (activates maltose operon; MalT binds ATP and maltotriose)
  • \ \ \ \ ' '2701' 'IPR011584' '\

    The green fluorescent protein (GFP) is found in the jellyfish (Aequorea victoria), and functions as an energy-transfer acceptor. It fluoresces in vivo upon receiving energy from the \ Ca2+-activated photoprotein aequorin. The protein absorbs light maximally at 395 nm and exhibits a smaller absorbance peak at 470 nm. The fluorescence emission spectrum peaks at 509 nm with a shoulder at 540 nm. The protein is produced in the photocytes and contains a chromophore, which is composed of modified amino acid residues. The chromophore is formed upon cyclization of the residues ser-dehydrotyr-gly. There are several other members of the GFP family, which are able to fluoresce different colours, sveral of which are non-fluorescent PUBMED:10852900. These proteins are all essentailly encoded by single genes, since both the substrate and the catalytic enzyme for pigment biosynthesis are provided within a single polypeptide chain PUBMED:12325128.

    \

    More information about this protein can be found at Protein of the Month: Green Fluorescent Protein PUBMED:.

    \ \ ' '2702' 'IPR007839' '\

    GTP cyclohydrolase III catalyses the formation of 2-amino-5-formylamino-6- ribofuranosylamino-4(3H)-pyrimidinone ribonucleotide monophosphate and inorganic phosphate from GTP. The enzyme also has an independent pyrophosphate phosphohydrolase activity. The proteins are 200-270 amino acids in length.

    \ ' '2703' 'IPR002218' '\

    GidA is a tRNA modification enzyme found in bacteria and mitochondria. Though its precise molecular function of these proteins is not known, it is involved in the 5-carboxymethylaminomethyl modification of the wobble uridine base in some tRNAs PUBMED:15509579, PUBMED:11544186. Sequence variations in the human mitochondrial protein may influence the severity of aminoglycoside-induced deafness PUBMED:15542390.

    \ \

    This entry is found in GidA and related proteins, such as the methylenetetrahydrofolate--tRNA-(uracil-5-)-methyltransferase enzyme TrmFO.

    \ ' '2704' 'IPR003682' '\

    GidB (glucose-inhibited division protein B) appears to be present and in a single copy in all complete eubacterial genomes so far. Its mode of action is unknown, but a methytransferase fold is reported from the crystal structure. It may be a family of bacterial glucose inhibited division proteins that are involved in the regulation of cell division PUBMED:9795152.

    \ ' '2705' 'IPR004911' '\ This family includes the two characterised human gamma-interferon-inducible lysosomal thiol reductase (GILT) sequences PUBMED:3136170, PUBMED:10639150. It also contains several other eukaryotic putative proteins with similarity to GILT PUBMED:11491538. The\ aligned region contains three conserved cysteine residues. In addition, the two GILT sequences possess a C-X(2)-C motif that is\ shared by some of the other sequences in the family. This motif is thought to be associated with disulphide bond reduction. \ \ ' '2706' 'IPR005026' '\ The protein called postsynaptic density (PSD) is a specialised\ submembranous structure within which synaptic membrane proteins are\ linked to cytoskeleton and signalling proteins. Guanylate-kinase-associated protein (PSD-95/synapse-associated protein 90) is one of the major\ components of PSD, and functions as a scaffold protein for various ion\ channels and associated signalling molecules.\ ' '2707' 'IPR015899' '\ UDP-galactopyranose mutase () is involved in the conversion of UDP-GALP into UDP-GALF through a 2-keto intermediate, and contains FAD as a cofactor. The gene is known as glf, ceoA, and rfbD. It is known experimentally in Escherichia coli, Mycobacterium tuberculosis, and Klebsiella pneumoniae.\ ' '2708' 'IPR006097' '\

    Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction.

    \ \

    Glutamate dehydrogenases (, , and ) (GluDH) are enzymes that catalyse the NAD- and/or NADP-dependent reversible deamination of L-glutamate into alpha-ketoglutarate PUBMED:1358610, PUBMED:8315654. GluDH isozymes are generally involved with either ammonia assimilation or glutamate catabolism. Two separate enzymes are present in yeasts: the NADP-dependent enzyme, which catalyses the amination of alpha-ketoglutarate to L-glutamate; and the NAD-dependent enzyme, which catalyses the reverse reaction PUBMED:2989290 - this form links the L-amino acids with the Krebs cycle, which provides a major pathway for metabolic interconversion of alpha-amino acids and alpha- keto acids PUBMED:3368458.

    \ \

    Leucine dehydrogenase () (LeuDH) is a NAD-dependent enzyme that catalyses the reversible deamination of leucine and several other aliphatic amino acids to their keto analogues PUBMED:3069133. Each subunit of this octameric enzyme from Bacillus sphaericus contains 364 amino acids and folds into two domains, separated by a deep cleft. The nicotinamide ring of the NAD+ cofactor binds deep in this cleft, which is thought to close during the hydride transfer step of the catalytic cycle.

    \ \

    Phenylalanine dehydrogenase () (PheDH) is na NAD-dependent enzyme that catalyses the reversible deamidation of L-phenylalanine into phenyl-pyruvate PUBMED:1880121.

    \ \

    Valine dehydrogenase () (ValDH) is an NADP-dependent enzyme that catalyses the reversible deamidation of L-valine into 3-methyl-2-oxobutanoate PUBMED:8320231.

    \

    This entry represents the dimerisation region of these enzymes.

    \ ' '2709' 'IPR008146' '\

    Glutamine synthetase () (GS) PUBMED:2900091 plays an essential role in the metabolism of nitrogen by catalyzing the condensation of glutamate and ammonia to form glutamine.

    \

    There seem to be three different classes of GS PUBMED:8096645, PUBMED:2575672, PUBMED:7916055:\

    \

    While the three classes of GS\'s are clearly structurally related, the sequence similarities are not so extensive.

    \ \ ' '2710' 'IPR008147' '\

    Glutamine synthetase () (GS) PUBMED:2900091 plays an essential role in the metabolism of nitrogen by catalyzing the condensation of glutamate and ammonia to form glutamine.

    \

    There seem to be three different classes of GS PUBMED:8096645, PUBMED:2575672, PUBMED:7916055:\

    \

    While the three classes of GS\'s are clearly structurally related, the sequence similarities are not so extensive.

    \ ' '2711' 'IPR005190' '\

    This is a conserved repeated domain found in GlnE proteins. These proteins adenylate and deadenylate glutamine synthases: The domain is related to the nucleotidyltransferase domain .

    \ ' '2712' 'IPR004445' '\ This is a family of sodium/glutamate symporters (glutamate permeases), which catalyse the sodium-dependent uptake of extracellular glutamate. The protein is located in the inner membrane.\ ' '2713' 'IPR008164' '\ This short repeat of unknown function is found in multiple copies in several Caenorhabditis elegans proteins. The repeat is five residues long and consists of XGLTT where X can be any amino acid.\ ' '2714' 'IPR003837' '\

    Glu-tRNAGln amidotransferase is a heterotrimeric enzyme that is required for correct decoding of glutamine codons during translation. The Glu-tRNA Gln amidotransferase enzyme is an important translational fidelity mechanism replacing incorrectly charged Glu-tRNAGln with the correct Gln-tRANGln via transmidation of the misacylated Glu-tRNAGln PUBMED:9342321. This activity supplements the lack of glutaminyl-tRNA synthetase activity in Gram-positive eubacteria, cyanobacteria, archaea, and organelles PUBMED:9342321.

    \ ' '2715' 'IPR007788' '\ This family of enzymes catalyse the cyclization of free L-glutamine and N-terminal glutaminyl residues in proteins to pyroglutamate (5-oxoproline) and pyroglutamyl residues respectively PUBMED:11035947. This family includes plant and bacterial enzymes and seems unrelated to the mammalian enzymes.\ ' '2716' 'IPR007370' '\

    This is a group of bacterial glutamate-cysteine ligases that carry out the first step of the glutathione biosynthesis pathway according to the following equation: (L-aminohexanoate can replace glutamate).

    \ ' '2717' 'IPR002932' '\

    Ferredoxin-dependent glutamate synthases have been implicated in a number of functions including photorespiration in Arabidopsis where they may also play a role in primary nitrogen assimilation in roots PUBMED:9596633. This region is expressed as a seperate subunit in the glutamate synthase alpha subunit from archaebacteria, or part of a large multidomain enzyme in other organisms.

    \

    The aligned region of these proteins contains a putative FMN binding site and Fe-S cluster.

    \ ' '2718' 'IPR003440' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    This is the glycosyltransferase 48 family , which consists of various 1,3-beta-glucan synthase components including Gls1, Gls2 and Gls3 from yeast. 1,3-beta-glucan synthase () also known as callose synthase catalyses the formation of a beta-1,3-glucan polymer that is a major component of the fungal cell wall PUBMED:9209021. The reaction catalysed is:-

    UDP-glucose + {(1,3)-beta-D-glucosyl}(N)\ = UDP + {(1,3)-beta-D-glucosyl}(N+1).

    \ ' '2719' 'IPR003836' '\

    Glucokinases are found in invertebrates and microorganisms and are highly specific for glucose. These enzymes phosphorylate glucose using ATP as a donor to give glucose-6-phosphate and ADP PUBMED:9023215.

    \ ' '2720' 'IPR006148' '\ This entry contains 6-phosphogluconolactonase (), Glucosamine-6-phosphate isomerase (), and Galactosamine-6-phosphate isomerase. 6-phosphogluconolactonase is the enzyme responsible for the hydrolysis of 6-phosphogluconolactone to 6-phosphogluconate, the second step in the pentose phosphate pathway. Glucosamine-6-phosphate isomerase (or Glucosamine 6-phosphate deaminase) is the enzyme responsible for the conversion of D-glucosamine 6-phosphate into D-fructose 6-phosphate PUBMED:8747459. It is the last specific step in the pathway for N-acetylglucosamine (GlcNAC) utilization in bacteria such as Escherichia coli (gene nagB) or in fungi such as Candida albicans (gene NAG1).\ A region located in the central part of Glucosamine-6-phosphate isomerase contains a conserved histidine which has been shown PUBMED:8747459, in nagB, to be important for the pyranose ring-opening step of the catalytic mechanism.\ ' '2721' 'IPR002109' '\

    Glutaredoxins PUBMED:3152490, PUBMED:3286320, PUBMED:2668278, also known as thioltransferases (disulphide reductases, are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system PUBMED:14713336.

    \

    Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin, which functions in a similar way, glutaredoxin possesses an active centre disulphide bond PUBMED:14962389. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond.

    \

    Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed PUBMED:1994586 that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.

    \ \

    This entry represents Glutaredoxin.

    \ ' '2722' 'IPR007494' '\

    Glutaredoxins PUBMED:3152490, PUBMED:3286320, PUBMED:2668278, also known as thioltransferases (disulphide reductases, are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system PUBMED:14713336.

    \

    Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin, which functions in a similar way, glutaredoxin possesses an active centre disulphide bond PUBMED:14962389. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond.

    \

    Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed PUBMED:1994586 that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.

    \ \

    Unlike other glutaredoxins, glutaredoxin 2 (Grx2) cannot reduce ribonucleotide reductase. Grx2 has significantly higher catalytic activity in the reduction of mixed disulphides with glutathione (GSH) compared with other glutaredoxins. The active site residues (Cys9-Pro10-Tyr11-Cys12, in Escherichia coli Grx2, ), which are found at the interface between the N- and C-terminal domains are identical to other glutaredoxins, but there is no other similarity between glutaredoxin 2 and other glutaredoxins. Grx2 is structurally similar to glutathione-S-transferases (GST), but there is no obvious sequence similarity. The inter-domain contacts are mainly hydrophobic, suggesting that the two domains are unlikely to be stable on their own. Both domains are needed for correct folding and activity of Grx2. It is thought that the primary function of Grx2 is to catalyse reversible glutathionylation of proteins with GSH in cellular redox regulation including the response to oxidative stress. The N-terminal domain is .

    \ ' '2723' 'IPR001419' '\

    Gluten is the protein component of wheat flour. It consists of numerous\ proteins, which are of two different types responsible for different physical\ properties of dough: the glutenins, which are primarily responsible for\ the elasticity, and the gliadins, which contribute to the extensibility.

    \

    The glutenins are of two different types, termed low (LMW) and high \ molecular weight (HMW) subunits PUBMED:3840588. The glutenin high molecular weight subunits are classified as\ elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the\ original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all\ polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate\ cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic\ residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic\ domains that form disulphide cross-links. The central elastomeric domain is characterised by the following three repeated motifs:\ PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a\ regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm PUBMED:11084370.

    \ ' '2724' 'IPR015896' '\

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway PUBMED:16564539. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin PUBMED:17227226.

    \

    The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase (), or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase () to charge a tRNA with glutamate, glutamyl-tRNA reductase () to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase () to catalyse a transamination reaction to produce ALA.

    \

    The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase, ) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase, ) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase () to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.

    \

    Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase (). To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase () PUBMED:11215515.

    \ \ \

    This entry represents the C-terminal domain of glutamyl-tRNA reductase (), which reduces glutamyl-tRNA to glutamate-1-semialdehyde during the first stage of tetrapyrrole biosynthesis by the C5 pathway PUBMED:11215515, PUBMED:1502723. The enzyme requires NADPH as a cofactor.

    \ ' '2725' 'IPR015895' '\

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway PUBMED:16564539. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin PUBMED:17227226.

    \

    The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase (), or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase () to charge a tRNA with glutamate, glutamyl-tRNA reductase () to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase () to catalyse a transamination reaction to produce ALA.

    \

    The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase, ) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase, ) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase () to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.

    \

    Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase (). To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase () PUBMED:11215515.

    \ \ \

    This entry represents the N-terminal domain of glutamyl-tRNA reductase (), which reduces glutamyl-tRNA to glutamate-1-semialdehyde during the first stage of tetrapyrrole biosynthesis by the C5 pathway PUBMED:11215515, PUBMED:1502723. The enzyme requires NADPH as a cofactor.

    \ ' '2726' 'IPR004381' '\

    This family includes glycerate kinase 2 (), which catalyses the phosphorylation of (R)-glycerate to 3-phospho-(R)-glycerate in the presence of ATP. These proteins consist of two different alpha/beta domains: domain 1 has a flavodoxin-like fold, while domain 2 has a restriction enzyme-like fold (domain 2 is inserted into domain 1).

    \ ' '2727' 'IPR001150' '\

    Synonym(s):Pyruvate formate-lyase

    \ \

    Escherichia coli Formate C-acetyltransferase () (genes pflB and pflD) is a key enzyme of anaerobic glucose metabolism, it converts pyruvate and CoA into acetyl-CoA and pyruvate. This enzyme is posttranslationally interconverted, under anaerobic conditions, from an inactive to an active form that carries a stable radical localized to a specific glycine at the C-terminus PUBMED:1310545. \ Such a glycine radical seems PUBMED:8421692 also to be present\ in E. coli (gene nrdD) and Bacteriophage T4 (gene nrdD or sunY) anaerobic ribonucleoside-triphosphate reductase ().

    \ ' '2728' 'IPR001360' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 1 comprises enzymes with a number of known activities; beta-glucosidase (); beta-galactosidase (); 6-phospho-beta-galactosidase (); 6-phospho-beta-glucosidase (); lactase-phlorizin hydrolase (), (); beta-mannosidase (); myrosinase ().

    \ ' '2729' 'IPR001000' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 10 \ comprises enzymes with a number of known activities; xylanase (); endo-1,3-beta-xylanase (); cellobiohydrolase (). These enzymes were formerly known as cellulase family F.

    \ \

    The microbial degradation of cellulose and xylans requires several types of\ enzymes such as endoglucanases (), cellobiohydrolases ()\ (exoglucanases), or xylanases () PUBMED:2252383, PUBMED:1886523. Fungi and bacteria produces\ a spectrum of cellulolytic enzymes (cellulases) and xylanases which, on the\ basis of sequence similarities, can be classified into families. One of these\ families is known as the cellulase family F PUBMED:2806912 or as the glycosyl hydrolases\ family 10 PUBMED:1747104.

    \ ' '2730' 'IPR001137' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 11 \ comprises enzymes with only one known activity, xylanase (). These enzymes were formerly known as cellulase family G.

    \ ' '2731' 'IPR000773' '\ Granulocyte-macrophage colony-stimulating factor (GMCSF) is a cytokine that acts in\ hematopoiesis to stimulate growth and differentiation of hematopoietic precursor cells\ from various lineages including granulocytes, macrophages, eosinophils and erythrocytes\ PUBMED:2458827, PUBMED:1569568. GMCSF is a glycoprotein of ~120 residues that contains 4 conserved\ cysteines that participate in disulphide bond formation. The crystal structure of recombinant\ human GMCSF has been determined PUBMED:1569568. There are two molecules in the asymmetric\ unit, which are related by an approximate non-crystallographic 2-fold axis. The overall\ structure, which is highly compact and globular with a predominantly hydrophobic core, is\ characterised by a 4-alpha-helix bundle. The helices are arranged in a left-handed anti-parallel\ fashion, with two overhand connections. Within the connections is a two-stranded \ anti-parallel beta-sheet. The tertiary structure has a topology similar to that of Sus scrofa (pig) growth\ factor and interferon-beta. Most of the proposed critical regions for receptor binding are\ located on a continuous surface at one end of the molecule that includes the C terminus\ PUBMED:1569568.\ ' '2732' 'IPR002594' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 12 comprises enzymes with two known activities: endoglucanase ()and xyloglucan hydrolase (EC not defined). These enzymes were formerly known as cellulase family H.

    \ ' '2733' 'IPR001554' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 14 \ comprises enzymes with only one known activity; beta-amylase (). A Glu residue has been proposed as a catalytic residue, but it is not known if it is the nucleophile or the proton donor.

    \ \

    Beta-amylase PUBMED:2457058, PUBMED:2464171 is an enzyme that hydrolyses 1,4-alpha-glucosidic linkages in starch-type polysaccharide substrates so as to remove\ successive maltose units from the non-reducing ends of the chains. Beta-amylase is present in certain bacteria as well as in plants.

    \

    Three highly conserved sequence regions are found in all known beta-amylases.\ The first of these regions is located in the N-terminal section of the enzymes\ and contains an aspartate which is known PUBMED:2474529 to be involved in the catalytic\ mechanism. The second, located in a more central location, is centred around\ a glutamate which is also involved PUBMED:8174545 in the catalytic mechanism.

    \

    The 3D structure of a complex of soybean beta-amylase with an inhibitor\ (alpha-cyclodextrin) has been determined to 3.0A resolution by X-ray\ diffraction PUBMED:1491009. The enzyme folds into large and small domains: the large\ domain has a (beta alpha)8 super-secondary structural core, while the smaller\ is formed from two long loops extending from the beta-3 and beta-4 strands\ of the (beta alpha)8 fold PUBMED:1491009. The interface of the two domains, together\ with shorter loops from the (beta alpha)8 core, form a deep cleft, in which\ the inhibitor binds PUBMED:1491009. Two maltose molecules also bind in the cleft,\ one sharing a binding site with alpha-cyclodextrin, and the other sitting\ more deeply in the cleft PUBMED:1491009.

    \ ' '2734' 'IPR011613' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 15 comprises enzymes with several known activities; glucoamylase (); alpha-glucosidase (); glucodextranase ().

    \ \ \

    Glucoamylase (GA) catalyses the release of\ D-glucose from the non-reducing ends of starch and other oligo- or poly-saccharides. Studies of fungal GA have indicated 3 closely-clustered acidic\ residues that play a role in the catalytic mechanism PUBMED:1970434. This region is also conserved in a recently sequenced bacterial GA PUBMED:1633799.

    \

    The 3D structure of the pseudo-tetrasaccharide acarbose complexed with\ glucoamylase II(471) from Aspergillus awamori var. X100 has been determined\ to 2.4A resolution PUBMED:8195212. The protein belongs to the mainly-alpha class, and contains 19 helices and 9 strands.

    \ ' '2735' 'IPR000490' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 17 comprises enzymes with several known activities; endo-1,3-beta-glucosidase (); lichenase (); exo-1,3-glucanase (). Currently these enzymes have only been found in plants and in fungi.

    \ ' '2736' 'IPR000726' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 19 comprises enzymes with only one known activity; chitinase ().

    \ \

    Chitinases PUBMED:1516675 are enzymes that catalyse the hydrolysis of the beta-1,4-N-acetyl-D-glucosamine linkages in chitin polymers. Chitinases belong to glycoside hydrolase families 18 or 19 PUBMED:1747104. Chitinases of family 19 (also known as classes IA or I and IB or II) are enzymes from plants that function in the defence against fungal and insect pathogens by destroying their chitin-containing cell wall. Class IA/I and IB/II enzymes differ in the presence (IA/I) or absence (IB/II) of a N-terminal chitin-binding domain. The catalytic domain of these enzymes consist of about 220 to 230 amino acid residues.

    \ ' '2737' 'IPR006102' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 2 \ comprises enzymes with several known activities: beta-galactosidase (); beta-mannosidase (); beta-glucuronidase ().

    \ \

    These enzymes contain a conserved glutamic acid residue which has been shown PUBMED:1350782, in Escherichia coli lacZ (), to be the general acid/base catalyst in the active site of the enzyme.

    \

    This entry describes the immunoglobulin-like beta-sandwich domain PUBMED:8008071.

    \ ' '2738' 'IPR015883' '\

    Glycoside hydrolase family 20 comprises enzymes with several known activities; beta-hexosaminidase (); lacto-N-biosidase (). Carbonyl oxygen of the C-2 acetamido group of the substrate acts as the catalytic nucleophile/base in this family of enzymes.

    \ \

    In the brain and other tissues, beta-hexosaminidase A degrades GM2 gangliosides; specifically, the enzyme hydrolyses terminal non-reducing N-acetyl-D-hexosamine residues in N-acetyl-beta-D-hexosaminides. There are 3 forms of beta-hexosaminidase: hexosaminidase A is a trimer, with one alpha, one beta-A and one beta-B chain; hexosaminidase B is a tetramer of two beta-A and two beta-B chains; and hexosaminidase S is a homodimer of alpha chains. The two beta chains are derived from the cleavage of a precursor. Mutations in the beta-chain lead to Sandhoff disease, a lysosomal storage disorder characterised by accumulation of GM2 ganglioside PUBMED:8357844.

    \ ' '2739' 'IPR015882' '\

    Glycoside hydrolase family 20 comprises enzymes with several known activities; beta-hexosaminidase (); lacto-N-biosidase (). Carbonyl oxygen of the C-2 acetamido group of the substrate acts as the catalytic nucleophile/base in this family of enzymes.

    \ \

    In the brain and other tissues, beta-hexosaminidase A degrades GM2 gangliosides; specifically, the enzyme hydrolyses terminal non-reducing N-acetyl-D-hexosamine residues in N-acetyl-beta-D-hexosaminides. There are 3 forms of beta-hexosaminidase: hexosaminidase A is a trimer, with one alpha, one beta-A and one beta-B chain; hexosaminidase B is a tetramer of two beta-A and two beta-B chains; and hexosaminidase S is a homodimer of alpha chains. The two beta chains are derived from the cleavage of a precursor. Mutations in the beta-chain lead to Sandhoff disease, a lysosomal storage disorder characterised by accumulation of GM2 ganglioside PUBMED:8357844.

    \ \

    This entry represents the beta-N-acetylhexosaminidase-like domain. It contains a similar fold but lacks the catalytic centre.

    \ ' '2740' 'IPR002053' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 25 comprises enzymes with only one known activity; lysozyme ().

    \

    It has been shown PUBMED:1916274, PUBMED:1747104 that a number of cell-wall lytic enzymes are evolutionary related and can be classified into a single family.\ Two residues, an aspartate and a glutamate, have been shown PUBMED:567645 to be\ important for the catalytic activity of the Charalopsis enzyme. These residues\ as well as some others in their vicinity are conserved in all proteins from\ this family.

    \ ' '2741' 'IPR000805' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 26 comprises enzymes with only one known activity; mannanase ().

    \ \

    Family 26 encompasses mainly mannan endo-1,4-beta-mannosidases.\ Mannan endo-1,4-beta-mannosidase hydrolyses mannan and galactomannan, but\ displays little activity towards other plant cell wall polysaccharides PUBMED:7848261. The enzyme randomly hydrolyses 1,4-beta-D-linkages in mannans, galacto-mannans, glucomannans and galactoglucomannans.

    \ ' '2742' 'IPR000743' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 28 comprises enzymes with several known activities; polygalacturonase (); exo-polygalacturonase (); exo-polygalacturonase (); rhamnogalacturonase (EC not defined).

    \ \

    Polygalacturonase (PG) (pectinase) PUBMED:2400785, PUBMED:2193922 catalyses the random hydrolysis of 1,4-alpha-D-galactosiduronic linkages in pectate and other galacturonans. In fruit, polygalacturonase plays an important role in cell wall metabolism during ripening. In plant bacterial pathogens such as Erwinia carotovora or Ralstonia solanacearum (Pseudomonas solanacearum) and fungal pathogens such as Aspergillus niger, polygalacturonase is involved in maceration and soft-rotting of plant tissue. Exo-poly-alpha-D-galacturonosidase () (exoPG) PUBMED:2168372 hydrolyses peptic acid from the non-reducing end, releasing digalacturonate. PG and exoPG share a few regions of sequence similarity, and belong to family 28 of the glycosyl hydrolases.

    \ ' '2743' 'IPR006103' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 2 \ comprises enzymes with several known activities; beta-galactosidase (); beta-mannosidase (); beta-glucuronidase ().

    \ \

    These enzymes contain a conserved glutamic acid residue which has been shown PUBMED:1350782, in Escherichia coli lacZ (), to be the general acid/base catalyst in the active site of the enzyme.

    \

    Beta-galactosidase from E. coli has a TIM-barrel-like core surrounded by four other largely beta domains PUBMED:8008071.

    \ \ ' '2744' 'IPR006104' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 2 \ comprises enzymes with several known activities; beta-galactosidase (); beta-mannosidase (); beta-glucuronidase ().

    \ \

    These enzymes contain a conserved glutamic acid residue which has been shown PUBMED:1350782, in Escherichia coli lacZ (), to be the general acid/base catalyst in the active site of the enzyme.

    The sugar binding domain has a jelly-roll fold PUBMED:8008071.

    \ ' '2745' 'IPR001764' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \ Glycoside hydrolase family 3 comprises enzymes with a number of known activities; beta-glucosidase (); beta-xylosidase (); N-acetyl beta-glucosaminidase (); glucan\ beta-1,3-glucosidase (); cellodextrinase (); exo-1,3-1,4-glucanase ().\ \ These enzymes are two-domain globular proteins that are N-glycosylated at three sites PUBMED:10368285. This domain is often\ N-terminal to the glycoside hydrolase family 3, C-terminal domain .\ ' '2746' 'IPR013148' '\

    This domain corresponds to the N-terminal domain of glycosyl transferase family 32 which forms a five bladed beta propeller structure PUBMED:14973124.

    \ ' '2747' 'IPR001944' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 35 comprises enzymes with only one known activity; beta-galactosidase ().

    \ \

    Mammalian beta-galactosidase is a lysosomal enzyme (gene GLB1) which cleaves the terminal galactose from gangliosides, glycoproteins, and glycosaminoglycans and whose deficiency is the cause of the genetic disease Gm(1) gangliosidosis (Morquio disease type B).

    \ ' '2748' 'IPR000602' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 38 comprises enzymes with only one known activity; alpha-mannosidase () ().

    \ \

    Lysosomal alpha-mannosidase is necessary for the catabolism of N-linked carbohydrates released during glycoprotein turnover. The enzyme catalyses the hydrolysis of terminal, non-reducing alpha-D-mannose residues in alpha-D-mannosides, and can cleave all known types of alpha-mannosidic linkages. Defects in the gene cause lysosomal alpha-mannosidosis (AM), a lysosomal storage disease characterised by the accumulation of unbranched oligo-saccharide chains.

    \ ' '2749' 'IPR000514' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 39 comprises enzymes with several known activities; alpha-L-iduronidase (); beta-xylosidase ().

    \ \ \

    The most highly conserved regions in these enzymes are located in their N-terminal\ sections. These contain a glutamic acid residue which, on the basis of\ similarities with other families of glycosyl hydrolases PUBMED:7624375, probably acts as\ the proton donor in their catalytic mechanism.

    \ ' '2750' 'IPR002772' '\

    Glycoside hydrolase family 3 comprises enzymes with a number of known activities; beta-glucosidase (); beta-xylosidase (); N-acetyl beta-glucosaminidase (); glucan beta-1,3-glucosidase (); cellodextrinase(); exo-1,3-1,4-glucanase ().

    \ \

    These enzymes are two-domain globular proteins that are N-glycosylated at three sites PUBMED:10368285. This domain is often C-terminal to the glycoside hydrolase family 3, N-terminal domain .

    \ ' '2751' 'IPR001088' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 4 \ comprises enzymes with several known activities; 6-phospho-beta-glucosidase (); 6-phospho-alpha-glucosidase (); alpha-galactosidase ().

    \ \ \

    6-phospho-alpha-glucosidase requires both NAD(H) and divalent metal (Mn2+, Fe2+, Co2+, or Ni2+) for activity PUBMED:9765262.

    \ ' '2752' 'IPR013529' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \ This group of beta-galactosidase enzymes () belong to the glycosyl hydrolase 42 family . The enzyme catalyses the hydrolysis of terminal, non-reducing terminal beta-D-galactosidase residues.\ ' '2753' 'IPR006710' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 43 includes enzymes with the following activities, beta-xylosidase (), alpha-L-arabinofuranosidase (); arabinanase (), and xylanase ().

    \ ' '2754' 'IPR000334' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 45 comprises enzymes with only one known activity; endoglucanase ().

    \ \

    The microbial degradation of cellulose and xylans requires several types of\ enzymes such as endoglucanases, cellobiohydrolases ()\ (exoglucanases), or xylanases () PUBMED:2252383, PUBMED:1886523.\ Fungi and bacteria produce\ a spectrum of cellulolytic enzymes (cellulases) and xylanases which, on the\ basis of sequence similarities, can be classified into families. One of these\ families is known as the cellulase family K or as the glycosyl hydrolases\ family 45 PUBMED:8352747.\ The best conserved regions in these enzymes is located in the N-terminal\ section. It contains an aspartic acid residue which has been shown PUBMED:8377830 to act\ as a nucleophile in the catalytic mechanism.\ This also has several cysteines that are involved in forming disulphide bridges.

    \ ' '2755' 'IPR000400' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 46 comprises enzymes with only one known activity; chitosanase ().

    \ \

    Chitosanase enzymes catalyse the endohydrolysis\ of beta-1,4-linkages between N-acetyl-D-glucosamine and D-glucosamine\ residues in a partly acetylated chitosan.

    \ ' '2756' 'IPR000556' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 48 comprises enzymes with several known activities; endoglucanase (); cellobiohydrolase ().

    \ \

    The largest cellulase gene sequenced to\ date is one of the cellulases (celA) from the genome of the thermophilic anaerobic bacterium Caldocellum saccharolyticum. The celA gene product is a polypeptide of 1751 amino acids; this has a multidomain structure\ comprising two catalytic domains and two cellulose-binding domains, linked by Pro-Thr-rich regions. The\ N-terminal domain encodes an endoglucanase activity on carboxymethylcellulose, consistent with its similarity\ to several endo-1, 4-beta-D-glucanase sequences. The C-terminal domain shows similarity to a cellulase from\ Clostridium thermocellum (CelS), which acts synergistically with a second component to hydrolyse crystalline\ cellulose PUBMED:7612247.

    \ ' '2757' 'IPR005192' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This is a family of dextranase () and isopullulanase () which are all members of glycoside hydrolase family 49 (). Dextranase hydrolyses alpha-1,6-glycosidic bonds in dextran polymers.

    \ ' '2758' 'IPR000852' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 52 \ comprises enzymes with only one known activity; beta-xylosidase ().

    \ \

    Proteins harboring beta-xylosidase and xylanase activities PUBMED:8074507have been\ identified in the Gram-positive, facultative thermophilic aerobe Bacillus stearothermophilus 21 PUBMED:8074507. This microbe, which functions in xylan\ degradation, can utilise xylan as a sole source of carbon. The enzyme\ hydrolyses 1,4-beta-D-xylans, removing successive D-xylose residues from\ the non-reducing termini. It also hydrolyses xylobiose.

    \ ' '2759' 'IPR001968' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 56 comprises enzymes with only one known activity; hyaluronidase .

    \ \

    The venom of Apis mellifera (Honeybee) contains several biologically-active peptides and\ two enzymes, one of which is a hyaluronidase PUBMED:7682712. The amino acid sequence\ of bee venom hyaluronidase contains 349 amino acids, and includes four\ cysteines and a number of potential glycosylation sites PUBMED:7682712. The sequence\ shows a high degree of similarity to PH-20, a membrane protein of mammalian\ sperm involved in sperm-egg adhesion, supporting the view that hyaluronidases\ play a role in fertilisation PUBMED:7682712.

    \

    PH-20 is required for sperm adhesion to the egg zona pellucida; it is\ located on both the sperm plasma membrane and acrosomal membrane PUBMED:2269661. The\ amino acid sequence of the mature protein contains 468 amino acids, and\ includes six potential N-linked glycosylation sites and twelve cysteines,\ eight of which are tightly clustered near the C-terminus PUBMED:2269661.

    \ ' '2760' 'IPR004300' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 57 comprises enzymes with two known activities; alpha-amylase () and 4-alpha-glucanotransferase ().

    \ ' '2761' 'IPR001286' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 59 comprises enzymes with only one known activity; galactocerebrosidase ().

    \ \

    Globoid cell leukodystrophy (Krabbe disease) is a severe, autosomal\ recessive disorder that results from deficiency of galactocerebrosidase\ (GALC) activity PUBMED:8661004, PUBMED:7601472, PUBMED:9434153. GALC is responsible for the lysosomal catabolism of\ certain galactolipids, including galactosylceramide and psychosine PUBMED:8661004.

    \ ' '2762' 'IPR016288' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    The 1,4-beta cellobiohydrolase family plays a central role in the recycling of plant biomass. The biological conversion of cellulose to glucose generally requires three types of hydrolytic enzymes: Endoglucanases, which cut internal beta-1,4-glucosidic bonds; Exocellobiohydrolases that cut the dissaccharide cellobiose from the non-reducing end of the cellulose polymer chain; and Beta-1,4-glucosidases, which hydrolyze the cellobiose and other short cello-oligosaccharides to glucose.

    \ ' '2763' 'IPR005103' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \ The only known activity within this family is that of endoglucanase () \ ' '2764' 'IPR005193' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This is a family of alpha -L-arabinofuranosidases () which are all members of glycoside hydrolase family 62 (). This enzyme hydrolyzed aryl alpha-L-arabinofuranosides and cleaves arabinosyl side chains from arabinoxylan and arabinan.

    \ ' '2765' 'IPR005194' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This family of glycosyl hydrolases () contains this domain and includes vacuolar acid trehalase and maltose phosphorylases. Maltose phosphorylase (MP) is a dimeric enzyme that catalyzes the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. The C-terminal domain forms a two layered jelly roll motif. This domain is situated at the base of the catalytic domain, however its function remains unknown PUBMED:11587643.

    \ ' '2766' 'IPR005195' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    The family of glycosyl hydrolases () which contains this domain includes vacuolar acid trehalase and maltose phosphorylase. Maltose phosphorylase (MP) is a dimeric enzyme that catalyzes the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. The central domain is the catalytic domain, which binds a phosphate ion that is proximal the the highly conserved Glu. The arrangement of the phosphate and the glutamate is thought to cause nucelophilic attack on the anomeric carbon atom PUBMED:11587643. The catalytic domain also forms the majority of the dimerisation interface.

    \ ' '2767' 'IPR005196' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    The family of glycosyl hydrolases () containing this domain includes vacuolar acid trehalase and maltose phosphorylase. Maltose phosphorylase (MP) is a dimeric enzyme that catalyzes the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. This domain is believed to be essential for catalytic activity PUBMED:11587643 although its precise function remains unknown.

    \ ' '2768' 'IPR003469' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \ This family consists of the glycosyl hydrolase 68 family (), including several bacterial levansucrase enzymes, and invertase from Zymomonas. Levansucrase (), also known as beta-D-fructofuranosyl transferase, catalyses the conversion of sucrose and (2,6-beta-D-fructosyl)(N) to glucose and (2,6-beta-D-fructosyl)(N+1), where other sugars can also act as fructosyl acceptors. Invertase, or extracellular sucrase (), catalyses the hydrolysis of terminal non-reducing beta-D-fructofuranoside residues in beta-D-fructofuranosides.\ ' '2769' 'IPR001722' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 7 comprises enzymes with several known activities; endoglucanase (); cellobiohydrolase (). These enzymes were formerly known as cellulase family C.

    \ \

    Exoglucanases and cellobiohydrolases PUBMED:1886523 play a role in the conversion of cellulose to glucose by cutting the dissaccharide\ cellobiose from the nonreducing end of the cellulose polymer chain.\ Structurally, cellulases and xylanases generally consist of a catalytic\ domain joined to a cellulose-binding domain (CBD) via a linker region that\ is rich in proline and/or hydroxy-amino acids. In type I exoglucanases, the\ CBD domain is found at the C-terminal extremity of these enzyme (this short\ domain forms a hairpin loop structure stabilised by 2 disulphide bridges).

    \ ' '2770' 'IPR003318' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \ Glucosyltransferases or sucrose 6-glycosyl transferases (GTF-S) (, ) catalyse the transfer of D-glucopyramnosyl units from sucrose onto acceptor molecules PUBMED:8982063. This signature roughly corresponds to the N-terminal catalytic domain of the enzyme. Members of this group also contain the putative cell wall binding repeat ().\ ' '2771' 'IPR005197' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This is a family of alpha-1,3-glucanases belonging to glycoside hydrolase family 71 ().

    \ ' '2772' 'IPR005199' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This is a family of endo-beta-N-glucuronidase, or heparanase belonging to glycoside hydrolase family 79 (). Heparan sulphate proteoglycans (HSPGs) play a key role in the self- assembly, insolubility and barrier properties of basement membranes and extracellular matrices. Hence, cleavage of heparan sulphate (HS) affects the integrity and functional state of tissues and thereby fundamental normal and pathological phenomena involving cell migration and response to changes in the extracellular microenvironment. Heparanase degrades HS at specific intrachain sites. The enzyme is synthesized as a latent approximately 65 kDa protein that is processed at the N-terminus into a highly active approximately 50 kDa form. Experimental evidence suggests that heparanase may facilitate both tumor cell invasion and neovascularization, both critical steps in cancer progression. The enzyme is also involved in cell migration associated with inflammation and autoimmunity PUBMED:11530216.

    \ ' '2773' 'IPR002037' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 8 comprises enzymes with several known activities; endoglucanase (); lichenase (); chitosanase (). These enzymes were formerly known as cellulase family D PUBMED:2806912.

    \ ' '2774' 'IPR005200' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This is a family of eukaryotic beta-1,3-glucanases belonging to glycoside hydrolase family 81 ().

    \ ' '2775' 'IPR001701' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 9 comprises enzymes with several known activities; endoglucanase (); cellobiohydrolase (). These enzymes were formerly known as cellulase family E.

    \ ' '2776' 'IPR004629' '\

    The WecG member of this superfamily, believed to be UDP-N-acetyl-D-mannosaminuronic acid transferase, plays a role in Enterobacterial common antigen (eca) synthesis in Escherichia coli. Another family member, the Bacillus subtilis TagA protein, is involved in the biosynthesis of the cell wall polymer poly(glycerol phosphate). The third family member, CpsF, CMP-N-acetylneuraminic acid synthetase has a role in the capsular polysaccharide biosynthesis pathway.

    \ ' '2777' 'IPR002516' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 11 comprises enzymes with only one known activity; galactoside 2-L-fucosyltransferase ().

    \ \

    Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Galactoside 2-L-fucosyltransferase 1 () and Galactoside 2-L-fucosyltransferase 2 () belong to the Hh blood group system and are associated with H/h and Se/se antigens.

    \ ' '2778' 'IPR002685' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 15 comprises enzymes with only one known activity; glycolipid 2-alpha-mannosyltransferase .

    \ ' '2779' 'IPR004310' '\ This entry contains proteins encoded by ORF3 of Equine arteritis virus. They are possible envelope glcoproteins.\ ' '2780' 'IPR006813' '\ This family represents beta-1,4-mannosyl-glycoprotein beta-1,4-N-acetylglucosaminyltransferase (). This enzyme transfers the bisecting GlcNAc to the core mannose of complex N-glycans. The addition of this residue is regulated during development and has functional consequences for receptor signalling, cell adhesion, and tumour progression PUBMED:11986323, PUBMED:11784313.\ ' '2781' 'IPR004276' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 28 comprises enzymes with a number of known activities; 1,2-diacylglycerol 3-beta-galactosyltransferase (); 1,2-diacylglycerol 3-beta-glucosyltransferase (); beta-N-acetylglucosamine transferase ().

    \ ' '2782' 'IPR006759' '\

    The complex-type of oligosaccharides are synthesised through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains.

    In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyse the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV () catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms a GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminal end of the protein PUBMED:9278430.

    \ ' '2783' 'IPR005076' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 6 comprises enzymes with three known activities; \ alpha-1,3-galactosyltransferase (); alpha-1,3 N-acetylgalactosaminyltransferase ();\ alpha-galactosyltransferase ().

    \ ' '2784' 'IPR002201' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 9 comprises enzymes with two known activity; lipopolysaccharide N-acetylglucosaminyltransferase (), heptosyltransferase ().

    \ \

    Heptosyltransferase I is thought to add L-glycero-D-manno-heptose to the inner\ 3-deoxy-D-manno-octulosonic acid (Kdo) residue of the lipopolysaccharide core PUBMED:9446588.\ Heptosyltransferase II is a glycosyltransferase involved in the synthesis of the inner core region of lipopolysaccharide PUBMED:11054112. Lipopolysaccharide is a major component of the outer leaflet of the outer membrane in Gram-negative bacteria. It is composed of three domains; lipid A, Core oligosaccharide and the O-antigen. These enzymes transfer heptose to the lipopolysaccharide core PUBMED:9446588.

    \ ' '2785' 'IPR000741' '\

    Fructose-bisphosphate aldolase () PUBMED:2199259, PUBMED:1412694 is a glycolytic \ enzyme that catalyses the reversible aldol cleavage or condensation of fructose-1,6-bisphosphate into dihydroxyacetone-phosphate and glyceraldehyde 3-phosphate. There are two classes of fructose-bisphosphate aldolases with different catalytic mechanisms: class I enzymes PUBMED:3355497 are found in animals, do not require a metal ion, and are characterised by the formation of a Schiff base intermediate between a highly conserved active site lysine and a substrate carbonyl group, while the class II enzymes are produced in bacteria and fungi, and require an active-site divalent metal ion. This entry represents the class I enzymes.

    \

    In vertebrates, three forms of this enzyme are found: aldolase A is expressed in muscle, aldolase B in liver, kidney, stomach and intestine, and aldolase C in brain, heart and ovary. The different isozymes have different catalytic functions: aldolases A and C are mainly involved in glycolysis, while aldolase B is involved in both glycolysis and gluconeogenesis. Defects in aldolase B result in hereditary fructose intolerance.

    \ \ ' '2786' 'IPR001195' '\

    Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others.

    \

    Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane PUBMED:2605264. Structurally, glycophorin A consists of\ an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues,\ followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.

    \ ' '2787' 'IPR000234' '\ This family of proteins are the surface glycoprotein of various herpesviruses.\ The glycoprotein is anchored to the lipid envelope of the virus by a transmembrane region.\ ' '2788' 'IPR000925' '\

    This family includes attachment proteins from respiratory synctial virus. Glycoprotein G has not been shown to have any neuraminidase or haemagglutinin activity. The amino terminus is thought to be cytoplasmic, and the carboxyl terminus extracellular. The extracellular region contains four completely conserved cysteine residues.

    \ ' '2789' 'IPR017459' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    The glycosyl transferase family includes anthranilate phosphoribosyltransferase (TrpD, ) and thymidine phosphorylase ().\ All these proteins can transfer a phosphorylated ribose substrate. Thymidine phosphorylase () catalyses the reversible phosphorolysis\ of thymidine, deoxyuridine and their analogues to their respective bases and\ 2-deoxyribose 1-phosphate. This enzyme regulates the availability of thymidine\ and is therefore essential to nucleic acid metabolism.

    \ \ \ \

    This N-terminal domain is found in various family 3 glycosyl transferases, including anthranilate phosphoribosyltransferase (TrpD, ) and thymidine phosphorylase ().\ All these proteins can transfer a phosphorylated ribose substrate. Thymidine phosphorylase catalyses the reversible phosphorolysis of thymidine, deoxyuridine and their analogues to their respective bases and 2-deoxyribose 1-phosphate. This enzyme regulates the availability of thymidine and is therefore essential to nucleic acid metabolism.

    \ ' '2790' 'IPR001296' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Proteins containign this domain transfer UDP, ADP, GDP or CMP linked sugars to a variety of \ substrates, including glycogen, fructose-6-phosphate and lipopolysaccharides. The \ bacterial enzymes are involved in various biosynthetic processes that include\ exopolysaccharide biosynthesis, lipopolysaccharide core biosynthesis and the biosynthesis\ of the slime polysaccaride colanic acid. Mutations in this domain of the human\ N-acetylglucosaminyl-phosphatidylinositol biosynthetic protein are the cause of \ paroxysmal nocturnal hemoglobinuria (PNH), an acquired hemolytic blood disorder\ characterised by venous thrombosis, erythrocyte hemolysis, infections and defective \ hematopoiesis.

    \ ' '2791' 'IPR000312' '\

    The glycosyl transferase family includes anthranilate phosphoribosyltransferase (TrpD, ) and thymidine phosphorylase ().\ All these proteins can transfer a phosphorylated ribose substrate. Thymidine phosphorylase () catalyses the reversible phosphorolysis\ of thymidine, deoxyuridine and their analogues to their respective bases and\ 2-deoxyribose 1-phosphate. This enzyme regulates the availability of thymidine\ and is therefore essential to nucleic acid metabolism.

    \ \ \ \ ' '2792' 'IPR007507' '\

    This is a domain found in proteins that transfer activated sugars to a variety of substrates, including glycogen, fructose-6-phosphate and lipopolysaccharides. Proteins bearing this domain transfer UDP, ADP, GDP or CMP linked sugars. This region is flanked at the N terminus by a signal peptide and at the C terminus by a glycosyl transferase group 1 domain (). The eukaryotic glycogen synthases may be distant members of this bacterial family PUBMED:10952982.

    \ ' '2793' 'IPR001863' '\

    Glypicans PUBMED:8589707, PUBMED:7657705 are a family of heparan sulphate proteoglycans which are anchored to cell membranes by a glycosylphosphatidylinositol (GPI) linkage. Six members (GPC1-6) are known in vertebrates PUBMED:11474185. Structurally, these proteins consist of three separate domains:

    \ \ ' '2794' 'IPR007867' '\

    The glucose-methanol-choline (GMC) oxidoreductases are FAD flavoproteins oxidoreductases PUBMED:1542121, PUBMED:8218217. These enzymes include a variety of proteins; choline dehydrogenase (CHD), methanol oxidase (MOX) and cellobiose dehydrogenase () PUBMED:10725534 which share a number of regions of sequence similarities. The function of this C-terminal conserved domain is not yet known.

    \ ' '2795' 'IPR000172' '\

    The glucose-methanol-choline (GMC) oxidoreductases are FAD\ flavoproteins oxidoreductases PUBMED:1542121, PUBMED:8218217.\ These enzymes include a variety of proteins; choline dehydrogenase (CHD), methanol oxidase (MOX) and cellobiose dehydrogenase () PUBMED:10725534 which share a number of regions of sequence similarities. One of\ these regions, located in the N-terminal section, corresponds to the FAD ADP-\ binding domain. The function of the other conserved domains is not yet known.

    \ ' '2796' 'IPR002012' '\ The gonadotropin-releasing hormones (GnRH) (gonadoliberin) PUBMED: are a family\ of peptides that play a pivotal role in reproduction. The main function of\ GnRH is to act on the pituitary to stimulate the synthesis and secretion of\ luteinizing and follicle-stimulating hormones, but GnRH also acts on the\ brain, retina, sympathetic nervous system, gonads and placenta in certain\ species. There seems to be at least three forms of GnRH. The second form is\ expressed in midbrain and seems to be widespread. The third form has only been\ found so far in fish.\ GnRH is a C-terminal amidated decapeptide processed from a larger precursor\ protein. Four of the ten residues are perfectly conserved in all species\ where GnRH has been sequenced.\ ' '2797' 'IPR004139' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \ Alpha-1,3-mannosyl-glycoprotein beta-1,2-N-acetylglucosaminyltransferase (GNT-I, GLCNAC-T I) transfers N-acetyl-D-glucosamine from UDP to high-mannose glycoprotein N-oligosaccharide. This is an essential step in the synthesis of complex or hybrid-type N-linked oligosaccharides. The enzyme is an integral membrane protein localized to the Golgi apparatus, and is probably distributed in all tissues. The catalytic domain is located at the C-terminus PUBMED:10406843. These proteins are members of the glycosyl transferase family 13 ()\ ' '2798' 'IPR003474' '\ This is a family of integral membrane permeases that are involved in gluconate uptake. Escherichia coli contains several members of this family including GntU, a low affinity transporter PUBMED:9135111 and GntT, a high affinity transporter PUBMED:9045817.\ ' '2799' 'IPR000524' '\

    Many bacterial transcription regulation proteins bind DNA through a helix-turn-helix (HTH) motif, which can be classified into subfamilies on the basis of sequence similarities. The HTH GntR family has many members distributed among diverse bacterial groups that regulate various biological processes. It was named GntR after the Bacillus subtilis repressor of the gluconate operon PUBMED:2060763. Family members include GntR, HutC, KorA, NtaR, FadR, ExuR, FarR, DgoR and PhnF. The crystal structure of the FadR protein has been determined PUBMED:11013219. In general, these proteins contain a DNA-binding HTH domain at the N terminus, and an effector-binding or oligomerisation domain at the C terminus (). The DNA-binding domain is well conserved in structure for the whole of the GntR family, consisting of a 3-helical bundle core with a small beta-sheet (wing); the GntR winged helix structure is similar to that found in several other transcriptional regulator families. The regions outside the DNA-binding domain are more variable and are consequently used to define GntR subfamilies PUBMED:11756427. This entry represents the N-terminal DNA-binding domain of the GntR family.

    \ ' '2800' 'IPR003109' '\

    In heterotrimeric G-protein signalling, cell surface receptors (GPCRs) are\ coupled to membrane-associated heterotrimers comprising a GTP-hydrolysing subunit G-alpha and a G-beta/G-gamma dimer. The inactive form contains the alpha subunit bound to GDP and complexes with the beta and gamma subunit. When the ligand is associated to the receptor, GDP is displaced from G-alpha and GTP is bound. GTP/G-alpha complex dissociates from the trimer and associates to an effector until the intrinsic GTPase activity of G-alpha returns the protein to GDP bound form. Reassociation of GDP bound G-alpha with G-beta/G-gamma dimer terminates the signal. Several mechanisms regulate the signal output at different stage of the G-protein cascade. Two classes of intracellular proteins act as inhibitors of G protein activation: GTPase activating proteins (GAPs), which enhance GTP hydrolysis (see ),\ and guanine dissociation inhibitors (GDIs), which inhibit GDP dissociation.\ The GoLoco or G-protein regulatory (GPR) motif found in various G-protein\ regulators PUBMED:10470031, PUBMED:10606204 acts as a GDI on G-alpha(i) PUBMED:11121039, PUBMED:11024022.

    \ \

    The crystal structure of the GoLoco motif in complex with G-alpha(i) has been solved PUBMED:11976690. It consists of three small alpha helices. The highly conserved Asp-Gln-Arg triad within the GoLoco motif participates directly in GDP binding by extending the arginine side chain into the nucleotide binding pocket, highly reminiscent of the catalytic arginine finger employed in GTPase-activating protein (see ). This addition of an arginine in the binding pocket affects the interaction of GDP with G-alpha and therefore is certainly important for the GoLoco GDI activity PUBMED:11976690.

    \ \

    Some proteins known to contain a GoLoco motif are listed below:

    \ \

    \ ' '2801' 'IPR007305' '\ Traffic through the yeast Golgi complex depends on a member of the syntaxin family of SNARE proteins, Sed5, present in early Golgi cisternae. Got1 is thought to facilitate Sed5-dependent fusion events PUBMED:10406798.\ ' '2802' 'IPR000777' '\

    The entry of HIV requires interaction of viral GP120, an envelope glycoprotein with human T-cell surface glycoprotein CD4 and a chemokine receptor on the cell surface. These envelope glycoproteins are found in HIV types 1 and 2, and Simian Immunodeficiency virus (SIV).

    \ ' '2804' 'IPR004257' '\ This family contains a predicted structural envelope protein GP4 from Equine arteritis virus (EAV).\ ' '2805' 'IPR000328' '\

    The gp41 subunit of the envelope protein complex from Human immunodeficiency virus (HIV) and Simian immunodeficiency virus (SIV-cpz) mediates membrane fusion during viral entry PUBMED:9689046. It has a core composed of a six-helix bundle and is folded by its trimeric N- and C-terminal heptad-repeats (NHR and CHR) PUBMED:18417584. Derivatives of this protein prevent HIV-1 from entering cell lines and primary human CD4+ cells in vitro PUBMED:18449216, making it an attractive subject of gene therapy studies against HIV and related retroviruses.

    \ ' '2806' 'IPR004196' '\

    The assembly of a macromolecular structure proceeds via a specific pathway of ordered events and occurs by changing of protein conformations as they join the assembly. The assembly process is aided by scaffolding proteins, which act as chaperones. In bacteriophages, scaffolding proteins B and D are responsible for procapsid formation. Copies of protein D (240) form the external scaffold, while 60 copies of protein B form the internal scaffold PUBMED:9305849. The role of scaffolding protein D is in the production of viral single-stranded RNA.

    \ \ ' '2807' 'IPR000173' '\

    Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) plays an important role in glycolysis and gluconeogenesis PUBMED:2716055 by reversibly catalysing the oxidation and phosphorylation of D-glyceraldehyde-3-phosphate to 1,3-diphospho-glycerate. The enzyme exists as a tetramer of identical subunits, each containing 2 conserved functional domains: an NAD-binding domain, and a highly conserved catalytic domain PUBMED:6303388. The enzyme has been found to bind to actin and tropomyosin, and may thus have a role in cytoskeleton assembly. Alternatively, the cytoskeleton may provide a framework for precise positioning of the glycolytic enzymes, thus permitting efficient passage of metabolites from enzyme to enzyme PUBMED:6303388.

    \

    GAPDH displays diverse non-glycolytic functions as well, its role depending upon its subcellular location. For instance, the translocation of GAPDH to the nucleus acts as a signalling mechanism for programmed cell death, or apoptosis PUBMED:10740219. The accumulation of GAPDH within the nucleus is involved in the induction of apoptosis, where GAPDH functions in the activation of transcription. The presence of GAPDH is associated with the synthesis of pro-apoptotic proteins like BAX, c-JUN and GAPDH itself.

    \

    GAPDH has been implicated in certain neurological diseases: GAPDH is able to bind to the gene products from neurodegenerative disorders such as Huntington\'s disease, Alzheimer\'s disease, Parkinson\'s disease and Machado-Joseph disease through stretches encoded by their CAG repeats. Abnormal neuronal apoptosis is associated with these diseases. Propargylamines such as deprenyl increase neuronal survival by interfering with apoptosis signalling pathways via their binding to GAPDH, which decreases the synthesis of pro-apoptotic proteins PUBMED:12721812.

    \ ' '2808' 'IPR000173' '\

    Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) plays an important role in glycolysis and gluconeogenesis PUBMED:2716055 by reversibly catalysing the oxidation and phosphorylation of D-glyceraldehyde-3-phosphate to 1,3-diphospho-glycerate. The enzyme exists as a tetramer of identical subunits, each containing 2 conserved functional domains: an NAD-binding domain, and a highly conserved catalytic domain PUBMED:6303388. The enzyme has been found to bind to actin and tropomyosin, and may thus have a role in cytoskeleton assembly. Alternatively, the cytoskeleton may provide a framework for precise positioning of the glycolytic enzymes, thus permitting efficient passage of metabolites from enzyme to enzyme PUBMED:6303388.

    \

    GAPDH displays diverse non-glycolytic functions as well, its role depending upon its subcellular location. For instance, the translocation of GAPDH to the nucleus acts as a signalling mechanism for programmed cell death, or apoptosis PUBMED:10740219. The accumulation of GAPDH within the nucleus is involved in the induction of apoptosis, where GAPDH functions in the activation of transcription. The presence of GAPDH is associated with the synthesis of pro-apoptotic proteins like BAX, c-JUN and GAPDH itself.

    \

    GAPDH has been implicated in certain neurological diseases: GAPDH is able to bind to the gene products from neurodegenerative disorders such as Huntington\'s disease, Alzheimer\'s disease, Parkinson\'s disease and Machado-Joseph disease through stretches encoded by their CAG repeats. Abnormal neuronal apoptosis is associated with these diseases. Propargylamines such as deprenyl increase neuronal survival by interfering with apoptosis signalling pathways via their binding to GAPDH, which decreases the synthesis of pro-apoptotic proteins PUBMED:12721812.

    \ ' '2809' 'IPR007720' '\ Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins. The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This chemically simple step is genetically complex because three or four genes are required in both Saccharomyces cerevisiae (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively PUBMED:11849707.\ ' '2810' 'IPR007245' '\ GPI (glycosyl phosphatidyl inositol) transamidase is a multiprotein complex. Gpi16, Gpi8 and Gaa1 for a sub-complex of the GPI transamidase. GPI transamidase adds glycosylphosphatidylinositols (GPIs) to newly synthesized proteins. Gpi16 is an essential N-glycosylated transmembrane glycoprotein. Gpi16 is largely found on the lumenal side of the ER. It has a single C-terminal transmembrane domain and a small C-terminal, cytosolic extension with an ER retrieval motif PUBMED:11598210.\ ' '2811' 'IPR004174' '\ gpW is a 68 residue protein known to be present in phage particles. Extracts of phage-infected cells lacking gpW contain DNA-filled heads, and active tails, but no infectious virions. gpW is required for the addition of gpFII to the head, which is, in turn, required for the attachment of tails. Since gpFII and tails are known to be attached at the connector, gpW is also likely to assemble at this site. The addition of gpW to filled heads increases the DNase resistance of the packaged DNA, suggesting that gpW either forms a plug at the connector to prevent ejection of the DNA, or binds directly to the DNA. The large number of positively charged residues in gpW (its calculated pI is 10.8) is consistent with a role in DNA interaction PUBMED:11302702.\ ' '2812' 'IPR007048' '\ These proteins from Bacteriophage T4 and related phage may be a structural component of the outer wedge of the baseplate that has acidic lysozyme activity PUBMED:3186452, PUBMED:3520236.\ ' '2813' 'IPR008119' '\

    Toxoplasma gondii is an obligate intracellular apicomplexan protozoan \ parasite, with a complex lifestyle involving varied hosts PUBMED:11269320. It has two \ phases of growth: an intestinal phase in feline hosts, and an extra-intestinal phase in other mammals. Oocysts from infected cats develop \ into tachyzoites, and eventually, bradyzoites and zoitocysts in the \ extraintestinal host PUBMED:11269320. Transmission of the parasite occurs through \ contact with infected cats or raw/undercooked meat; in immunocompromised \ individuals, it can cause severe and often lethal toxoplasmosis. Acute \ infection in healthy humans can sometimes also cause tissue damage PUBMED:11269320.\

    \

    The protozoan utilises a variety of secretory and antigenic proteins to \ invade a host and gain access to the intracellular environment PUBMED:11269320. These \ originate from distinct organelles in the T. gondii cell termed micronemes, \ rhoptries, and dense granules. They are released at specific times during \ invasion to ensure the proteins are allocated to their correct target \ destinations PUBMED:11269320. \ Dense granule antigens (GRAs) are released from the T. gondii tachyzoite\ while still encapsulated in a host vacuole.

    \

    Gra6, one of these moieties, is associated with the parasitophorous vacuole PUBMED:10498186. It possesses a hydrophobic\ central region flanked by two hydrophilic domains, and is present as a\ single copy gene in the Toxoplasma gondii genome PUBMED:10498186. Gra6\ shares a similar function with Gra2, in that it is rapidly targeted to a network of membranous tubules that connect with the vacuolar membrane PUBMED:10498186. Indeed, these two proteins, together with Gra4, form a multimeric complex that stabilises the parasite within the vacuole.

    \ ' '2814' 'IPR001702' '\

    The outer membrane of Gram-negative bacteria acts as a molecular filter for hydrophilic compounds. Proteins, known as porins PUBMED:2901351, are responsible for the \'molecular sieve\' properties of the outer membrane. Porins form large water-filled channels which allows the diffusion of hydrophilic molecules into the periplasmic space. Some porins form general diffusion channels that allows any solutes up to a certain size (that size is known as the exclusion limit) to cross the membrane, while other porins are specific for a solute and contain a binding site for that solute inside the pores (these are known as selective porins). As porins are the major outer membrane proteins, they also serve as receptor sites for the binding of phages and bacteriocins.

    \

    General diffusion porins generally assemble as trimer in the membrane and the transmembrane core of these proteins is composed exclusively of beta strands PUBMED:2178269. It has been shown PUBMED:1662760 that a number of general porins are evolutionary related, these porins are:\

    \ ' '2815' 'IPR019948' '\

    Viruses, parasites and bacteria are covered in protein and sugar molecules that help them gain entry into a host by counteracting the host\'s defences. One such molecule is the M protein produced by certain streptococcal bacteria. M proteins embody a motif that is now known to be shared by many Gram-positive bacterial surface proteins. The motif includes a conserved hexapeptide, which precedes a hydrophobic C-terminal membrane anchor, which itself precedes a cluster of basic residues PUBMED:2188957, PUBMED:2287281.\ This structure is represented in the following schematic representation:

    \
    \
      +--------------------------------------------+-+--------+-+\
      |    Variable length extracellular domain    |H| Anchor |B|\
      +--------------------------------------------+-+--------+-+\
    \
      \'H\': conserved hexapeptide.\
      \'B\': cluster of basic residues.\
    
    \

    It has been proposed that this hexapeptide sequence is responsible for a post-\ translational modification necessary for the proper anchoring of the proteins\ which bear it, to the cell wall.

    \ \ ' '2816' 'IPR001990' '\

    Granins (chromogranins or secretogranins) PUBMED:2053134 are a family of acidic proteins present in the secretory granules of a wide variety of endocrine and neuro-endocrine cells. The exact function(s) of these proteins is not yet known but they seem to be the precursors of biologically active peptides and/or they may act as helper proteins in the packaging of peptide hormones and neuropeptides. Apart from their subcellular location and the abundance of acidic residues (Asp and Glu), these proteins do not share many structural similarities. Only one short region, located in the C-terminal section, is conserved in all these proteins.

    \

    Chromogranins and secretogranins together share a C-terminal motif, whereas chromogranins A and B share a region of high similarity in their N-terminal section; this region includes two cysteine residues involved in a disulphide bond.

    \ ' '2817' 'IPR000118' '\

    Metazoan granulins PUBMED:1542665 are a family of cysteine-rich peptides of about 6 Kd which may\ have multiple biological activity. A precursor protein (known as acrogranin)\ potentially encodes seven different forms of granulin (grnA to grnG) which are\ probably released by post-translational proteolytic processing. \ Granulins are evolutionary related to a PMP-D1, a peptide extracted from the\ pars intercerebralis of migratory locusts PUBMED:1740125.\ A schematic representation of the structure of a granulin is shown below:\ \

    \
           xxxCxxxxxCxxxxxCCxxxxxxxxCCxxxxxxCCxxxxxCCxxxxxCxxxxxxCx\
    \
    \'C\': conserved cysteine probably involved in a disulphide bond.\
    
    \

    \

    In plants a granulin domain is often associated with the C terminus of cysteine proteases belong to the MEROPS peptidase family C1, subfamily C1A (papain).

    \ ' '2818' 'IPR007583' '\ GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system PUBMED:10487747.\ ' '2819' 'IPR006812' '\ Found in clostridia, this protein contains one active site selenocysteine and catalyses the reductive deamination of glycine, which is coupled to the esterification of orthophosphate resulting in the formation of ATP PUBMED:2963330. A member of this family may also exist in Treponema denticola PUBMED:11797052.\ ' '2820' 'IPR001437' '\ Bacterial proteins greA and greB are necessary for efficient RNA\ polymerase transcription elongation past template-encoded arresting sites.\ Arresting sites in DNA have the property of trapping a certain fraction of\ elongating RNA polymerases that pass through, resulting in locked DNA/RNA/\ polymerase ternary complexes. Cleavage of the nascent transcript by cleavage\ factors, such as greA or greB, allows the resumption of elongation from the\ new 3\'terminus PUBMED:8431948, PUBMED:7854424.

    Escherichia coli GreA and GreB are sequence homologues and have homologues in\ every known bacterial genome PUBMED:12914698. GreA induces cleavage two or three nucleotides behind the terminus\ and can only prevent\ the formation of arrested complexes while greB releases longer sequences up to eighteen nucleotides in length and can\ rescue preexisting arrested complexes. These functional differences correlate with a\ distinctive structural feature, the distribution of positively charged residues on one face of the N-terminal coiled\ coil. Remarkably, despite close functional similarity, the prokaryotic Gre factors have no\ sequence or structural similarity with eukaryotic TFIIS.

    \ ' '2821' 'IPR001437' '\ Bacterial proteins greA and greB are necessary for efficient RNA\ polymerase transcription elongation past template-encoded arresting sites.\ Arresting sites in DNA have the property of trapping a certain fraction of\ elongating RNA polymerases that pass through, resulting in locked DNA/RNA/\ polymerase ternary complexes. Cleavage of the nascent transcript by cleavage\ factors, such as greA or greB, allows the resumption of elongation from the\ new 3\'terminus PUBMED:8431948, PUBMED:7854424.

    Escherichia coli GreA and GreB are sequence homologues and have homologues in\ every known bacterial genome PUBMED:12914698. GreA induces cleavage two or three nucleotides behind the terminus\ and can only prevent\ the formation of arrested complexes while greB releases longer sequences up to eighteen nucleotides in length and can\ rescue preexisting arrested complexes. These functional differences correlate with a\ distinctive structural feature, the distribution of positively charged residues on one face of the N-terminal coiled\ coil. Remarkably, despite close functional similarity, the prokaryotic Gre factors have no\ sequence or structural similarity with eukaryotic TFIIS.

    \ ' '2822' 'IPR000791' '\

    Several uncharacterised proteins are evolutionary related, including Yarrowia lipolytica (Candida lipolytica)\ glyxoxylate pathway regulator GPR1; yeast protein FUN34 and hypothetical proteins YCR10c and YDR384c; fission yeast hypothetical protein SpAC5D6.09c; Escherichia coli hypothetical protein yaaH; and Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum) hypothetical protein Mth215. They are hydrophobic proteins that seem to contain six transmembrane regions and which could therefore be involved in transport. They have from 188 to 283 amino acids.

    \ ' '2823' 'IPR000740' '\

    Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolysing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. In prokaryotes the grpE protein. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold PUBMED:8280473. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.

    The X-ray crystal structure of GrpE in complex with the ATPase domain of DnaK revealed that GrpE is an asymmetric homodimer, bent in a manner that favours extensive contacts with only one DnaKATPase monomer PUBMED:15136046. GrpE does not actively compete for the atomic positions occupied by the nucleotide. GrpE and ADP mutually reduce one another\'s affinity for DnaK 200-fold, and ATP instantly dissociates GrpE from DnaK.

    \ ' '2824' 'IPR004218' '\ Prokaryotic glutathione synthetase (glutathione synthase) catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to orthophosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis. The enzyme is inhibited by 7,8-dihydrofolate, methotrexate and trimethoprim. This is the ATP-binding domain of the enzyme.\ ' '2825' 'IPR004215' '\ Prokaryotic glutathione synthetase (glutathione synthase) catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to orthophosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis. The enzyme is inhibited by 7,8-dihydrofolate, methotrexate and trimethoprim. This domain is the N-terminus of the enzyme.\ ' '2826' 'IPR005615' '\

    This entry represents eukaryotic glutathione synthetase () (GSS), a homodimeric enzyme that catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to phosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis, the first step being catalysed by gamma-glutamylcysteine synthetase PUBMED:15981742. In humans, defects in GSS are inherited in an autosomal recessive way and are the cause of severe metabolic acidosis, 5-oxoprolinuria, and increased rate of haemolysis and defective function of the central nervous system.

    \ ' '2827' 'IPR004887' '\

    This entry represents the substrate-binding domain of glutathione synthetase () (GSS), a homodimeric enzyme that catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to phosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis, the first step being catalysed by gamma-glutamylcysteine synthetase PUBMED:15981742. In humans, defects in GSS are inherited in an autosomal recessive way and are the cause of severe metabolic acidosis, 5-oxoprolinuria, and increased rate of haemolysis and defective function of the central nervous system. The substrate-binding domain has a 3-layer alpha/beta/alpha structure PUBMED:10369661.

    \ ' '2828' 'IPR000889' '\

    Glutathione peroxidase (GSHPx) () is an enzyme that catalyses the reduction of hydroxyperoxides by glutathione PUBMED:, PUBMED:7565867. Its main function is to protect against the damaging effect of endogenously formed hydroxyperoxides. In higher vertebrates, several forms of GSHPx are known, including a ubiquitous cytosolic form (GSHPx-1), a gastrointestinal cytosolic form (GSHPx-GI), a plasma secreted form (GSHPx-P), and an epididymal secretory form (GSHPx-EP). In addition to these characterised forms, the sequence of a protein of unknown function PUBMED:2771650 has been shown to be evolutionary related to those of GSHPx\'s.

    \ \

    In filarial nematode parasites, the major soluble cuticular protein (gp29) is a secreted GSHPx, which may provide a mechanism of resistance to the immune reaction of the mammalian host by neutralising the products of the oxidative burst of leukocytes PUBMED:1631065. The Escherichia coli protein btuE, a periplasmic protein involved in vitamin B12 transport, is evolutionarily related to GSHPxs, although the significance of this relationship is unclear. The structure of bovine seleno-glutathione peroxidase has been determined PUBMED:6852035. The protein belongs to the alpha-beta class, with a 3 layer(aba) sandwich architecture. The catalyic site of GSHPx contains a conserved residue which is either a cysteine or, in many eukaryotic GSHPx, a selenocysteine PUBMED:2142875.

    \ ' '2829' 'IPR005494' '\ This region contains the Glutathionylspermidine synthase enzymatic activity . This is the C-terminal region in bienzymes such as . Glutathionylspermidine (GSP) synthetases of Trypanosomatidae and Escherichia coli couple hydrolysis of ATP (to ADP and Pi) with formation of an amide bond between spermidine and the glycine carboxylate of glutathione (gamma-Glu-Cys-Gly). In the pathogenic trypanosomatids, this reaction is the penultimate step in the biosynthesis of the antioxidant metabolite, trypanothione (N1,N8-bis-(glutathionyl)spermidine), and is a target for drug design PUBMED:7775463.\ ' '2830' 'IPR001482' '\ A number of bacterial proteins, some of which are involved in a general\ secretion pathway (GSP) for the export of proteins (also called the type II\ pathway) belong to this group PUBMED:8438237, PUBMED:7934814. These proteins\ are probably located in the cytoplasm and, on the basis of the presence of a\ conserved P-loop region , bind ATP.\ ' '2831' 'IPR007831' '\

    This domain is found at the N terminus of members of the general secretory system II protein E. Proteins in this subfamily are typically involved in Type IV pilus biogenesis (e.g. ), though some are involved in other processes; for instance aggregation in Myxococcus xanthus (e.g. ) PUBMED:11073903.

    \ ' '2832' 'IPR018076' '\ A number of bacterial proteins, some of which are involved in a general secretion pathway (GSP) for the export of proteins (also called the type II pathway) PUBMED:8438237, have been found to be evolutionary related. These are proteins of about 400 amino acids that are highly hydrophobic and which are thought to be integral protein of the inner membrane. Proteins with this domain form a platform for the type II secretion machinery, as well as the type IV pili and the archaeal flagellae PUBMED:11266368.\ ' '2833' 'IPR004846' '\

    This family includes: protein D that is involved in the general (type II) secretion pathway (GSP) within Gram-negative bacteria, a signal sequence-dependent process responsible for protein export PUBMED:8438237, PUBMED:1365398, PUBMED:1592799,PUBMED:8326859, PUBMED:7901733, PUBMED:7934814, PUBMED:8190064 and protein G from the type III secretion system.

    \

    A number of proteins are involved in the GSP; one of these is known as protein D (GSPD protein), the most probable location of which is the outer membrane PUBMED:2677007. This suggests that protein D constitutes the apparatus of the accessory mechanism, and is thus involved in transporting exoproteins from the periplasm, across the outer membrane, to the extracellular environment.

    \

    The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell and is only triggered when the bacterium comes into close contact with the host. The protein subunits of the system are very similar to those of bacterial flagellar biosynthesis. However, while the latter forms a ring structure to allow secretion of flagellin and is an integral part of the flagellum itself PUBMED:10564516, type III subunits in the outer membrane translocate secreted proteins through a channel-like structure. Protein G aids in the structural assembly of the invasion complex PUBMED:8733226.

    \ ' '2834' 'IPR000645' '\ The secretion pathway (GSP) for the export of proteins (also called the type II pathway) PUBMED:8438237\ requires a number of protein components. One of them is known as the \'N\' protein and has been sequenced\ in a variety of bacteria such as Aeromonas hydrophila (gene exeN); Erwinia carotovora (gene outN); Klebsiella pneumoniae (gene pulN); or Vibrio cholerae (gene epsN). The size of the \'N\' protein is around 250 amino acids. It apparently\ contains a single transmembrane domain located in the N-terminal section. The short N-terminal domain is\ predicted to be cytoplasmic and the large C-terminal domain periplasmic.\ ' '2835' 'IPR005644' '\

    This is a group of NolW-like proteins, which are closely related to bacterial type II and III secretion system protein ().

    \ ' '2836' 'IPR003413' '\ The bacterial general secretion pathway (GSP) is involved in the export of proteins (also called the type II pathway). This family includes GSPI and GSPJ, which contain the pre-pilin signal sequence PUBMED:8407845.\ ' '2837' 'IPR005628' '\ Members of this family are involved in the general secretion pathway. The family includes proteins such as ExeK, PulK, OutX and XcpX.\ ' '2838' 'IPR007812' '\ This family consists of general secretion pathway protein L sequences from several Gram-negative bacteria. The general secretion pathway of Gram-negative bacteria is responsible for extracellular secretion of a number of different proteins, including proteases and toxins. This pathway supports secretion of proteins across the cell envelope in two distinct steps, in which the second step, involving translocation through the outer membrane, is assisted by at least 13 different gene products. GspL is predicted to contain a large cytoplasmic domain and has been shown to interact with the autophosphorylating cytoplasmic membrane protein GspE. It is thought that the tri-molecular complex of GspL, GspE and GspM might be involved in regulating the opening and closing of the secretion pore and/or transducing energy to the site of outer membrane translocation PUBMED:10322014.\ ' '2839' 'IPR007690' '\ This is a family of membrane proteins involved in the secretion of a number of molecules in Gram-negative bacteria. The precise function of these proteins is unknown, though in Vibrio cholerae, the EpsM protein interacts with the EpsL protein, and also forms homodimers PUBMED:10322014.\ ' '2840' 'IPR004046' '\

    In eukaryotes, glutathione S-transferases (GSTs) participate in the\ detoxification of reactive electrophilic compounds by catalysing their\ conjugation to glutathione. The GST domain is also found in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of cephalopods is also a GST PUBMED:9074797, PUBMED:10783391, PUBMED:11035031, PUBMED:10416260.

    \

    Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family PUBMED:11327815, PUBMED:9045797. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.

    \

    Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural\ fold. Each monomer is composed of a distinct N-terminal sub-domain,\ which adopts the thioredoxin fold, and a C-terminal all-helical\ sub-domain. This entry is the C-terminal domain.

    \ ' '2841' 'IPR004045' '\

    In eukaryotes, glutathione S-transferases (GSTs) participate in the\ detoxification of reactive electrophilic compounds by catalysing their\ conjugation to glutathione. The GST domain is also found in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of Cephalopoda is also a GST PUBMED:9074797, PUBMED:10783391, PUBMED:11035031, PUBMED:10416260.

    \

    Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family PUBMED:11327815, PUBMED:9045797. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.

    \ \

    Soluble GSTs activate glutathione (GSH) to GS-. In many GSTs, this is accomplished by a Tyr at H-bonding distance from the sulphur of GSH. These enzymes catalyse nucleophilic attack by reduced glutathione (GSH) on nonpolar compounds that contain an electrophilic carbon, nitrogen, or sulphur atom PUBMED:16399376.

    \ \

    Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold, with each monomer composed of two distinct domains PUBMED:12211029. The N-terminal domain forms a thioredoxin-like fold that binds the glutathione moiety, while the C-terminal domain contains several hydrophobic alpha-helices that specifically bind hydrophobic substrates.

    \ \

    This entry represents the N-terminal domain of GST.

    \ ' '2842' 'IPR000038' '\

    Septins constitute a eukaryotic family of guanine nucleotide-binding proteins, most of which polymerise to form filaments PUBMED:14611653. Members of the family were first identified by genetic screening for Saccharomyces cerevisiae (Baker\'s yeast) mutants defective in cytokinesis PUBMED:4950437. Temperature-sensitive mutations in four genes, CDC3, CDC10, CDC11 and CDC12, were found to cause cell-cycle arrest and defects in bud growth and cytokinesis. The protein products of these genes localise at the division plane between mother and daughter cells, indicating a role in mother-daughter separation during cytokinesis PUBMED:3316985. Members of the family were therefore termed septins to reflect their role in septation and cell division. The identification of septin homologues in higher eukaryotes, which localise to the cleavage furrow in dividing cells, supports an orthologous function in cytokinesis. Septins have since been identified in most eukaryotes, except plants PUBMED:10805747.

    \ \

    Septins are approximately 40-50 kDa in molecular mass, and typically comprise a conserved central core domain (more than 35% sequence identity between mammalian and yeast homologues) flanked by more divergent N- and C-termini. Most septins possess a P-loop motif in their N-terminal domain (which is characteristic of GTP-binding proteins), and a predicted C-terminal coiled-coil domain PUBMED:10481176.

    \ \

    A number of septin interaction partners have been identified in yeast, many of which are components of the budding site selection machinery, kinase cascades or of the ubiquitination pathway. It has been proposed that septins may act as a scaffold that provides an interaction matrix for other proteins PUBMED:10805747, PUBMED:10481176. In mammals, septins have been shown to regulate vesicle dynamics PUBMED:11942624. Mammalian septins have also been implicated in a variety of other cellular processes, including apoptosis, carcinogenesis and neurodegeneration PUBMED:9203580.

    \

    This entry represents a variety of septins and homologous sequences involved in the cell division process.

    \ ' '2843' 'IPR000926' '\

    GTP cyclohydrolase II catalyses the first committed step in the biosynthesis of riboflavin. The enzyme\ converts GTP and water to formate, 2,5-diamino-6-hydroxy-4-(5-phosphoribosylamino)- pyrimidine and\ pyrophosphate, and requires magnesium as a cofactor. It is sometimes found as a bifunctional enzyme with 3,4-dihydroxy-2-butanone 4-phosphate synthase (DHBP_synthase) .

    \ ' '2844' 'IPR020602' '\

    GTP cyclohydrolase I () catalyses the biosynthesis of formic acid and dihydroneopterin triphosphate from GTP. This reaction is the first step in the biosynthesis of tetrahydrofolate in prokaryotes, of tetrahydrobiopterin in vertebrates, and of pteridine-containing pigments in insects. The comparison of the sequence of the enzyme from bacterial and eukaryotic sources shows that the structure of this enzyme has been extremely well conserved throughout evolution PUBMED:7542887.

    \

    This entry represents GTP cyclohydrolase I and NADPH-dependent nitrile oxidoreducases. These enzymes display a common fold PUBMED:15767583.

    \

    NADPH-dependent nitrile oxidoreductases are involved in the biosynthesis of queuosine, a 7-deazaguanine-modified nucleoside found in tRNA(GUN) of bacteria and eukaryotes PUBMED:15767583.

    \ ' '2845' 'IPR006762' '\

    GTR1 was first identified in Saccharomyces cerevisiae (Baker\'s yeast) as a suppressor of a mutation in RCC1. RCC1 catalyzes guanine nucleotide exchange on Ran, a well characterised nuclear Ras-like small G protein that plays an essential role in the import and export of proteins and RNAs across the nuclear membrane through the nuclear pore complex. RCC1 is located inside the nucleus, bound to chromatin. The concentration of GTP within the cell is ~30 times higher than the concentration of GDP, thus resulting in the preferential production of the GTP form of Ran by RCC1 within the nucleus.

    Gtr1p is located within both the cytoplasm and the nucleus and has been reported to play a role in cell growth. Biochemical analysis revealed that Gtr1 is in fact a G protein of the Ras family. The RagA/B proteins are the human homologues of Gtr1 and Rag A and Gtr1p belong to the sixth subfamily of the Ras-like small GTPase superfamily PUBMED:11073942.

    \ ' '2846' 'IPR007267' '\

    Members of this entry are predicted to be integral membrane proteins with three or four transmembrane spans. They are involved in the synthesis of cell surface polysaccharides. The GtrA family is a subset of this family. GtrA is predicted to be an integral membrane protein with 4 transmembrane spans. It is involved in O antigen modification by Shigella flexneri bacteriophage X (SfX), but does not determine the specificity of glucosylation. Its function remains unknown, but it may play a role in translocation of undecaprenyl phosphate linked glucose (UndP-Glc) across the cytoplasmic membrane PUBMED:10376843. Another member of this family is a DTDP-glucose-4-keto-6-deoxy-D-glucose reductase, which catalyses the conversion of dTDP-4-keto-6-deoxy-D-glucose to dTDP-D-fucose, which is involved in the biosynthesis of the serotype-specific polysaccharide antigen of Actinobacillus actinomycetemcomitans Y4 (serotype b) PUBMED:10358040. This family also includes the teichoic acid glycosylation protein, GtcA, which is a serotype-specific protein in some Listeria innocua and Listeria monocytogenes strains. Its exact function is not known, but it is essential for decoration of cell wall teichoic acids with glucose and galactose PUBMED:11029438.

    \ ' '2847' 'IPR001054' '\

    Guanylate cyclases () catalyse the formation of cyclic GMP (cGMP) \ from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases \ and regulating cGMP-sensitive ion channels. The role of cGMP as a second messenger in \ vascular smooth muscle relaxation and retinal photo-transduction is well established. \ Guanylate cyclase is found both in the\ soluble and particulate fractions of eukaryotic cells. The soluble and plasma\ membrane-bound forms differ in structure, regulation and other properties PUBMED:1349465,\ PUBMED:1356629, PUBMED:1680765, PUBMED:1982420. \ Most currently known plasma membrane-bound\ forms are receptors for small polypeptides. The soluble forms of guanylate cyclase are\ cytoplasmic heterodimers having alpha and beta subunits.

    \

    In all characterised eukaryote guanylyl- and adenylyl cyclases, cyclic nucleotide synthesis is carried out by the conserved class III cyclase domain.

    \ ' '2848' 'IPR008144' '\

    Guanylate kinase () (GK) PUBMED:1314905 catalyzes the ATP-dependent phosphorylation of GMP into GDP. It is essential for recycling GMP and indirectly, cGMP. In prokaryotes (such as Escherichia coli), lower eukaryotes\ (such as yeast) and in vertebrates, GK is a highly conserved monomeric protein of about 200 amino acids. GK has been shown PUBMED:1310897, PUBMED:8097461, PUBMED:1329277 to be structurally similar to protein A57R (or SalG2R) from various strains of Vaccinia virus.

    \

    Proteins containing one or more copies of the DHR domain, an SH3 domain as well as a C-terminal GK-like domain, are collectively termed MAGUKs (membrane-associated guanylate kinase homologs) PUBMED:8155583, and\ include Drosophila lethal(1)discs large-1 tumor suppressor protein (gene dlg1); mammalian tight junction protein Zo-1; a family of mammalian synaptic proteins that seem to interact with the cytoplasmic tail of NMDA receptor subunits (SAP90/PSD-95, CHAPSYN-110/PSD-93, SAP97/DLG1 and SAP102); vertebrate 55kDa erythrocyte membrane protein (p55); Caenorhabditis elegans protein lin-2; rat protein CASK; and human proteins DLG2 and DLG3. There is an ATP-binding site (P-loop) in the N-terminal section of GK, which is not conserved in the GK-like domain of the above proteins. However these proteins retain the residues known, in GK, to be involved in the binding of GMP.

    \ ' '2849' 'IPR000879' '\ Guanylin, a 15-amino-acid peptide, is an endogenous ligand of the intestinal receptor guanylate \ cyclase-C, known as STaR PUBMED:7713512, PUBMED:1409606. Upon receptor binding, guanylin increases the \ intracellular concentration of cGMP, it induces chloride secretion and decreases intestinal fluid \ absorption, ultimately causing diarrhoea PUBMED:1346555. The peptide stimulates the enzyme through \ the same receptor binding region as the heat-stable enterotoxins PUBMED:1409606.\ ' '2850' 'IPR007804' '\

    Gas vesicles are intracellular, protein-coated, and hollow organelles found in cyanobacteria and halophilic archaea. They are\ permeable to ambient gases by diffusion and provide buoyancy, enabling cells to move upwards in water to access oxygen and/or light. Proteins containing this family are involved in the formation of gas vesicles PUBMED:9573198.\

    \ ' '2851' 'IPR007805' '\

    Gas vesicles are intracellular, protein-coated, and hollow organelles found in cyanobacteria and halophilic archaea. They are permeable to ambient gases by diffusion and provide buoyancy, enabling cells to move upwards in liquid to access oxygen and/or light. Proteins containing this domain are involved in the formation of gas vesicles PUBMED:1404376.

    \ ' '2852' 'IPR003169' '\

    The glycine-tyrosine-phenylalanine (GYF) domain is an around 60-amino acid\ domain which contains a conserved GP[YF]xxxx[MV]xxWxxx[GN]YF motif. It was\ identified in the human intracellular protein termed CD2 binding protein 2\ (CD2BP2), which binds to a site containing two tandem PPPGHR segments within\ the cytoplasmic region of CD2. Binding experiments and mutational analyses\ have demonstrated the critical importance of the GYF tripeptide in ligand\ binding. A GYF domain is also found in several other eukaryotic proteins of\ unknown function PUBMED:9843987. It has been proposed that the GYF domain found in these\ proteins could also be involved in proline-rich sequence recognition PUBMED:10404223.\ \ Resolution of the structure of the CD2BP2 GYF domain by NMR spectroscopy\ revealed a compact domain with a beta-beta-alpha-beta-beta topology, where the\ single alpha-helix is tilted away from the twisted, anti-parallel beta-sheet.\ The conserved residues of the GYF domain create a contiguous\ patch of predominantly hydrophobic nature which forms an integral part of the\ ligand-binding site PUBMED:10404223. There is limited homology within the C-terminal 20-30\ amino acids of various GYF domains, supporting the idea that this part of the\ domain is structurally but not functionally important PUBMED:12426371.

    \ \ ' '2853' 'IPR004105' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms PUBMED:8868347, PUBMED:11406410. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation PUBMED:10426948, and CheA, which plays a central role in the chemotaxis system PUBMED:9989504. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water PUBMED:11145881. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.

    \

    HKs can be roughly divided into two classes: orthodox and hybrid kinases PUBMED:8029829, PUBMED:1482126. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK PUBMED:10966457. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.

    \

    This helical bundle domain is the homodimer interface of the signal transducing histidine kinase family PUBMED:9989504.

    \ ' '2854' 'IPR004131' '\

    Two types of proteins that hydrolyse inorganic pyrophosphate (PPi), very different in both amino acid sequence and structure, have been characterised to date: soluble and membrane-bound proton-pumping pyrophosphatases (sPPases and H(+)-PPases, respectively). sPPases are ubiquitous proteins that hydrolyse PPi to release heat, whereas H+-PPases, so far unidentified in animal and fungal cells, couple the energy of PPi hydrolysis to proton movement across biological membranes PUBMED:12451180, PUBMED:10523139. The latter type is represented by this group of proteins. H+-PPases () are also called vacuolar-type inorganic pyrophosphatases (V-PPase) or pyrophosphate-energised vacuolar membrane proton pumps PUBMED:11343697. In plants, vacuoles contain two enzymes for acidifying the interior of the vacuole, the V-ATPase and the V-PPase (V is for vacuolar) PUBMED:10523139.

    Two distinct biochemical subclasses of H+-PPases have been characterised to date: K+-stimulated and K+-insensitive PUBMED:12451180, PUBMED:11343697.

    For additional information please see PUBMED:1311852, PUBMED:10556526.

    \ ' '2855' 'IPR002637' '\

    This family contains the Saccharomyces cerevisiae (Baker\'s yeast) HAM1 protein and other hypothetical archaeal, bacterial and Caenorhabditis elegans proteins. \ S. cerevisiae HAM1 protects against the mutagenic effects of the base analog 6-N-hydroxylaminopurine (HAP) which can be a natural product of monooxygenase activity on adenine. HAM1 protein protects the cell from HAP, either on the level of deoxynucleoside triphosphate or the DNA level by a yet unidentified set of reactions PUBMED:8789257.

    \ ' '2856' 'IPR007483' '\

    This family includes the hamartin protein which is thought to function as a tumour suppressor. The hamartin protein interacts with the tuberin protein . Tuberous sclerosis complex (TSC) is an autosomal dominant disorder and is characterised by the presence of hamartomas in many organs, such as brain, skin, heart, lung, and kidney. It is caused by mutation in either TSC1 or TSC2 tumour suppressor genes. TSC1 encodes a protein, hamartin, containing two coiled-coil regions, which have been shown to mediate binding to tuberin. The TSC2 gene codes for tuberin . These two proteins function within the same pathway(s) regulating cell cycle, cell growth, adhesion, and vesicular trafficking PUBMED:12167664.

    \ ' '2857' 'IPR002534' '\

    The medium (M) genome segment of Hantaviruses (family Bunyaviridae) encodes the two virion glycoproteins PUBMED:3114716 as a polyprotein precursor. This entry represents the G1 glycoprotein.

    \ ' '2858' 'IPR002532' '\

    The medium (M) genome segment of Hantaviruses (family Bunyaviridae) encodes the two virion glycoproteins PUBMED:3114716, G1 and G2, as a polyprotein precursor. This entry represents the polyprotein region which forms the G2 glycoprotein.

    \ ' '2859' 'IPR002214' '\

    Hantaviruses are ssRNA negative-strand viruses. The nucleocapsid protein is an internal protein of the virus particle PUBMED:9208453, PUBMED:8578853.

    \ \ ' '2860' 'IPR005566' '\

    Expression of Hydrophobic Abundant protein is thought to be developmentally regulated and possibly involved in spherule cell wall formation PUBMED:3170484.

    \ ' '2861' 'IPR002522' '\

    The Hepatitis C virus has a ssRNA genome. The virion is a nucleocapsid covered by a lipoprotein envelope consisting of two proteins, protein M and glycoprotein E. The nucleocapsid is a complex of protein C and mRNA.

    \ ' '2862' 'IPR002521' '\ The viral core protein forms the internal viral coat that\ encapsidates the genomic RNA and is enveloped in a host\ cell-derived lipid membrane. The core protein has been shown,\ by yeast two-hybrid assay to interact with cellular DEAD box\ helicases PUBMED:10329544. The N terminus of the core protein is\ involved in transcriptional repression PUBMED:10082392.\ ' '2863' 'IPR002519' '\

    Poliovirus infection leads to drastic alterations in membrane permeability late during infection. Proteins 2B and 2BC enhance membrane permeability PUBMED:9218794, PUBMED:8798506.

    \ ' '2864' 'IPR002531' '\ The hypervariable region of the E2/NS1 region of Hepatitis C virus\ varies greatly between viral isolates. E2 is thought to encode a\ structurally unconstrained envelope protein PUBMED:9425941.\ ' '2865' 'IPR002518' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    The group of proteins, non-structural protein 2 (NS2) of hepatitis C virus, are peptidases belonging to MEROPS peptidase family C18 (hepatitis C virus endopeptidase 2, clan C-).

    \ \

    The viral genome is translated into a single polyprotein of\ about 3000 amino acids. Generation of the mature non-structural proteins\ relies on the activity of viral proteases. NS2 is an zinc-dependent autocatalytic endopeptidase which cleaves at the NS2/NS3 junction PUBMED:9224925, PUBMED:9261354. The action of NS3 proteinase (NS3P, ), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins.

    \ ' '2866' 'IPR004109' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (), a zinc-dependent enzyme, performs a single proteolytic cut to release the N-terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme PUBMED:11264729.

    \ ' '2867' 'IPR000745' '\ NS4a (non-structural protein) forms an integral part of the NS3 serine protease in Hepatitis C virus, \ as it is required in a number of cases as a cofactor of cleavage PUBMED:9568891, PUBMED:9261364. It has \ also been reported that NS4a interacts with NS4b and NS3 to form a multi-subunit replicase complex \ PUBMED:9261364.\ ' '2868' 'IPR001490' '\ The genome polyprotein contains: caspid protein C, envelope glycoproteins E1 and E2, protein P7, nonstructural protein NS2, protease/helicase NS3, nonstructural proteins NS4A and NS4B (this family), NS5A and NS5B.\ \

    The small proteins NS2A, NS2B, NS4A and NS4B are hydrophobic, suggesting a possible membrane-related function PUBMED:9224925.\ It is known that NS4B interacts with NS4A and NS3 to form a large\ replicase complex to direct the viral RNA replication PUBMED:9261364. NS3 and NS5 may also play a role in the viral RNA replication.

    \ ' '2869' 'IPR002868' '\ The molecular function of the non-structural 5a viral protein is uncertain.\ The NS5a protein is phosphorylated when expressed in mammalian cells.\ It is thought to interact with the dsRNA-dependent (interferon\ inducible) kinase PKR, PUBMED:, PUBMED:.\ ' '2870' 'IPR006712' '\

    Homeodomain leucine zipper (HDZip) genes encode putative transcription factors that are unique to plants. This observation suggests that homeobox-leucine zipper genes evolved after the\ divergence of plants and animals, perhaps to mediate specific regulatory events PUBMED:7915839.

    \ \ This domain is the N-terminal of plant homeobox-leucine zipper proteins. Its function is unknown.

    \ ' '2871' 'IPR003427' '\

    Histidine decarboxylase () catalyses the formation of histamine from histidine PUBMED:11243783. It requires a pyruvoyl group for its activity. Cleavage of the proenzyme PI chain yields two subunits, alpha and beta, which arrange as a hexamer (alpha beta) 6 by nonhydrolytic self-catalysis.

    \ ' '2872' 'IPR004195' '\ Bacteriophage lambda head decoration protein D stabilises the head shell after the rearrangement of GP7 subunits of the head shell lattice that accompanies expansion of the head. There are approximately 420 copies of protein D per mature phage.\ ' '2873' 'IPR002506' '\ The Hepatitis delta virus (HDV) encodes a single protein, the\ hepatitis delta antigen (HDAg). The central region of this protein\ has been shown to bind RNA PUBMED:8245865. Several interactions are also\ mediated by a coiled-coil region at the N terminus of the protein PUBMED:9687364.\ ' '2874' 'IPR000357' '\

    The HEAT repeat is a tandemly repeated, 37-47 amino acid long module\ occurring in a number of cytoplasmic proteins, including the four\ name-giving proteins huntingtin, elongation factor 3 (EF3), the 65 Kd\ alpha regulatory subunit of protein phosphatase 2A (PP2A) and the\ yeast PI3-kinase TOR1 PUBMED:7550332. Arrays of HEAT repeats consists of 3 to 36\ units forming a rod-like helical structure and appear to function as \ protein-protein interaction surfaces. It has been noted that many\ HEAT repeat-containing proteins are involved in intracellular \ transport processes.

    \ \

    In the crystal structure of PP2A PR65/A PUBMED:9989501, the HEAT repeats consist\ of pairs of antiparallel alpha helices, as predicted in PUBMED:7550332.

    \ ' '2875' 'IPR004155' '\

    These proteins contain a short bi-helical repeat that is related to HEAT. Cyanobacteria and red algae harvest light energy using macromolecular complexes known as phycobilisomes (PBS), peripherally attached to the photosynthetic membrane. The major components of PBS are the phycobiliproteins. These heterodimeric proteins are covalently attached to phycobilins: open-chain tetrapyrrole chromophores, which function as the photosynthetic light-harvesting pigments. Phycobiliproteins differ in sequence and in the nature and number of\ attached phycobilins to each of their subunits. These proteins include the lyase enzymes that specifically attach particular phycobilins to apophycobiliprotein subunits. The most comprehensively studied of these is the CpcE/Flyase , , which attaches phycocyanobilin (PCB) to the alpha subunit of apophycocyanin PUBMED:8132596. Similarly, MpeU/V attaches phycoerythrobilin to phycoerythrin II, while CpeY/Z is thought to be involved in phycoerythrobilin (PEB) attachment to phycoerythrin (PE) I (PEs I and II differ in sequence and in the number of attached molecules of PEB: PE I has five, PE II has six) PUBMED:9023176.

    \

    All the reactions of the above lyases involve an apoprotein cysteine SH addition to a terminal delta 3,3\'-double bond. Such a reaction is not possible in the case of phycoviolobilin (PVB), the phycobilin of alpha-phycoerythrocyanin (alpha-PEC). It is thought that in this case, PCB, not PVB, is first added to apo-alpha-PEC, and is then isomerized to PVB. The addition reaction has been shown to occur in the presence of either of the components of alpha-PEC-PVB lyase PecE or PecF (or both). The isomerisation reaction occurs only when both PecE and PecF components are present, i.e. the PecE/F phycobiliprotein lyase is also a phycobilin isomerase PUBMED:10708746. Another member of this family is the NblB protein, whose similarity to the phycobiliprotein lyases was previously noted PUBMED:9882677. This constitutively expressed protein is not known to have any lyase activity. It is thought to be involved in the coordination of PBS degradation with environmental nutrient limitation. It has been suggested that the similarity of NblB to the phycobiliprotein lyases is due to the ability to bind tetrapyrrole phycobilins via the common repeated motif PUBMED:9882677.

    \ ' '2876' 'IPR000569' '\

    The name HECT comes from \'Homologous to the E6-AP Carboxyl Terminus\' PUBMED:7708685. Proteins containing this domain at the C-terminus include\ ubiquitin-protein ligase, which regulates ubiquitination of CDC25. Ubiquitin-protein ligase accepts ubiquitin from an E2 ubiquitin-conjugating enzyme in the form of a thioester, and then directly transfers the ubiquitin to targeted substrates. A cysteine residue is required for ubiquitin-thiolester formation. Human thyroid receptor interacting protein 12, which also contains this domain, is a component of an ATP-dependent multisubunit protein that interacts with the ligand binding domain of the thyroid hormone receptor. It could be an E3 ubiquitin-protein ligase. Human ubiquitin-protein ligase E3A interacts with the E6 protein of the cancer-associated Human papillomavirus type 16 and Human papillomavirus type 18. The E6/E6-AP complex binds to and targets the P53 tumour-suppressor protein for ubiquitin-mediated proteolysis.

    \ ' '2877' 'IPR005108' '\

    The HELP (Hydrophobic ELP) domain is found in EMAP and EMAP-like proteins (ELPs) PUBMED:11694528, PUBMED:7989351. Although called a domain it contains a predicted transmembrane helix and may not form a globular domain. It is also not clear if these proteins localize to membranes.

    \ ' '2878' 'IPR007142' '\

    Haemagglutinin-esterase fusion glycoprotein (HEF) is a multi-functional protein embedded in the viral envelope of several viruses, including influenza C virus, coronaviruses and toroviruses PUBMED:16575523, PUBMED:9817207. HEF is required for infectivity, and functions to recognise the host cell surface receptor, to fuse the viral and host cell membranes, and to destroy the receptor upon host cell infection. The haemagglutinin region of HEF is responsible for receptor recognition and membrane fusion, and bears a strong resemblance to the sialic acid-binding haemagglutinin found in influenza A and B viruses, except that it binds 9-O-acetylsialic acid. The esterase region of HEF is responsible for the destruction of the receptor, an action that is carried out by neuraminidase in influenza A and B viruses. The esterase domain is similar in structure to Streptomyces scabies esterase, and to acetylhydrolase, thioesterase I and rhamnogalacturonan acetylesterase.

    \

    The haemagglutinin-esterase glycoprotein HEF must be cleaved by the host\'s trypsin-like proteases to produce two peptides (HEF1 and HEF2) in order for the virus to be infectious. Once HEF is cleaved, the newly exposed N-terminal of the HEF2 peptide then acts to fuse the viral envelope to the cellular membrane of the host cell, which allows the virus to infect the host cell.

    \

    The haemagglutinin-esterase glycoprotein is a trimer, where each monomer is composed of three domains: an elongated stem active in membrane fusion, an esterase domain, and a receptor-binding domain, where the stem and receptor-binding domains together resemble influenza A virus haemagglutinin. Two of these domains are composed of non-contiguous sequence: the receptor-binding haemagglutinin domain is inserted into a surface loop of the esterase domain, and the esterase domain is inserted into a surface loop of the haemagglutinin stem.

    \

    This entry represents the core of the haemagglutinin-esterase glycoprotein, including the haemagglutinin receptor-binding domain and the esterase domain.

    \

    More information about haemagglutinin proteins can be found at Protein of the Month: Bird Flu, Haemagglutinin PUBMED:.

    \ ' '2879' 'IPR003860' '\

    Haemagglutinin-esterase fusion glycoprotein (HEF) is a multi-functional protein embedded in the viral envelope of several viruses, including influenza C virus, coronaviruses and toroviruses PUBMED:16575523, PUBMED:9817207. HEF is required for infectivity, and functions to recognise the host cell surface receptor, to fuse the viral and host cell membranes, and to destroy the receptor upon host cell infection. The haemagglutinin region of HEF is responsible for receptor recognition and membrane fusion, and bears a strong resemblance to the sialic acid-binding haemagglutinin found in influenza A and B viruses, except that it binds 9-O-acetylsialic acid. The esterase region of HEF is responsible for the destruction of the receptor, an action that is carried out by neuraminidase in influenza A and B viruses. The esterase domain is similar in structure to Streptomyces scabies esterase, and to acetylhydrolase, thioesterase I and rhamnogalacturonan acetylesterase.

    \

    The haemagglutinin-esterase glycoprotein HEF must be cleaved by the host\'s trypsin-like proteases to produce two peptides (HEF1 and HEF2) in order for the virus to be infectious. Once HEF is cleaved, the newly exposed N-terminal of the HEF2 peptide then acts to fuse the viral envelope to the cellular membrane of the host cell, which allows the virus to infect the host cell.

    \

    The haemagglutinin-esterase glycoprotein is a trimer, where each monomer is composed of three domains: an elongated stem active in membrane fusion, an esterase domain, and a receptor-binding domain, where the stem and receptor-binding domains together resemble influenza A virus haemagglutinin. Two of these domains are composed of non-contiguous sequence: the receptor-binding haemagglutinin domain is inserted into a surface loop of the esterase domain, and the esterase domain is inserted into a surface loop of the haemagglutinin stem.

    \

    This entry represents the receptor-binding haemagglutinin domain of the haemagglutinin-esterase glycoprotein.

    \

    More information on haemagglutinin proteins can be found at Protein of the Month: Bird Flu, Haemagglutinin PUBMED:.

    \ ' '2880' 'IPR001364' '\

    Haemagglutinin (HA) is one of two main surface fusion glycoproteins embedded in the envelope of influenza viruses, the other being neuraminidase (NA). There are sixteen known HA subtypes (H1-H16) and nine NA subtypes (N1-N9), which together are used to classify influenza viruses (e.g. H5N1). The antigenic variations in HA and NA enable the virus to evade host antibodies made to previous influenza strains, accounting for recurrent influenza epidemics PUBMED:16178512. The HA glycoprotein is present in the viral membrane as a single polypeptide (HA0), which must be cleaved by the host\'s trypsin-like proteases to produce two peptides (HA1 and HA2) in order for the virus to be infectious. Once HA0 is cleaved, the newly exposed N-terminal of the HA2 peptide then acts to fuse the viral envelope to the cellular membrane of the host cell, which allows the viral negative-stranded RNA to infect the host cell. The type of host protease can influence the infectivity and pathogenicity of the virus.

    \

    The haemagglutinin glycoprotein is a trimer containing three structurally distinct regions: a globular head consisting of anti-parallel beta-sheets that form a beta-sandwich with a jelly-roll fold (contains the receptor binding site and the HA1/HA2 cleavage site); a triple-stranded, coiled-coil, alpha-helical stalk; and a globular foot composed of anti-parallel beta-sheets PUBMED:16543414, PUBMED:15475582. Each monomer consists of an intact HA0 polypeptide with the HA1 and HA2 regions linked by disulphide bonds. The N-terminus of HA1 provides the central strand in the 5-stranded globular foot, while the rest of the HA1 chain makes its way to the 8-stranded globular head. HA2 provides two alpha helices, which form part of the triple-stranded coiled-coil that stabilises the trimer, its C-terminus providing the remaining strands of the 5-stranded globular foot.

    \ \

    This entry represents the entire haemagglutinin protein (HA0) consisting of both the HA1 and HA2 regions, as found in influenza A and B viruses.

    \

    More information about these protein can be found at Protein of the Month: Bird Flu, Haemagglutinin PUBMED:.

    \ ' '2881' 'IPR016053' '\

    Haem oxygenase () (HO) PUBMED:3290025 is the microsomal enzyme that, in animals, carries out the oxidation of haem, it cleaves the haem ring at the alpha-methene bridge to form biliverdin and carbon monoxide PUBMED:3032976. Biliverdin is subsequently converted to bilirubin by biliverdin reductase. In mammals there are three isozymes of haem oxygenase: HO-1 to HO-3. The first two isozymes differ in their tissue expression and their inducibility: HO-1 is highly inducible by its substrate haem and by various non-haem substances, while HO-2 is non-inducible. It has been suggested PUBMED:8093563 that HO-2 could be implicated in the production of carbon monoxide in the brain where it is said to act as a neurotransmitter. In the genome of the chloroplast of red algae as well as in cyanobacteria, there is a haem oxygenase (gene pbsA) that is the key enzyme in the synthesis of the chromophoric part of the photosynthetic antennae PUBMED:9326680. A haem oxygenase is also present in the bacteria Corynebacterium diphtheriae (gene hmuO), where it is involved in the acquisition of iron from the host haem PUBMED:9006041. There is, in the central section of these enzymes, a well-conserved region centred on a histidine residue.

    \ ' '2882' 'IPR000896' '\

    Haemocyanins are copper-containing oxygen transport proteins found in the haemolymph of many invertebrates. They are divided into 2 main groups, arthropodan and molluscan. These have structurally similar oxygen-binding centres, which are similar to the oxygen-binding centre of tyrosinases PUBMED:, but their quaternary structures are arranged differently. The arthropodan proteins exist \ as hexamers comprising 3 heterogeneous subunits (a, b and c) and possess 1 oxygen-binding centre per subunit; and the molluscan proteins exist as cylindrical oligomers of 10 to 20 subunits and possess 7 or 8 oxygen-binding centres per subunit PUBMED:3207675. Although the proteins have similar amino acid compositions, the only real similarity in their primary sequences is in the region corresponding to the second copper-binding domain, which also shows similarity to the copper-binding domain of tyrosinases PUBMED:.

    \

    Larval storage proteins (LSP) PUBMED:2808410 are proteins from the hemolymph of insects, which may serve as a store of amino acids for synthesis of adult proteins. There are two classes of LSP\'s, arylphorins, which are rich in aromatic amino acids, and methionine-rich LSP\'s. LSP\'s forms hexameric complexes. LSP\'s are structurally related to arthropods hemocyanins.

    \

    This entry represents the copper-containg domain that usually occurs in the centre of haemocyanin proteins.

    \ ' '2883' 'IPR005203' '\

    Haemocyanins are copper-containing oxygen transport proteins found in the haemolymph of many invertebrates. They are divided into 2 main groups, arthropodan and molluscan. These have structurally similar oxygen-binding centres, which are similar to the oxygen-binding centre of tyrosinases \ PUBMED:, but their quaternary structures are arranged differently. The arthropodan proteins exist as hexamers comprising 3 heterogeneous subunits (a, b and c) and possess 1 oxygen-binding centre per \ subunit; and the molluscan proteins exist as cylindrical oligomers of 10 to 20 subunits and possess 7 or 8 oxygen-binding centres per subunit PUBMED:3207675. Although the proteins have similar amino acid \ compositions, the only real similarity in their primary sequences is in the region corresponding to the second copper-binding domain, which also shows similarity to the copper-binding domain of tyrosinases PUBMED:.

    \

    Larval storage proteins (LSP) PUBMED:2808410 are proteins from the hemolymph of insects, which may serve as a store of amino acids for synthesis of adult proteins. There are two classes of LSP\'s, arylphorins, which are rich in aromatic amino acids, and methionine-rich LSP\'s. LSP\'s forms \ hexameric complexes. LSP\'s are structurally related to arthropods hemocyanins.

    \ ' '2884' 'IPR005204' '\

    Haemocyanins are copper-containing oxygen transport proteins found in the haemolymph of many invertebrates. They are divided into 2 main groups, arthropodan and molluscan. These have structurally similar oxygen-binding centres, which are similar to the oxygen-binding centre of tyrosinases \ PUBMED:, but their quaternary structures are arranged differently. The arthropodan proteins exist as hexamers comprising 3 heterogeneous subunits (a, b and c) and possess 1 oxygen-binding centre per \ subunit; and the molluscan proteins exist as cylindrical oligomers of 10 to 20 subunits and possess 7 or 8 oxygen-binding centres per subunit PUBMED:3207675. Although the proteins have similar amino acid \ compositions, the only real similarity in their primary sequences is in the region corresponding to the second copper-binding domain, which also shows similarity to the copper-binding domain of tyrosinases PUBMED:.

    \

    Larval storage proteins (LSP) PUBMED:2808410 are proteins from the hemolymph of insects, which may serve as a store of amino acids for synthesis of adult proteins. There are two classes of LSP\'s, arylphorins, which are rich in aromatic amino acids, and methionine-rich LSP\'s. LSP\'s forms \ hexameric complexes. LSP\'s are structurally related to arthropods hemocyanins.

    \ ' '2885' 'IPR018487' '\

    Hemopexin () is a serum glycoprotein that binds haem and transports it to the liver for breakdown and iron recovery, after which the free hemopexin returns to the circulation PUBMED:12042069. Hemopexin prevents haem-mediated oxidative stress. Structurally hemopexin consists of two similar halves of approximately two hundred amino acid residues connected by a histidine-rich hinge region. Each half is itself formed by the repetition of a basic unit of some 35 to 45 residues. Hemopexin-like domains have been found in two other types of proteins, vitronectin PUBMED:9572850, a cell adhesion and spreading factor found in plasma and tissues, and matrixins MMP-1, MMP-2, MMP-3, MMP-9, MMP-10, MMP-11, MMP-12, MMP-14, MMP-15 and MMP-16, members of the matrix metalloproteinase family that cleave extracellular matrix constituents PUBMED:14619953. These zinc endopeptidases, which belong to MEROPS peptidase subfamily M10A, have a single hemopexin-like domain in their C-terminal section. It is suggested that the hemopexin domain facilitates binding to a variety of molecules and proteins, for example the HX repeats of some matrixins bind tissue inhibitor of metallopeptidases (TIMPs).

    \

    This entry represents the repeat found in hempoxein and related domains.

    \ ' '2886' 'IPR007845' '\ The Yersinia enterocolitica O:8 periplasmic binding protein-dependent transport system consisted of four proteins: the periplasmic haemin-binding protein HemT, the haemin permease protein HemU, the ATP-binding hydrophilic protein HemV and the haemin-degrading protein HemS (this family).\ ' '2887' 'IPR002006' '\

    This entry represent the core domain of the viral capsid (HBcAg) from various Hepatitis B virus (HBV), which is a major human pathogen. The virus is composed of an outer envelope of host-derived lipid containing the surface proteins, and an inner protein capsid that contains genomic DNA. The capsid is composed of a single polypeptide, HBcAg, also known as the core antigen. The capsid has a 5-helical fold, where two long helices form a hairpin that dimerises into a 4-helical bundle PUBMED:10394365; this fold is unusual for icosahedral viruses. The monomer fold is stabilised by a hydrophobic core that is highly conserved among human viral variants. The capsid is assembled from dimers via interactions involving a highly conserved arginine-rich region near the C terminus. This viral capsid acts as a core antigen, the major immunodominant region lying at the tips of the alpha-helical hairpins that form spikes on the capsid surface.

    \ ' '2888' 'IPR001616' '\

    Equid herpesvirus 1 (Equine herpesvirus 1) is a respiratory virus capable of causing abortion and neurological disease. Its complete DNA sequence has been determined PUBMED:1318606 and the constituent genes found to be arranged co-linearly with those in the genomes of other alphaherpesviruses, namely Human herpesvirus 3 (HHV-3) and Human herpesvirus 1 (HHV-1) PUBMED:1318606. Comparisons of the predicted amino acid sequences have allowed functions of many EHV-1 proteins to be inferred.

    \

    For example, detailed analysis of HHV-1 and Human herpesvirus 2 (HHV-2) DNA has revealed an open reading frame sufficient to encode 626 amino acids for the HHV-1 alkaline exonuclease (620 amino acids for HHV-2) PUBMED:3005609. Comparison of the predicted amino acid sequences of the viral enzymes has revealed significant differences in the N-terminal portions of the proteins; nevertheless, their three-dimensional structures are believed to be similar.

    \ ' '2889' 'IPR006878' '\ Most proteins in this entry are uncharacterised viral proteins translated from gene 49. The UL6 locus of pseudorabies virus (PRV) has a gene cluster with homology to herpes simplex virus UL5, UL6, UL7 and UL8, Epstein-Barr virus BBRF1 and BBRF2, and Kaposi sarcoma-associated herpes virus ORF43 and ORF42 PUBMED:9458304.\ ' '2891' 'IPR007796' '\

    This family consists of the viral late glycoprotein BLLF1, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo. The binding of the viral major glycoprotein BLLF1 to the CD21 cellular receptor is thought to play an essential role during infection of B lymphocytes by the Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) PUBMED:11024143.

    \ ' '2892' 'IPR006727' '\

    This is a family of unknown function found in the Herpes viruses.

    \ ' '2893' 'IPR006772' '\

    This is a family of Herpesvirus proteins of unknown function.

    \ ' '2894' 'IPR007013' '\

    Replicative DNA polymerases are capable of polymerising tens of thousands of nucleotides without dissociating from their DNA templates. The high processivity of these polymerases is dependent upon accessory proteins that bind to the catalytic subunit of the polymerase or to the substrate. The Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) BMRF1 protein is an essential component of the viral DNA polymerase and is absolutely required for lytic virus replication PUBMED:9934686. BMRF1 is also a transactivator PUBMED:9934686. This family is predicted to have a UL42-like structure PUBMED:10882068.

    \ ' '2895' 'IPR002597' '\

    This family consists of probable major envelope glycoproteins\ from members of the herpesviridae including Human herpesvirus 1 (HHV-1), Human cytomegalovirus (HHV-5) and Human herpesvirus 3 (HHV-3). Members of the herpesviridae have a dsDNA genome and do not have an RNA stage during their replication.

    \ ' '2896' 'IPR003404' '\ Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation PUBMED:10881679.\ ' '2898' 'IPR002874' '\ This family consists of glycoprotein I from various members of the alphaherpesvirinae. These include Human herpesvirus 1 (HHV-1), Human herpesvirus 3 (HHV-3) and Suid herpesvirus 1 (Pseudorabies virus). Glycoprotein I (gI) is important during natural infection, mutants lacking gI produce smaller lesions at the site of infection and show reduced neuronal spread PUBMED:8764058. gI forms a heterodimeric complex with gE; this complex displays Fc receptor activity (binds to the Fc region of immunoglobulin) PUBMED:8764058. Glycoproteins are also important in the production of virus-neutralizing antibodies and cell mediated immunity PUBMED:8207390. The alphaherpesviridae have a dsDNA genome and have no RNA stage during viral replication.\ ' '2899' 'IPR000785' '\

    The Equid herpesvirus 1 (Equine herpesvirus 1, EHV-1) protein belongs to a family of sequences that groups together Human herpesvirus 1 (HHV-1) UL10, EHV-1 52, Human herpesvirus 3 (HHV-3) 50, Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) BBRF3, Human herpesvirus 1 (HHV-1) 39 and Human cytomegalovirus (HHV-5) UL100. Little is yet known about the properties \ of the protein. However, its amino acid sequence is highly hydrophobic, containing 8 putative membrane-spanning regions, and it is therefore believed to be either membrane-associated or transmembrane.

    \ ' '2900' 'IPR002896' '\ Herpesviruses are dsDNA viruses with no RNA stage. This family consists of glycoprotein-D (gD or gIV) which is common to Human herpesvirus 1 (HHV-1) and Human herpesvirus 2 (HHV-2), as well as Equid herpesvirus 1, Bovine herpesvirus 1 and Meleagrid herpesvirus 1 (MeHV-1). Glycoprotein-D has been found on the viral envelope and the plasma membrane of infected cells. gD immunisation can produce an immune response to bovine herpes virus (BHV-1). This response is stronger than that of the other major glycoproteins gB (gI) and gC (gIII) in BHV-1 PUBMED:7530392.\ ' '2901' 'IPR003493' '\

    This entry represents Herpesvirus glycoprotein H (gH), which is a virion associated envelope glycoprotein PUBMED:9526546. Heterodimer formation between gH and gL has been demonstrated in both virions and infected cells PUBMED:9267002. Heterodimer formation between gL and gH is important for the proper folding of gH and its insertion into the membrane because the anti-gH conformation-dependent monoclonal antibodies (mAbs) 53S and LP11 bind gH only when gL is present PUBMED:3016991, PUBMED:2552150.

    \

    Herpesviruses are enveloped by a lipid bilayer that contains at least a dozen glycoproteins. The virion surface glycoproteins mediate recognition of susceptible cells and promote fusion of the viral envelope with the cell membrane, leading to virus entry. No single glycoprotein associated with the virion membrane has been identified as the fusogen PUBMED:17299053.

    \ \

    Glycoprotein L (gL) forms a non-covalently linked heterodimer with glycoprotein H (gH). This heterodimer is essential for virus-cell and cell-cell fusion since the association of gH and gL is necessary for correct localisation of gH to the virion or cell surface. gH anchoring the heterodimer to the plasma membrane through its transmembrane domain. gL lacks a transmembrane domain and is secreted from cells when expressed in the absence of gH PUBMED:7769724.

    \ ' '2902' 'IPR003840' '\ Helicases from the Herpesviridae are responsible for the unwinding of DNA and\ are essential for replication and completion of the viral life cycle.\ ' '2903' 'IPR004996' '\

    This is a family of proteins expressed by members of the Herpesviridae.

    \ ' '2904' 'IPR005205' '\

    The immediate-early protein ICP4 (infected-cell polypeptide 4) is required for efficient transcription of early and late viral genes and is thus essential for productive infection. ICP4 is a large phosphoprotein that binds DNA in a sequence specific manner as a homodimer. ICP4 represses transcription from LAT, ICP4 and ORF-P that have high-affinity a ICP4 binding site that spans the transcription initiation site. ICP4 proteins have two highly conserved regions, this family contains the C-terminal region that probably acts as an enhancer for the N-terminal region PUBMED:11739685.

    \ ' '2905' 'IPR005206' '\

    The immediate-early protein ICP4 (infected-cell polypeptide 4) is required for efficient transcription of early and late viral genes and is thus essential for productive infection. ICP4 is a large phosphoprotein that binds DNA in a sequence specific manner as a homodimer. ICP4 represses transcription from LAT, ICP4 and ORF-P that have high-affinity a ICP4 binding site that spans the transcription initiation site. ICP4 proteins have two highly conserved regions, this family contains the N-terminal region that contains sites for DNA binding and homodimerisation PUBMED:11739685.

    \ ' '2906' 'IPR005028' '\

    This domain of unknown function is found in the intermediate/early proteins of the Herpes virus. Many of these proteins play a role in transcriptional regulation.

    \ ' '2908' 'IPR005030' '\

    This entry contains Epstein-Barr virus EBNA-LP protein. It is a protein involved in latency whose function is not fully understood. The protein contains three domains, each of which contains conserved serine residues within conserved regions (CR1 to CR3). These regions are essential for the EBNA2 cooperativity function. The domains have a bipartite nuclear localisation signal and nuclear localisation of EBNA-LP is essential for EBNA2 cooperativity function. PUBMED:11024123.

    \ ' '2909' 'IPR000912' '\ The Herpesvirus major capsid protein (MCP) is the principal protein of the icosahedral capsid, forming\ the main component of the hexavalent and probably the pentavalent capsomeres. It shares similarity with\ all other Herpesvirus major capsid proteins.\ ' '2910' 'IPR006882' '\ This is a family of Herpesvirus proteins sharing a conserved region present in the ORF11 protein.\ ' '2911' 'IPR003450' '\ This family represents the Herpesviridae origin of replication binding protein, probably involved in DNA replication.\ ' '2912' 'IPR004997' '\

    This is an accessory subunit of Herpesvirus DNA polymerase that acts to increase the processivity of polymerisation.

    \ ' '2913' 'IPR006930' '\ Members of this family contain a conserved region found in most herpesvirus pp38 phosphoproteins.\ ' '2914' 'IPR006731' '\ This family includes UL25 proteins from HCMV, as well as U14 proteins from HHV 6 and HHV7. These 85 kDa phosphoproteins appear to act as structural antigens, but their precise function is otherwise unknown.\ ' '2915' 'IPR004998' '\

    This is a family of early or early-intermediate transcription factors. This family includes EBV BRLF1 and similar ORF 50 proteins from other herpesviruses.

    \ ' '2916' 'IPR006928' '\

    Herpesvirus UL36 encodes the largest herpes simplex virus protein, also designated VP1/2, which is a component of the virion tegument.\ The N-terminal domain, of approximately 500aa, encodes a ubiquitin-specific protease (USP) that belongs to MEROPS peptidase family C76 (UL36 deubiquitinylating peptidase, clan CA). It is conserved across the herpesviridae and accumulates as a cleavage product of UL36 during late viral replication PUBMED:16109378. The conservation of UL36-USP across all members of the herpesviridae and the absolute conservation of the active site residues imply that UL36-USP plays an important role the herpesvirus life cycle. \

    \ ' '2917' 'IPR007611' '\ This family is named after the human herpesvirus protein, but has been characterised in cytomegalovirus as UL47. Cytomegalovirus UL47 is a component of the tegument, which is a protein layer surrounding the viral capsid. UL47 co-precipitates with UL48 and UL69 tegument proteins, and the major capsid protein UL86. A UL47-containing complex is thought to be involved in the release of viral DNA from the disassembling virus particle PUBMED:11773380.\ ' '2918' 'IPR007626' '\

    During primary envelopment Human herpesvirus 1 (HHV-1, HSV-1) nucleocapsids translocate from the nucleus to the cytoplasm. Lining the inside of the INM is the nuclear lamina, which is composed of a meshwork of proteins with spaces too small for the capsid to move through without some disruption of the lamina. The lamina is mainly made up of lamin A/C and lamin B proteins, with smaller amounts of other proteins also present; this lamina must be disrupted before the nucleocapsids can egress.

    \ \ \

    UL31 and UL34 and US3 proteins of herpes simplex virus type 1 form a complex that accumulates at the nuclear rim and is required for envelopment of nucleocapsids and successful egress of the nucleocapsids PUBMED:11507225. Although UL34 has been shown to interact directly with lamin A it cannot disrupt lamin structure by itself. Its interaction with UL31 and US3 appears to be crucial for lamin disruption, though the mechanism is not yet clear PUBMED:11507225, PUBMED:15140953.

    \ ' '2919' 'IPR007619' '\ In cytomegalovirus this protein is known as UL71. This family of proteins has no known function.\ ' '2920' 'IPR007616' '\ The proteins in this family have no known function. Cytomegalovirus UL88 is also a member of this family.\ ' '2921' 'IPR005207' '\

    This is a family of Herpesvirus proteins including UL14. UL14 protein is a minor component of the virion tegument PUBMED:10590088 and is expressed late in infection. UL14 protein can influence the intracellular localization patterns of a number of proteins belonging to the capsid or the DNA encapsidation machinery PUBMED:11161269.

    \ ' '2922' 'IPR007640' '\ UL17 protein is required for DNA cleavage and packaging in herpes viruses. It has been shown to associate with immature B-type capsids PUBMED:10752563, and is required for the localisation of capsids and capsid proteins to the intranuclear sites where viral DNA is cleaved and packaged PUBMED:9875322. In the virion, UL17 is a component of the tegument, which is a protein layer surrounding the viral capsid PUBMED:9557660.\ ' '2923' 'IPR007629' '\

    UL20 is predicted to be a transmembrane protein with multiple membrane spans. It is involved in the trans-cellular transport of enveloped virions, and is therefore important for viral egress. However, UL20 operates in different cellular compartments and different stages of egress in Suid herpesvirus 1 (Pseudorabies virus) and herpes simplex virus. This is thought to be due to differences in egress pathways between these two viruses PUBMED:9188641.

    \ ' '2924' 'IPR002580' '\

    This entry consists of the human herpes virus protein UL24 and its orthologues, which are universally present in avian, mammalian and reptilian herpes viruses. Though the functions of these proteins are not known, computational analysis suggests that they may belong to the restriction endonuclease-like fold superfamily, which contains a variety of endonucleases, DNA repair enzymes and exonucleases PUBMED:16474163. Proteins in this entry contain an absolutely conserved PD-(D/E)XK motif thought to be critical for nucleotide-cleaving activity.

    \ ' '2925' 'IPR004021' '\ This domain has no known function. It is found in one or two copies per protein, and is found associated with the PAAD/DAPIN domain .\ ' '2926' 'IPR005035' '\

    Herpes simplex viruses are large DNA viruses, the genome of which encode approximately 80 genes. The UL3 gene of Human herpesvirus 2 (HHV-2) is predicted to encode a 233 amino acid protein with a molecular mass of 26 kDa. Homologues of the UL3 protein are encoded only among alphaherpesviruses. The function of the UL3 protein of Herpes simplex viruses remains unknown but it is known to localize to the nucleus and is a phosphoprotein PUBMED:10466815.

    \ ' '2927' 'IPR003868' '\ This is a family of Herpesvirus proteins including UL31, UL53, and the product of ORF 69 in some strains. The proteins in this family have no known function.\ ' '2928' 'IPR005208' '\

    This is a family of Herpesvirus proteins including UL33 ,UL51 . The proteins in this family are involved in packaging viral DNA.

    \ ' '2930' 'IPR005210' '\

    The UL36 open reading frame (ORF) encodes the largest Human herpesvirus 1 (HHV-1) protein, a 270 kDa polypeptide designated VP1/2, which is also a component of the virion tegument. A null mutation in the UL36 gene of herpes simplex virus type 1 results in accumulation of unenveloped DNA-filled capsids in the cytoplasm of infected cells PUBMED:1331541. The region which defines these sequences only covers a small central part of this large protein.

    \ ' '2931' 'IPR004958' '\

    This is a family of Herpes virus UL4 proteins, which are related to Human herpesvirus 1 (HHV-1), Human herpesvirus 2 (HHV-2), Equid herpesvirus 1 (Equine herpesvirus 1, EHV-1) 58, and Human herpesvirus 3 VZV-32 56 proteins.

    \ ' '2932' 'IPR007764' '\ UL43 genes are expressed with true-late (gamma2) kinetics and have been identified as a virion tegument component PUBMED:12029146. Studies suggest that the N-terminal sequences target UL43 to protein aggregates and that C-terminal sequences are important for incorporation into particles.\ ' '2933' 'IPR005051' '\ The UL46 protein (VP11/12) is\ produced in the late phase of Herpes virus infection in a manner highly dependent on viral DNA synthesis, and is mainly distributed at the edge of the nucleus in the cytoplasm. It is a tegument phosphoprotein reported to modulate the activity of UL48 (anti-TNF) protein.\ ' '2934' 'IPR005029' '\

    The herpes simplex virus type 1 gene UL47 encodes the tegument proteins referred to collectively as VP13/14, which are believed to be differentially modified forms of the same protein. These proteins have been show to target to the nucleus. The function of this family is unknown but it contains a number of Herpesviridae proteins.

    \ ' '2935' 'IPR006908' '\ This is a family of herpesvirus UL49 tegument proteins. It was shown that interactions between herpesvirus envelope and tegument proteins may play a role in secondary envelopment during herpesvirus virion maturation.\ \ \ ' '2936' 'IPR007625' '\

    UL51 protein is a virion protein. In Suid herpesvirus 1 (Pseudorabies virus), UL51 () was identified as a component of the capsid PUBMED:9188640. In Human herpesvirus 1 (HHV-1) there is evidence for post-translational modification of UL51 PUBMED:9880018.

    \ ' '2937' 'IPR007622' '\ In infected cells, UL55 is associated with the nuclear matrix, and found adjacent to compartments containing the capsid protein ICP35. UL55 was not detected in assembled virions. It is thought that UL55 may play a role in virion assembly or maturation PUBMED:9714248.\ ' '2938' 'IPR007620' '\ In herpes simplex virus type 2, UL56 is thought to be a tail-anchored type II membrane protein involved in vesicular trafficking. The C-terminal hydrophobic region is required for association with the cytoplasmic membrane, and the N-terminal proline-rich region is important for the translocation of UL56 to the Golgi apparatus and cytoplasmic vesicles PUBMED:12050385.\ ' '2939' 'IPR002660' '\ This family consists of various proteins from the Herpesviridae that are similar to Human herpesvirus 1 (HHV-1) UL6 virion protein. UL6 is essential for cleavage and packaging of the viral genome PUBMED:8955060.\ ' '2940' 'IPR002600' '\ This family consists of various functionally undefined proteins\ from the herpesviridae and UL7 from Bovine herpesvirus 1 PUBMED:8551568, PUBMED:7793062.\ UL7 is not essential for virus replication in\ cell culture, and is found localized in the cytoplasm of\ infected cells accumulated around the nucleus\ but could not be detected in purified virions PUBMED:8551568. \ Members of the herpesviridae have a dsDNA genome and do\ not have a RNA stage during there replication.\ ' '2941' 'IPR005211' '\

    This family groups together a number of viral proteins: BLRF1, U46, 53, and UL73, collectively known as glycoprotein N. The UL73-like envelope glycoproteins, which associates in a high molecular mass complex with its counterpart, gM, induce neutralizing antibody responses in the host. These glycoprotein are highly polymorphic, particularly in the N-terminal region PUBMED:11602789.

    \ ' '2942' 'IPR002690' '\ This family consist of various capsid proteins from members of the Herpesviridae. The capsid protein VP23 in Human herpesvirus 1 (HHV-1) (Human herpes simplex virus 1) forms a triplex together with VP19C these fit between and link together adjacent capsomers as formed by VP5 and VP26 PUBMED:10400780. VP3 along with the scaffolding proteins helps to form normal capsids by defining the curvature of the shell and size of the particle PUBMED:10400780.\ ' '2943' 'IPR004999' '\

    The family is the capsid assembly protein, which binds DNA and may be involved in anchoring DNA in the capsid.

    \ ' '2944' 'IPR000361' '\

    The proteins in this entry are variously annotated as iron-sulphur cluster insertion protein or Fe/S biogenesis protein. They appear to be involved in Fe-S cluster biogenesis. This family includes IscA, HesB, YadR and YfhF-like proteins. The hesB gene is expressed only under nitrogen fixation conditions PUBMED:10217509. IscA, an 11 kDa member of the hesB family of proteins, binds iron and [2Fe-2S] clusters, and participates in the biosynthesis of iron-sulphur proteins. IscA is able to bind at least 2 iron ions per dimer PUBMED:15050828. \ \ Other members of this family include various hypothetical proteins that also contain the NifU-like domain () suggesting that they too are able to bind iron and are involved in Fe-S cluster biogenesis. The HesB family are found in species as divergent as Homo sapiens (Human) and Haemophilus influenzae suggesting that these proteins are involved in basic cellular functions PUBMED:8875867.

    \ ' '2945' 'IPR003384' '\ The Hepatitis E virus(HEV) genome is a single-stranded, positive-sense RNA molecule of approximately 7.5 kb PUBMED:10449466. Three open reading frames (ORF) were identified within the HEV genome: ORF1 encodes nonstructural proteins, ORF2 encodes the putative structural protein(s), and ORF3 encodes a protein of unknown function. ORF2 contains a consensus signal peptide sequence at its amino terminus and a capsid-like region with a high content of basic amino acids similar to that seen with other virus capsid proteins PUBMED:1926770.\ ' '2946' 'IPR003479' '\ The major capsid protein of the adenovirus strain is also known as a hexon. This is a family of hexon-associated proteins (protein IIIa).\ ' '2947' 'IPR001312' '\

    Hexokinase is an important enzyme that catalyses the ATP-dependent conversion of aldo- and keto-hexose sugars to the hexose-6-phosphate (H6P). The enzyme can catalyse this reaction on glucose, fructose, sorbitol and glucosamine, and as such is the first step in a number of metabolic pathways PUBMED:1783373. The addition of a phosphate group to the sugar acts to trap it in a cell, since the negatively charged phosphate cannot easily traverse the plasma membrane.

    \ \

    The enzyme is widely distributed in eukaryotes. There are three isozymes of hexokinase in yeast (PI, PII and glucokinase): isozymes PI and PII phosphorylate both aldo- and keto-sugars; glucokinase is specific for aldo-hexoses. All three isozymes contain two domains PUBMED:1783373. Structural studies of yeast hexokinase reveal a well-defined catalytic pocket that binds ATP and hexose, allowing easy transfer of the phosphate from ATP to the sugar PUBMED:10749890. Vertebrates contain four hexokinase isozymes, designated I to IV, where types I to III contain a duplication of the two-domain yeast-type hexokinases. Both the N- and C-terminal halves bind hexose and H6P, though in types I an III only the C-terminal half supports catalysis, while both halves support catalysis in type II. The N-terminal half is the regulatory region. Type IV hexokinase is similar to the yeast enzyme in containing only the two domains, and is sometimes incorrectly referred to as glucokinase.

    \ \

    The different vertebrate isozymes differ in their catalysis, localisation and regulation, thereby contributing to the different patterns of glucose metabolism in different tissues PUBMED:12756287. Whereas types I to III can phosphorylate a variety of hexose sugars and are inhibited by glucose-6-phosphate (G6P), type IV is specific for glucose and shows no G6P inhibition. Type I enzyme may have a catabolic function, producing H6P for energy production in glycolysis; it is bound to the mitochondrial membrane, which enables the coordination of glycolysis with the TCA cycle. Types II and III enzyme may have anabolic functions, providing H6P for glycogen or lipid synthesis. Type IV enzyme is found in the liver and pancreatic beta-cells, where it is controlled by insulin (activation) and glucagon (inhibition). In pancreatic beta-cells, type IV enzyme acts as a glucose sensor to modify insulin secretion. Mutations in type IV hexokinase have been associated with diabetes mellitus.

    \ ' '2948' 'IPR001312' '\

    Hexokinase is an important enzyme that catalyses the ATP-dependent conversion of aldo- and keto-hexose sugars to the hexose-6-phosphate (H6P). The enzyme can catalyse this reaction on glucose, fructose, sorbitol and glucosamine, and as such is the first step in a number of metabolic pathways PUBMED:1783373. The addition of a phosphate group to the sugar acts to trap it in a cell, since the negatively charged phosphate cannot easily traverse the plasma membrane.

    \ \

    The enzyme is widely distributed in eukaryotes. There are three isozymes of hexokinase in yeast (PI, PII and glucokinase): isozymes PI and PII phosphorylate both aldo- and keto-sugars; glucokinase is specific for aldo-hexoses. All three isozymes contain two domains PUBMED:1783373. Structural studies of yeast hexokinase reveal a well-defined catalytic pocket that binds ATP and hexose, allowing easy transfer of the phosphate from ATP to the sugar PUBMED:10749890. Vertebrates contain four hexokinase isozymes, designated I to IV, where types I to III contain a duplication of the two-domain yeast-type hexokinases. Both the N- and C-terminal halves bind hexose and H6P, though in types I an III only the C-terminal half supports catalysis, while both halves support catalysis in type II. The N-terminal half is the regulatory region. Type IV hexokinase is similar to the yeast enzyme in containing only the two domains, and is sometimes incorrectly referred to as glucokinase.

    \ \

    The different vertebrate isozymes differ in their catalysis, localisation and regulation, thereby contributing to the different patterns of glucose metabolism in different tissues PUBMED:12756287. Whereas types I to III can phosphorylate a variety of hexose sugars and are inhibited by glucose-6-phosphate (G6P), type IV is specific for glucose and shows no G6P inhibition. Type I enzyme may have a catabolic function, producing H6P for energy production in glycolysis; it is bound to the mitochondrial membrane, which enables the coordination of glycolysis with the TCA cycle. Types II and III enzyme may have anabolic functions, providing H6P for glycogen or lipid synthesis. Type IV enzyme is found in the liver and pancreatic beta-cells, where it is controlled by insulin (activation) and glucagon (inhibition). In pancreatic beta-cells, type IV enzyme acts as a glucose sensor to modify insulin secretion. Mutations in type IV hexokinase have been associated with diabetes mellitus.

    \ ' '2949' 'IPR005212' '\

    This domain occurs in a range of proteins from antibiotic production pathways. These include the gra-ORF27 product that probably functions at an early step, most likely as a dTDP-4-keto-6- deoxyglucose-2,3-dehydratase PUBMED:9831526. Its homologues include dnmT from the daunorubicin biosynthetic gene cluster in S. peucetius PUBMED:8955419, a similar gene from the daunomycin biosynthetic cluster in Streptomyces sp. (strain 5) PUBMED:8655529, eryBVI from the erythromycin cluster in S. erythraea and snoH from the nogalamycin cluster in S. nogalater. This domain is a 200 amino acid long region, which may be a structural unit, that occurs twice within the proteins that contain it.

    \ ' '2950' 'IPR005708' '\

    Alkaptonuria (AKU), a rare hereditary disorder, was the first disease to be interpreted as an inborn error of metabolism. The\ deficiency causes homogentisic aciduria, ochronosis, and arthritis. AKU patients are deficient for homogentisate 1,2 dioxygenase (), the enzyme that mediates the conversion of homogentisate to maleylacetoacetate; a step in the catabolism of both tyrosine and phenylalanine. \

    \ \ ' '2951' 'IPR000320' '\

    This domain identifies a group of sequences which belong to the MEROPS peptidase family C46 (clan CH). The type example is the hedgehog protein from Drosophila melanogaster (Fruit fly) which self-processes by a one-time cysteine dependent self cleavage.

    \ \ \ \

    Hedgehog is a family of secreted signal molecules required\ for embryonic cell differentiation. members of this family are\ composed of two domains. These proteins are autocatalytically cleaved by the\ C-terminal domain . This family\ is the N-terminal domain that is responsible for both local and long-range\ signalling activities.

    \ \

    The structure of this domain is known PUBMED:7477329 and reveals a tetrahedrally coordinated zinc ion that appears to be structurally\ analogous to the zinc coordination sites of zinc hydrolases, such as\ thermolysin and carboxypeptidase A. This putative catalytic site\ represents a distinct activity from the autoprocessing activity that\ resides in the carboxy-terminal domain.

    \ ' '2952' 'IPR003265' '\

    Endonuclease III () is a DNA repair enzyme which removes a number of damaged pyrimidines from DNA via its glycosylase activity and also cleaves the phosphodiester backbone at apurinic / apyrimidinic sites via a beta-elimination mechanism PUBMED:7773744, PUBMED:9032058. The structurally related DNA glycosylase MutY\ recognises and excises the mutational intermediate 8-oxoguanine-adenine mispair PUBMED:1328155. The 3-D structures of Escherichia coli endonuclease III PUBMED:1411536 and catalytic domain of MutY PUBMED:9846876 have been determined. The\ structures contain two all-alpha domains: a sequence-continuous, six-helix domain (residues 22-132) and a Greek-key,\ four-helix domain formed by one N-terminal and three C-terminal helices (residues 1-21 and 133-211) together with the\ [Fe4S4] cluster. The cluster is bound entirely within the C-terminal loop by four cysteine residues with a ligation pattern\ Cys-(Xaa)6-Cys-(Xaa)2-Cys-(Xaa)5-Cys which is distinct from all other known Fe4S4 proteins. This structural motif is\ referred to as a [Fe4S4] cluster loop (FCL) PUBMED:7664751. Two DNA-binding motifs have been proposed, one at either end of the\ interdomain groove: the helix-hairpin-helix (HhH) and FCL motifs (see ). The primary role of the iron-sulphur cluster appears to\ involve positioning conserved basic residues for interaction with the DNA phosphate backbone by forming the loop of\ the FCL motif PUBMED:7664751, PUBMED:10900127.

    \ \

    The HhH-GPD domain gets its name from its hallmark helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This domain is found in a diverse range of structurally related DNA repair proteins that include: endonuclease III, and DNA glycosylase MutY, an A/G-specific adenine glycosylase. Both of these enzymes have a C-terminal iron-sulphur cluster loop (FCL). The methyl-CPG binding protein (MBD4) also contain a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II , 8-oxoguanine DNA glycosylases and other members of the AlkA family.

    \ ' '2953' 'IPR005507' '\ The proteins in this family are poorly characterised, but an investigation PUBMED:11596096 has indicated that the immediate early protein is required for the down-regulation of MHC class I expression in dendritic cells. Human herpesvirus 6 immediate early protein is also referred to as U90.\ ' '2954' 'IPR004792' '\ This is a family of conserved hypothetical proteins that may include proteins with a dinucleotide-binding motif (Rossman fold), including oxidoreductases and dehydrogenases.\ ' '2955' 'IPR007667' '\ This is a family of proteins thought to be involved in the response to hypoxia. Family members mostly come from diverse eukaryotic organisms however eubacterial members have been identified. This region is found at the N terminus of the member proteins which are predicted to be transmembrane PUBMED:11172064.\ ' '2956' 'IPR000170' '\

    High potential iron-sulphur proteins (HiPIP) PUBMED:1917989, PUBMED:1317860 are a specific class of high-redox potential 4Fe-4S ferredoxins that functions in anaerobic electron transport and which occurs commonly in purple photosynthetic bacteria and in other bacteria, such as Paracoccus denitrificans and Thiobacillus ferrooxidans PUBMED:14562962.

    \ \

    HiPIPs seem to react by oxidation of [4Fe-4S]2+ to [4Fe-4S]3+

    \ \

    The HiPIPs are small proteins which show significant variation in their sequences, their sizes (from 63 to 85 amino acids), and in their oxidation- reduction potentials. As shown in the following schematic representation the iron-sulphur cluster is bound by four conserved cysteine residues.

    \ \
    \
                               [4Fe-4S cluster]\
                               | |       |     |\
            xxxxxxxxxxxxxxxxxxxCxCxxxxxxxCxxxxxCxxxx\
    \
    \'C\': conserved cysteine involved in the binding of the iron-sulphur cluster.\
    
    \ ' '2957' 'IPR000429' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    The group of proteins belongs to the hirudin family; they are proteinase inhibitors belongs to MEROPS inhibitor family I14, clan IM; they inhibit serine peptidases of the S1 family () PUBMED:14705960.

    \ \ \ \

    Hirudin is a potent thrombin inhibitor secreted by the salivary glands of\ the Hirudinaria manillensis (Buffalo leech) and Hirudo medicinalis (Medicinal leech) PUBMED:3513162. \ It forms a stable non-covalent complex with alpha-thrombin, thereby abolishing its ability to cleave \ fibrinogen. \ The structure of hirudin has been solved by NMR PUBMED:2567183, and the structure \ of a recombinant hirudin-thrombin complex has been determined by X-ray\ crystallography to 2.3A PUBMED:2374926. Hirudin consists of an N-terminal globular\ domain and an extended C-terminal domain. Residues 1-3 form a parallel beta-\ strand with residues 214-217 of thrombin, the nitrogen atom of residue 1\ making a hydrogen bond with the Ser195 O gamma atom of the catalytic site.\ The C-terminal domain makes numerous electrostatic interactions with an\ anion-binding exosite of thrombin, while the last five residues are in\ a helical loop that forms many hydrophobic contacts PUBMED:2374926.

    \ ' '2958' 'IPR002970' '\ The lipocalins are a diverse, interesting, yet poorly understood family of \ proteins composed, in the main, of extracellular ligand-binding proteins\ displaying high specificity for small hydrophobic molecules PUBMED:2580349, PUBMED:8761444. Functions\ of these proteins include transport of nutrients, control of cell regulation, pheromone transport, cryptic colouration and the enzymatic synthesis\ of prostaglandins.\

    \ The crystal structures of several lipocalins have been solved and show a \ novel 8-stranded anti-parallel beta-barrel fold well conserved within the\ family. Sequence similarity within the family is at a much lower level and\ would seem to be restricted to conserved disulphides and 3 motifs, which\ form a juxtaposed cluster that may act as a common cell surface receptor\ site PUBMED:8761444. By contrast, at the more variable end of the fold are found an \ internal ligand binding site and a putative surface for the formation of \ macromolecular complexes PUBMED:8573354. The anti-parallel beta-barrel fold is also\ exploited by the fatty acid-binding proteins (which function similarly by\ binding small hydrophobic molecules), by avidin and the closely related\ metalloprotease inhibitors, and by triabin. Similarity at the sequence\ level, however, is less obvious, being confined to a single short \ N-terminal motif.\ The lipocalin family can be subdivided into kernal and outlier sets. The\ kernal lipocalins form the largest self-consistent group, comprising the subfamily of tick histamine-binding proteins. The outlier lipocalins form several smaller distinct subgroups: \ the OBPs, the von Ebner\'s gland proteins, alpha-1-acid glycoproteins, \ tick histamine binding proteins and the nitrophorins.

    \

    The tick histamine binding proteins are the most recently identified set of \ outlier lipocalins. The structure of one tick histamine binding protein has\ been solved PUBMED:10360182 and has shown the proteins to have the characteristic \ lipocalin fold but without any appreciable sequence similarity. The tick\ histamine binding proteins are secreted into the saliva of the ixodid tick \ Rhipicephalus appendiculatus and share functional similarity with the \ nitrophorins, sequestering histamine at the wound site. Because the tick\ histamine binding proteins outcompete histamine receptors, they are able to\ overcome host inflammatory and immune responses. This enables the ticks to\ feed for extended periods, lasting from days to several weeks, and are able \ to gorge themselves on large blood meals increasing their body mass 100 fold.\ Unlike nitrophorins, the tick proteins do not bind haem (or other cofactor),\ but ligate histamine directly in two rigid orthogonally-arranged binding \ sites, at opposing ends of the lipocalin anti-parallel beta-barrel, which\ have an unusually polar character.

    \ ' '2959' 'IPR006062' '\ Histidine is formed by several complex and distinct biochemical reactions catalysed by eight enzymes. Proteins\ involved in steps 4 and 6 of the histidine biosynthesis pathway are contained in one family. These enzymes are called\ His6 and His7 in eukaryotes and HisA and HisF in prokaryotes. HisA is a phosphoribosylformimino-5-aminoimidazole\ carboxamide ribotide isomerase (), involved in the fourth step of histidine biosynthesis. The bacterial HisF\ protein is a cyclase which catalyzes the cyclization reaction that produces D-erythro-imidazole glycerol phosphate during\ the sixth step of histidine biosynthesis. The yeast His7 protein is a bifunctional protein which catalyzes an \ amido-transferase reaction that generates imidazole-glycerol phosphate and 5-aminoimidazol-4-carboxamide. The latter is the\ ribonucleotide used for purine biosynthesis. The enzyme also catalyzes the cyclization reaction that produces \ D-erythro-imidazole glycerol phosphate, and is involved in the fifth and sixth steps in histidine biosynthesis.\ ' '2960' 'IPR012131' '\

    Histidinol dehydrogenase (HDH) catalyzes the terminal step in the biosynthesis of histidine in bacteria, fungi, and plants, the four-electron oxidation of L-histidinol to histidine.

    \

    In 4-electron dehydrogenases, a single active site catalyses 2 separate oxidation steps: oxidation of the substrate alcohol to an intermediate aldehyde; and oxidation of the aldehyde to the product acid, in this case His PUBMED:3533140. The reaction proceeds via a tightly- or covalently-bound inter-mediate, and requires the presence of 2 NAD molecules PUBMED:3533140. By contrast with most dehydrogenases, the substrate is bound before the NAD coenzyme PUBMED:3533140. A Cys residue has been implicated in the catalytic mechanism of the second oxidative step PUBMED:3533140.

    \

    In bacteria HDH is a single chain polypeptide; in fungi it is the C-terminal domain of a multifunctional enzyme which catalyzes three different steps of histidine biosynthesis; and in plants it is expressed as nuclear encoded protein precursor which is exported to the chloroplast PUBMED:2034659.

    \ \ \

    This group represents prokaryotic and plant histidinol dehydrogenases PUBMED:15299582, PUBMED:11842181.

    \ ' '2961' 'IPR007125' '\

    The core histones together with some other DNA binding proteins appear to form\ a superfamily defined by a common fold and distant sequence similarities PUBMED:7651829,\ PUBMED:9016552. Some proteins contain local\ homology domains related to the histone fold PUBMED:9305837.

    \ ' '2962' 'IPR001310' '\

    The Histidine Triad (HIT) motif, His-x-His-x-His-x-x (x, a\ hydrophobic amino acid) was identified as being highly conserved in a variety of organisms PUBMED:1472710. Crystal structure of rabbit Hint, purified as an adenosine and AMP-binding protein, showed that proteins in the HIT\ superfamily are conserved as nucleotide-binding proteins and that Hint homologues, which are found in all forms of life, are structurally related to Fhit homologues and GalT-related enzymes, which have more restricted phylogenetic profiles PUBMED:9164465. Hint homologues including rabbit Hint and yeast\ Hnt1 hydrolyse adenosine 5\' monophosphoramide substrates such as AMP-NH2 and\ AMP-lysine to AMP plus the amine product and function as positive regulators\ of Cdk7/Kin28 in vivo PUBMED:11805111. Fhit homologues are diadenosine polyphosphate hydrolases PUBMED:8794732 and function as tumour suppressors in human and mouse PUBMED:10758156 though the tumour suppressing function of Fhit does not depend on ApppA hydrolysis PUBMED:9576908. The third branch of the HIT superfamily, which includes\ GalT homologues, contains a related His-X-His-X-Gln motif and transfers\ nucleoside monophosphate moieties to phosphorylated second substrates rather\ than hydrolysing them PUBMED:12119013.

    \ ' '2963' 'IPR001801' '\

    The histone-like nucleoid-structuring (H-NS) protein belongs to a family of bacterial proteins that play a role in the\ formation of nucleoid structure and affect gene expression under certain conditions PUBMED:7875316.

    \ ' '2964' 'IPR002732' '\

    This entry represents Holliday junction resolvases (hjc gene) and related proteins, primarily from archaeal species PUBMED:10430863. The Holliday junction is an essential intermediate of homologous recombination. Holliday junctions are four-stranded DNA complexes that are formed during recombination and related DNA repair events. In the presence of divalent cations, these junctions exist predominantly as the stacked-X form in\ which the double-helical segments are coaxially stacked and twisted by 60 degrees in a right-handed direction across the junction cross-over. In this structure, the stacked arms resemble two adjacent double-helices, but are linked at the junction by two common strands that cross-over between the duplexes PUBMED:12126623. During homologous recombination, genetic information is physically exchanged between parental DNAs via crossing single strands of the same polarity within the four-way Holliday structure. This process is terminated by the endonucleolytic activity of resolvases, which convert the four-way DNA back to two double strands.

    \ ' '2965' 'IPR000417' '\ Thiamine pyrophosphate (TPP), a required cofactor for many enzymes in the \ cell, is synthesised de novo in Salmonella typhimurium PUBMED:. Five kinase \ activities have been implicated in TPP synthesis, which involves joining \ a 4-methyl-5-(beta-hydroxyethyl)thiazole (THZ) moiety and a 4-amino-5-\ hydroxymethyl-2-methylpyrimidine (HMP) moiety PUBMED:, PUBMED:7982968. \ THZ kinase () activity is involved in the salvage synthesis of \ TH-P from the thiazole: \ \ Hydroxyethylthiazole kinase expression is regulated at the mRNA level by\ intracellular thiamin pyrophosphate PUBMED:7982968.\ ' '2966' 'IPR005565' '\

    Haemolysin (HlyA) and related toxins are secreted across both the cytoplasmic and outer membranes of Gram-negative bacteria in a process which proceeds without a periplasmic intermediate. HlyA is directed by an uncleaved C-terminal targeting signal and the HlyD and HlyB translocator proteins PUBMED:1419114.

    \ ' '2967' 'IPR003996' '\

    Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior PUBMED:1558765. Four principal exotoxin secretion systems have been described. In the type II and IV secretion systems, toxins are first exported to the periplasm by way of a cleaved N-terminal signal sequence; a second set of proteins is used for extracellular transport (type II), or the C-terminus of the exotoxin itself is used (type IV). Type III secretion involves at least 20 molecules that assemble into a needle; effector proteins are then translocated through this without need of a signal sequence. In the Type I system, a complete channel is formed through both membranes, and the secretion signal is carried on the C-terminus of the exotoxin.

    \

    The RTX (repeats in toxin) family of cytolytic toxins belong to the Type I \ secretion system, and are important virulence factors in Gram-negative bacteria. As well as the C-terminal signal sequence, several glycine-rich\ repeats are also found. These are essential for binding calcium, and are critical for the biological activity of the secreted toxins PUBMED:8800842. All RTX toxin operons exist in the order rtxCABD, RtxA protein being the structural\ component of the exotoxin, both RtxB and D being required for its export from the bacterial cell; RtxC is an acyl-carrier-protein-dependent acyl- modification enzyme, required to convert RtxA to its active form PUBMED:10470043.

    \

    Escherichia coli haemolysin (HlyA) is often quoted as the model for RTX \ toxins. Recent work on its relative rtxC gene product HlyC PUBMED:9521785 has revealed that it provides the acylation aspect for post-translational modification of two internal lysine residues in the HlyA protein. Other residues, including His23 and two conserved tyrosine residues, also appear to be important PUBMED:10413532.

    \ ' '2968' 'IPR006143' '\

    Gram-negative bacteria produce a number of proteins which are secreted into the growth medium by a mechanism that does not require a cleaved N-terminal signal sequence. These proteins, while having different functions, require the help of two or more proteins for their secretion across the cell envelope. These secretion proteins include members belonging to the ABC transporter family (see the relevant entry ) and a protein belonging to a family which includes the following members PUBMED:2249654, PUBMED:2184029, PUBMED:1622271, PUBMED:1427098, PUBMED:9301333:

    \ \

    The secretion proteins are evolutionary related and consist of from 390 to 480 amino acid residues. They seem to be anchored in the inner membrane by a N-terminal transmembrane region. Their exact role in the secretion process is not yet known.

    \ ' '2969' 'IPR004889' '\

    The N(5),N(10)-methylenetetrahydromethanopterin dehydrogenase system of methanogenic archaea is composed of H2-forming methylenetetrahydromethanopterin dehydrogenase (Hmd, represented by this entry) and F420-dependent methylenetetrahydromethanopterin dehydrogenase () PUBMED:8215796, PUBMED:9151968. Hmd is an iron-sulphur-cluster-free enzyme that contains an intrinsic CO ligand bound to iron PUBMED:15506791. This entry contains two distinct subgroups: one has been experimentally characterised as H2-forming N(5),N(10)-methenyltetrahydromethanopterin dehydrogenase (Hmd or HmdI), and the other one contains isozymes that have not been experimentally characterised (HmdII and HmdIII). Because all three isozyme forms are present in each of the corresponding sequenced genomes, it has been suggested that HmdII and HmdIII may not exhibit Hmd activity and may have a different biological function PUBMED:11081790.

    \ ' '2970' 'IPR000079' '\

    High mobility group (HMG) proteins constitute a family of relatively low molecular weight non-histone components in chromatin. HMG14 and HMG17 are highly-similar proteins of about 100 amino acid residues; the sequence of chicken HMG14 is almost as similar to chicken HMG17 as it is to mammalian HMG14 polypeptides PUBMED:3384337. The proteins bind to the inner side of the nucleosomal DNA, altering the interaction between the DNA and the histone octamer. It is thought that they may be involved in the process that confers specific chromatin conformations to transcribable regions in the genome PUBMED:3754870.

    \

    The SMART signature describes a nucleosomal binding domain, which facilitates binding of proteins to nucleosomes in chromatin. The domain is most commonly found in the high mobility group (HMG) proteins, HMG14 and HMG17, however, it is also found in other proteins which bind to nucleosomes, e.g. NBP-45. NBP-45 is a nucleosomal binding protein, first identified in mice PUBMED:10692437, which is related to HMG14 and HMG17. NBP-45 binds specifically to nucleosome core particles, and can function as a transcriptional activator. These findings led to the suggestion that this domain, common to NBP-45, HMG14 and HMG17 is responsible for binding of the proteins to nucleosomes in chromatin.

    \ ' '2971' 'IPR000910' '\

    High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair PUBMED:11497996.

    \

    The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins PUBMED:12920151; LEF1 lymphoid enhancer binding factor 1 PUBMED:10890911; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor PUBMED:11779632; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.

    \ ' '2972' 'IPR013528' '\

    Synonym(s): 3-hydroxy-3-methylglutaryl-coenzyme A synthase, HMG-CoA synthase.

    \ \

    Hydroxymethylglutaryl-CoA synthase () catalyses the condensation of acetyl-CoA with acetoacetyl-CoA to produce HMG-CoA and CoA, the second reaction in the mevalonate-dependent isoprenoid biosynthesis pathway. HMG-CoA synthase contains an important catalytic cysteine residue that acts as a nucleophile in the first step of the reaction: the acetylation of the enzyme by acetyl-CoA (its first substrate) to produce an acetyl-enzyme thioester, releasing the reduced coenzyme A. The subsequent nucleophilic attack on acetoacetyl-CoA (its second substrate) leads to the formation of HMG-CoA PUBMED:15498869.

    \

    HMG-CoA synthase occurs in eukaryotes, archaea and certain bacteria PUBMED:15546978. In vertebrates, there are two isozymes located in different subcellular compartments: a cytosolic form that is the starting point of the mevalonate pathway (leads to cholesterol and other sterolic and isoprenoid compounds), and a mitochondrial form responsible for ketone body biosynthesis. HMG-CoA is also found in other eukaryotes such as insects, plants and fungi PUBMED:16640729. In bacteria, isoprenoid precursors are generally synthesised via an alternative, non-mevalonate pathway, however a number of Gram-positive pathogens utilise a mevalonate pathway involving HMG-CoA synthase that is parallel to that found in eukaryotes PUBMED:17128980, PUBMED:16245942.

    \ \

    This entry represents the N-terminal domain of HMG-CoA synthase enzymes from both eukaryotes and prokaryotes.

    \ ' '2973' 'IPR000891' '\

    \ Pyruvate carboxylase () (PC), a member of the biotin-dependent\ enzyme family, is involved in the gluconeogenesis by mediating the\ carboxylation of pyruvate to oxaloacetate. Biotin-dependent carboxylase\ enzymes perform a two step reaction. Enzyme-bound biotin is first carboxylated\ by bicarbonate and ATP and the carboxyl group temporarily bound to biotin is\ subsequently transferred to an acceptor substrate such as pyruvate PUBMED:11851389. PC has\ three functional domains: a biotin carboxylase (BC) domain,\ a carboxyltransferase (CT) domain which perform the second part of the\ reaction and a biotinyl domain PUBMED:7780827, PUBMED:10229653. The mechanism by which\ the carboxyl group is transferred from the carboxybiotin to the pyruvate is not\ well understood.\

    \

    \ The pyruvate carboxyltransferase domain is also found in other pyruvate\ binding enzymes and acetyl-CoA dependent enzymes suggesting that this domain\ can be associated with different enzymatic activities.

    \ \

    This domain is found towards the N-terminal region of various aldolase enzymes. This N-terminal TIM barrel domain PUBMED:12764229 interacts with the C-terminal domain. The C-terminal DmpG_comm domain () is thought to promote heterodimerisation with members of to form a bifunctional aldolase-dehydrogenase PUBMED:12764229.

    \ \ ' '2974' 'IPR000665' '\

    This entry represents the haemagglutinin-neuraminidase (HN) glycoprotein found in a variety of paramyxoviruses (negative-stranded RNA viruses), including Mumps virus, Human parainfluenza virus 3, and the avian pathogen Newcastle disease virus. The paramyxoviruses have two surface glycoproteins, HN and a fusion protein (F). HN is a multi-functional protein with three distinct functions: a receptor-binding (haemagglutinin) activity, a receptor-destroying (neuraminidase) activity, and a membrane fusion activity that fuses the viral envelope to the host cell membrane in order to infect the cell. The fusion activity involves an interaction between HN and the fusion protein. In other viruses, such as influenza A and B viruses, haemagglutinin and neuraminidase occur as separate glycoproteins.

    \

    The haemagglutinin-neuraminidase glycoprotein has a six-bladed beta-propeller structure, and bears structural similarity to influenza A and B virus neuraminidase, bacterial neuraminidase, trypanosomal neuraminidase and transialidase PUBMED:15016893, PUBMED:14729348.

    \

    More information about haemagglutinin proteins can be found at Protein of the Month: Bird Flu, Haemagglutinin PUBMED:.

    \ ' '2975' 'IPR006899' '\ This domain consists of the N terminus of homeobox-containing transcription factor HNF-1. This region contains a dimerisation sequence PUBMED:1988016 and an acidic region that may be involved in transcription activation. Mutations and the common Ala/Val 98 polymorphism in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3) PUBMED:9133564.\ ' '2976' 'IPR006898' '\ This domain consists of an alternative C terminus of homeobox-containing transcription factor HNF-1, found in the HNF-1A isoform. Different isoforms of HNF-1 are generated by the differential use of polyadenylation sites and by alternative splicing. The C-terminal region of HNF-1 is responsible for the activation of transcription, and HNF-1A, which has this C-terminal extension, transactivates less well than the B and C isoforms PUBMED:7900999. Mutations and polymorphisms in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3) PUBMED:9133564.\ ' '2977' 'IPR006897' '\ This domain consists of a region found within the alpha isoform and at the C terminus of the beta isoform of the homeobox-containing transcription factor of HNF-1. Different isoforms of HNF-1 are generated by the differential use of polyadenylation sites and by alternative splicing. The C-terminal region of HNF-1 is responsible for the activation of transcription PUBMED:7900999. Mutations and polymorphisms in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3) PUBMED:9133564.\ ' '2978' 'IPR002711' '\ HNH endonuclease is found in bacteria and viruses PUBMED:9358175, PUBMED:7920259, PUBMED:7817395. This family includes pyocins, colicins and anaredoxins.\ ' '2979' 'IPR000021' '\ The hok/gef family of Gram-negative bacterial proteins are toxic to cells\ when over-expressed, killing the cells from within by interfering with a\ vital function in the cell membrane PUBMED:3070354. Some family members (flm) increase the stability of unstable RNA PUBMED:3070354, some (pnd) induce the degradation of stable RNA at higher than optimum growth temperatures PUBMED:2465777, while others affect the release of cellular magnesium by membrane alterations PUBMED:2465777. The\ proteins are short (50-70 residues), consisting of an N-terminal hydrophobic (possibly membrane spanning) domain, and a C-terminal periplasmic region, which contains the toxic domain. The C-terminal region contains a conserved cysteine residue that mediates homo-dimerisation in the gef protein, although dimerisation is not necessary for the toxic effect PUBMED:1943700.\ ' '2980' 'IPR006493' '\

    This family is represented by BlyA, a small holin found in Borrelia circular plasmids that prove to be temperate phage PUBMED:11073925. This protein was previously proposed to be a haemolysin. BlyA is small (67 residues) and contains two largely hydrophobic helices and a highly charged C terminus.

    \ ' '2981' 'IPR007869' '\ Homing endonucleases are encoded by mobile DNA elements that are found inserted within host genes in all domains of life. The crystal structure of the homing nuclease PI-Sce PUBMED:12219083 revealed two domains: an endonucleolytic centre resembling the C-terminal domain of Drosophila melanogaster Hedgehog protein, and a second domain containing the protein-splicing active site. This domain corresponds to the C-terminal domain, which has structural similarity to .\ ' '2982' 'IPR007868' '\ Homing endonucleases are encoded by mobile DNA elements that are found inserted within host genes in all domains of life. The crystal structure of the homing nuclease PI-Sce PUBMED:12219083 revealed two domains: an endonucleolytic centre resembling the C-terminal domain of Drosophila melanogaster Hedgehog protein, and a second domain containing the protein-splicing active site. This domain corresponds to the protein-splicing domain.\ ' '2983' 'IPR001356' '\ The homeobox domain was first identified in a number of drosophila homeotic and \ segmentation proteins, but is now known to be well-conserved in many other animals, \ including vertebrates PUBMED:2568852, PUBMED:1357790, PUBMED:. Hox genes encode homeodomain-containing transcriptional regulators that operate differential genetic programs along the anterior-posterior axis of animal bodies PUBMED:12445403. The domain binds DNA through a \ helix-turn-helix (HTH) structure. The HTH motif is characterised by two alpha-helices, \ which make intimate contacts with the DNA and are joined by a short turn. The second \ helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which \ occur between specific side chains and the exposed bases and thymine methyl groups within \ the major groove of the DNA PUBMED:. The first helix helps to stabilise the \ structure.

    The motif is very similar in sequence and structure in a wide range of \ DNA-binding proteins (e.g., cro and repressor proteins, homeotic proteins, etc.). One of \ the principal differences between HTH motifs in these different proteins arises from the \ stereo-chemical requirement for glycine in the turn which is needed to avoid steric \ interference of the beta-carbon with the main chain: for cro and repressor proteins the \ glycine appears to be mandatory, while for many of the homeotic and other DNA-binding \ proteins the requirement is relaxed.

    \ ' '2984' 'IPR001342' '\

    Bacteria, plants and fungi metabolise aspartic acid to produce four amino acids - lysine, threonine, methionine and isoleucine - in a series of reactions known as the aspartate pathway. Additionally, several important metabolic intermediates are produced by these reactions, such as diaminopimelic acid, an essential component of bacterial cell wall biosynthesis, and dipicolinic acid, which is involved in sporulation in Gram-positive bacteria. Members of the animal kingdom do not posses this pathway and must therefore acquire these essential amino acids through their diet. Research into improving the metabolic flux through this pathway has the potential to increase the yield of the essential amino acids in important crops, thus improving their nutritional value. Additionally, since the enzymes are not present in animals, inhibitors of them are promising targets for the development of novel antibiotics and herbicides. For more information see PUBMED:11352712.

    \

    Homoserine dehydrogenase () catalyses the third step in the aspartate pathway; theNAD(P)-dependent reduction of aspartate beta-semialdehyde into homoserine PUBMED:8500624, PUBMED:8395899. Homoserine is an intermediate in the biosynthesis of threonine, isoleucine, and methionine. The enzyme can be found in a monofunctional form, in some bacteria and yeast, or a bifunctional form consisting of an N-terminal aspartokinase domain and a C-terminal homoserine dehydrogenase domain, as found in bacteria such as Escherichia coli and in plants. Structural analysis of the yeast monofunctional enzyme () indicates that the enzyme is a dimer composed of three distinct regions; an N-terminal nucleotide-binding domain, a short central dimerisation region, and a C-terminal catalytic domain PUBMED:10700284. The N-terminal domain forms a modified Rossman fold, while the catalytic domain forms a novel alpha-beta mixed sheet.

    \ \

    This entry represents the catalytic domain of homoserine dehydrogenase.

    \ ' '2985' 'IPR001400' '\

    Somatotropin is a hormone that plays an important role in growth control. It belongs to a family that includes choriomammotropin (lactogen), its placental analogue; prolactin, which promotes lactation in the mammary gland, and placental prolactin-related proteins; proliferin and proliferin related protein; and somatolactin from various fish PUBMED:, PUBMED:2765528, PUBMED:1993170, PUBMED:2790033.\ The 3D structure of bovine somatotropin has been predicted using a combination of heuristics and energy minimisation PUBMED:2021631.

    \ ' '2986' 'IPR000532' '\ A number of polypeptidic hormones, mainly expressed in the intestine or the pancreas, belong to a group of structurally related peptides PUBMED:3133967, PUBMED:3291691. Once such hormone, glucagon is widely distributed and produced in the alpha-cells of pancreatic islets PUBMED:4076759. It affects glucose metabolism in the liver PUBMED:6577439 by inhibiting glycogen synthesis, stimulating glycogenolysis and enchancing gluconeogenesis. It also increases mobilisation of glucose, free fatty acids and ketone bodies, which are metabolites produced in excess in diabetes mellitus. Glucagon is produced, like other peptide hormones, as part of a larger precursor (preproglucagon), which is cleaved to produce glucagon, glucagon-like protein I and glucagon-like protein II PUBMED:3260236. The structure of glucagon itself is fully conserved in all known mammalian species PUBMED:4076759. Other members of the structurally similar group include glicentin precursor, secretin, gastric inhibitory protein, vasoactive intestinal peptide (VIP), prealbumin, peptide HI-27 and growth hormone releasing factor.\ ' '2987' 'IPR001955' '\

    Pancreatic hormone (PP) PUBMED:6107857 is a peptide synthesized in pancreatic islets of Langherhans, which acts as a regulator of pancreatic and gastrointestinal functions.

    \

    The hormone is produced as a larger propeptide, which is enzymatically cleaved to yield the mature active peptide: this is 36 amino acids in length PUBMED:3031687 and has an amidated C terminus PUBMED:2599092. The hormone has a globular structure, residues 2-8 forming a left-handed poly-proline-II-like helix, residues 9-13 a beta turn, and 14-32 an alpha-helix,held close to the first helix by hydrophobic interactions PUBMED:3031687. Unlike glucagon, another peptide hormone, the structure of pancreatic peptide is preserved in aqueous solution PUBMED:2067973. Both N and C termini are required for activity: receptor binding and activation functions may reside in the N and C termini respectively PUBMED:3031687.

    \

    Pancreatic hormone is part of a wider family of active peptides that includes:\

    \ All these peptides are 36 to 39 amino acids long. Like most active peptides, their C-terminal is amidated and they are synthesized as larger protein precursors.

    \ ' '2988' 'IPR000981' '\ Oxytocin and vasopressin are nine-residue, structurally and functionally related neurohypophysial peptide \ hormones. Oxytocin mediates contraction of the smooth muscle of the uterus and mammary gland, while \ vasopressin has antidiuretic action on the kidney, and mediates vasoconstriction of the peripheral vessels \ PUBMED:3147712. In common with most active peptides, both hormones are synthesised as larger protein \ precursors that are enzymatically converted to their mature forms. Members of this family are found in birds, fish, reptiles and amphibians (mesotocin, isotocin, valitocin, glumitocin, aspargtocin, vasotocin, seritocin, asvatocin, phasvatocin), in worms (annetocin), octopi (cephalotocin), Locusta migratoria (Migratory locust) (locupressin or neuropeptide\ F1/F2) and in molluscs (conopressins G and S) PUBMED:7591488.\ ' '2989' 'IPR000981' '\ Oxytocin and vasopressin are nine-residue, structurally and functionally related neurohypophysial peptide \ hormones. Oxytocin mediates contraction of the smooth muscle of the uterus and mammary gland, while \ vasopressin has antidiuretic action on the kidney, and mediates vasoconstriction of the peripheral vessels \ PUBMED:3147712. In common with most active peptides, both hormones are synthesised as larger protein \ precursors that are enzymatically converted to their mature forms. Members of this family are found in birds, fish, reptiles and amphibians (mesotocin, isotocin, valitocin, glumitocin, aspargtocin, vasotocin, seritocin, asvatocin, phasvatocin), in worms (annetocin), octopi (cephalotocin), Locusta migratoria (Migratory locust) (locupressin or neuropeptide\ F1/F2) and in molluscs (conopressins G and S) PUBMED:7591488.\ ' '2990' 'IPR000476' '\ Glycoprotein hormones PUBMED:6267989, PUBMED:1445230 (or gonadotropins) are a family of proteins, which include the mammalian hormones follitropin (FSH), lutropin (LSH), thyrotropin\ (TSH) placental chorionic gonadotropins hCG and eCG PUBMED:6314263 and chorionic \ gonadotropin (CG), as well as at least two forms of fish\ gonadotropins. These hormones are central to the \ complex endocrine system that regulates normal growth, sexual development, \ and reproductive function PUBMED:6177696. The hormones LH, FSH and TSH are secreted\ by the anterior pituitary gland, while hCG and eCG are secreted by the \ placenta PUBMED:1713773. \ All these hormones consist of two glycosylated chains (alpha\ and beta). The alpha subunit is common to each protein dimer (well conserved within species, \ but differing between them PUBMED:6177696), and a unique beta subunit, which \ confers biological specificity PUBMED:6314263.\ The alpha chains are highly conserved proteins of about 100 amino acid\ residues which contain ten conserved cysteines all involved in disulphide\ bonds PUBMED:8202136, as shown in the following schematic representation.\
    \
                            +---------------------------+\
                +----------+|             +-------------|--+\
                |          ||             |             |  |\
            xxxxCxCxxxxxxCxCCxxxxxxxxxxxxxCCxxxxxxxxxxCxCxxCx\
                  |      |                 |          |\
                  +------|-----------------+          |\
                         |                            |\
                         +----------------------------+\
    \
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
    
    \ Intracellular levels of free alpha subunits are greater than those of the\ mature glycoprotein, implying that hormone assembly is limited by the\ appearance of the specific beta subunits, and hence that synthesis of alpha\ and beta is independently regulated PUBMED:6314263.\ ' '2991' 'IPR006711' '\ This domain constitutes the N-terminal of the paralogous homeobox proteins HoxA9, HoxB9, HoxC9 and HoxD9. The N-terminal region is thought to act as a transcription activation region. Activation may be by interaction with proteins such as Btg proteins, which are thought to recruit a multi-protein Ccr4-like complex PUBMED:10617598.\ ' '2992' 'IPR002718' '\

    \ Gram-negative bacterial outer membranes constitute a semi-permeable, size-\ dependent permeability barrier, for example to hydrolytic enzymes, \ detergents, dyes and hydrophobic anti-microbials. The outer membrane\ protein (OMP) profile of Helicobacter pylori differs from that of other\ Gram-negative bacteria, where the highly non-selective porins are absent and\ a number of less abundant protein species are observed PUBMED:9252185. OMPs from H. pylori have been identified as porins, gastric epithelial cell adhesins and Lewis B\ binding adhesins PUBMED:9430586. Extensive C-terminal sequence similarity between\ these OMPs has been used to define a much larger paralogous family.\

    \

    \ H. pylori is the causative agent of gastritis and peptic\ ulceration in humans. Numerous subtypes of OMPs have also been identified. Attempts have been made to construct recombinant vectors that are able\ to express these OMPs in order to develop a vaccine protecting against\ Hp infection and a diagnostic reagent kit to quickly detect infection. OMPs were chosen as possible targets of vaccine development as they are\ H. pylori specific, surface exposed and highly antigenic.\

    \ ' '2993' 'IPR003678' '\

    Helicobacter pylori is a causative agent of gastritis and peptic ulceration in humans. As the first step towards development of a vaccine against H. pylori infection, many attempts have been made to identify protective antigens. Potential targets for vaccine development are H. pylori-specific proteins that are surface-exposed and highly antigenic.

    \

    This family consists of putative outer membrane proteins from H. pylori.

    \ ' '2994' 'IPR005000' '\

    This family includes 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase () and 4-hydroxy-2-oxovalerate aldolase ().

    \ ' '2995' 'IPR007065' '\ These proteins are integral membrane proteins with four transmembrane spanning helices. The most conserved region of an alignment of the proteins is a motif HPP. The function of these proteins is uncertain but they may be transporters.\ ' '2996' 'IPR000550' '\

    All organisms require reduced folate cofactors for the synthesis of a variety of metabolites. Most microorganisms must synthesise folate de novo because they lack the active transport system of higher vertebrate cells which allows these organisms to use dietary folates. Enzymes involved in folate biosynthesis are therefore targets for a variety of antimicrobial agents such as trimethoprim or sulphonamides. 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase () (HPPK) catalyses the attachment of pyrophosphate to 6-hydroxymethyl-7,8-dihydropterin to form 6-hydroxymethyl-7,8-dihydropteridine pyrophosphate. This is the first step in a three-step pathway leading to 7,8 dihydrofolate. Bacterial HPPK (gene folK or sulD) PUBMED:1325970 is a protein of 160 to 270 amino acids. In the lower eukaryote Pneumocystis carinii, HPPK is the central domain of a multifunctional folate synthesis enzyme (gene fas) PUBMED:1313386.

    \ ' '2997' 'IPR011126' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    This entry represents the N-terminal region of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phosphorelay system in control of carbon catabolic repression in bacteria PUBMED:11904409. This kinase in unusual in that it recognises the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes PUBMED:11904409. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller. The blades are formed by two N-terminal domains each, and the compact central hub assembles the C-terminal kinase domains PUBMED:9570401.

    \ ' '2998' 'IPR008207' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms PUBMED:8868347, PUBMED:11406410. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation PUBMED:10426948, and CheA, which plays a central role in the chemotaxis system PUBMED:9989504. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water PUBMED:11145881. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.

    \

    HKs can be roughly divided into two classes: orthodox and hybrid kinases PUBMED:8029829, PUBMED:1482126. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK PUBMED:10966457. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.

    \

    This entry represents a domain present at the N terminus in proteins which undergo autophosphorylation. The group includes, the gliding motility regulatory protein from Myxococcus xanthus and a number of bacterial chemotaxis proteins.

    \ ' '2999' 'IPR002571' '\ In response to elevated temperature, both prokaryotic and eukaryotic cells increase expression of a small family of chaperones. The regulatory network that functions to control the transcription of the heat shock genes in bacteria includes unique structural motifs in the promoter region of these genes and the expression of alternate sigma factors. One of the conserved structural motifs, the inverted repeat CIRCE element, is found in the 5\' region of many heat shock operons PUBMED:8606155.\

    For Bacillus subtilis three classes of heat shock genes regulated by different mechanisms have been described. Regulation of class I heat shock genes (dnaK and groE operons) involves an inverted repeat (CIRCE element) which most probably serves as an operator for a repressor PUBMED:8576042.

    \ ' '3000' 'IPR006961' '\

    HrpZ (harpin elicitor) from the plant pathogen Pseudomonas syringae binds to lipid bilayers and forms a cation-conducting pore in vivo. This pore-forming activity may allow nutrient release or delivery of virulence factors during bacterial colonisation of host plants PUBMED:11134504.

    \ ' '3001' 'IPR003134' '\

    The cortactin or HS1 repeat is a tandem repeat of 37-amino acid actin-binding domains. The repeat is named after human cortactin and HS1, proteins involved in cytoskeletal rearrangements implicated in cell migration and apoptosis, respectively. Cortactin contains 6.5 tandem copies of the repeat and is conserved among metazoans, although e.g. insect cortactin and splice variants contain fewer copies. Hematopoietic lineage cell specific protein 1 (HS1) contains 3.5 tandem copies of the cortactin repeat and is mainly expressed in hematopoietic cells. Both cortactin and HS1 contain a C-terminal SH3 domain (). The cortactin repeat domain binds filamentous actin (F-actin) in proteins that modulate the assembly of the actin cytoskeleton. Secondary structure predictions indicate that the cortactin repeat could exhibit a helix-turn-helix structure PUBMED:12534372, PUBMED:15186216.

    \ ' '3002' 'IPR000232' '\ Heat shock factor (HSF) is a transcriptional activator of heat shock genes\ PUBMED:2257625: it binds specifically to heat shock promoter elements, which are\ palindromic sequences rich with repetitive purine and pyrimidine motifs PUBMED:2257625.\ Under normal conditions, HSF is a homo-trimeric cytoplasmic protein, but\ heat shock activation results in relocalisation to the nucleus PUBMED:1871105.\ Each HSF monomer contains one C-terminal and three N-terminal leucine zipper\ repeats PUBMED:1871106. Point mutations in these regions result in disruption of\ cellular localisation, rendering the protein constitutively nuclear PUBMED:1871105.\ Two sequences flanking the N-terminal zippers fit the consensus of a bi-\ partite nuclear localisation signal (NLS). Interaction between the N- and \ C-terminal zippers may result in a structure that masks the NLS sequences: following activation of HSF, these may then be unmasked, resulting in \ relocalisation of the protein to the nucleus PUBMED:1871106. The DNA-binding component\ of HSF lies to the N-terminus of the first NLS region, and is referred to\ as the HSF domain.\ ' '3003' 'IPR002068' '\

    Prokaryotic and eukaryotic organisms respond to heat shock or other\ environmental stress by inducing the synthesis of proteins collectively known\ as heat-shock proteins (hsp) PUBMED:2853609. Amongst them is a family of proteins with an\ average molecular weight of 20 Kd, known as the hsp20 proteins PUBMED:7925426. These\ seem to act as chaperones that can protect other proteins against heat-induced\ denaturation and aggregation. Hsp20 proteins seem to form large\ heterooligomeric aggregates. Structurally, this family is characterised by the presence of a conserved C-terminal domain of about 100 residues.

    \ ' '3004' 'IPR000397' '\ Hsp33 is a molecular chaperone, distinguished from all\ other known chaperones by its mode of functional regulation.\ Its activity is redox regulated. Hsp33 is a cytoplasmically\ localized protein with highly reactive cysteines that\ respond quickly to changes in the redox environment.\ Oxidizing conditions like H2O2 cause disulphide bonds\ to form in Hsp33, a process that leads to the activation\ of its chaperone function PUBMED:10025400.\ ' '3005' 'IPR013126' '\

    Heat shock proteins, Hsp70 chaperones help to fold many proteins. Hsp70 assisted folding involves repeated cycles of substrate binding and release. Hsp70 activity is ATP dependent. Hsp70 proteins are made up of two regions: the amino terminus is the ATPase domain and the carboxyl terminus is the substrate binding region PUBMED:9476895.

    \ \

    Hsp70 proteins have an average molecular weight of 70 kDa PUBMED:2686623, PUBMED:2944601, PUBMED:3282176. In most species,there are many proteins that belong to the hsp70 family. Some of these are only expressed under stress conditions (strictly inducible), while some are present in cells under normal growth conditions and are not heat-inducible (constitutive or cognate) PUBMED:2143562, PUBMED:2841196. Hsp70 proteins can be found in different cellular compartments(nuclear, cytosolic, mitochondrial, endoplasmic reticulum, for example).

    \ ' '3006' 'IPR020576' '\

    Prokaryotes and eukaryotes respond to heat shock and other forms of \ environmental stress by inducing synthesis of heat-shock proteins (hsp) PUBMED:2853609. The 90 kDa heat shock protein, Hsp90, is one of the most abundant proteins in eukaryotic cells, comprising 1-2% of cellular proteins under non-stress conditions PUBMED:15069952. Its contribution to various cellular processes including signal transduction, protein folding, protein degradation and morphological evolution has been extensively studied PUBMED:8419347, PUBMED:7914036. The full functional activity of Hsp90 is gained in concert with other co-chaperones, playing an important role in the folding of newly synthesised proteins and stabilisation and refolding of denatured proteins after stress. Apart from its co-chaperones, Hsp90 binds to an array of client proteins, where the co-chaperone requirement varies and depends on the actual client.

    The\ sequences of hsp90s show a distinctive domain structure, with a highly-conserved N-terminal domain separated from a conserved, acidic C-terminal\ domain by a highly-acidic, flexible linker region.

    \ ' '3007' 'IPR007250' '\ These heat shock proteins (Hsp9 and Hsp12) are strongly expressed and undergo an increase of 100 fold, upon entry into stationary phase in yeast PUBMED:2175390, PUBMED:8679693.\ ' '3008' 'IPR007331' '\ This domain is found in HtaA, a secreted protein implicated in iron acquisition and transport PUBMED:10760164.\ ' '3009' 'IPR000847' '\ Numerous bacterial transcription regulatory proteins bind DNA via a helix-turn-helix (HTH) motif. \ These proteins are very diverse, but for convenience may be grouped into subfamilies on the basis \ of sequence similarity. One such family, the lysR family, groups together a range of proteins, \ including ampR, catM, catR, cynR, cysB, gltC, iciA, ilvY, irgB, lysR, metR, mkaC, mleR, nahR, nhaR, \ nodD, nolR, oxyR, pssR, rbcR, syrM, tcbR, tfdS and trpI PUBMED:1907267, PUBMED:1592818, PUBMED:1840615, \ PUBMED:3413113, PUBMED:2034653. The majority of these proteins appear to be transcription activators\ and most are known to negatively regulate their own expression. All possess a potential HTH \ DNA-binding motif towards their N-termini.\ ' '3010' 'IPR007050' '\

    Numerous bacterial transcription regulatory proteins bind DNA via a helix-turn-helix (HTH) motif. This entry represents the HTH DNA binding domain found in Halobacterium salinarium (Halobacterium halobium) and described as a putative bacterio-opsin activator.

    \ ' '3011' 'IPR001845' '\

    Bacterial transcription regulatory proteins that bind DNA via a helix-turn-helix (HTH) motif can be grouped into families on the basis of sequence similarities. One such group, termed arsR, includes several proteins that appear to dissociate from DNA in the presence of metal ions: arsR, which functions as a transcriptional repressor of an arsenic resistance operon; smtB from Synechococcus sp. (strain PCC 7942), which acts as a transcriptional repressor of the smtA gene that codes for a metallothionein; cadC, a protein required for cadmium-resistance; and hypothetical protein yqcJ from Bacillus subtilis.

    \

    The HTH motif is thought to be located in the central part of these proteins PUBMED:8451191. The motif is characterised by a number of well-conserved residues: at its N-terminal extremity is a cysteine residue; a second Cys is found in arsR and cadC, but not in smtA; and at the C-terminus lie one or two histidines. These residues may be involved in metal-binding (Zn in smtB; metal-oxyanions such as arsenite, antimonite and arsenate for arsR; and cadmium for cadC) PUBMED:8506147. It is believed that binding of a metal ion could induce a conformational change that would prevent the protein from binding DNA PUBMED:8506147.

    \

    The crystal structure of the cyanobacterial smtB shows a fold of five\ alpha-helices (H) and a pair of antiparallel beta-strands (B) in the topology\ H1-H2-H3-H4-B1-B2-H5. Helices 3 and 4 comprise the\ helix-turn-helix motif and the beta-sheet is called the wing as in other wHTH,\ such as the dtxR-type or the merR-type.\ Helix 4 is termed the recognition helix, like in other HTHs where it binds the\ DNA major groove. Most arsR/smtB-like metalloregulators form homodimers PUBMED:14568530.\ The dimer interface is formed by helix 5 and an N-terminal part PUBMED:9466913. Two\ distinct metal-binding sites have been identified. The first site comprises\ cysteine thiolates located in the HTH in helix 3 and for some cases in the\ N-terminus, called the alpha3(N) site PUBMED:8506147. The second metal-binding site\ is located in helix 5 (and C-terminus) and is called the alpha5(C) site. The\ alpha3N site binds large thiophilic, toxic metals including Cd, Pb, and Bi, as\ in S. aureus cadC. ArsR lacks the N-terminal arm and its alpha3 site\ coordinates smaller thiophilic ions like As and Sb. The alpha5 site contains\ carboxylate and imidazole ligands and interacts preferentially with\ biologically required metal ions including Zn, Co, and Ni. ArsR-type\ metalloregulators contain one of these sites, both, or other potential\ metal-binding sites PUBMED:12829264, PUBMED:14960585. Binding of metal ions to these sites leads to\ allosteric changes that can derepress the operator/promotor DNA. The\ metal-inducible operons contain one or two imperfect 12-2-12 inverted repeats,\ which can be recognised by multimeric arsR-type metalloregulators.\

    \ ' '3012' 'IPR000281' '\ This domain contains a helix-turn-helix motif PUBMED:8576032.\ Every member of this family is N-terminal to a SIS domain . Members of this family are probably regulators of genes\ involved in phosphosugar metobolism.\ ' '3013' 'IPR006120' '\

    Site-specific recombination plays an important role in DNA rearrangement in prokaryotic organisms. Two types of site-specific recombination are known to occur:

    \
      \
    1. Recombination between inverted repeats resulting in the reversal of a DNA segment.
    2. \
    3. Recombination between repeat sequences on two DNA molecules resulting in their cointegration, or between repeats on one DNA molecule resulting in the excision of a DNA fragment.
    4. \
    \

    Site-specific recombination is characterised by a strand exchange mechanism that requires no DNA synthesis or high energy cofactor; the phosphodiester bond energy is conserved in a phospho-protein linkage during strand cleavage and re-ligation.

    \

    Two unrelated families of recombinases are currently known PUBMED:3011407. The first, called the \'phage integrase\' family, groups a number of bacterial phage and yeast plasmid enzymes. The second PUBMED:2896291, called the \'resolvase\' family, groups enzymes which share the following structural characteristics: an N-terminal catalytic and dimerization domain that contains a conserved serine residue involved in the transient covalent attachment to DNA , and a C-terminal helix-turn-helix DNA-binding domain.

    \ ' '3015' 'IPR000005' '\

    Many bacterial transcription regulation proteins bind DNA through a\ \'helix-turn-helix\' (HTH) motif. One major subfamily of these proteins PUBMED:8451183, PUBMED:2314271 is related to the arabinose \ operon regulatory protein AraC PUBMED:8451183, PUBMED:2314271. Except for celD PUBMED:2179047, all of these proteins seem to be positive transcriptional factors.

    \ \

    Although the sequences belonging to this family differ somewhat in length, in nearly every case the HTH motif is situated towards the C-terminus in the third quarter of most of the sequences. The minimal DNA binding domain spans roughly 100 residues and comprises two HTH subdomains; the classical HTH domain and another HTH subdomain with similarity to the classical HTH domain but with an insertion of one residue in the turn-region. The N-terminal and central regions of these proteins are presumed to interact with effector molecules and may be involved in dimerisation PUBMED:8516313.

    \ \

    The known structure of MarA () shows that the AraC domain is alpha helical and shows the two HTH subdomains both bind the major groove of the DNA. The two HTH subdomains are separated by only 27\ angstroms, which causes the cognate DNA to bend.

    \ ' '3016' 'IPR005697' '\

    This family of enzymes, homoserine O-succinyltransferase, catalyses the first step in the biosynthesis of methionine: \

    \ \

    This enzyme is consequently essential for the survival of bacteria, plants and fungi. Since they are not found in humans, they make a promising new target for antimicrobial drug development. Homoserine O-succinyltransferase (HST) is a representative from this class and has recently had the key amino acids involved in substrate specificity and catalysis elucidated PUBMED:17442255.

    \ ' '3018' 'IPR007038' '\ This family of proteins are hydrogenase/urease accessory proteins. They contain many conserved histidines that are likely to be involved in nickel binding.\ ' '3019' 'IPR001109' '\

    The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small\ proteins that are hydrogenase precursor-specific chaperones required for this maturation process PUBMED:8497190. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation PUBMED:9485446, PUBMED:10783387.

    \ ' '3020' 'IPR006894' '\ This domain represents a C-terminal conserved region found in these bacterial proteins necessary for hydrogenase synthesis. Their precise function is unknown PUBMED:8045431.\ ' '3022' 'IPR000671' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    Metalloproteases are the most diverse of the four main types of protease, with more than 30 families identified to date PUBMED:7674922. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as abXHEbbHbc, where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesized as a precursor devoid of the metalloenzyme active site. This precursor undergoes a complex post-translational maturation process that requires a number of accessory proteins PUBMED:11336840, PUBMED:12196162, PUBMED:10226043. At one step of this process, after nickel incorporation, each hydrogenase isoenzyme is processed by proteolytic cleavage at the C-terminal end by the corresponding hydrogenase maturation endopeptidase PUBMED:10727938. For example, Escherichia coli HycI is involved in processing of pre-HycE (the large subunit of hydrogenase 3) PUBMED:8125094, PUBMED:10795682; HybD is involved in processing of pre-HybC (the large subunit of hydrogenase 2) PUBMED:10331925; and HyaD is assumed to be involved in processing of the large subunit of hydrogenase 1. This group represents metallopeptidases of the MEROPS peptidase family A31 (HybD endopeptidase family, clan AE).

    \ \

    The cleavage site is after a His or an Arg, liberating a short peptide PUBMED:8405419, PUBMED:8125094. This cleavage occurs only in the presence of nickel, and the endopeptidase probably uses the metal in the large subunit of [NiFe]-hydrogenases as a recognition motif PUBMED:10727938. There is no direct evidence for the active site or substrate-binding site, but there are predictions based on an available structure PUBMED:10331925.

    \ \

    Nomenclature note: the following names are used in different organisms for members of this group: HycI, HybD, HyaD, HoxM, HoxW, HupD, HynC, HupM, VhoD, VhtD PUBMED:11336840. Gene/protein names are sometimes used interchangeably to designate various "hydrogenase cluster" proteins unrelated to each other in various organisms. For example, the following names are used for members of this group, but also for unrelated proteins: HupD is used in Azotobacter chroococcum and Anabaena species to designate an unrelated hydrogenase maturation factor; HydD is used to designate hydrogenase structural genes in Thermococcus litoralis, Pyrococcus abyssi, and other species.

    \ ' '3023' 'IPR002821' '\ This family includes the enzymes hydantoinase and oxoprolinase ().\ Both reactions involve the hydrolysis of 5-membered rings via hydrolysis\ of their internal imide bonds PUBMED:8943290.\ ' '3024' 'IPR003692' '\

    An appreciable fraction of the sulphur present in mammals occurs in the form of glutathione. The synthesis of glutathione and its utilization take place by the reactions of the gamma-glutamyl cycle, which include those catalysed by gamma-glutamylcysteine and glutathione synthetases, gamma-glutamyl transpeptidase, cysteinylglycinase, gamma-glutamyl cyclotransferease, and 5-oxoprolinase PUBMED:45011.

    \

    This family includes N-methylhydantoinase B which converts hydantoin to N-carbamyl-amino acids, and 5-oxoprolinase which catalyses the formation of L-glutamate from 5-oxo-L-proline. These enzymes are part of the oxoprolinase family and are related to hydantoinase_A.

    \ ' '3025' 'IPR001338' '\ The surface of many fungal spores is covered by a hydrophobic sheath, the rodlet layer \ whose main component is a protein known as the rodlet protein PUBMED:2065971, PUBMED:1459459. The \ rodlet proteins of Neurospora crassa (gene eas) and Emericella nidulans (gene rodA) are \ evolutionary related to proteins found in the cell wall of fruiting bodies of the \ mushroom Schizophyllum commune (Bracket fungus) PUBMED:2401401.\ Collectively, these low-molecular-weight, cysteine-rich (eight conserved cysteines), \ hydrophobic proteins, are known as hydrophobins.\ ' '3026' 'IPR000688' '\

    Bacterial membrane-bound nickel-dependent hydrogenases requires a number of accessory proteins\ which are involved in their maturation. The exact role of these proteins is not yet clear, but some seem\ to be required for the incorporation of the nickel ions PUBMED:8305450. One of these proteins is generally\ known as hypA. It is a protein of about 12 to 14 kDa that contains, in its C-terminal region, four conserved\ cysteines that form a zinc-finger like motif. Escherichia coli has two proteins that belong to this family, hypA and\ hybF. A homologue, MJ0214, has also been found in a number of archaeal species, including the genome of Methanocaldococcus jannaschii (Methanococcus jannaschii).

    \ ' '3028' 'IPR002780' '\

    HypD is involved in the hyp operon which is needed for the activity of the three hydrogenase isoenzymes in Escherichia coli. HypD is one of the genes needed for formation of these enzymes PUBMED:1849603. This protein has been found in Gram-negative and Gram-positive bacteria and Archaea. HypD contains\ many possible metal binding residues, which may bind to nickel.\ Transposon insertions into HypD resulted in Rhizobium leguminosarum mutants that lacked any hydrogenase activity in symbiosis with peas PUBMED:8326860.

    \ ' '3029' 'IPR003410' '\ This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is\ uncertain it may be involved in cell adhesion. In the Sushi repeat-containing protein (SrpX), this domain is found between two sushi repeats.\ ' '3031' 'IPR002558' '\ I/LWEQ domains bind to actin. It has been shown that the I/LWEQ\ domains from mouse talin and yeast Sla2p\ interact with F-actin PUBMED:9159132. \ The domain has four conserved blocks, the name of the domain is derived from the initial conserved amino acid of\ each of the four blocks PUBMED:9159132. I/LWEQ domains can be\ placed into four major groups based on sequence similarity:\
      \
    1. Metazoan talin.
    2. \
    3. Dictyostelium discoideum (Slime mould) TalA/TalB and SLA110.
    4. \
    5. Metazoan Hip1p .
    6. \
    7. Saccharomyces cerevisiae Sla2p .
    8. \
    \ ' '3032' 'IPR007648' '\ ATP synthase inhibitor prevents the enzyme from switching to ATP hydrolysis during collapse of the electrochemical gradient, for example during oxygen deprivation PUBMED:8961923 ATP synthase inhibitor forms a one-to-one complex with the F1 ATPase, possibly by binding at the alpha-beta interface. It is thought to inhibit ATP synthesis by preventing the release of ATP. The minimum inhibitory region for bovine inhibitor () is from residues 39 to 72. The inhibitor has two oligomeric states, dimer (the active state) and tetramer. At low pH, the inhibitor forms a dimer via antiparallel coiled-coil interactions between the C-terminal regions of two monomers. At high pH, the inhibitor forms tetramers and higher oligomers by coiled-coil interactions involving the N terminus and inhibitory region, thus preventing the inhibitory activity PUBMED:8961923.\ ' '3033' 'IPR002652' '\

    The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.

    \

    Members of the importin-alpha (karyopherin-alpha) family can form heterodimers with importin-beta. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Proteins can contain one (monopartite) or two (bipartite) NLS motifs. Importin-alpha contains several armadillo (ARM) repeats, which produce a curving structure with two NLS-binding sites, a major one close to the N-terminus and a minor one close to the C-terminus.

    \

    Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. The N-terminal importin-beta-binding (IBB) domain of importin-alpha contains an auto-regulatory region that mimics the NLS motif PUBMED:8692858. The release of importin-beta frees the auto-regulatory region on importin-alpha to loop back and bind to the major NLS-binding site, causing the cargo to be released PUBMED:17170104.

    \

    This entry represents the N-terminal IBB domain of importin-alpha that contains the auto-regulatory region.

    \

    More information about these proteins can be found at Protein of the Month: Importins PUBMED:.

    \ \ ' '3034' 'IPR005214' '\

    The gene product of gene 3 from Infectious bronchitis virus (strain CL190). Currently, the function of this protein remains unknown.

    \ ' '3035' 'IPR005295' '\

    These proteins are the product of ORF 3B from Infectious bronchitis virus). Currently, the function of this protein remains unknown PUBMED:9168126.

    \ ' '3036' 'IPR005296' '\

    These proteins are the product of ORF 3C from Infectious bronchitis virus. Currently, the function of this protein remains unknown.

    \ ' '3037' 'IPR006723' '\ This family includes a 69 kDa protein which has been identified as an islet cell autoantigen in type I diabetes mellitus PUBMED:8975715. Its precise function is unknown.\ ' '3038' 'IPR013768' '\

    Intercellular adhesion molecules (ICAMs) and vascular cell adhesion molecule-1 (VCAM-1) are part of the immunoglobulin superfamily. They are important in inflammation, immune responses and in intracellular signalling events PUBMED:9151947. The ICAM family consists of five members, designated ICAM-1 to ICAM-5. They are known to bind to leucocyte integrins CD11/CD18 during inflammation and in immune responses. In addition, ICAMs may exist in soluble forms in human plasma, due to activation and proteolysis mechanisms at cell surfaces.

    \

    ICAM-1 (CD54) contains five Ig-like domains. It is expressed on leucocytes, endothelial and epithelial cells, and is upregulated in response to bacterial invasion. The protein is a ligand for lymphocyte-function associated (LFA) antigens and also a receptor for CD11a,b/CD18, fibrinogen, human rhinovirus and Plasmodium falciparum-infected erythrocytes. ICAM-1 binding sites for CD11a/CD18 and its other binding partners are located in the first domain and are overlapping. ICAM-1 domain 2 seems to play an important role in maintaining the conformation of domain 1 and particularly the structural integrity of the LFA-1 ligand-binding site PUBMED:10998349.

    \

    The 3-dimensional atomic structure of the tandem N-terminal Ig-like domains (D1 and D2) of ICAM-1 has been determined to 2.2A resolution and fitted into a cryoelectron microscopy reconstruction of a rhinovirus-ICAM-1 complex PUBMED:9539703. Extensive charge interactions between ICAM-1 and human rhinovirusesare largely conserved in major and minor receptor groups of rhinoviruses. The interaction of ICAMs with LFA-1 is mediated by a divalent cation bound to the insertion (I)-domain on the alpha chain of LFA-1 and the carboxyl group of a conserved glutamic acid residue on ICAMs.

    \

    ICAM-2 (CD102) has two Ig-like domains. It is expressed on endothelial cells, leucocytes and platelets, and binds to CD11a\'b/CD18. The protein is refractory to proinflammatory cytokines, and plays an important role in the adhesion of leucocytes to the uninduced endothelium PUBMED:10352278.

    \

    ICAM-3 (CD50) contains five Ig-like domains and binds to leucocyte integrins CD11a\'d/CD18. The protein plays an important role in the immune response and perhaps in signal transduction PUBMED:10725740.

    \

    ICAM-4 (LW blood group Ag) is red blood cell (RBC) specific and binds to CD11a\'b/CD18. It is associated with the RBC Rh antigens and could be important in retaining immature red cells in the bone marrow, or in the uptake of senescent cells into the spleen PUBMED:10846180.

    \

    ICAM-5 (telencephalin) has nine Ig-like domains and is confined to the telencephalon of the brain. The role of this CD11a/CD18 binding molecule is not yet known PUBMED:10741396.

    \

    VCAM-1 was first described as a cytokine-inducible endothelial adhesion molecule. It can bind to leucocyte integrin VL-4 (very late antigen-4) to recruit leucocytes to sites of inflammation PUBMED:11133225. The predominant form of VCAM-1 in vivo has an N-terminal extracellular region comprising seven Ig-like domains PUBMED:7531291. A conserved integrin-binding motif has been identified in domains 1 and 4, variants of which are present in the N-terminal domain of all members of the integrin-binding subgroup of the immunoglobulin superfamily. The structure of a VLA-4-binding fragment comprising the first two domains of VCAM-1 has been determined to 1.8A resolution. The integrin-binding motif is exposed and forms the N-terminal region of the loop between beta-strands C and D of domain 1 PUBMED:7531291. VCAM-1 domains 1 and 2 are structurally similar to ICAM-1 and ICAM-2 PUBMED:11133225.

    \ \

    This entry represents the N-terminal domain of ICAM proteins such as ICAM-2, ICAM-3 and ICAM-4.

    \ ' '3039' 'IPR000258' '\ Certain Gram-negative bacteria express proteins that enable them to promote nucleation of ice at relatively high temperatures (above -5C) PUBMED:1366726, PUBMED:8224607. These proteins are localised at the outer membrane surface and can cause frost damage to many plants. The primary structure of the proteins contains a highly repetitive domain that dominates the sequence. The domain comprises a number of 48-residue repeats, which themselves contain 3 blocks of 16 residues, the first 8 of which are identical. It is thought that the repetitive domain may be responsible for aligning water molecules in the seed crystal.\
    \
                  [.........48.residues.repeated.domain..........]\
                 /              / |              | \\              \\\
                AGYGSTxTagxxssli  AGYGSTxTagxxsxlt  AGYGSTxTaqxxsxlt\
                [16.residues...]  [16.residues...]  [16.residues...]\
    
    \ ' '3041' 'IPR011600' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of sequences represent the p20 (20kDa) and p10 (10kDa) subunits of caspases, which together form the catalytic domain of the caspase and are derived from the p45 (45 kDa) precursor () PUBMED:15226512.

    \ \

    Caspases (Cysteine-dependent ASPartyl-specific proteASE) are cysteine peptidases that belong to the MEROPS peptidase family C14 (caspase family, clan CD) based on the architecture of their catalytic dyad or triad PUBMED:11517925. Caspases are tightly regulated proteins that require zymogen activation to become active, and once active can be regulated by caspase inhibitors. Activated caspases act as cysteine proteases, using the sulphydryl group of a cysteine side chain for catalysing peptide bond cleavage at aspartyl residues in their substrates. The catalytic cysteine and histidine residues are on the p20 subunit after cleavage of the p45 precursor.

    \

    Caspases are mainly involved in mediating cell death (apoptosis) PUBMED:10578171, PUBMED:10872455, PUBMED:15077141. They have two main roles within the apoptosis cascade: as initiators that trigger the cell death process, and as effectors of the process itself. Caspase-mediated apoptosis follows two main pathways, one extrinsic and the other intrinsic or mitochondrial-mediated. The extrinsic pathway involves the stimulation of various TNF (tumour necrosis factor) cell surface receptors on cells targeted to die by various TNF cytokines that are produced by cells such as cytotoxic T cells. The activated receptor transmits the signal to the cytoplasm by recruiting FADD, which forms a death-inducing signalling complex (DISC) with caspase-8. The subsequent activation of caspase-8 initiates the apoptosis cascade involving caspases 3, 4, 6, 7, 9 and 10. The intrinsic pathway arises from signals that originate within the cell as a consequence of cellular stress or DNA damage. The stimulation or inhibition of different Bcl-2 family receptors results in the leakage of cytochrome c from the mitochondria, and the formation of an apoptosome composed of cytochrome c, Apaf1 and caspase-9. The subsequent activation of caspase-9 initiates the apoptosis cascade involving caspases 3 and 7, among others. At the end of the cascade, caspases act on a variety of signal transduction proteins, cytoskeletal and nuclear proteins, chromatin-modifying proteins, DNA repair proteins and endonucleases that destroy the cell by disintegrating its contents, including its DNA. The different caspases have different domain architectures depending upon where they fit into the apoptosis cascades, however they all carry the catalytic p10 and p20 subunits.

    \

    Caspases can have roles other than in apoptosis, such as caspase-1 (interleukin-1 beta convertase) (), which is involved in the inflammatory process. The activation of apoptosis can sometimes lead to caspase-1 activation, providing a link between apoptosis and inflammation, such as during the targeting of infected cells. Caspases may also be involved in cell differentiation PUBMED:15066636.

    \ \ ' '3042' 'IPR000918' '\

    Isocitrate lyase () PUBMED:2696959, PUBMED:2361956 is an enzyme that catalyzes the conversion of \ isocitrate to succinate and glyoxylate. This is the first step in the glyoxylate bypass, an alternative \ to the tricarboxylic acid cycle in bacteria, fungi and plants. A cysteine, a histidine and a glutamate \ or aspartate have been found to be important for the enzyme\'s catalytic activity. Only one cysteine \ residue is conserved between the sequences of the fungal, plant and bacterial enzymes; it is located in \ the middle of a conserved hexapeptide.

    \

    Other enzymes also belong to this family including carboxyvinyl-carboxyphosphonate phosphorylmutase () which catalyses the conversion of 1-carboxyvinyl carboxyphosphonate to 3-(hydrohydroxyphosphoryl) pyruvate carbon dioxide, and phosphoenolpyruvate mutase (), which is involved in the biosynthesis of phosphinothricin tripeptide antiobiotics.

    \ ' '3043' 'IPR003521' '\ The nucleotide-sensitive chloride conductance regulatory protein (ICln) is\ found ubiquitously in mammalian (and other) cell types and is postulated to\ play a critical role in cell volume regulation. Initial studies proposed\ that ICln was itself a swelling-activated anion channel; however, further\ studies demonstrated that it is localised primarily to the cell cytoplasm.\ It has therefore been postulated that activation of cell volume regulation\ may involve reversible translocation of ICln from the cytoplasm, and its\ insertion into the plasma membrane. It is not resolved whether the anionic channel involved in cell volume regulation after cell-swelling comprises one or more subunits, and if it does, whether ICln is in fact one of them PUBMED:9696697.\ ' '3044' 'IPR007269' '\ The isoprenylcysteine o-methyltransferase () carries out carboyxl methylation of cleaved eukaryotic proteins that terminate in a CaaX motif. In Saccharomyces cerevisiae (Baker\'s yeast) this methylation is carried out by Ste14p, an integral endoplasmic reticulum membrane protein. Ste14p is the founding member of the isoprenylcysteine carboxyl methyltransferase (ICMT) family, whose members share significant sequence homology PUBMED:11451995.\ ' '3045' 'IPR004436' '\

    This family of enzymes catalyses the NADP(+)-dependent oxidative decarboxylation of isocitrate to form 2-oxoglutarate, CO2, and NADPH within the Krebs cycle (). Thus this enzyme supplies the cell with a key intermediate in energy metabolism, and precursors for biosynthetic pathways. The activity of this enzyme, which is controlled by phosphorylation, helps regulate carbon flux between the Krebs cycle and the glyoxylate bypass, which is an alternate route that accumulates carbon for biosynthesis when acetate is the sole carbon source for growth PUBMED:7836312. The phosphorylation state of this enzyme is controlled by isocitrate dehydrogenase kinase/phosphatase. This family has been found in a number of bacterial species including Azotobacter vinelandii, Corynebacterium glutamicum, Rhodomicrobium vannielii, and Neisseria meningitidis.

    \ \

    The structure of isocitrate dehydrogenase from Azotobacter vinelandii () has been determined PUBMED:12467571. This molecule consists of two distinct domains, a small domain and a large domain, with a folding topology similar to that of dimeric isocitrate dehydrogenase from Escherichia coli (). The structure of the large domain repeats a motif observed in the dimeric enzyme. Such a fusional structure by domain duplication enables a single polypeptide chain to form a structure at the catalytic site that is homologous to the dimeric enzyme, the catalytic site of which is located at the interface of two identical subunits.

    \ ' '3046' 'IPR000898' '\

    Indoleamine 2,3-dioxgyenase (IDO, ) PUBMED:1907934 is a cytosolic haem protein which, together with the hepatic enzyme tryptophan 2,3-dioxygenase, catalyzes the conversion of tryptophan and other indole derivatives to kynurenines. The physiological role of IDO is not fully understood but is of great interest, because IDO is widely distributed in human tissues, can be up-regulated via cytokines such as interferon-gamma, and can thereby modulate the levels of tryptophan, which is vital for cell growth. The degradative action of IDO on tryptophan leads to cell death by starvation of this essential and relatively scarce amino acid. IDO is a haem-containing enzyme of about 400 amino acids. Site-directed mutagenesis showed His346 () to be essential for haem binding, indicating that this histidine residue may be the proximal ligand. Mutation of Asp274 also compromised the ability of IDO to bind haem, suggesting that Asp274 may coordinate to haem directly as the distal ligand or is essential in maintaining the conformation of the haem pocket PUBMED:12766158.

    \

    Other proteins that are evolutionarily related to IDO include yeast hypothetical protein YJR078w; and myoglobin from the red muscle of the archaeogastropodic molluscs, Nordotis madaka (Giant abalone) and Sulculus diversicolor PUBMED:8011076, PUBMED:12711393. These unusual globins lack enzymatic activity but have kept the haem group.

    \ ' '3047' 'IPR003403' '\ This regulatory protein is expressed from an immediate early gene in the cell cycle of Herpesviridae. The protein is known by various names including IE-68, US1, ICP22 and IR4.\ ' '3048' 'IPR000649' '\

    Initiation factor 2 binds to Met-tRNA, GTP and the small ribosomal subunit. The eukaryotic translation initiation factor EIF-2B is a complex made up of five different subunits, alpha, beta, gamma, delta and epsilon, and catalyses the exchange of EIF-2-bound GDP for GTP. This family includes initiation factor 2B alpha, beta and delta subunits from eukaryotes; related proteins from archaebacteria and IF-2 from prokaryotes and also contains a subfamily of proteins in eukaryotes, archaeae (e.g. Pyrococcus furiosus), or eubacteria such as Bacillus subtilis and Thermotoga maritima. Many of these proteins were initially annotated as putative translation initiation factors despite the fact that there is no evidence for the requirement of an IF2 recycling factor in prokaryotic translation initiation. Recently, one of these proteins from B. subtilis has been functionally characterised as a 5-methylthioribose-1-phosphate isomerase (MTNA) PUBMED:14551435. This enzyme participates in the methionine salvage pathway catalysing the isomerisation of 5-methylthioribose-1-phosphate to 5-methylthioribulose-1-phosphate PUBMED:15215245. The methionine salvage pathway leads to the synthesis of methionine from methylthioadenosine, the end product of the spermidine and spermine anabolism in many species.

    \ ' '3049' 'IPR019815' '\

    Initiation factor 3 (IF-3) (gene infC) is one of the three factors required for the \ initiation of protein biosynthesis in bacteria. IF-3 is thought to function as a \ fidelity factor during the assembly of the ternary initiation complex which consist of \ the 30S ribosomal subunit, the initiator tRNA and the messenger RNA. IF-3 is a basic\ protein that binds to the 30S ribosomal subunit PUBMED:8405963. The chloroplast initiation factor IF-3(chl) is a protein that \ enhances the poly(A,U,G)-dependent binding of the initiator tRNA to chloroplast ribosomal\ 30s subunits in which the central section is evolutionary related to the sequence of \ bacterial IF-3 PUBMED:8144528.

    \ ' '3050' 'IPR001040' '\ Eukaryotic translation initiation factor 4E (eIF-4E) PUBMED:1733496 is a protein that\ binds to the cap structure of eukaryotic cellular mRNAs. eIF-4E recognises and binds\ the 7-methylguanosine-containing (m7Gppp) cap during an early step in the initiation\ of protein synthesis and facilitates ribosome binding to a mRNA by inducing the unwinding\ of its secondary structures. A tryptophan in the central part of the sequence of human\ eIF-4E seems to be implicated in cap-binding PUBMED:1672854.\ ' '3051' 'IPR001322' '\

    Intermediate filaments (IF) are primordial components of the cytoskeleton and the \ nuclear envelope PUBMED:8771189. They generally form filamentous structures 8 to 14 nm \ wide. IF proteins are members of a very large multigene family of proteins which has been \ subdivided in five major subgroups, type I: acidic cytokeratins, type II: basic \ cytokeratins, type III: vimentin, desmin, glial fibrillary acidic protein (GFAP),\ peripherin, and plasticin, type IV: neurofilaments L, H and M, alpha-internexin and \ nestin, and type V: nuclear lamins A, B1, B2 and C. The lamins are components of the\ nuclear lamina, a fibrous layer on the nucleoplasmic side of the inner nuclear membrane\ that may provide a framework for the nuclear envelope and may interact with chromatin.

    \

    All IF proteins are structurally similar in that they consist of a central rod domain \ arranged in coiled-coil alpha-helices, with at least two short characteristic \ interruptions; a N-terminal non-helical domain (head) of variable length; and a C-terminal\ domain (tail) which is also non-helical, and which shows extreme length variation between \ different IF proteins. The C-terminal domain has been charcterised for the lamins.

    \ ' '3052' 'IPR002069' '\

    Interferon gamma (IFN-gamma) is produced by lymphocytes activated by specific antigens or mitogens. IFN-gamma shows antiviral activity and has important immunoregulatory functions. It is a potent activator of microphages and had antiproliferative effects on transformed cells. It can potentiate the antiviral and antitumor effects of the type I interferons.

    \ \

    The crystal structures of a number IFN-gamma proteins have been solved, including bovine interferon-gamma at 2.0-A PUBMED:10666622 and human IFN-gamma at 2.9-A PUBMED:10860730.

    \ ' '3053' 'IPR013151' '\

    This entry is for immunoglobulin-like domains. Studies indicate that the interactions essential for defining the structure of these beta sandwich proteins are also important in nucleation of folding, and that proteins containing this fold may share similar folding pathways even though the proteins may have low sequence homology. The fold consists of a beta-sandwich formed of 7 strands in 2 sheets with a Greek-key topology. Some members of the fold have additional strands. The Pfam alignments do not include the first and last strand of the immunoglobulin-like domain.

    \ \ ' '3054' 'IPR000710' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae PUBMED:7845208. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.

    \ \ ' '3055' 'IPR000867' '\

    Insulin-like Growth Factor Binding Proteins (IGFBP) are a group of vertebrate secreted proteins, which bind to IGF-I and IGF-II with high affinity and modulate the biological actions of IGFs. The IGFBP family has six distinct subgroups, IGFBP-1 through 6, based on conservation of gene (intron-exon) organisation, structural similarity, and binding affinity for IGFs. Across species, IGFBP-5 exhibits the most sequence conservation, while IGFBP-6 exhibits the least sequence conservation. The IGFBPs contain inhibitor domain homologues, which are related to MEROPS protease inhibitor family I31 (equistatin, clan IX).

    \ \

    All IGFBPs share a common domain architecture (:). While the N-terminal (, IGF binding protein domain), and the C-terminal (, thyroglobulin type-1 repeat) domains are conserved across vertebrate species, the mid-region is highly variable with respect to protease cleavage sites and phosphorylation and glycosylation sites. IGFBPs contain 16-18 conserved cysteines located in the N-terminal and the C-terminal regions, which form 8-9 disulphide bonds PUBMED:11874691.

    As demonstrated for human IGFBP-5, the N terminus is the primary binding site for IGF. This region, comprised of Val49, Tyr50, Pro62 and Lys68-Leu75, forms a hydrophobic patch on the surface of the protein PUBMED:9822601. The C terminus is also required for high affinity IGF binding, as well as for binding to the extracellular matrix PUBMED:9725901 and for nuclear translocation PUBMED:7519375, PUBMED:9660801 of IGFBP-3 and -5.

    IGFBPs are unusually pleiotropic molecules. Like other binding proteins, IGFBP can prolong the half-life of IGFs via high affinity binding of the ligands. In addition to functioning as simple carrier proteins, serum IGFBPs also serve to regulate the endocrine and paracrine/autocrine actions of IGF by modulating the IGF available to bind to signalling IGF-I receptors PUBMED:12379487, PUBMED:12379489. Furthermore, IGFBPs can function as growth modulators independent of IGFs. For example, IGFBP-5 stimulates markers of bone formation in osteoblasts lacking functional IGFs PUBMED:11874691. The binding of IGFBP to its putative receptor on the cell membrane may stimulate the signalling pathway independent of an IGF receptor, to mediate the effects of IGFBPs in certain target cell types. IGFBP-1 and -2, but not other IGFBPs, contain a C-terminal Arg-Gly-Asp integrin-binding motif. Thus, IGFBP-1 can also stimulate cell migration of CHO and human trophoblast cells through an action mediated by alpha 5 beta 1 integrin PUBMED:7504269. Finally, IGFBPs transported into the nucleus (via the nuclear localisation signal) may also exert IGF-independent effects by transcriptional activation of genes.

    \ \

    The insulin family of proteins PUBMED:6107857 groups a number of active peptides which are evolutionary related including insulin; relaxin; insulin-like growth factors I and II PUBMED:2197088; mammalian\ Leydig cell-specific insulin-like peptide (gene INSL3) PUBMED:8253799 and early placenta insulin-like peptide (ELIP) (gene INSL4) PUBMED:8666396; insect prothoracicotropic hormone (bombyxin) PUBMED:; locust insulin-related peptide (LIRP) PUBMED:1688797; molluscan insulin-related peptides 1 to 5 (MIP)\ PUBMED:1868853; and Caenorhabditis elegans insulin-like peptides PUBMED:9548970. Structurally, all these peptides consist of two polypeptide chains (A and B) linked by two disulphide bonds. They all share a conserved arrangement of four cysteines in their A chain. The first of these cysteines is linked by a disulphide bond to the third one and the second and fourth cysteines are linked by interchain disulphide bonds to cysteines in the B chain.

    \ \

    Insulin is involved in the regulation of normal glucose homeostasis, as well\ as other specific physiological functions PUBMED:6243748. It is synthesised as a prepropeptide from which an endoplasmic reticulum-targeting sequence is cleaved to yield proinsulin. Prosinsulin contains regions A and B separated by an intervening connecting region C. The connecting region is cleaved, liberating the active protein, which contains the A and B chains,\ held together by 2 disulphide bonds PUBMED:503234.

    \ This entry represents insulin-like growth factors (IGF-I and IGF-II), which bind to specific binding proteins in \ extracellular fluids with high affinity PUBMED:7680510, PUBMED:1725860, PUBMED:2480830. These IGF-binding\ proteins (IGFBP) prolong the half-life of the IGFs and have been shown to either inhibit or \ stimulate the growth promoting effects of the IGFs on cells culture. They seem to alter the \ interaction of IGFs with their cell surface receptors. There are at least six different IGFBPs and \ they are structurally related. The following growth-factor inducible proteins are structurally \ related to IGFBPs and could function as growth-factor binding proteins PUBMED:1654338, PUBMED:1309586, \ mouse protein cyr61 and its probable chicken homolog, protein CEF-10; human connective tissue growth \ factor (CTGF) and its mouse homolog, protein FISP-12; and vertebrate protein NOV.\ ' '3056' 'IPR000724' '\ This domain is found as a tandem repeat in Streptococcal cell surface proteins, such as the\ IgG binding proteins G and MIG. These proteins are type I membrane proteins that bind to\ the constant Fc region of IgG with high affinity. The N-terminus of MIG mediates binding to\ plasma proteinase inhibitor alpha 2-macroglobulin after complex formation with proteases.\ ' '3057' 'IPR000807' '\

    Imidazoleglycerol-phosphate dehydratase is the enzyme that catalyses the seventh step in the biosynthesis of histidine in bacteria, fungi and plants. In most organisms it is a monofunctional protein of about 22 to 29 kD. In some bacteria such as Escherichia coli, it is the \ C-terminal domain of a bifunctional protein that include a histidinol-phosphatase domain PUBMED:3062174.

    \ ' '3058' 'IPR013798' '\

    Indole-3-glycerol phosphate synthase () (IGPS) catalyses the fourth step in the biosynthesis of tryptophan, the ring closure of 1-(2-carboxy-phenylamino)-1-deoxyribulose into indol-3-glycerol-phosphate. In some bacteria, IGPS is a single chain enzyme. In others, such as Escherichia coli, it is the N-terminal domain of a bifunctional enzyme that also catalyses N-(5\'-phosphoribosyl)anthranilate isomerase () (PRAI) activity (see ), the third step of tryptophan biosynthesis. In fungi, IGPS is the central domain of a trifunctional enzyme that contains a PRAI C-terminal domain and a glutamine amidotransferase () (GATase) N-terminal domain (see ).

    A structure of the IGPS domain of the bifunctional enzyme from the mesophilic\ bacterium E. coli (eIGPS) has been compared with the monomeric indole-3-glycerol phosphate\ synthase from the hyperthermophilic archaeon Sulfolobus solfataricus (sIGPS). Both are single-domain\ (beta/alpha)8 barrel proteins, with one (eIGPS) or two (sIGPS) additional helices inserted before the first beta strand PUBMED:8747452.

    \ ' '3059' 'IPR007743' '\

    Interferon-inducible GTPase (IIGP) is thought to play a role in in intracellular defence. IIGP is predominantly associated with the Golgi apparatus and also localizes to the endoplasmic reticulum and exerts a distinct role in IFN-induced intracellular membrane trafficking or processing PUBMED:11907101.

    \ ' '3060' 'IPR000975' '\

    Interleukin-1 alpha and interleukin-1 beta (IL-1 alpha and IL-1 beta) are \ cytokines that participate in the regulation of immune responses, inflammatory reactions, and hematopoiesis PUBMED:2969618. Two types of IL-1 receptor, each with three extracellular immunoglobulin (Ig)-like domains, limited sequence similarity (28%) and different pharmacological characteristics have been cloned from mouse and human cell lines: these have been termed type I and type II receptors PUBMED:8702856. The receptors both exist in transmembrane (TM) and soluble forms: the soluble IL-1 receptor is thought to be post-translationally derived from cleavage of the extracellular portion of the membrane receptors.

    \

    Both IL-1 receptors appear to be well conserved in evolution, and map to the\ same chromosomal location PUBMED:1833184. The receptors can both bind all three forms of IL-1 (IL-1 alpha, IL-1 beta and IL-1RA).

    \

    The crystal structures of IL1A and IL1B PUBMED:2602367 have been solved, showing them to share the same 12-stranded beta-sheet structure as both the heparin binding growth factors and the Kunitz-type soybean trypsin inhibitors PUBMED:1738162. The beta-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel beta-barrel. Several regions, especially the loop between strands 4 and 5, have been implicated in receptor binding.

    \ \

    The Vaccinia virus genes B15R and B18R each encode proteins with N-terminal \ hydrophobic sequences, possible sites for attachment of N-linked carbohydrate and a short C-terminal hydrophobic domain PUBMED:1826022. These properties\ are consistent with the mature proteins being either virion, cell surface or secretory glycoproteins. Protein sequence comparisons reveal that the gene products are related to each other (20% identity) and to the Ig superfamily. The highest degree of similarity is to the human and murine interleukin-1 receptors, although both proteins are related to a wide range of Ig superfamily members, including the interleukin-6 receptor. A novel method for virus immune evasion has been proposed in which the product of one or both of these proteins may bind interleukin-1 and/or interleukin-6, preventing these cytokines reaching their natural receptors PUBMED:1826022. A similar gene product from Cowpox virus (CPV) has also been shown to specifically bind murine IL-1 beta PUBMED:1339315.

    \

    This entry represents Interleukin-1.

    \ ' '3061' 'IPR020443' '\

    Interleukin-10 (IL-10) is a protein that inhibits the synthesis of a number of cytokines, including IFN-gamma, IL-2, IL-3, TNF and GM-CSF produced by activated macrophages and by helper T cells. Structurally, IL-10 is a protein of about 160 amino acids that contains four conserved cysteines involved in disulphide bonds PUBMED:8590020. IL-10 is highly similar to the Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) BCRF1 protein which inhibits the synthesis of gamma-interferon and to Equid herpesvirus 2 (Equine herpesvirus 2) protein E7.\ It is also similar, but to a lesser degree, with human protein mda-7 PUBMED:8545104, a protein which has antiproliferative properties in human melanoma cells. Mda-7 only contains two of the four cysteines of IL-10.

    \ ' '3062' 'IPR004281' '\

    Interleukin 12 (IL-12) is a disulphide-bonded heterodimer consisting of a 35kDa alpha subunit and a 40kDa beta subunit. It is involved in the stimulation and maintenance of Th1 cellular immune responses, including the normal host defence against various intracellular pathogens, such as Leishmania, Toxoplasma, Measles virus and Human immunodeficiency virus 1 (HIV). IL-12 also has an important role in pathological Th1 responses, such as in inflammatory bowel disease and multiple sclerosis. Suppression of IL-12 activity in such diseases may have therapeutic benefit. On the other hand, administration of recombinant IL-12 may have therapeutic benefit in conditions associated with pathological Th2 responses PUBMED:11422900, PUBMED:9597139.

    \ ' '3063' 'IPR003443' '\

    Interleukins (IL) are a group of cytokines that play an important role in the immune system. They modulate inflammation and immunity by regulating growth, mobility and differentiation of lymphoid and other cells.

    \ \

    Interleukin-15 (IL-15) has a variety of biological functions, including stimulation and maintenance of cellular immune responses PUBMED:10689297. It is required for division of CD8+ T cells of memory phenotype, a process that is increased by inhibition of IL-2 PUBMED:10784451. The numbers of CD8+ memory T cells in animals may, therefore, be controlled by a balance between IL-15 and -2.

    \ ' '3064' 'IPR003502' '\

    Interleukin-1 alpha and interleukin-1 beta (IL-1 alpha and IL-1 beta) are \ cytokines that participate in the regulation of immune responses, inflammatory reactions, and hematopoiesis PUBMED:2969618. Two types of IL-1 receptor, each with three extracellular immunoglobulin (Ig)-like domains, limited sequence similarity (28%) and different pharmacological characteristics have been cloned from mouse and human cell lines: these have been termed type I and type II receptors PUBMED:8702856. The receptors both exist in transmembrane (TM) and soluble forms: the soluble IL-1 receptor is thought to be post-translationally derived from cleavage of the extracellular portion of the membrane receptors.

    \

    Both IL-1 receptors appear to be well conserved in evolution, and map to the\ same chromosomal location PUBMED:1833184. The receptors can both bind all three forms of IL-1 (IL-1 alpha, IL-1 beta and IL-1RA).

    \

    The crystal structures of IL1A and IL1B PUBMED:2602367 have been solved, showing them to share the same 12-stranded beta-sheet structure as both the heparin binding growth factors and the Kunitz-type soybean trypsin inhibitors PUBMED:1738162. The beta-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel beta-barrel. Several regions, especially the loop between strands 4 and 5, have been implicated in receptor binding.

    \ \

    The Vaccinia virus genes B15R and B18R each encode proteins with N-terminal \ hydrophobic sequences, possible sites for attachment of N-linked carbohydrate and a short C-terminal hydrophobic domain PUBMED:1826022. These properties\ are consistent with the mature proteins being either virion, cell surface or secretory glycoproteins. Protein sequence comparisons reveal that the gene products are related to each other (20% identity) and to the Ig superfamily. The highest degree of similarity is to the human and murine interleukin-1 receptors, although both proteins are related to a wide range of Ig superfamily members, including the interleukin-6 receptor. A novel method for virus immune evasion has been proposed in which the product of one or both of these proteins may bind interleukin-1 and/or interleukin-6, preventing these cytokines reaching their natural receptors PUBMED:1826022. A similar gene product from Cowpox virus (CPV) has also been shown to specifically bind murine IL-1 beta PUBMED:1339315.

    \

    The N-terminal of Interleukin-1 is approximately 115 amino acids long, it forms a propeptide that is cleaved off to release the active interleukin-1. This signature is for the propeptide.

    \ ' '3065' 'IPR000779' '\ T-Lymphocytes regulate the growth and differentiation of certain lymphopoietic and\ haemopoietic cells through the release of various secreted protein factors PUBMED:3918306.\ These factors, which include interleukin-2 (IL2), are secreted by lectin- or antigen-stimulated\ T-cells, and have various physiological effects. IL2 is a lymphokine that induces the\ proliferation of responsive T-cells. In addition, it acts on some B-cells, via receptor-specific\ binding PUBMED:3517854, as a growth factor and antibody production stimulant PUBMED:1510960. The\ protein is secreted as a single glycosylated polypeptide, and cleavage of a signal sequence\ is required for its activity PUBMED:3517854. Solution NMR suggests that the structure of IL2 comprises a\ bundle of 4 helices (termed A-D), flanked by 2 shorter helices and several poorly-defined\ loops. Residues in helix A, and in the loop region between helices A and B, are important for\ receptor binding. Secondary structure analysis has suggested similarity to IL4 and \ granulocyte-macrophage colony stimulating factor (GMCSF) PUBMED:1510960.\ \ ' '3066' 'IPR002183' '\

    Interleukin-3 (IL3) is a cytokine that regulates blood-cell production by controlling the production, differentiation and function of granulocytes and macrophages PUBMED:3497843, PUBMED:2413359. The protein, which exists in vivo as a monomer, is produced in activated T-cells and mast cells PUBMED:3497843, PUBMED:2413359, and is activated by the cleavage of an N-terminal signal sequence PUBMED:2413359.

    \

    IL3 is produced by T-lymphocytes and T-lymphomas only after stimulation with antigens, mitogens, or chemical activators such as phorbol esters. However, IL3 is constitutively expressed in the myelomonocytic leukaemia cell line WEHI-3B PUBMED:2413359. It is thought that the genetic change of the cell line to constitutive production of IL3 is the key event in development of this leukaemia PUBMED:2413359.

    \ ' '3067' 'IPR002354' '\ Cytokines are protein messengers that carry information from cell to cell\ PUBMED:8151703. Interleukin is one such molecule, and participates in several B-cell \ activation processes: e.g., it enhances production and secretion of IgG1\ and IgE PUBMED:3083412; it induces expression of class II major histocompatability \ complex (MHC) molecules on resting B-cells; and it regulates expression of\ the low affinity Fc receptor for IgE on lymphocytes and monocytes.\ Interleukin-4 (IL4) has a compact, globular fold (similar to other\ cytokines), stabilised by 3 disulphide bonds PUBMED:1993171. One half of the structure\ is dominated by a 4 alpha-helix bundle with a left-handed twist PUBMED:1400355. The\ helices are anti-parallel, with 2 overhand connections, which fall into a\ 2-stranded anti-parallel beta-sheet PUBMED:1400355.\ ' '3068' 'IPR000186' '\ Interleukin-5 (IL5), also known as eosinophil differentiation factor (EDF),\ is a lineage-specific cytokine for eosinophilpoiesis PUBMED:3498940, PUBMED:8483502. It regulates \ eosinophil growth and activation PUBMED:3498940, and thus plays an important role in\ diseases associated with increased levels of eosinophils, including asthma\ PUBMED:8483502. \ IL5 has a similar overall fold to other cytokines (e.g., IL2, IL4 and GCSF)\ PUBMED:8483502, but while these exist as monomeric structures, IL5 is a homodimer. The\ fold contains an anti-parallel 4-alpha-helix bundle with a left handed twist,\ connected by a 2-stranded anti-parallel beta-sheet PUBMED:8483502, PUBMED:2037074. The monomers are\ held together by 2 interchain disulphide bonds PUBMED:2037074.\ ' '3069' 'IPR003573' '\

    Interleukin-6 (IL6), also refered to as B-cell stimulatory factor-2 (BSF-2) and interferon beta-2, is a cytokine involved in a wide variety of biological functions PUBMED:3491322. It plays an essential role in the final\ differentiation of B-cells into IG-secreting cells, as well as inducing myeloma/plasmacytoma growth, nerve cell differentiation and, in hepatocytes, acute phase reactants PUBMED:3491322, PUBMED:2037043.

    \

    A number of other cytokines may be grouped with IL6 on the basis of sequence similarity PUBMED:3491322, PUBMED:2037043, PUBMED:2472117: these include granulocyte colony-stimulating factor (GCSF) and myelomonocytic growth factor (MGF). GCSF acts in hematopoiesis by affecting the production, differentiation and function of 2 related white cell groups in the blood PUBMED:2472117. MGF also acts in hematopoiesis, stimulating proliferation and colony formation of normal and transformed avian cells of the myeloid lineage.

    \

    Cytokines of the IL6/GCSF/MGF family are glycoproteins of about 170 to 180 amino acid residues that contains four conserved cysteine residues involved in two disulphide bonds PUBMED:2472117. They have a compact, globular fold (similar to other interleukins), stabilised by the 2 disulphide bonds. One half of the structure is dominated by a 4 alpha-helix bundle with a left-handed twist PUBMED:1400355: the helices are anti-parallel, with 2 overhand connections, which fall into a 2-stranded anti-parallel beta-sheet. The fourth alpha-helix is important to the biological activity of the molecule PUBMED:2037043.

    \

    It has been said PUBMED:1717982 that this family can be extended by the adjunction of LIF and OSM (see the relevant entry ) which seem to be structurally related.

    \ ' '3070' 'IPR000226' '\ Interleukin-7 (IL-7) PUBMED:2663018 is a cytokine that serves as a growth factor for\ early lymphoid cells of both B- and T-cell lineages. Interleukin-9 (IL-9) PUBMED:1971295\ is a cytokine that supports IL-2 independent and IL-4 independent growth of\ helper T-cells.\ Interleukin-7 and -9 seems to be evolutionary related PUBMED:15335670.\ ' '3071' 'IPR001811' '\ Synonym(s): cytokine, intecrine\

    Many low-molecular weight factors secreted by cells including fibroblasts, macrophages and endothelial cells, in response to a variety of stimuli such as growth factors, interferons, viral transformation and bacterial products, are structurally related PUBMED:1910690, PUBMED:2149646, PUBMED:2687068. Most members of this family of proteins seem to have mitogenic, chemotactic or inflammatory activities. These small cytokines are also called intercrines or chemokines. They are cationic proteins of 70 to 100 amino acid residues that share four conserved cysteine residues involved in two disulphide bonds, as shown in the following schematic representation:\

    \
                                 +------------------------------------+\
                                 |                                    |\
         xxxxxxxxxxxxxxxxxxxxxxCxCxxxxxxxxxxxxxxxxxxxxxxxCxxxxxxxxxxxxCxxxxx\
                               |                         |\
                               +-------------------------+\
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
    

    \

    These proteins can be sorted into two groups based on the spacing of the two amino-terminal cysteines. In the first group (see ), the two cysteines are separated by a single residue (C-x-C), while in the second group (see ), they are adjacent (C-C).

    \ \ ' '3072' 'IPR002681' '\ This family consists of various coat proteins from the \ Ilarviruses which belong to the Bromoviridae, members include \ Apple mosaic virus and \ Prune dwarf virus. The Ilarvirus coat protein is required to initiate replication of the viral genome in host plants PUBMED:7730792. Members of the \ Bromoviridae have a positive stand ssRNA genome with no DNA stage in their replication.\ ' '3073' 'IPR000506' '\ Acetohydroxy acid isomeroreductase catalyses the conversion of acetohydroxy acids into dihydroxy valerates. This reaction is the second in the synthetic pathway of the essential branched side chain\ amino acids valine and isoleucine PUBMED:9218783. The enzyme forms a tetramer of similar but non-identical chains, and requires magnesium as a cofactor.\ \ ' '3074' 'IPR000581' '\

    Two dehydratases, dihydroxy-acid dehydratase () (gene ilvD or ILV3) and 6-phosphogluconate dehydratase () (gene edd) have been shown to be evolutionary related PUBMED:1624451. Dihydroxy-acid dehydratase catalyses the fourth step in the biosynthesis of isoleucine and valine, the dehydratation of 2,3-dihydroxy-isovaleic acid into alpha-ketoisovaleric acid. 6-Phosphogluconate dehydratase catalyses the first step in the Entner-Doudoroff pathway, the dehydratation of 6-phospho-D-gluconate into 6-phospho-2-dehydro-3-deoxy-D-gluconate. Another protein containing this signature is the Escherichia coli hypothetical protein\ yjhG. The N-terminal part of the proteins contains a cysteine that could be involved in the binding of a 2Fe-2S iron-sulphur cluster PUBMED:8299945.

    \ ' '3075' 'IPR007740' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \ This family of proteins has been identified as part of the mitochondrial large ribosomal subunit in Saccharomyces cerevisiae PUBMED:12392552.\ ' '3076' 'IPR007285' '\ Chlamydia trachomatis is an obligate intracellular bacterium that develops within a parasitophorous vacuole termed an inclusion. The inclusion is nonfusogenic with lysosomes but intercepts lipids from a host cell exocytic pathway. Initiation of chlamydial development is concurrent with modification of the inclusion membrane by a set of C. trachomatis-encoded proteins collectively designated Incs. One of these Incs, IncA, is functionally associated with the homotypic fusion of inclusions PUBMED:12065525.\ ' '3077' 'IPR003446' '\ These proteins are plasmid encoded and essential for plasmid replication, they are also involved in copy control functions PUBMED:3041379.\ ' '3078' 'IPR003086' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    This family of proteins represent monomeric serralysin inhibitors of about 125 residues, which interact with specific metalloprotease which are synthesised by serralysin secretors and characterised by being plant, insect and animal pathogens. It is probable that the serralysin inhibitors protect the host from proteolysis during export of the protease. The members of this family belong to MEROPS proteinase inhibitor family I38, clan IK.

    \ \

    X-ray crystallography of a complex between the Serratia marcescens protease, SmaPI, and the inhibitor of Erwinia chrysanthemi, Inh, reveals that Inh is folded into an eight-stranded b-barrel with an N-terminal trunk of 10 residues. Residues 1-5 occupy part of the extended active site of the proteinase, thereby preventing access of the substrate. Residues 6-10 form a linker that connects the N-terminal proteinase-binding peptide to the body of the b-barrel. The backbone carbonyl of Ser-1 interacts with the catalytic zinc; the Ser-2 side chain occupies the S1\'-binding site and also forms a hydrogen bond to the carboxyl end of the catalytic Glu, whereas Leu-3 occupies the S2\' recognition site. Penetration of the trunk region further than 5 residues into the substrate binding cleft appears to be prevented by the b-barrel, which itself interacts with the proteinase near its Met turn (19). Peptide mimetics of the trunk at concentrations up to about 100 mM do not inhibit the protease, demonstrating that the barrel is essential for inhibitory activity PUBMED:10770939, PUBMED:7752231.

    \ \

    Structurally and functionally these inhibitors are closely related to the \ lipocalins, fatty acid-binding proteins, avidins and the enigmatic triabin.\ Together these five protein families constitute the calycin superfamily PUBMED:7684291. \ The proteins are characterised by their high specificity for small hydrophobic molecules and by their ability to form complexes with soluble macromolecules either through intramolecular disulphides or protein-protein interactions PUBMED:8761444.

    \ \ ' '3079' 'IPR000990' '\

    The pannexin family combines invertebrate gap junction proteins and their vertebrate homologs. These proteins have been named innexins PUBMED:9769729. Gap junctions are composed of membrane proteins,\ which form a channel permeable for ions and small molecules connecting\ cytoplasm of adjacent cells. Although gap junctions provide similar functions\ in all multicellular organisms, until recently it was believed that\ vertebrates and invertebrates use unrelated proteins for this purpose. While\ the connexins family of gap junction proteins is well-\ characterised in vertebrates, no homologs have been found in invertebrates. In\ turn, gap junction molecules with no sequence homology to connexins have been\ identified in insects and nematodes. It has been suggested that these proteins\ are specific invertebrate gap junctions, and they were thus named innexins\ (invertebrate analog of connexins) PUBMED:9428764. As innexin homologs were recently identified in other taxonomic groups including vertebrates, indicating their ubiquitous distribution in the animal kingdom, they were called pannexins\ (from the Latin pan-all, throughout, and nexus-connection, bond) PUBMED:10898987, PUBMED:12492443, PUBMED:5028292.

    \ \

    Genomes of vertebrates carry probably a conserved set of 3 pannexin paralogs\ (PANX1, PANX2 and PANX3). Invertebrate genomes may contain more than a dozen\ pannexin (innexin) genes. Vinnexins, viral homologs of pannexins/innexins,\ were identified in Polydnaviruses that occur in obligate symbiotic\ associations with parasitoid wasps. It was suggested that virally encoded\ vinnexin proteins may function to alter gap junction proteins in infected host\ cells, possibly modifying cell-cell communication during encapsulation\ responses in parasitized insects PUBMED:12205780, PUBMED:14651471. Structurally pannexins are simillar to connexins. Both types of protein\ consist of a cytoplasmic N-terminal domain, followed by four transmembrane\ segments that delimit two extracellular and one cytoplasmic loops; the C-\ terminal domain is cytoplasmic.

    \ \ \ ' '3080' 'IPR013021' '\

    This is a region of myo-inositol-1-phosphate synthases that is related to the glyceraldehyde-3-phosphate dehydrogenase-like, C-terminal domain.

    \

    1L-myo-Inositol-1-phosphate synthase () catalyzes the conversion of D-glucose 6-phosphate to 1L-myo-inositol-1-phosphate, the first committed step in the production of all inositol-containing compounds, including phospholipids, either directly or by salvage. The enzyme exists in a cytoplasmic form in a wide range of plants, animals, and fungi. It has also been detected in several bacteria and a chloroplast form is observed in alga and higher plants. Inositol phosphates play an important role in signal transduction.

    \

    In Saccharomyces cerevisiae (Baker\'s yeast), the transcriptional regulation of the INO1 gene has been studied in detail PUBMED:7975896 and its expression is sensitive to the availability of phospholipid precursors as well as growth phase. The regulation of the structural gene encoding 1L-myo-inositol-1-phosphate synthase has also been analyzed at the transcriptional level in the aquatic angiosperm, Spirodela polyrrhiza (Giant duckweed) and the halophyte, Mesembryanthemum crystallinum (Common ice plant) PUBMED:9370339.

    \ ' '3081' 'IPR000760' '\ It has been shown that several proteins share two sequence motifs PUBMED:1660408. Two of these\ proteins, vertebrate and plant inositol monophosphatase (), and vertebrate inositol\ polyphosphate 1-phosphatase (), are enzymes of the inositol phosphate second messenger\ signalling pathway, and share similar enzyme activity. Both enzymes exhibit an absolute requirement\ for metal ions (Mg2+ is preferred), and their amino acid sequences contain a number of conserved\ motifs, which are also shared by several other proteins related to MPTASE (including products of fungal QaX and qutG, bacterial suhB and cysQ, and yeast hal2) PUBMED:7761465. The function of the\ other proteins is not yet clear, but it is suggested that they may act by enhancing the synthesis\ or degradation of phosphorylated messenger molecules PUBMED:1660408. Structural analysis of these\ proteins has revealed a common core of 155 residues, which includes residues essential for metal\ binding and catalysis. An interesting property of the enzymes of this family is their sensitivity\ to Li+. The targets and mechanism of action of Li+ are unknown, but overactive inositol phosphate\ signalling may account for symptoms of manic depression PUBMED:2553271.\ ' '3082' 'IPR003235' '\

    The insulin family of proteins PUBMED:6107857 groups a number of active peptides which are evolutionary related including insulin; relaxin; insulin-like growth factors I and II PUBMED:2197088; mammalian\ Leydig cell-specific insulin-like peptide (gene INSL3) PUBMED:8253799 and early placenta insulin-like peptide (ELIP) (gene INSL4) PUBMED:8666396; insect prothoracicotropic hormone (bombyxin) PUBMED:; locust insulin-related peptide (LIRP) PUBMED:1688797; molluscan insulin-related peptides 1 to 5 (MIP)\ PUBMED:1868853; and Caenorhabditis elegans insulin-like peptides PUBMED:9548970. Structurally, all these peptides consist of two polypeptide chains (A and B) linked by two disulphide bonds. They all share a conserved arrangement of four cysteines in their A chain. The first of these cysteines is linked by a disulphide bond to the third one and the second and fourth cysteines are linked by interchain disulphide bonds to cysteines in the B chain.

    \ \

    Insulin is involved in the regulation of normal glucose homeostasis, as well\ as other specific physiological functions PUBMED:6243748. It is synthesised as a prepropeptide from which an endoplasmic reticulum-targeting sequence is cleaved to yield proinsulin. Prosinsulin contains regions A and B separated by an intervening connecting region C. The connecting region is cleaved, liberating the active protein, which contains the A and B chains,\ held together by 2 disulphide bonds PUBMED:503234.

    \

    This entry represents Caenorhabditis elegans insulin-like peptides PUBMED:1868853, beta type.

    \ ' '3083' 'IPR003220' '\ Insertion elements are mobile elements in DNA, usually encoding proteins required for transposition, for example transposases. This protein is absolutely required for transposition of insertion element 1.\ ' '3084' 'IPR004825' '\

    The insulin family of proteins PUBMED:6107857 groups a number of active peptides which are evolutionary related including insulin; relaxin; insulin-like growth factors I and II PUBMED:2197088; mammalian\ Leydig cell-specific insulin-like peptide (gene INSL3) PUBMED:8253799 and early placenta insulin-like peptide (ELIP) (gene INSL4) PUBMED:8666396; insect prothoracicotropic hormone (bombyxin) PUBMED:; locust insulin-related peptide (LIRP) PUBMED:1688797; molluscan insulin-related peptides 1 to 5 (MIP)\ PUBMED:1868853; and Caenorhabditis elegans insulin-like peptides PUBMED:9548970. Structurally, all these peptides consist of two polypeptide chains (A and B) linked by two disulphide bonds. They all share a conserved arrangement of four cysteines in their A chain. The first of these cysteines is linked by a disulphide bond to the third one and the second and fourth cysteines are linked by interchain disulphide bonds to cysteines in the B chain.

    \ \

    Insulin is involved in the regulation of normal glucose homeostasis, as well\ as other specific physiological functions PUBMED:6243748. It is synthesised as a prepropeptide from which an endoplasmic reticulum-targeting sequence is cleaved to yield proinsulin. Prosinsulin contains regions A and B separated by an intervening connecting region C. The connecting region is cleaved, liberating the active protein, which contains the A and B chains,\ held together by 2 disulphide bonds PUBMED:503234.

    \ ' '3085' 'IPR001037' '\

    Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic core, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (have been studied most carefully with respect to the structural basis of catalysis. Although the active site of avian virus integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis PUBMED:10384242.

    \

    Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group.

    \

    HIV-1 integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS PUBMED:9161051. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity PUBMED:10384243.

    \ ' '3086' 'IPR004191' '\ The integrase family of site-specific recombinases catalyze a diverse array of DNA rearrangements in archaebacteria, eubacteria and yeast. The structure of the\ DNA binding domain of the the conjugative transposon Tn916 integrase protein was determined using NMR spectroscopy. The N-terminal domain was found to be structurally similar to the double stranded RNA binding domain (dsRBD). Experimental evidence suggests that the integrase protein interacts with DNA using residues located on the face of its three stranded beta-sheet PUBMED:9665166.\ ' '3087' 'IPR013513' '\ Some alpha subunits are cleaved post-\ translationally to produce a heavy and a light chain linked by a disulphide\ bond PUBMED:3028640, PUBMED:2199285. Integrin alpha chains share a conserved sequence which is found at\ the beginning of the cytoplasmic domain, just after the end of the\ transmembrane region. Within the N-terminal domain of alpha subunits, seven sequence repeats, each\ of approximately 60 amino acids, have been found PUBMED:3327687. It has been predicted \ that these repeats assume the beta-propeller fold. The domains contain seven \ four-stranded beta-sheets arranged in a torus around a pseudosymmetry axis\ PUBMED:8990162. Integrin ligands and a putative Mg2+ ion are predicted to bind to the\ upper face of the propeller, in a manner analogous to the way in which the\ trimeric G-protein beta subunit (G beta) (which also has a beta-propeller\ fold) binds the G protein alpha subunit PUBMED:8990162.\

    Integrin cytoplasmic domains are normally less than 50 amino acids in length, with the beta-subunit sequences\ exhibiting greater homology to each other than the alpha-subunit sequences PUBMED:12826403. This is consistent with\ current evidence that the beta subunit is the principal site for binding of cytoskeletal and signalling\ molecules, whereas the alpha subunit has a regulatory role. The first ten residues of the\ alpha-subunit cytoplasmic domain appear to form an alpha helix that is terminated by a proline residue. The\ remainder of the domain is highly acidic in nature and this loops back to contact the\ membrane-proximal lysine anchor residue.

    \ ' '3088' 'IPR002369' '\

    Integrins are the major metazoan receptors for cell adhesion to extracellular matrix proteins and, in vertebrates, also play important roles in certain cell-cell adhesions, make transmembrane connections to the cytoskeleton and activate many intracellular signalling pathways PUBMED:12297042, PUBMED:12361595. The integrin receptors are composed of alpha and beta subunit heterodimers. Each subunit crosses the membrane once, with most of the polypeptide residing in the extracellular space, and has two short cytoplasmic domains. Some members of this family have EGF repeats at the C terminus and also have a vWA domain inserted within the integrin domain at the N terminus.

    \

    Most integrins recognise relatively short peptide motifs, and in general require an acidic amino acid to be present. Ligand specificity depends upon both the alpha and beta subunits PUBMED:12234368. There are at least 18 types of alpha and 8 types of beta subunits recognised in humans PUBMED:14689578. Each alpha subunit tends to associate only with one type of beta subunit, but there are exceptions to this rule PUBMED:2467745. Each association of alpha and beta subunits has its own binding specificity and signalling properties. Many integrins require activation on the cell surface before they can bind ligands. Integrins frequently intercommunicate, and binding at one integrin receptor activate or inhibit another.

    \

    The structure of unliganded alphaV beta3 showed the molecule to be folded, with the head bent over towards the C termini of the legs which would normally be inserted into the membrane PUBMED:12714499. The head comprises a beta propeller domain at the end terminus of the alphaV subunit and an I/A domain inserted into a loop on the top of the hybrid domain in the beta subunit. The I/A domain consists of a Rossman fold with a core of beta parallel sheets surrounded by amphipathic alpha helices.

    \ \

    Integrins are important therapeutic targets in conditions such as atherosclerosis, thrombosis, cancer and asthma PUBMED:2199285.

    \ \

    At the N-terminus of the beta subunit is a cysteine-containing domain reminiscent of that found in presenillins and semaphorins, which has hence been termed the PSI domain. C-terminal to the PSI domain is an A-domain, which has been predicted to adopt a Rossmann fold similar to that of the alpha subunit, but with additional loops between the second and third beta strands PUBMED:9009218. The murine gene Pactolus shares significant similarity with the beta subunit PUBMED:9535848, but lacks either one or both of the inserted loops. The C-terminal portion of the beta subunit extracellular domain contains an internally disulphide-bonded cysteine-rich region, while the intracellular tail contains putative sites of interaction with a variety of intracellular signalling and cytoskeletal proteins, such as focal adhesion kinase and alpha-actinin respectively PUBMED:9818167. Integrin cytoplasmic domains are normally less than 50 amino acids in length, with the beta-subunit sequences exhibiting greater homology to each other than the alpha-subunit sequences. This is consistent with current evidence that the beta subunit is the principal site for binding of cytoskeletal and signalling molecules, whereas the alpha subunit has a regulatory role. The first 20 amino acids of the beta-subunit cytoplasmic domain are also alpha helical, but the final 25 residues are disordered and, apart from a turn that follows a conserved NPxY motif, appear to lack defined structure, suggesting that this is adopted on effector binding. The two membrane-proximal helices mediate the link between the subunits via a series of hydrophobic and electrostatic contacts.

    \ ' '3089' 'IPR000471' '\ Interferons PUBMED:3022999 are proteins which produce antiviral and antiproliferative\ responses in cells. On the basis of their sequence interferons are classified\ into five groups: alpha, alpha-II (or omega), beta, delta (or trophoblast).\ The sequence differences may possibly cause different responses to various inducers, or \ result in the recognition of different target cell types PUBMED:6170983. The main\ conserved structural feature of interferons is a disulphide bond that, \ except in mouse beta interferon, occurs in all alpha, beta and omega\ sequences.\ ' '3090' 'IPR020418' '\ Interleukin-13 (IL-13) is a pleiotropic cytokine which may be important in the regulation of the inflammatory and immune responses PUBMED:8096327. It inhibits inflammatory cytokine production and synergises with IL-2 in regulating interferon-gamma synthesis. The sequences of IL-4 and IL-13 are distantly related.\ ' '3091' 'IPR000442' '\ Group II introns use intron-encoded reverse transcriptase,\ maturase and DNA endonuclease activities for site-specific\ insertion into DNA PUBMED:9362497. Although this type of intron is\ self splicing in vitro they require a maturase protein for\ splicing in vivo. It has been shown that a specific region\ of the aI2 intron is needed for the maturase function PUBMED:8029012.\ This region was found to be conserved in group II introns\ and called domain X PUBMED:8255751.\ ' '3093' 'IPR003065' '\ The Salmonella typhimurium surface presentation of antigens K/invasion \ protein B gene (SpaK/InvB) is one of 12 that form a cluster responsible for \ invasion properties. The gene product is required for entry by the \ bacterium into epithelial cells, and is thus considered to be a virulence \ factor PUBMED:8404849. Other Spa genes in the cluster are related to invasion (Inv) genes in similar Salmonella and Shigella species PUBMED:7752894, and to flagella \ biosynthesis genes in Helicobacter pylori PUBMED:10066464. A further analogous gene in \ Yersinia (Spa15 homologue) has also been found PUBMED:8045880.\

    The SpaK/InvB protein has a molecular mass of 15kDa, and is believed to play a part in the sec-independent type III protein secretion system of \ S. typhimurium and Shigella flexneri PUBMED:9159221. In the organisation of the \ Spa/Inv locus, the SpaK/InvB gene is found adjacent to SpaL/InvC PUBMED:8045880, and \ may play a part in the ATPase activity possessed by the latter.

    \ ' '3095' 'IPR006937' '\ This family represents a number of plant neutral invertases ().\ ' '3096' 'IPR006830' '\ This family represents the Salmonella outer membrane lipoprotein InvH. The molecular function of this protein is unknown, but it is required for the localisation to outer membrane of InvG, which is involved in a type III secretion apparatus mediating host cell invasion PUBMED:9680224, PUBMED:9786184.\ ' '3097' 'IPR000354' '\

    Involucrin PUBMED:1359382, PUBMED:8277848 is a protein present in keratinocytes of epidermis and other\ stratified squamous epithelia. Involucrin first appears in the cell cytosol,\ but ultimately becomes cross-linked to membrane proteins by transglutaminase\ thus helping in the formation of an insoluble envelope beneath the plasma\ membrane.

    \ \

    Structurally involucrin consists of a conserved region of about 75 amino acid\ residues followed by two extremely variable length segments that contain\ glutamine-rich tandem repeats. The glutamine residues in the tandem repeats\ are the substrate for the tranglutaminase in the cross-linking reaction. The\ total size of the protein varies from 285 residues (in dog) to 835 residues\ (in orangutan).

    \ ' '3098' 'IPR001666' '\ Phosphatidylinositol transfer protein (PITP) is a ubiquitous cytosolic protein, thought to be involved in transport of phospholipids from their site of synthesis in the endoplasmic reticulum and Golgi to other cell membranes PUBMED:7774006. More recently, PITP has been shown to be an essential component of the polyphosphoinositide synthesis machinery and is hence required for proper signalling by epidermal growth factor and f-Met-Leu-Phe, as well as for exocytosis. The role of PITP in polyphosphoinositide synthesis may also explain its involvement in intracellular vesicular traffic PUBMED:7774006.\ \ ' '3099' 'IPR004959' '\ The IpgB family includes the invasion plasmid antigen from Shigella flexneri PUBMED:11207575, as well as related proteins from Escherichia coli species. Members of this family seem to be involved in pathogenicity of some enterobacteria. However the exact function of these proteins is unclear.\ ' '3100' 'IPR007062' '\ Protein phosphatase inhibitor 2 (IPP-2) is a phosphoprotein conserved among all eukaryotes, and it appears in both the nucleus and cytoplasm of tissue culture cells PUBMED:12235284.\ ' '3101' 'IPR002627' '\ tRNA isopentenyltransferases also known as tRNA delta(2)-isopentenylpyrophosphate transferases or IPP transferases. These enzymes modify both cytoplasmic and mitochondrial tRNAs at A(37) to give isopentenyl A(37) PUBMED:8139535.\ ' '3102' 'IPR002648' '\ Isopentenyl transferase / dimethylallyl transferase synthesizes isopentenyladensosine 5\'-monophosphate, a cytokinin that induces shoot formation on host plants infected with the Ti plasmid PUBMED:1465104.\ ' '3104' 'IPR001346' '\ The expression of type I interferon genes (interferons alpha and beta) is induced by many \ agents, including viral attack PUBMED:3409321. Induction is mediated by the binding of \ interferon regulatory factor 1 (IRF-1) to a region known as the interferon consensus \ sequence (ICS), located upstream of the interferon genes PUBMED:1460054. Other factors may \ also bind to the ICS, including IRF-2, which does not function as an activator, but \ rather suppresses the function of IRF-1 under certain circumstances PUBMED:2475256. \ IRF proteins contain a conserved N-terminal region of about 120 amino acids, which folds \ into a structure that binds specifically to the ICS; the remaining parts of the\ sequences vary depending on the precise function of the protein PUBMED:1460054.\ ' '3105' 'IPR007251' '\

    The low affinity iron permease is an integral membrane protein required for ferrous iron low affinity uptake, and induced by iron deprivation.

    \ ' '3106' 'IPR000369' '\

    Potassium channels are the most diverse group of the ion channel family\ PUBMED:1772658, PUBMED:1879548. They are important in shaping the action potential, and in neuronal excitability and plasticity PUBMED:2451788. The potassium channel family is\ composed of several functionally distinct isoforms, which can be broadly\ separated into 2 groups PUBMED:2555158: the practically non-inactivating \'delayed\' group and the rapidly inactivating \'transient\' group.

    \

    These are all highly similar proteins, with only small amino acid\ changes causing the diversity of the voltage-dependent gating mechanism,\ channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or\ other second messengers PUBMED:2448635. In eukaryotic cells, K+ channels\ are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes PUBMED:1373731. In prokaryotic cells, they play a role in the\ maintenance of ionic homeostasis PUBMED:11178249.

    \

    All K+ channels discovered so far possess a core of \ alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has\ been termed the K+ selectivity sequence.\ In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane.\ However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains.\ The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK)\ PUBMED:11178249, PUBMED:. The 2TM domain family comprises inward-rectifying K+ \ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.

    \

    The KCNE family are single transmembrane (TM) domain proteins which function as ancillary (or beta) subunits of voltage-gated potassium channels PUBMED:15527815. They share no structural relationship with the alpha subunit proteins, which possess pore forming domains. When expressed with the alpha subunits, the KCNE subunits confer changes in channel conductance, gating kinetics and pharamcology. KCNE subunits are formed from short polypeptides of ~130 amino acids, and are divided into five subfamilies: - KCNE1 (MinK/IsK), KCNE2 (MiRP1), KCNE3 (MiRP2), KCNE4 (MiRP3) and KCNE1L (AMMECR2). The widespread expression of these subunits in numerous tissues suggests that they play diverse roles, from controlling excitability in the heart and CNS, modulating vascular tone in blood vessels, to facilitating transport processes in the gastrointestinal tract. Inherited mutations in KCNE genes are associated with diseases of cardiac and skeletal muscle, and the inner ear. KCNE subunits are promiscuous, with each shown to interact with several different voltage-gated alpha subunits in vitro and possibly in vivo as well. \

    \ ' '3107' 'IPR001804' '\

    Isocitrate dehydrogenase (IDH) PUBMED:2682654, PUBMED:1939242 is an important enzyme of carbohydrate metabolism which catalyses the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. IDH is either dependent on NAD+ () or on NADP+ (). In eukaryotes there are at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD+-dependent, the other NADP+-dependent), while the third one (also NADP+-dependent) is cytoplasmic. In Escherichia coli the activity of a NADP+-dependent form of the enzyme is controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is completely inactivated.

    \

    3-isopropylmalate dehydrogenase () (IMDH) PUBMED:1748999, PUBMED:7773180 catalyses the third step in the biosynthesis of leucine in bacteria and fungi, the oxidative decarboxylation of 3-isopropylmalate into 2-oxo-4-methylvalerate. Tartrate dehydrogenase () PUBMED:8053675 catalyses the reduction of tartrate to oxaloglycolate.

    \

    These enzymes are evolutionary related PUBMED:2682654, PUBMED:1748999, PUBMED:7773180, PUBMED:8053675. The best conserved region of these enzymes is a glycine-rich stretch of residues located in the C-terminal section.

    \ ' '3109' 'IPR006008' '\

    Intracellular septation protein A is a family of proteins which are essential for both normal cell division and bacterial virulence and are believed to play a role in the septation process PUBMED:9746567.

    \ ' '3110' 'IPR001228' '\

    4-diphosphocytidyl-2C-methyl-D-erythritol synthase, a bacterial ispD protein, catalyzes the third step of the deoxyxylulose-5-phosphate pathway (DXP) of isoprenoid biosynthesis; the formation of 4-diphosphocytidyl-2C-methyl-D-erythritol from CTP and 2C-methyl-D-erythritol 4-phosphate PUBMED:10841550. The isoprenoid pathway is a well known target for anti-infective drug development PUBMED:17081012, PUBMED:17921290.

    \ ' '3111' 'IPR002611' '\

    Proteins in this entry contain an ATP/GTP binding P-loop motif. They are found \ associated with IS21 family insertion sequences PUBMED:7698671. Functionally they have not been characterised, but they may be involved in transposition PUBMED:9141667.

    \ ' '3112' 'IPR003110' '\

    Phosphorylated immunoreceptor signalling motifs (ITAMs) exhibit unique abilities to bind and activate Lyn and Syk tyrosine kinases PUBMED:7594458. Motif may be dually phosphorylated on tyrosine that links antigen receptors to downstream signalling machinery.

    \ ' '3113' 'IPR001910' '\

    Inosine-uridine preferring nucleoside hydrolase () (IU-nucleoside hydrolase or IUNH) is an enzyme first identified in protozoan PUBMED:8634237 that catalyses the hydrolysis of all of the commonly occuring purine and pyrimidine nucleosides into ribose and the associated base, but has a preference for inosine and uridine as substrates. This enzyme is important for these parasitic organisms, which are deficient in de novo synthesis of purines, to salvage the host purine nucleosides. IUNH from Crithidia fasciculata has been sequenced and characterised, it is an homotetrameric enzyme of subunits of 34 Kd. An histidine has been shown to be important for the catalytic mechanism, it acts as a proton donor to activate the hypoxanthine leaving group.

    \ \

    A highly conserved region located in the N-terminal extremity contains four conserved aspartates that have been shown PUBMED:8634238 to be located in the active site cavity.

    \ \

    IUNH is evolutionary related to a number of uncharacterised proteins from various biological sources.

    \ ' '3114' 'IPR007310' '\

    Bacteria solve the iron supply problem caused by the insolubility of Fe(3+) by synthesizing iron-complexing compounds, called siderophores, and by using iron sources of their hosts, such as haem and iron bound to transferrin and lactoferrin. Escherichia coli, as an example of Gram-negative bacteria, forms sophisticated Fe(3+)-siderophore and haem transport systems across the outer membrane. LucA and IucC catalyse discrete steps in biosynthesis of the siderophore aerobactin from N epsilon-acetyl-N epsilon-hydroxylysine and citrate PUBMED:3087960.

    \ ' '3115' 'IPR001229' '\

    This entry represents a mannose-binding lectin domain with a beta-prism fold consisting of three 4-stranded beta-sheets, with an internal pseudo 3-fold symmetry. Some lectins in this group stimulate distinct T- and B- cell functions, such as Jacalin, which binds to the T-antigen and acts as an agglutinin. This domain is found in 1 to 6 copies in lectins. The domain is also found in the salt-stress induced protein from rice and an animal prostatic spermine-binding protein. Proteins containing this domain include:

    \

    \ ' '3116' 'IPR003349' '\

    Jumonji protein is required for neural tube formation in mice PUBMED:7758946.There is evidence of domain swapping within the jumonji family of transcription factors PUBMED:10838566. This domain is often associated with JmjC (see ).

    \ ' '3117' 'IPR005643' '\

    The c-Jun NH(2)-terminal kinase (JNK) is a member of an evolutionarily conserved sub-family of mitogen-activated protein (MAP) kinases PUBMED:11402333, PUBMED:11790549.

    \ ' '3118' 'IPR002487' '\ MADS genes in plants encode key developmental regulators of vegetative and reproductive development. The majority of the plant MADS proteins share a stereotypical MIKC structure. It comprises (from N- to C-terminal) an N-terminal domain, which is, however, present only in a minority of proteins; a MADS domain (see , ), which is the major determinant of DNA-binding but which also performs dimerisation and accessory factor binding functions; a weakly conserved intervening (I) domain, which constitutes a key molecular determinant for the selective formation of DNA-binding dimers; a keratin-like (K-box) domain, which promotes protein dimerisation; and a C-terminal (C) domain, which is involved in transcriptional activation or in the formation of ternary or quaternary protein complexes. The 80-amino acid K-box domain was originally identified as a region with low but significant similarity to a region of keratin, which is part of the coiled-coil sequence constituting the central rod-shaped domain of keratin PUBMED:10805792, PUBMED:12032236, PUBMED:12943540.\ \ The K-box protein-protein interaction domain which mediates heterodimerization of MIKC-type MADS proteins contains several heptad repeats in which the first and the fourth positions are occupied by hydrophobic amino acids suggesting that the K-box domain forms three amphipathic alpha-helices referred to as K1, K2, and K3 PUBMED:12943540.\ ' '3119' 'IPR004121' '\

    Current genotyping systems for Human herpesvirus 8 (HHV-8) are based on the highly variable gene encoding the K1 glycoprotein PUBMED:11172090. This entry represents the C-terminal region of the K1 glycoprotein.

    \ ' '3120' 'IPR001772' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    Eukaryotic protein kinases PUBMED:, PUBMED:7768349, PUBMED:1835513, PUBMED:1956325, PUBMED:3291115 are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. There are a number of conserved regions in the catalytic domain of protein kinases. In the N-terminal extremity of the catalytic domain there is a glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP binding. In the central part of the catalytic domain there is a conserved aspartic acid residue which is important for the catalytic activity of the enzyme PUBMED:1862342.

    \ \

    Members of the KIN2/PAR-1/MARK kinase subfamily are conserved from yeast to human and share the same domain organisation: an N-terminal kinase domain () and a C-terminal kinase associated domain 1 (KA1). Some members of the KIN1/PAR-1/MARK family also contain an UBA domain (). Members of this kinase subfamily are involved in various biological processes such as cell polarity, cell cycle control, intracellular signalling, microtubule stability and protein stability PUBMED:15182702. The function of the KA1 domain is not yet known.

    \ \

    Some proteins known to contain a KA1 domain are listed below:\

    \ \

    This entry represents the KA1 domain.

    \ ' '3121' 'IPR003900' '\ This group of proteins contains the KID repeat as found in Borrelia and spirochete RepA / Rep+ proteins. The function of these proteins is unknown. RepA and related Borrelia proteins have been suggested to play an important genus-wide role in the biology of the Borrelia PUBMED:9733706.\ ' '3122' 'IPR002350' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    This family of Kazal inhibitors, belongs to MEROPS inhibitor family I1, clan IA. They inhibit serine peptidases of the S1 family () PUBMED:14705960. The members are primarily metazoan, but includes exceptions in the alveolata (apicomplexa), stramenopiles, higher plants and bacteria.

    \ \ \

    Kazal inhibitors, which inhibit a number of serine proteases (such as\ trypsin and elastase), belong to family of proteins that includes\ pancreatic secretory trypsin inhibitor; avian ovomucoid; acrosin inhibitor;\ and elastase inhibitor. These proteins contain between 1 and 7 Kazal-type\ inhibitor repeats PUBMED:6699915, PUBMED:3828298.

    The structure of the Kazal repeat includes a large quantity of extended chain, 2 short alpha-helices and a 3-stranded anti-parallel beta sheet PUBMED:6699915.The inhibitor makes 11 contacts with its enzyme substrate: unusually, 8 of these important residues are hypervariable PUBMED:3828298. Altering the enzyme-contact residues, and especially that of the active site bond, affects the the strength of inhibition and specificity of the inhibitor for particular serine proteases PUBMED:3828298, PUBMED:7046785. The presence of this Pfam domain is usually indicative of serine protease inhibitors, however, Kazal-like domains are also seen in the extracellular part of agrins which are not known to be proteinase inhibitors.

    \ ' '3123' 'IPR018491' '\

    The K-Cl co-transporter (KCC) mediates the coupled movement of K+ and Cl-\ ions across the plasma membrane of many animal cells. This transport is\ involved in the regulatory volume decrease in response to cell swelling in\ red blood cells, and has been proposed to play a role in the vectorial\ movement of Cl- across kidney epithelia. The transport process involves one\ for one electroneutral movement of K+ together with Cl-, and, in all\ known mammalian cells, the net movement is outward PUBMED:8663127.

    \ \

    In neurones, it appears to play a unique role in maintaining low\ intracellular Cl-concentration, which is required for the functioning of Cl-\ dependent fast synaptic inhibition, mediated by certain neurotransmitters,\ such as gamma-aminobutyric acid (GABA) and glycine.

    \ \

    Three isoforms of the K-Cl co-transporter have been described, termed KCC1 KCC2, and KCC3, containing 1085, 1116 and 1150 amino acids, respectively. They are predicted to have 12 transmembrane (TM) regions in a central hydrophobic\ domain, together with hydrophilic N- and C-termini that are likely\ cytoplasmic. Comparison of their sequences with those of other\ ion-tranporting membrane proteins reveals that they are part of a new\ superfamily of cation-chloride co-transporters, which includes the Na-Cl and\ Na-K-2Cl co-transporters. KCC1 and KCC3 are widely expressed in human tissues, while KCC2 is are expressed only in brain neurones, making it likely that this is the isoform responsible for maintaining low Cl- concentration in neurones PUBMED:8663311, PUBMED:9930699, PUBMED:10600773.

    \ \ \

    KCC1 is widely expressed in human tissues, and when heterologously expressed,\ possesses the functional characteristics of the well-studied red blood cell\ K-Cl co-transporter, including stimulation by both swelling and\ N-ethylmaleimide. Several splice variants have also been identified.

    \

    KCC3 is widely expressed in human tissues and, like KCC1, is stimulated by both swelling and N-ethylmaleimide. The induction of KCC3 is up-regulated by vascular endothelial growth factor and down-regulated by tumour necrosis factor. Defects in KCC3 are linked to agenesis of the corpus callosum with peripheral neuropathy PUBMED:12368912. This disorder is characterised by severe progressive sensorimotor neuropathy, mental retardation, dysmorphic features and complete or partial agenesis of the corpus callosum.

    \ ' '3124' 'IPR013821' '\

    Potassium channels are the most diverse group of the ion channel family\ PUBMED:1772658, PUBMED:1879548. They are important in shaping the action potential, and in neuronal excitability and plasticity PUBMED:2451788. The potassium channel family is\ composed of several functionally distinct isoforms, which can be broadly\ separated into 2 groups PUBMED:2555158: the practically non-inactivating \'delayed\' group and the rapidly inactivating \'transient\' group.

    \

    These are all highly similar proteins, with only small amino acid\ changes causing the diversity of the voltage-dependent gating mechanism,\ channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or\ other second messengers PUBMED:2448635. In eukaryotic cells, K+ channels\ are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes PUBMED:1373731. In prokaryotic cells, they play a role in the\ maintenance of ionic homeostasis PUBMED:11178249.

    \

    All K+ channels discovered so far possess a core of \ alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has\ been termed the K+ selectivity sequence.\ In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane.\ However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains.\ The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK)\ PUBMED:11178249, PUBMED:. The 2TM domain family comprises inward-rectifying K+ \ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.

    \

    KCNQ channels (also known as KQT-like channels) differ from other voltage-gated 6 TM helix channels, chiefly in that they possess no tetramerisation domain. Consequently, they rely on interaction with accessory subunits, or form heterotetramers with other members of the family PUBMED:10838601. Currently, 5 members of the KCNQ family are known. These have been found to be widely distributed within the body, having been shown to be expressed in the heart, brain, pancreas, lung, placenta and ear. They were initially cloned as a result of a search for proteins involved in cardiac arhythmia. Subsequently, mutations in other KCNQ family members have been shown to be responsible for some forms of hereditary deafness PUBMED:8528244 and benign familial neonatal epilepsy PUBMED:9430594.

    \

    This entry represents a region found at the C-terminus of these proteins.

    \ ' '3125' 'IPR004684' '\

    This family includes the characterised 2-Keto-3-Deoxygluconate transporters from Bacillus subtilis and Erwinia chrysanthemi. There are homologs of this protein found in both Gram-positive and Gram-negative bacteria.

    \

    In E. chrysanthemi, a phytopathogenic bacterium, degraded pectin products from plant cell walls are transported by 2-keto-3-deoxygluconate permease into the bacterial cell to provide a carbon and energy source PUBMED:2684787. 2-keto-3-deoxygluconate permease can mediate the uptake of glucuronate with a low affinity PUBMED:3571157.

    \ ' '3126' 'IPR004623' '\ Kdp is a high affinity ATP-driven K+ transport system in Escherichia coli. It is composed of three membrane-bound subunits, KdpA, KdpB and KdpC and one small peptide, KdpF. KdpA is the K+-transporting subunit of this complex. During assembly of the complex, KdpA and KdpC bind to each other. This interaction is thought to stabilise the complex. Data indicates that KdpC might connect the KdpA, the K+-transporting subunit, to KdpB, the ATP-hydrolysing (energy providing) subunit PUBMED:9858692.\ ' '3127' 'IPR003820' '\

    Kdp, the high affinity ATP-driven K+-transport system of Escherichia coli, is a complex of the membrane-bound subunits KdpA, KdpB, KdpC and the small peptide KdpF. KdpC forms strong interactions with the KdpA subunit, serving to assemble and stabilise the Kdp complex PUBMED:9858692. It has been suggested that KdpC could be one of the connecting links between the energy providing subunit KdpB and the K+- transporting subunit KdpA PUBMED:9858692. The K+ transport system actively transports K+ ions via ATP hydrolysis.

    \ ' '3128' 'IPR003852' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms PUBMED:8868347, PUBMED:11406410. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation PUBMED:10426948, and CheA, which plays a central role in the chemotaxis system PUBMED:9989504. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water PUBMED:11145881. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.

    \

    HKs can be roughly divided into two classes: orthodox and hybrid kinases PUBMED:8029829, PUBMED:1482126. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK PUBMED:10966457. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.

    \

    This entry represents the N-terminal domain found in KdpD sensor kinase proteins, which regulate the kdpFABC operon responsible for potassium transport PUBMED:9226259. The N-terminal domain forms part of the cytoplasmic region of the protein, which may be the sensor domain responsible for sensing turgor pressure PUBMED:1532388.

    \ ' '3129' 'IPR007045' '\

    This family of bacterial proteins have been characterised as 4-deoxy-L-threo-5-hexosulose-uronate ketol-isomerase. It is coded for by kdul PUBMED:1766386 and is involved in the fourth step in pectin degradation.

    \ \ \ \

    Although this enzyme is found in Escherichia coli, its role is uncertain since E. coli is not known to degrade the polysaccharides which are potential sources of 5-keto-4-deoxyuronate PUBMED:9761873.

    \ ' '3130' 'IPR003461' '\ Keratins are a well known group of intermediate filament proteins. Like actin filaments, keratins are flexible but provide a firm cell skeleton. Unlike actin, however, no known keratins are associated with motor functions. This family represents avian keratin proteins PUBMED:6200321, found in feathers, scale and claw. The avian keratins (F-ker, S-ker, C-ker and B-ker) are a complex mixture of very similar polypeptides.\ ' '3131' 'IPR002494' '\ High sulphur proteins are cysteine-rich proteins synthesized\ during the differentiation of hair matrix cells, and form hair\ fibers in association with hair keratin intermediate filaments PUBMED:9524245.\ This family has been divided up into four regions, with the second\ region containing 8 copies of a short repeat PUBMED:9524245. This family is\ also known as B2 or KAP1.\ ' '3132' 'IPR007659' '\ This is a family of keratins, high-sulphur matrix proteins. The keratin products of mammalian epidermal derivatives such as wool and hair consist of microfibrils embedded in a rigid matrix of other proteins. The matrix proteins include the high-sulphur and high-tyrosine keratins, having molecular weights of 6-20 kDa, whereas microfibrils contain the larger, low-sulphur keratins (40-56 kDa) PUBMED:4678578.\ ' '3133' 'IPR005582' '\

    This family contains MukF, which are proteins involved in the segregation and condensation of prokaryotic chromosomes. MukE () along with MukF interact with MukB () in vivo forming a complex, which is required for chromosome condensation and segregation in Escherichia coli PUBMED:10545099. The Muk complex appears to be similar to the SMC-ScpA-ScpB complex in other prokaryotes where MukB is the homologue of SMC PUBMED:12065423. ScpA () and ScpB () have little sequence similarity to MukE or MukF, though they are predicted to be structurally similar, being predominantly alpha-helical with coiled coil regions.

    \ ' '3134' 'IPR018004' '\

    The amino-terminal module of the poxvirus D6R/NIR proteins defines a novel conserved DNA-binding domain (the KilA-N domain) that is found in a wide range of proteins of large bacterial and eukaryotic DNA viruses PUBMED:11897024. Putative proteins with homology to the KilA-N domain have also been identified in Maverick transposable elements of the parabasalid protozoa Trichomonas vaginalis PUBMED:17034960. The KilA-N domain has been suggested to be homologous to the fungal DNA-binding APSES domain (see ). In all proteins shown to contain the KilA-N domain, it occurs at the extreme amino terminus accompanied by a wide range of distinct carboxy-terminal domains. These carboxy-terminal modules may be enzymes, such as the nuclease domains, or might mediate additional, specific interactions with nucleic acids or proteins, like the RING (see ) or CCCH fingers in the poxviruses PUBMED:11897024. The KilA-N domain is predicted to adopt an alpha-beta fold with four conserved strands and at least two conserved helices PUBMED:11897024. Some proteins known to contain a KilA-N domain are listed below:

    \ \

    \ ' '3135' 'IPR003101' '\

    The nuclear factor CREB activates transcription of target genes in part through direct interactions with the KIX domain of the coactivator CBP in a\ phosphorylation-dependent manner PUBMED:9413984. This provides a model for\ activator:coactivator interactions. The KIX domain of CBP also binds to transactivation domains of other nuclear factors including Myb and Jun.

    \ \ ' '3136' 'IPR004132' '\ Kinetoplastid membrane protein 11 is a major cell surface glycoprotein of the parasite Leishmania donovani. It stimulates T-cell proliferation and may play a role in the immunlogy of the dieases Leishmaniasis.\ ' '3137' 'IPR005540' '\

    The MEINOX region is comprised of two domains, KNOX1 and KNOX2. KNOX1 plays a role in suppressing target gene expression. KNOX2, essential for function, is thought to be necessary for homo-dimerization PUBMED:11549765.

    \ ' '3138' 'IPR005541' '\

    The MEINOX region is comprised of two domains, KNOX1 and KNOX2. KNOX1 plays a role in suppressing target gene expression. KNOX2, essential for function, is thought to be necessary for homo-dimerization PUBMED:11549765.

    \ ' '3139' 'IPR000001' '\ Kringles are autonomous structural domains, found throughout the blood clotting and fibrinolytic proteins.\ Kringle domains are believed to play a role in binding mediators (e.g., membranes,\ other proteins or phospholipids), and in the regulation of proteolytic activity\ PUBMED:3886654, PUBMED:6373375, PUBMED:2157850. \ Kringle domains PUBMED:3131537, PUBMED:3891096, PUBMED:1879523 are characterised by a triple loop, 3-disulphide bridge structure, whose conformation is defined by a number of hydrogen bonds and small pieces of anti-parallel beta-sheet. They are found in a varying number of copies in some plasma proteins including prothrombin and urokinase-type plasminogen activator, which are serine proteases belonging to MEROPS peptidase family S1A.\ \ ' '3140' 'IPR002160' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    The Kunitz-type soybean trypsin inhibitor (STI) family consists mainly of proteinase inhibitors from Leguminosae seeds PUBMED:14705960. They belong to MEROPS inhibitor family I3, clan IC. They exhibit proteinase inhibitory activity against serine proteinases; trypsin (MEROPS peptidase family S1, ) and subtilisin (MEROPS peptidase family S8, ), thiol proteinases (MEROPS peptidase family C1, ) and aspartic proteinases (MEROPS peptidase family A1, ) PUBMED:14705960. \

    \

    Inhibitors from cereals are active against subtilisin and endogenous alpha-amylases, while some also inhibit tissue plasminogen activator. The inhibitors are usually specific for either trypsin or chymotrypsin, and some are effective against both. They are thought to protect the seeds against consumption by animal predators, while at the same time existing as seed storage proteins themselves - all the actively inhibitory members contain 2 disulphide bridges. The existence of a member with no inhibitory activity, winged bean albumin 1, suggests that the inhibitors may have evolved from seed storage proteins.

    \

    Proteins from the Kunitz family contain from 170 to 200 amino acid residues and one or two intra-chain disulphide bonds. The best conserved region is found in their N-terminal section. The crystal structures of soybean trypsin inhibitor (STI), trypsin inhibitor DE-3 from the Kaffir tree Erythrina caffra (ETI) PUBMED:1988676 and the bifunctional proteinase K/alpha-amylase inhibitor from wheat (PK13) have been solved, showing them to share the same 12-stranded beta-sheet structure as those of interleukin-1 and heparin-binding growth factors PUBMED:1738162. The beta-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel beta-barrel. Despite the structural similarity, STI shows no interleukin-1 bioactivity, presumably as a result of their primary sequence disparities. The active inhibitory site containing the scissile bond is located in the loop between beta-strands 4 and 5 in STI and ETI.

    \ \ \

    The STIs belong to a superfamily that also contains the interleukin-1 \ proteins, heparin binding growth factors (HBGF) and histactophilin, all of \ which have very similar structures, but share no sequence similarity with \ the STI family.

    \ ' '3142' 'IPR003472' '\

    The four families of large eukaryotic DNA viruses, Poxviridae, Asfarviridae, Iridoviridae, and Phycodnaviridae, referred to collectively as nucleocytoplasmic large DNA viruses or NCLDV, have all been shown to have a lipid membrane, in spite of the major differences in virion structure. The paralogous genes L1R and F9L encode membrane proteins that have a conserved domain architecture, with a single, C-terminal transmembrane helix, and an N-terminal, multiple-disulphide-bonded domain. The conservation of the myristoylated, disulphide-bonded protein L1R/F9L in most of the NCLDV correlates with the conservation of the thiol-disulphide oxidoreductase E10R which, in vaccinia virus, is required for the formation of disulphide bonds in L1R and F9L PUBMED:11689653.

    \ ' '3143' 'IPR007741' '\ Proteins containing this domain are located in the mitochondrion and include ribosomal protein L51, and S25. This domain is also found in mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) . It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins.\ ' '3144' 'IPR007682' '\ Lantibiotics are antibiotic peptides distinguished by the presence of the rare thioether amino acids lanthionine and/or methyllanthionine. They are produced by Gram-positive bacteria as gene-encoded precursor peptides and undergo post-translational modification to generate the mature peptide. Based on their structural and functional features lantibiotics are currently divided into two major groups: the flexible amphiphilic type-A and the rather rigid and globular type-B. Type-A lantibiotics act primarily by pore formation in the bacterial membrane by a mechanism involving the interaction with specific docking molecules such as the membrane precursor lipid II PUBMED:7601145.\ ' '3145' 'IPR003500' '\

    This entry represents the sugar isomerase enzymes ribose 5-phosphate isomerase B (rpiB), galactose isomerase subunit A (LacA) and galactose isomerase subunit B (LacB).

    \

    Galactose-6-phosphate isomerase () is a heteromultimeric protein consisting of subunits LacA and LacB, and catalyses the conversion of D-galactose 6-phosphate to D-tagatose and 6-phosphate in the tagatose 6-phosphate pathway of lactose catabolism PUBMED:1400164. Galactose-6-phosphate isomerase is induced by galactose or lactose. This entry represents the LacB subunit.

    \

    Ribose 5-phosphate isomerase () forms a homodimer and catalyses the interconversion of D-ribose 5-phosphate and D-ribulose 5-phosphate in the non-oxidative branch of the pentose phosphate pathway. This reaction permits the synthesis of ribose from other sugars, as well as the recycling of sugars from nucleotide breakdown. Two unrelated enzymes can catalyse this reaction: RpiA (found in most organisms) and RpiB (found in some bacteria and eukaryotes). RpiB is also involved in metabolism of the rare sugar, allose, in addition to ribose sugars. The structures of RpiA and RpiB are distinct, RpiB having a Rossmann-type alpha/beta/alpha sandwich topology PUBMED:14499611.

    \ ' '3146' 'IPR000843' '\ Numerous bacterial transcription regulatory proteins bind DNA via a helix-turn-helix (HTH) motif. \ These proteins are very diverse, but for convenience may be grouped into subfamilies on the basis \ of sequence similarity. One such family groups together a range of proteins, including ascG, ccpA, \ cytR, ebgR, fruR, galR, galS, lacI, malI, opnR, purF, rafR, rbtR and scrR PUBMED:1639817, PUBMED:1805309. \ Within this family, the HTH motif is situated towards the N-terminus.\ ' '3147' 'IPR003386' '\ Lecithin:cholesterol acyltransferase (LACT) also known as phosphatidylcholine-sterol acyltransferase (), is involved in extracellular metabolism of plasma lipoproteins, including cholesterol. It esterifies the free cholesterol transported in plasma lipoproteins, and is activated by apolipoprotein A-I. Defects in LACT cause Norum and Fish eye diseases.\ ' '3148' 'IPR007464' '\ This is a family of bacteriocins from lactic acid bacteria.\ ' '3149' 'IPR000576' '\ In bacteria there are a number of families of transport proteins, including symporters and antiporters, that\ mediate the intake of a variety of sugars with the concomitant uptake of hydrogen ions (proton symporters)\ PUBMED:8438231. The lacY family of Escherichia coli and Klebsiella pneumoniae are proton/beta-galactoside symporters,\ which, like most sugar transporters, are integral membrane proteins with 12 predicted transmembrane (TM) regions.\ Also similar to the lacY family are the rafinose (rafB) and sucrose (cscB) permeases from E. coli PUBMED:1435727.\ ' '3150' 'IPR001982' '\ The LAGLIDADG and HNH domains of site-specific DNA endonucleases encoded by viruses, bacteriophages as well as archaeal, eukaryotic nuclear and organellar genomes are characterised by the sequence motifs \'LAGLIDADG\' and \'HNH\', respectively PUBMED:9187655, PUBMED:9254693. Phylogenetic analysis of the two domains indicates a lack of exchange of endonucleases between different mobile elements (environments) and between hosts from different phylogenetic kingdoms. However, there does appear to have been considerable exchange of endonuclease domains amongst elements of the same type. Such events are suggested to be important for the formation of elements of new specficity PUBMED:9358175.\

    \'Homing\' is the lateral transfer of an intervening genetic sequence, either an intron or an intein, to a cognate allele that lacks that element. The end result of homing is the duplication of the intervening sequence. The process is initiated by site-specific endonucleases that are encoded by open reading frames within the mobile elements. These endonucleases may be contrasted with a variety of enzymes involved in nucleic acid strand breakage and rearrangement, particularly restriction endonucleases. They are encoded within\ the intervening sequence and there are interesting limitations on the position and length of their open reading frames, and therefore on their structures. These enzymes display a unique strategy of flexible recognition of very long DNA target sites. This strategy allows these sequences to minimize nonspecific cleavage within the host genome, while maximizing the ability of the endonuclease to cleave closely related variants of the homing site PUBMED:10487208.

    \ ' '3151' 'IPR003192' '\ Maltoporin (LamB protein) forms a trimeric structure which facilitates the diffusion of maltodextrins across the outer membrane of Gram-negative bacteria. The membrane channel is formed by an antiparallel beta-barrel PUBMED:7824948.\ ' '3152' 'IPR005501' '\ This family includes LamB. The lam locus of Emericella nidulans (Aspergillus nidulans) consists of two divergently transcribed genes, lamA and lamB, involved in the utilization of lactams such as 2-pyrrolidinone. Both genes are under the control of the positive regulatory gene amdR and are subject to carbon and nitrogen metabolite repression PUBMED:1729609. The exact molecular function of the proteins in this family is unknown.\ ' '3153' 'IPR013056' '\ Bacteriophage lambda regulatory protein CIII is a small protein that plays a role in stabilising the CII transcriptional activator, via a mechanism that is not yet fully understood PUBMED:1828895, PUBMED:2957696. Stabilised CII activates CI, the gene for the repressor protein that prevents transcription of proteins required for lytic development. The central portion of the protein is well conserved and is both necessary and sufficient for the activity of the protein PUBMED:1828895. Comparative analysis of the CIII sequence in lambda, Bacteriophage HK022 and the lambdoid Enterobacteria phage P22 has led to the suggestion that this central region assumes an amphipathic alpha-helical structure PUBMED:1828895.\ ' '3154' 'IPR000034' '\

    Laminins represent a distinct family of extracellular matrix proteins present only in basement membranes in almost every animal tissue. They are heterotrimeric molecules composed of alpha, beta and gamma subunits (formerly A, B1, and B2, respectively PUBMED:7921537) and form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains, PUBMED:2404817, PUBMED:7827749. Most of the globular domains of the short arms correspond to one of two different motifs, the 200-residue laminin N-terminal (domain VI) (LN) module and the 250-residue laminin domain IV (L4) module PUBMED:8615779. All alpha chains share a unique C-terminal G domain which consists of five laminin G modules. The laminins can self-assemble, bind to other matrix macromolecules, and have unique and shared cell interactions mediated by integrins, dystroglycan, and other receptors. There are at least 14 laminin isoforms that regulate a variety of cellular functions including cell adhesion, migration, proliferation, signalling and differentiation PUBMED:9758133, PUBMED:7827749, PUBMED:11054872.

    \ \

    The laminin B domain (also known as domain IV) is an extracellular module of unknown function. It is found in a number of different proteins that include, heparan sulphate proteoglycan from basement membrane, a laminin-like protein from Caenorhabditis elegans and laminin. Laminin IV domain is not found in short laminin chains (alpha4 or beta3).

    \ ' '3155' 'IPR002049' '\ Laminins PUBMED:2404817 are the major noncollagenous components of basement membranes\ that mediate cell adhesion, growth migration, and differentiation. They are\ composed of distinct but related alpha, beta and gamma chains. The three\ chains form a cross-shaped molecule that consist of a long arm and three short\ globular arms. The long arm consist of a coiled coil structure contributed by\ all three chains and cross-linked by interchain disulphide bonds.\ Beside different types of globular domains each subunit contains, in its first\ half, consecutive repeats of about 60 amino acids in length that include eight\ conserved cysteines PUBMED:2666164. The tertiary structure PUBMED:8648630, PUBMED:8648631 of this domain is\ remotely similar in its N-terminal to that of the EGF-like module (see ). It is known as a \'LE\' or \'laminin-type EGF-like\' domain. The\ number of copies of the LE domain in the different forms of laminins is highly\ variable; from 3 up to 22 copies have been found.\ A schematic representation of the topology of the four disulphide bonds in\ the LE domain is shown below.\ \
    \
             +-------------------+\
           +-|-----------+       |  +--------+  +-----------------+\
           | |           |       |  |        |  |                 |\
         xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx\
           sssssssssssssssssssssssssssssssssss\
    \
    \'C\': conserved cysteine involved in a disulphide bond\
    \'a\': conserved aromatic residue\
    \'G\': conserved glycine (lower case = less conserved)\
    \'s\': region similar to the EGF-like domain\
    
    \ In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the\ only one that binds with a high affinity to nidogen PUBMED:7781764. The binding-sites are\ located on the surface within the loops C1-C3 and C5-C6 PUBMED:8648630, PUBMED:8648631. Long\ consecutive arrays of LE domains in laminins form rod-like elements of limited\ flexibility PUBMED:2404817, which determine the spacing in the formation of laminin\ networks of basement membranes PUBMED:8349613.\ ' '3156' 'IPR012679' '\

    Laminins are large heterotrimeric glycoproteins involved in basement membrane function PUBMED:15037599. The laminin globular (G) domain can be found in one to several copies in various laminin family members, which includes a large number of extracellular proteins. The C-terminus of laminin alpha chain contains a tandem repeat of five laminin G domains, which are critical for heparin-binding and cell attachment activity PUBMED:10747011. Laminin alpha4 is distributed in a variety of tissues including peripheral nerves, dorsal root ganglion, skeletal muscle and capillaries; in the neuromuscular junction, it is required for synaptic specialisation PUBMED:15823034. The structure of the laminin-G domain has been predicted to resemble that of pentraxin PUBMED:9480764.

    \ \

    Laminin G domains can vary in their function, and a variety of binding functions has been ascribed to different LamG modules. For example, the laminin alpha1 and alpha2 chains each has five C-teminal laminin G domains, where only domains LG4 and LG5 contain binding sites for heparin, sulphatides and the cell surface receptor dystroglycan PUBMED:10747011. Laminin G-containing proteins appear to have a wide variety of roles in cell adhesion, signalling, migration, assembly and differentiation. This entry represents one subtype of laminin G domains, which is sometimes found in association with thrombospondin-type laminin G domains ().

    \ ' '3157' 'IPR008211' '\

    Laminin is a large molecular weight glycoprotein present only in basement membranes in almost every animal tissue. Laminin is thought to mediate the attachment, migration and organisation of cells into tissues during embryonic development by interacting with other extracellular matrix components PUBMED:1975589. Each laminin is a heterotrimer assembled from alpha, beta and gamma chain subunits, secreted and incorporated into cell-associated extracellular matrices PUBMED:10842354.

    \

    \ Basement membrane assembly is a cooperative process in which laminins polymerise through their N-terminal domain (LN or domain VI) and anchor to the cell surface through their G domains. Netrins may also associate with this network through heterotypic LN domain interactions PUBMED:8349613. This leads to cell signalling through integrins and dystroglycan (and possibly other receptors) recruited to the adherent laminin. This LN domain dependent self-assembly is considered to be crucial for the integrity of basement membranes, as highlighted by genetic forms of muscular dystrophy containing the deletion of the LN module from the alpha 2 laminin chain PUBMED:7874173. The laminin N-terminal domain is found in all laminin and netrin subunits except laminin alpha 3A, alpha 4 and gamma 2.

    \ ' '3158' 'IPR002000' '\

    Lysosome-associated membrane glycoproteins (lamp) PUBMED:1939168 are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below.

    \
    \
       +-----+            +-----+         +-----+            +-----+\
       |     |            |     |         |     |            |     |\
      xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx\
      +--------------------------++Hinge++--------------------------++TM++C+\
    
    \

    In mammals, there are two closely related types of lamp: lamp-1 and lamp-2. In chicken lamp-1 is known as LEP100.

    \ \

    CD69 (also called gp110 or macrosialin) PUBMED:8486654 is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail.

    \ \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).\

    \ \ ' '3159' 'IPR007822' '\

    The LanC-like protein superfamily encompasses a highly divergent group of peptide-modifying enzymes, including the eukaryotic and bacterial lanthionine synthetase C-like proteins (LanC) PUBMED:11474189, PUBMED:10944443, PUBMED:12566319; subtilin biosynthesis protein SpaC from Bacillus subtilis PUBMED:1735728, PUBMED:1539969; epidermin biosynthesis protein EpiC from Staphylococcus epidermidis PUBMED:1740156; nisin biosynthesis protein NisC from Lactococcus lactis PUBMED:8161176, PUBMED:7689965, PUBMED:1482192; GCR2 from Arabidopsis thaliana PUBMED:17347412; and many others.

    The 3D structure of the lantibiotic cyclase from L. lactis has been determined by X-ray crystallography to 2.5A resolution PUBMED:16527981. The globular structure is characterised by an all-alpha fold, in which an outer ring of helices envelops an inner toroid composed of 7 shorter, hydrophobic helices. This 7-fold hyrophobic periodicity has led several authors to claim various members of the family, including eukaryotic LanC-1 and GCR2, to be novel G protein-coupled receptors PUBMED:17347412, PUBMED:9512664; some of these claims have since been corrected PUBMED:10944443, PUBMED:18086512, PUBMED:17894782.

    \ ' '3160' 'IPR006827' '\ Lantibiotics are ribosomally synthesised antimicrobial agents derived from ribosomally synthesised peptides PUBMED:1539969. They are produced by bacteria of the Firmicutes phylum, and include mutacin, subtilin, and nisin. Lantibiotic peptides contain thioether bridges termed lanthionines that are thought to be generated by dehydration of serine and threonine residues followed by addition of cysteine residues PUBMED:12127987. This family constitutes the C-terminus of the enzyme proposed to catalyse the dehydration step PUBMED:12127987, PUBMED:10215865.\ ' '3161' 'IPR006826' '\ Lantibiotics are ribosomally synthesised antimicrobial agents derived from ribosomally synthesised peptides PUBMED:1539969. They are produced by bacteria of the Firmicutes phylum, and include mutacin, subtilin, and nisin. Lantibiotic peptides contain thioether bridges termed lanthionines that are thought to be generated by dehydration of serine and threonine residues followed by addition of cysteine residues PUBMED:12127987. This family constitutes the N terminus of the enzyme proposed to catalyse the dehydration step PUBMED:12127987, PUBMED:10215865.\ ' '3162' 'IPR002210' '\

    This entry represents the major late capsid protein L1 from Papillomaviruses, such as Human papillomavirus (HPV) PUBMED:10882140. Papillomaviruses are members of the papovavirus superfamily. More than 70 different types of papillomavirus have been discovered in humans, some of which have been shown to cause genital carcinomas and cutaneous warts PUBMED:17446671. The viruses contain a circular dsDNA genome surrounded by an icosahedral capsid (non-enveloped). Two proteins are involved in capsid formation: a major (L1) and a minor (L2) protein, in the approximate proportion 95:5%. L1 forms a pentameric assembly unit of the viral shell in a manner that closely resembles VP1 from polyomaviruses. Intermolecular disulphide bonding holds the L1 capsid proteins together PUBMED:7561785. L1 capsid proteins can bind via its nuclear localisation signal (NLS) to karyopherins Kapbeta(2) and Kapbeta(3) and inhibit the Kapbeta(2) and Kapbeta(3) nuclear import pathways during the productive phase of the viral life cycle PUBMED:12620808. Surface loops on L1 pentamers contain sites of sequence variation between HPV types.

    \ ' '3163' 'IPR000784' '\

    This family includes the L2 minor capsid protein, a late protein from Human papillomavirus (HPV). HPV are dsDNA viruses with no RNA stage in their replication cycle. Their dsDNA is contained within a capsid composed of 72 L1 capsomers and about 36 L2 minor capsid proteins. L2 minor capsid proteins enter the nucleus twice during infection: in the initial phase after virion disassembly, and in the productive phase when it assembles into replicated virions along with L1 major capsid proteins. L2 proteins contain two nuclear localisation signals (NLSs), one at the N-terminal (nNLS) and the other at the C-terminal (cNLS). L2 uses its NLSs to interact with a network of karyopherins in order to enter the nucleus via several import pathways. L2 from HPV types 11 and 16 was shown to interact with karyopherins Kapbeta(2) and Kapbeta(3) PUBMED:16873281, PUBMED:15507604. L2 capsid proteins can also interact with viral dsDNA, facilitating its release from the endocytic compartment after viral uncoating.

    \ ' '3164' 'IPR003334' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The secretin-like GPCRs include secretin PUBMED:1646711, calcitonin PUBMED:1658940, parathyroid hormone/parathyroid hormone-related peptides PUBMED:1658941 and vasoactive intestinal peptide PUBMED:1314625, all of which activate adenylyl cyclase and the phosphatidyl-inositol-calcium pathway. These receptors contain seven transmembrane regions, in a manner reminiscent of the rhodopsins and other receptors believed to interact with G-proteins (however there is no significant sequence identity between these families, the secretin-like receptors thus bear their own unique \'7TM\' signature). Their N-terminus is probably located on the extracellular side of the membrane and potentially glycosylated. This N-terminal region contains a long conserved region which allow the binding of large peptidic ligand such as glucagon, secretin, VIP and PACAP; this region contains five conserved cysteines residues which could be involved in disulphide bond. The C-terminal region of these receptor is probably cytoplasmic. Every receptor gene in this family is encoded on multiple exons, and several of these genes are alternatively spliced to yield functionally distinct products.

    \

    Latrophilins are a family of secretin-like GPCRs that can be subdivided\ into 3 subtypes: LPH1, LPH2 and LPH3. LPH1 is a brain-specific calcium\ independent receptor of alpha-latrotoxin (LTX), a neurotoxin. It is the\ affinity of this form of the receptor for LTX that gives the family its name. LPH2 and LPH3, whilst sharing extensive sequence similarity to LPH1, do not bind LTX. LPH2 is distributed throughout most tissues, whereas LPH3 is also brain-specific PUBMED:10025961. The endogenous ligand(s) for these receptors are at present unknown. Binding of LTX to LPH1 stimulates exocytosis and the\ subsequent release of large amounts of neurotransmitters from neuronal and\ endocrine cells. The latrophilins possess up to 7 sites of alternative splicing; the resulting number of possible splice variants leads to a highly variable family of proteins.

    \

    This entry represents the C-terminal region of latrophilin.

    \ ' '3165' 'IPR017942' '\

    This entry represents the N-terminal domain found in several lipid-binding serum glycoproteins. The N- and C-terminal domains share a similar two-layer alpha/beta structure, but they show little sequence identity. Proteins containing this N-terminal domain include:

    \ \

    \ \

    Bactericidal permeability-increasing protein (BPI) is a potent antimicrobial protein of 456 residues that binds to and neutralises lipopolysaccharides from the outer membrane of Gram-negative bacteria PUBMED:9188532. BPI contains two domains that adopt the same structural fold, even though they have little sequence similarity PUBMED:10843855.

    \ \

    Lipopolysaccharide-binding protein (LBP) is an endotoxin-binding protein that is closely related to, and functions in a co-ordinated manner with BPI to facilitate an integrated host response to invading Gram-negative bacteria PUBMED:12887306.

    \ \

    Cholesteryl ester transfer protein (CETP) is a glycoprotein that facilitates the transfer of lipids (cholesteryl esters and triglycerides) between the different lipoproteins that transport them through plasma, including HDL, LDL, VLDL and chylomicrons. These lipoproteins shield the lipids from water by encapsulating them within a coating of polar lipids and proteins PUBMED:17277799.

    \ \

    Phospholipid transfer protein (PLTP) exchanges phospholipids between lipoproteins and remodels high-density lipoproteins (HDLs) PUBMED:12693940.

    \ \

    Palate, lung and nasal epithelium carcinoma-associated protein (PLUNC) is a potential host defensive protein that is secreted from the submucosal gland to the saliva and nasal lavage fluid. PLUNC appears to be a secreted product of neutrophil granules that participates in an aspect of the inflammatory response that contributes to host defence PUBMED:18245229. Short palate, lung and nasal epithelium clone 1 (SPLUNC1) may bind the lipopolysaccharide of Gram-negative nanobacteria, thereby playing an important role in the host defence of nasopharyngeal epithelium PUBMED:16364440.

    \ ' '3166' 'IPR001124' '\

    This entry represents the C-terminal domain found in several lipid-binding serum glycoproteins. The N- and C-terminal domains share a similar two-layer alpha/beta structure, but they show little sequence identity. Proteins containing this C-terminal domain include:

    \ \

    \ \

    Bactericidal permeability-increasing protein (BPI) is a potent antimicrobial protein of 456 residues that binds to and neutralises lipopolysaccharides from the outer membrane of Gram-negative bacteria PUBMED:9188532. BPI contains two domains that adopt the same structural fold, even though they have little sequence similarity PUBMED:10843855.

    \ \

    Lipopolysaccharide-binding protein (LBP) is an endotoxin-binding protein that is closely related to, and functions in a co-ordinated manner with BPI to facilitate an integrated host response to invading Gram-negative bacteria PUBMED:12887306.

    \ \

    Cholesteryl ester transfer protein (CETP) is a glycoprotein that facilitates the transfer of lipids (cholesteryl esters and triglycerides) between the different lipoproteins that transport them through plasma, including HDL, LDL, VLDL and chylomicrons. These lipoproteins shield the lipids from water by encapsulating them within a coating of polar lipids and proteins PUBMED:17277799.

    \ \

    Phospholipid transfer protein (PLTP) exchanges phospholipids between lipoproteins and remodels high-density lipoproteins (HDLs) PUBMED:12693940.

    \ \

    Palate, lung and nasal epithelium carcinoma-associated protein (PLUNC) is a potential host defensive protein that is secreted from the submucosal gland to the saliva and nasal lavage fluid. PLUNC aapears to be a secreted product of neutrophil granules that participates in an aspect of the inflammatory response that contributes to host defence PUBMED:18245229. Short palate, lung and nasal epithelium clone 1 (SPLUNC1) may bind the lipopolysaccharide of Gram-negative nanobacteria, thereby playing an important role in the host defence of nasopharyngeal epithelium PUBMED:16364440.

    \ ' '3167' 'IPR004043' '\

    The LCCL domain has been named after the best characterised proteins that were found to contain it, namely Limulus factor C, Coch-5b2 and Lgl1. It is an about 100 amino acids domain whose C-terminal part contains a highly conserved histidine in a conserved motif YxxxSxxCxAAVHxGVI. The LCCL module is thought to be an autonomously folding domain that has been used for the construction of various modular proteins through exon-shuffling. It has been found in various metazoan proteins in association with complement B-type domains, C-type lectin domains, von Willebrand type A domains, CUB domains, discoidin lectin domains or CAP domains. It has been proposed that the LCCL domain could be involved in lipopolysaccharide (LPS) binding PUBMED:10971586, PUBMED:9806553. Secondary structure prediction suggests that the LCCL domain contains six beta strands and two alpha helices PUBMED:10971586.

    \

    Some proteins known to contain a LCCL domain include Limulus factor C, a LPS endotoxin-sensitive trypsin type serine protease which serves to protect the organism from bacterial infection; vertebrate cochlear protein cochlin or coch-5b2 (Cochlin is probably a secreted protein, mutations affecting the LCCL domain of coch-5b2 cause the deafness disorder DFNA9 in humans); and mammalian late gestation lung protein Lgl1, contains two tandem copies of the LCCL domain PUBMED:10362728.

    \ ' '3168' 'IPR007213' '\

    This entry represents a group of leucine carboxymethyltransferases which methylate the carboxyl group of leucine residues to form alpha-leucine ester residues. It includes LCTM1 which regulates the activity of serine/threonine phosphatase 2A (PP2A) through methylation of the C-terminal leucine residue of the catalytic subunit of PP2A PUBMED:10600115, PUBMED:11697862, PUBMED:11060018. This affects the heteromultimeric composition of PP2A which in turn affects protein recognition and substrate specificity. Like many other methyltransferases LCTM1 uses S-adenosylmethionine (SAM) as the methyl donor. LCTM1 contains the common SAM-dependent methyltransferase core fold, with various insertions and additions creating a specific PP2A binding site PUBMED:14660564. This entry also contains LCTM2, a homologue of LCTM1 which is not necessary for PP2A methylation and whose function is not clear.

    \ ' '3169' 'IPR005413' '\

    The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell PUBMED:9618447 and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Yersinia spp. secrete effector proteins called YopB and YopD that facilitate the spread of other translocated proteins through the type III needle and the host cell cytoplasm PUBMED:9440524. In turn, the transcription of these moieties is thought to be regulated by another gene, lcrV, found on the Yops virulon that encodes the entire type III system PUBMED:9495760. The product of this gene, LcrV protein, also regulates the secretion of YopD through the type III translocon PUBMED:11443094, and itself acts as a protective "V" antigen for Yersinia pestis, the causative agent of plague PUBMED:11489861.

    \ \

    Recently, a homologue of the Y. pestis LcrV protein (PcrV) was found in Pseudomonas aeruginosa, an opportunistic pathogen. In vivo studies using mice found that immunisation with the protein protected burned animals from infection by P. aeruginosa, and enhanced survival. In addition, it is speculated that PcrV determines the size of the needle pore for type III secreted effectors PUBMED:11500471.

    \ ' '3170' 'IPR001236' '\

    L-lactate dehydrogenases are metabolic enzymes which catalyse the conversion of \ L-lactate to pyruvate, the last step in anaerobic glycolysis PUBMED:11276087. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. L-2-hydroxyisocaproate dehydrogenases are also members of the family. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate PUBMED:8117664. The enzyme participates in the citric acid cycle.

    \ ' '3171' 'IPR003767' '\

    The malate dehydrogenase (MDH) of some extremophilies is more similar to the L-lactate dehydrogenases (L-LDH) from various sources than to other MDHs PUBMED:8476859.

    \ \

    This family consists of bacterial and archaeal malate/L-lactate dehydrogenases. The archaebacterial malate dehydrogenase , deviates from the eubacterial and eukaryotic enzymes having a low selectivity for the coenzyme (NAD(H) or NADP(H)) and catalyzing the reduction of oxalacetate to malate more efficiently than the reverse reaction PUBMED:2110059.

    \ ' '3172' 'IPR001236' '\

    L-lactate dehydrogenases are metabolic enzymes which catalyse the conversion of \ L-lactate to pyruvate, the last step in anaerobic glycolysis PUBMED:11276087. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. L-2-hydroxyisocaproate dehydrogenases are also members of the family. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate PUBMED:8117664. The enzyme participates in the citric acid cycle.

    \ ' '3173' 'IPR000033' '\

    The low-density lipoprotein receptor (LDLR) regulates cholesterol homeostasis in mammalian cells. LDLR binds cholesterol-carrying LDL, associates with clathrin-coated pits, and is internalized into acidic endosomes where it separates from its ligand. The ligand is degraded in lysosomes, while the receptor returns to the cell surface PUBMED:3513311. The LDLR has several domains. The ligand-binding domain contains seven LDL receptor class A repeats, each with three disulphide bonds and a coordinated Ca2+ ion. The second conserved region contains two EGF repeats, followed by six YWTD or LDL receptor class B repeats and another EGF repeat PUBMED:9790844. This conserved region is critical for ligand release and recycling of the receptor PUBMED:3494949.

    \

    The structure of the six YWTD repeats of LDL receptor have been solved PUBMED:11373616. The six YWTD repeats together fold into a six-bladed beta-propeller. Each blade of the propeller consists of four antiparallel beta-strands; the innermost strand of each blade is labeled 1 and the outermost strand, 4. The sequence repeats are offset with respect to the blades of the propeller, such that any given 40-residue YWTD repeat spans strands 24 of one propeller blade and strand 1 of the subsequent blade. This offset ensures circularization of the propeller because the last strand of the final sequence repeat acts as an innermost strand 1 of the blade that harbors strands 24 from the first sequence repeat. The repeat is found in a variety of proteins that include, vitellogenin receptor from Drosophila melanogaster, low-density lipoprotein (LDL) receptor PUBMED:6091915, preproepidermal growth factor, and nidogen (entactin).

    \ ' '3174' 'IPR005513' '\

    LEA proteins are late embryonic proteins abundant in higher plant seed embryos. They may play an essential role in seed survival and control of water exchanges during seed desiccation and imbibition. Family members are conserved along the entire coding region, especially within the hydrophobic internal 20 amino acid motif. This motif may be repeated.

    \ ' '3175' 'IPR004864' '\

    Different types of LEA proteins are expressed at different stages of late embryogenesis in higher plant seed embryos and under conditions of dehydration stress. The function of these proteins is unknown. This family represents a group of LEA proteins that appear to be distinct from those in .

    \ \ ' '3176' 'IPR004926' '\

    Members of this family are similar to late embryogenesis abundant proteins. Members of the family have been isolated in a number of\ different screens. However, the molecular function of these proteins remains obscure.

    \ \ ' '3177' 'IPR001304' '\

    Lectins occur in plants, animals, bacteria and viruses. Initially described for their carbohydrate-binding activity PUBMED:14533786, they are now recognised as a more diverse group of proteins, some of which are involved in protein-protein, protein-lipid or protein-nucleic acid interactions PUBMED:12223269. There are at least twelve structural families of lectins:

    \

    \

    C-type lectins can be further divided into seven subgroups based on additional non-lectin domains and gene structure: (I) hyalectans, (II) asialoglycoprotein receptors, (III) collectins, (IV) selectins, (V) NK group transmembrane receptors, (VI) macrophage mannose receptors, and (VII) simple (single domain) lectins PUBMED:15476922.

    \

    Therefore, lectins are a diverse group of proteins, both in terms of structure and activity. Carbohydrate binding ability may have evolved independently and sporadically in numerous unrelated families, where each evolved a structure that was conserved to fulfil some other activity and function. In general, animal lectins act as recognition molecules within the immune system, their functions involving defence against pathogens, cell trafficking, immune regulation and the prevention of autoimmunity PUBMED:14519388.

    \ ' '3178' 'IPR004283' '\

    The late expression factor 2 (lef-2) protein from Orgyia pseudotsugata multicapsid polyhedrosis virus (OpMNPV) is required for expression of late genes. The lef-2 protein has been shown to be specifically required for expression from the vp39 and polh promoters PUBMED:8445724.

    \ ' '3179' 'IPR005052' '\

    Lectins are structurally diverse proteins that bind to specific carbohydrates. This family includes the VIP36 and ERGIC-53 lectins. These two proteins were the first members of the family of animal lectins similar to the leguminous plant lectins PUBMED:8205612. The alignment for this family is towards the N-terminus, where the similarity of VIP36 and ERGIC-53 is greatest. Although they have been identified as a family of animal lectins, this alignment also includes yeast sequencesPUBMED:8205612.

    \

    ERGIC-53 is a 53kDa protein, localised to the intermediate region between the endoplasmic reticulum and the Golgi apparatus (ER-Golgi-Intermediate Compartment, ERGIC). It was identified as a calcium-dependent, mannose-specific lectin PUBMED:8868475. Its dysfunction has been associated with combined factors V and VIII deficiency, suggesting an important and substrate-specific role for ERGIC-53 in the glycoprotein-secreting pathway PUBMED:8868475,PUBMED:10090935.

    \

    The L-type lectin-like domain has an overall globular shape composed of a beta-sandwich of two major twisted antiparallel beta-sheets. The beta-sandwich comprises a major concave beta-sheet and a minor convex beta-sheet, in a variation of the jelly roll fold PUBMED:11850423, PUBMED:14643651, PUBMED:16439369, PUBMED:17652092.

    \ ' '3180' 'IPR001220' '\ Legume lectins are one of the largest lectin families with more than 70 lectins\ reported. Leguminous plant lectins resemble each other in their physicochemical properties although they differ in their carbohydrate specificities. They consist of two or four subunits with relative molecular mass of 30 kDa and each subunit has one carbohydrate-binding site. The interaction with sugars requires tightly bound calcium and manganese ions. The structural similarities of these lectins are reported by the primary structural analyses and X-ray crystallographic studies. X-ray studies have shown that the folding of the polypeptide chains in the region of the carbohydrate-binding sites is also similar, despite differences in the primary sequences. The carbohydrate-binding sites of these lectins consist of two conserved amino acids on beta pleated sheets. One of these loops contains transition metals, calcium and manganese,\ which keep the amino acid residues of the sugar-binding site at the required\ positions. Amino acid sequences of this loop play an important role in the\ carbohydrate-binding specificities of these lectins. These lectins bind either glucose/mannose or galactose.\

    The exact function of legume lectins \ is not known but they may be involved in the attachment of nitrogen-fixing bacteria to legumes and \ in the protection against pathogens.

    \

    Some legume lectins are proteolytically processed to produce two chains, beta (which corresponds to \ the N-terminal) and alpha (C-terminal) (). The lectin concanavalin A (conA) from jack bean is exceptional \ in that the two chains are transposed and ligated (by formation of a new peptide bond). The N-terminus \ of mature conA thus corresponds to that of the alpha chain and the C-terminus to the beta chain.

    \ ' '3181' 'IPR005640' '\

    Animal lectins display a wide variety of architectures.\ They are classified according to the carbohydrate-recognition\ domain (CRD) of which there are two main types, S-type and C-type.

    \

    C-type lectins display a wide range of specificities.\ They require Ca2+ for their activity\ They are found predominantly but not exclusively in vertebrates.

    \

    This entry presents N-terminal domain, which is found in C-type lectins.

    \ ' '3182' 'IPR007790' '\

    The baculovirus Autographa californica nuclear polyhedrosis virus (AcMNPV) virus encodes a DNA-dependent RNA polymerase that is required for\ transcription of viral late genes. This polymerase is composed of four equimolar subunits, LEF-8, LEF-4, LEF-9, and p47. LEF-4 carries out all the enzymatic functions related to mRNA capping PUBMED:12124466.

    \ ' '3183' 'IPR007025' '\ Late expression factor 8 (LEF-8) is one of the primary components of RNA polymerase produced by polyhedrosis viruses. LEF-8 shows homology to the second largest subunit of prokaryotic DNA-directed RNA polymerasePUBMED:12124466.\ ' '3184' 'IPR007786' '\ The baculovirus Autographa californica nuclear polyhedrosis virus (AcMNPV) encodes a DNA-dependent RNA polymerase that is required for transcription of viral late genes. This polymerase is composed of four equimolar subunits, LEF-8, LEF-4, LEF-9, and p47. LEF-9 is homologous to the largest beta-subunit of prokaryotic DNA-directed RNA polymerase PUBMED:12124466.\ ' '3185' 'IPR007825' '\ This family consists of major outer membrane protein precursors from Legionella pneumophila.\ ' '3186' 'IPR003887' '\

    The LEM domain is found in nuclear membrane-associated proteins, including lamino-associated polypeptide 2 and emerin PUBMED:11792821. Defects in the emerin gene are a cause of Emery-Dreifuss muscular dystrophy, an X-linked disorder characterised by early contractures, muscle wasting, weakness and cardiomyopathy.

    \ \ ' '3187' 'IPR007156' '\ The members of this family are related to the LemA protein . LemA contains an N-terminal predicted transmembrane helix. It has been predicted that the small N terminus is extracellular PUBMED:8758895. The exact molecular function of this protein is uncertain.\ ' '3188' 'IPR004247' '\ This family contains retroviral transactivating (Tat) proteins, from a variety of lentiviruses. The Tat protein may have a role in trans-activation of the viral long terminal repeat PUBMED:2536163.\ ' '3189' 'IPR000065' '\ Leptin, a metabolic monitor of food intake and energy need, is expressed\ by the ob obesity gene. The protein may function as part of a signalling\ pathway from adipose tissue that acts to regulate the size of the body\ fat depot PUBMED:7984236, the hormone effectively turning the brain\'s appetite\ message off when it senses that the body is satiated. Obese humans have\ high levels of the protein, suggesting a similarity to type II (adult\ onset) diabetes, in which sufferers over-produce insulin, but can\'t respond\ to it metabolically - they have become insulin resistant. Similarly, it is\ thought that obese individuals may be leptin resistant.\ ' '3190' 'IPR004616' '\ Leucyl/phenylalanyl-tRNA--protein transferase transfers a Leu or Phe to the amino end of certain proteins to enable degradation. The N-terminal residue controls the biological half-life of many proteins via the N-end rule pathway.\ ' '3191' 'IPR002703' '\ The Levivirus coat protein forms the bacteriophage coat that encapsidates the viral RNA. 180 copies of this protein form the virion shell. The Bacteriophage MS2 coat protein controls two distinct processes: sequence-specific RNA encapsidation and repression of replicase translation-by binding to an RNA stem-loop structure of 19 nucleotides containing the initiation codon of the replicase gene. The binding of a coat protein dimer to this hairpin shuts off synthesis of the viral replicase, switching the viral replication cycle to virion assembly rather than continued replication PUBMED:7523953.\ ' '3192' 'IPR006199' '\

    This is the DNA binding domain of the LexA SOS regulon\ repressor which prevents expression of DNA repair proteins in bacteria.\ The aligned region contains a variant form of the helix-turn-helix DNA\ binding motif PUBMED:8076591.\ This domain usually at the N terminus is found associated with the auto-proteolytic domain of LexA .

    \ ' '3193' 'IPR001640' '\ Prolipoprotein diacylglyceryl transferase PUBMED:8051048 is the bacterial enzyme catalysing the first step in lipoprotein biogenesis. It transfers the n-acyl diglyceride group onto what will become the N-terminal cysteine of membrane lipoproteins. This enzyme is an integral membrane protein.\ ' '3194' 'IPR000066' '\ In photosynthetic bacteria the antenna complexes function as light-harvesting\ systems that absorb light radiation and transfer the excitation energy to the\ reaction centres. The antenna complexes are generally composed of two\ polypeptides (alpha and beta chains); two or three bacteriochlorophyll (BChl)\ molecules and some carotenoids PUBMED:1577009, PUBMED:1460542.\ Both the alpha and the beta chains of antenna complexes are small proteins of\ 42 to 68 residues which share a three-domain organization. They are composed\ of a N-terminal hydrophilic cytoplasmic domain followed by a transmembrane\ region and a C-terminal hydrophilic periplasmic domain. In the transmembrane\ region of both chains there is a conserved histidine which is most probably\ involved in the binding of the magnesium atom of a bacteriochlorophyll group.\ The beta chains contain an additional conserved histidine which is located at\ the C-terminal extremity of the cytoplasmic domain and which is also thought\ to be involved in bacteriochlorophyll-binding.\ ' '3195' 'IPR007074' '\

    The LicD family of proteins show high sequence similarity and are involved in phosphorylcholine metabolism. There is evidence to show that LicD2 mutants have a reduced ability to take up choline, have decreased ability to adhere to host cells and are less virulent PUBMED:10200966.

    \ \

    This entry includes LicD and related proteins such as Fukutin, a human protein which may be involved in the modification of glycan moieties of alpha-dystroglycan; defects in Fukutin are associated with congential muscular dystrophy PUBMED:11445638.

    \ ' '3196' 'IPR001581' '\

    On the basis of functional and structural similarities, the small cytokines leukemia inhibitory factor (LIF) and oncostatin (OSM) can be classified into a single family PUBMED:1566332, PUBMED:1717982.

    \

    It has been said PUBMED:1717982 that LIF and OSM can be included in the IL-6 family of cytokines (), but while all these cytokines seem to be structurally related, the sequence similarity is not high enough to allow the use of a single consensus pattern.

    \ \ ' '3197' 'IPR005811' '\

    This entry represents a domain found in both the alpha and beta chains of succinyl-CoA synthase ( (GDP-forming) and (ADP-forming)) PUBMED:9917402, PUBMED:10873456. This domain can also be found in ATP citrate synthase () and malate-CoA ligase (). Some members of the domain utilise ATP others use GTP.

    \ ' '3198' 'IPR001781' '\

    This entry represents LIM-type zinc finger (Znf) domains. LIM domains coordinate one or more zinc atoms, and are named after the three proteins (LIN-11, Isl1 and MEC-3) in which they were first found. They consist of two zinc-binding motifs that resemble GATA-like Znf\'s, however the residues holding the zinc atom(s) are variable, involving Cys, His, Asp or Glu residues. LIM domains are involved in proteins with differing functions, including gene expression, and cytoskeleton organisation and development PUBMED:1970421, PUBMED:1467648. Protein containing LIM Znf domains include:

    \

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    \

    These proteins generally contain two tandem copies of the LIM domain in their N-terminal section. Zyxin and paxillin are exceptions in that they contain respectively three and four LIM domains at their C-terminal extremity. In apterous, isl-1, LH-2, lin-11, lim-1 to lim-3, lmx-1 and ceh-14 and mec-3 there is a homeobox domain some 50 to 95 amino acids after the LIM domains.

    \

    LIM domains contain seven conserved cysteine residues and a histidine. The arrangement followed by these conserved residues is:

    \
    \
    C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD]\
    
    \

    LIM domains bind two zinc ions PUBMED:8506279. LIM does not bind DNA, rather it seems to act as an interface for protein-protein interaction.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '3199' 'IPR002691' '\ The LIM-domain binding protein, binds to the LIM domain of LIM homeodomain proteins which are transcriptional regulators of development. Nuclear LIM interactor (NLI) / LIM domain-binding protein 1 (LDB1) is located in the nuclei of neuronal cells during development, it is co-expressed with Isl1 in early motor neuron differentiation and has a suggested role in the Isl1 dependent development of motor neurons PUBMED:8876198. It is suggested that these proteins act synergistically to enhance transcriptional efficiency by acting as co-factors for LIM homeodomain and Otx class transcription factors both of which have essential roles in development PUBMED:9192866. The Drosophila melanogaster protein Chip is required for segmentation and activity of a remote wing margin enhancer PUBMED:9334334. Chip is a ubiquitous chromosomal factor required for normal expression of diverse genes at many stages of development PUBMED:9334334. It is suggested that Chip cooperates with different LIM domain proteins and other factors to structurally support remote enhancer-promoter interactions PUBMED:9334334.\ ' '3200' 'IPR005818' '\ Linker histone H1 is an essential component of chromatin structure. H1 links nucleosomes into higher order structures.\ Histone H5 performs the same\ function as histone H1, and replaces H1 in certain cells. \ The structure of GH5, the globular domain of the linker\ histone H5 is known PUBMED:8384699, PUBMED:3463990. The fold is similar to the DNA-binding\ domain of the catabolite gene activator protein, CAP, thus providing a\ possible model for the binding of GH5 to DNA.\ ' '3201' 'IPR007544' '\ Many Gram-positive bacteria produce antimicrobial peptides, generally termed bacteriocins. These peptides are usually cationic, less than 50 amino acid residues long, contain an amphiphilic or hydrophobic region, and often kill their target cells by permeabilizing the cell membrane. Antimicrobial peptides with these characteristics are also produced by plants and a wide variety of animals, including humans, and are thus widely distributed in nature. The Linocin_M18 region is found mostly in eubacteria, though homologous sequences have been identified in archaea PUBMED:8919789, PUBMED:7986050.\ ' '3202' 'IPR005152' '\

    These lipases are expressed and secreted during the infection cycle of these pathogens. In particular, Candida albicans has a large number of different lipases, possibly reflecting broad lipolytic activity, which may contribute to the persistence and virulence of C. albicans in human tissue PUBMED:11131027.

    \ ' '3203' 'IPR004960' '\

    Bacterial lipopolysachharides (LPS) are glycolipids that make up the outer monolayer of the outer membranes of most Gram-negative bacteria. Though LPS moleculesare variable, they all show the same general features: an outer polysaccharide which is attached to the lipid component, termed lipid A PUBMED:9791168. The polysaccharide component consists of a variable repeat-structure polysaccharide known as the O-antigen, and a highly conserved short core oligosaccharide which connects the O-antigen to lipid A. Lipid A is a glucosamine-based phospholipid that makes up the membrane anchor region of LPS PUBMED:12045108. The structure of lipid A is relatively invariant between species, presumably reflecting its fundamental role in membrane integrity. Recognition of lipid A by the innate immune system can lead to a response even at picomolar levels. In some genera, such as Neisseria and Haemophilus, lipooligosaccharides (LOS) are the predominant glycolipids PUBMED:8894399. These are analogous to LPS except that they lack O-antigens, with the LOS oligosaccharide structures limited to 10 saccharide units.

    \ \ \

    The bacterial lipid A biosynthesis protein, or lipid A biosynthesis (KDO)2-(lauroyl)-lipid IVA acyltransferase , transfers myristate or laurate, activated on ACP, to the lipid IVA moiety of (KDO)2-(lauroyl)-lipid IVA during lipopolysaccharide core biosynthesis.

    \ ' '3204' 'IPR013818' '\

    Triglyceride lipases () are lipolytic enzymes that hydrolyse ester linkages of\ triglycerides PUBMED:3147715. Lipases are widely distributed in animals, plants and prokaryotes.\ At least three tissue-specific isozymes exist in higher vertebrates, pancreatic, hepatic and\ gastric/lingual. These lipases are closely related to each other and to lipoprotein lipase\ (), which hydrolyses triglycerides of chylomicrons and very low density lipoproteins\ (VLDL) PUBMED:2917565. The most conserved region in all these proteins is centred around a serine\ residue which has been shown PUBMED:2304545 to participate, with an histidine and an aspartic acid\ residue, in a charge relay system. Such a region is also present in lipases of prokaryotic\ origin and in lecithin-cholesterol acyltransferase () (LCAT) PUBMED:3458198, which\ catalyzes fatty acid transfer between phosphatidylcholine and cholesterol.

    \ ' '3205' 'IPR002918' '\ Lipases or triacylglycerol acylhydrolases hydrolyse ester bonds in triacylglycerol giving diacylglycerol, monoacylglycerol, glycerol and free fatty acids PUBMED:1320940. These have been called class 2 as they are not clearly related to other lipase families.\

    These enzymes catalyse the reaction:

    \ \ ' '3206' 'IPR004961' '\ The Proteobacterial lipase chaperone is a lipase helper protein which seems to assist in the folding of extracellular lipase during its passage through the periplasm.\ ' '3207' 'IPR001087' '\

    A variety of lipolytic enzymes with serine as part of the active site have been\ identified PUBMED:7610479. Members of this entry include; Aeromonas hydrophila lipase, Vibrio mimicus arylesterase, Vibrio parahaemolyticus thermolabile haemolysin, rabbit phospholipase (AdRab-B), and Brassica napus anter-specific proline-rich protein.

    \ ' '3208' 'IPR000566' '\ Proteins which transport small hydrophobic molecules such as steroids, bilins, retinoids, and lipids share limited\ regions of sequence homology and a common tertiary structure architecture PUBMED:3622999, PUBMED:1608945, PUBMED:2217163,\ PUBMED:7684291, PUBMED:3238752. This is an eight stranded antiparallel beta-barrel with a repeated + 1 topology enclosing\ a internal ligand binding site PUBMED:7684291, PUBMED:2217163. The name \'lipocalin\' has been proposed PUBMED:3622999 for\ this protein family, but cytosolic fatty-acid binding proteins are also included. The sequences of most members of the family, the core or kernal lipocalins, are characterised by\ three short conserved stretches of residues, while others, the outlier lipocalin group, share only one or two of these\ PUBMED:1834059, PUBMED:7684291. Proteins known to belong to this family include alpha-1-microglobulin (protein HC);\ alpha-1-acid glycoprotein (orosomucoid) PUBMED:3064105; aphrodisin; apolipoprotein D; beta-lactoglobulin; complement\ component C8 gamma chain PUBMED:1707134; crustacyanin PUBMED:2026162; epididymal-retinoic acid binding protein\ (E-RABP) PUBMED:8069623; insectacyanin; odorant-binding protein (OBP); human pregnancy-associated endometrial alpha-2\ globulin; probasin (PB), a rat prostatic protein; prostaglandin D synthase () PUBMED:1723819; purpurin; Von\ Ebner\'s gland protein (VEGP) PUBMED:7514123; and lizard epididymal secretory protein IV (LESP IV) PUBMED:8486691.\ \ \

    Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee\ King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E.,\ Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of\ the first three letters of the genus; a space; the first letter of the\ species name; a space and an arabic number. In the event that two species\ names have identical designations, they are discriminated from one another\ by adding one or more letters (as necessary) to each species designation.

    \

    The allergens in this family include allergens with the following designations: Bla g 4, Bos d 2, Bos d 5, Can f 1, Can f 2, Equ c 1 and Equ c 2.

    \ ' '3209' 'IPR001809' '\ The ospA and ospB genes encode the major outer membrane proteins of the Lyme disease spirochaete Borrelia burgdorferi PUBMED:2761388. The deduced gene products OspA and OspB, contain 273 and 296 residues respectively PUBMED:2761388. The two Osp proteins show a high degree of sequence similarity, indicating a recent evolutionary event. Molecular analysis and sequence comparison of OspA and OspB with other proteins has revealed similarity to the signal peptides of prokaryotic lipoproteins PUBMED:2761388, PUBMED:1560779.\ ' '3210' 'IPR004890' '\

    This domain is found along with a central domain () in a group of Mycoplasma lipoproteins of unknown function.

    \ ' '3211' 'IPR004943' '\ This family includes Lepidopteran low molecular weight (30 kDa) lipoprotein, which is an extracellular protein of unknown function. Biosynthesis occurs in a stage-dependent fashion in the fat body. \ ' '3212' 'IPR005132' '\ This is a domain found in some bacterial and eukaryotic lipoproteins. The function of RlpA is not well understood, but it has been shown to act as a prc mutant suppressor in Escherichia coliPUBMED:8576052. This entry contains a\ conserved region in the middle of RlpA.\ ' '3214' 'IPR005297' '\

    This family occurs as tandem repeats in a set of lipoproteins. The alignment contains a Y-X4-D motif.

    \ ' '3215' 'IPR006876' '\

    This group of uncharacterised proteins have a conserved C-terminal region which is found in LMBR1 and in the lipocalin-1 receptor. LMBR1 was thought to play a role in preaxial polydactyly, but recent evidence now suggests this not to be the case PUBMED:12032320.

    \ ' '3216' 'IPR005619' '\ The function of this presumed lipoprotein is unknown. The family includes Escherichia coli YajG .\ ' '3217' 'IPR007326' '\ This presumed domain is about 100 amino acids in length. It is found in lipoprotein of unknown function and is greatly expanded in Mycoplasma pulmonis. The domain is found in up to five copies in some proteins.\ ' '3218' 'IPR000680' '\ This family of lipoproteins is found in Borrelia spirochetes. The function of these proteins is\ uncertain, but it may serve to avoid the host immune response by changing from one surface\ exposed variable major outer membrane lipoprotein to another.\ ' '3219' 'IPR001595' '\

    This family of lipoproteins is Mycoplasma specific, and includes a variety of hypothetical proteins PUBMED:8948633. They all have a prokaryotic membrane lipoprotein lipid attachment site which is probable acts as a membrane anchor.

    \ ' '3220' 'IPR001677' '\

    Bacterial transferrin binding proteins act as transferrin receptors and are required for transferrin utilisation. Transferrins are iron-binding glycoproteins that control the level of free iron in biological fluids.

    \ ' '3221' 'IPR001800' '\

    Members of this family are lipoproteins that are probably involved in evasion of the host immune system by pathogens PUBMED:9403685. They are predominantly found in the Spirochaetaceae.

    \ ' '3222' 'IPR002520' '\ This family consists of the p50 and variable adherence-associated antigen\ (Vaa) adhesins from Mycoplasma hominis. M. hominis is a mycoplasma associated with human urogenital diseases, pneumonia, and septic\ arthritis PUBMED:8698503.\ An adhesin is a cell surface molecule that mediates adhesion to other\ cells or to the surrounding surface or substrate.\ The Vaa antigen is a 50-kDa surface lipoprotein that has four tandem\ repetitive DNA sequences encoding a periodic peptide structure, and is\ highly immunogenic in the human host PUBMED:8698503. p50 is also a 50-kDa\ lipoprotein, having three repeats A,B and C, that may be a tetramer of\ 191-kDa in its native environment PUBMED:8926064.\ ' '3223' 'IPR000044' '\ Mycoplasma genitalium has the smallest known genome of any free-living \ organism. Its complete genome sequence has been determined by whole-genome random sequencing and assembly PUBMED:7569993. Only 470 putative coding regions were identified, including genes for DNA replication, transcription and\ translation, DNA repair, cellular transport and energy metabolism PUBMED:7569993. \ A hypothetical protein from the MG045 gene PUBMED:8253680 has a homologue of similarly\ unknown function in Mycoplasma pneumoniae PUBMED:8948633.\ ' '3224' 'IPR004984' '\

    This domain is found along with a C-terminal domain () in a group of Mycoplasma lipoproteins of unknown function.

    \ ' '3225' 'IPR013819' '\

    Lipoxygenases () are a class of iron-containing dioxygenases which catalyses the hydroperoxidation of lipids, containing a cis,cis-1,4-pentadiene structure. They are common in plants where they may be involved in a number of diverse aspects of plant physiology including growth and development, pest resistance, and senescence or responses to wounding PUBMED:. In mammals a number of lipoxygenases isozymes are involved in the metabolism of prostaglandins and leukotrienes PUBMED:3017195. Sequence data is available for the following lipoxygenases:

    \ \

    \ \

    The iron atom in lipoxygenases is bound by four ligands, three of which are\ histidine residues PUBMED:8502991. Six histidines are conserved in all lipoxygenase sequences, five of them are found clustered in a stretch of 40 amino acids. This region contains two of the three zinc-ligands; the other histidines have been shown PUBMED:1567851 to be important for the activity of lipoxygenases.

    \

    This entry represents the C-terminal region of these proteins.

    \ ' '3226' 'IPR006864' '\ This repeated sequence element is found in the LMP group of surface-located membrane proteins of Mycoplasma hominis. The the number of repeats in the protein affects the tendency of cells to spontaneously aggregate. Agglutination may be an important factor in colonization. Non-agglutinating microorganisms might easily be distributed whereas aggregation might provide a better chance to avoid an antibody response since some of the epitopes may be buried PUBMED:7543881.\ ' '3227' 'IPR017867' '\

    Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; ) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation PUBMED:9818190, PUBMED:14625689. The PTP superfamily can be divided into four subfamilies PUBMED:12678841:

    \

    \

    Based on their cellular localisation, PTPases are also classified as:

    \

    \

    All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif PUBMED:9646865. Functional diversity between PTPases is endowed by regulatory domains and subunits.

    \ \

    This entry represents the low molecular weight (LMW) protein-tyrosine phosphatases (or acid phosphatase), which act on tyrosine phosphorylated proteins, low-MW aryl phosphates and natural and synthetic acyl phosphates PUBMED:1587862, PUBMED:1304913. The structure of a LMW PTPase has been solved by X-ray crystallography PUBMED:8052313 and is found to form a single structural domain. It belongs to the alpha/beta class, with 6 alpha-helices and 4 beta-strands forming a 3-layer alpha-beta-alpha sandwich architecture.

    \ ' '3228' 'IPR004564' '\

    This protein, LolA, is known so far only in the gamma subdivision of the Proteobacteria. \ In Escherichia coli, lipoproteins are anchored to the\ periplasmic side of either the inner or outer membrane through N-terminal lipids, depending on the lipoprotein-sorting signal present at\ position 2 PUBMED:12032293. Five Lol proteins are involved in the sorting and outer membrane localization of lipoproteins. LolCDE, an ATP\ binding cassette (ABC) transporter, in the inner membrane releases outer membrane-directed lipoproteins from the inner membrane in an ATP-dependent manner, leading to the formation of a water-soluble complex between the lipoprotein and the molecular chaperone, LolA. The LolA-lipoprotein complex crosses the periplasm and then\ interacts with outer membrane receptor LolB, which is essential for the anchoring of lipoproteins to the outer membrane.

    \

    E. coli lipoproteins are anchored to the inner or outer membrane depending on the residue at position 2. Aspartate at this\ position makes lipoproteins specific to the inner membrane, whereas other residues cause the release of lipoproteins from the inner\ membrane.

    \ ' '3229' 'IPR004565' '\

    This protein, LolB, is known so far only in the gamma subdivision of the Proteobacteria. It is a processed, lipid-modified outer membrane protein. \ In Escherichia coli, lipoproteins are anchored to the\ periplasmic side of either the inner or outer membrane through N-terminal lipids, depending on the lipoprotein-sorting signal present at\ position 2 PUBMED:12032293. Five Lol proteins are involved in the sorting and outer membrane localization of lipoproteins. LolCDE, an ATP\ binding cassette (ABC) transporter, in the inner membrane releases outer membrane-directed lipoproteins from the inner membrane in an ATP-dependent manner, leading to the formation of a water-soluble complex between the lipoprotein and LolA. The LolA-lipoprotein complex crosses the periplasm and then\ interacts with outer membrane receptor LolB, which is essential for the anchoring of lipoproteins to the outer membrane.

    \ ' '3230' 'IPR006817' '\

    This repeating sequence, NAKVDQLSNDV, is found in the enterobacterial outer membrane lipoprotein LPP. The outer membrane lipoprotein is the most abundant protein in an Escherichia coli cell. The messenger RNA for the lipoprotein of the E. coli outer membrane codes for a putative precursor, prolipoprotein, which has 20 additional amino acid residues extending from the amino terminus of the lipoprotein.

    \ ' '3231' 'IPR002217' '\

    A major antigen has been recognised in Helicobacter pylori, a protein with an apparent molecular weight of 20,000 and mass 18,283 kDa PUBMED:7928954. DNA sequence analysis revealed a 525 bp gene, encoding a 175-amino acid residue product with a typical 21-residue lipoprotein signal peptide and consensus prolipoprotein processing site PUBMED:7928954. Results of experimental work with Lpp20 are consistent with it being a nonessential lipoprotein PUBMED:7928954.

    \

    Prokaryotic membrane lipoproteins are synthesised with precursor signal peptides that are cleaved by specific peptidases (signal peptidase II). The enzyme recognises a conserved sequence, cutting upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached PUBMED:2202727.

    \ ' '3232' 'IPR007443' '\ This family includes several bacterial outer membrane antigens, whose molecular function is unknown.\ ' '3233' 'IPR003835' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    These enzymes belong to the glycosyltransferase family 19 . Lipid-A-disaccharide synthetase is involved with acyl-[acyl-carrier-protein]--UDP-N-acetylglucosamine O-acyltransferase and tetraacyldisaccharide 4\'-kinase in the biosynthesis of the phosphorylated glycolipid, lipid A, in the outer membrane of Escherichia coli and other bacteria. These enzymes catalyse the first disaccharide step in the synthesis of lipid-A-disaccharide.

    \ ' '3234' 'IPR004463' '\

    UDP-3-O-N-acetylglucosamine deacetylases are zinc-dependent metalloamidases that catalyse the second and committed step in the biosynthesis of lipid A. Lipid A anchors lipopolysaccharide (the major constituent of the outer membrane) into the membrane in Gram negative bacteria. LpxC shows no homology to mammalian metalloamidases and is essential for cell viability, making it an important target for the development of novel antibacterial compounds PUBMED:15667205. The structure of UDP-3-O-N-acetylglucosamine deacetylase (LpxC) from Aquifex aeolicus has a two-layer alpha/beta structure similar to that of the second domain of ribosomal protein S5, only in LpxC there is a duplication giving two structural repeats of this fold, each repeat being elaborated with additional structures forming the active site. LpxC contains a zinc-binding motif, which resides at the base of an active site cleft and adjacent to a hydrophobic tunnel occupied by a fatty acid PUBMED:12819349. This tunnel accounts for the specificity of LpxC toward substrates and inhibitors bearing appropriately positioned 3-O-fatty acid substituents PUBMED:17296300.

    \

    This entry represents the UDP-3-O-N-acetylglucosamine deacetylase family of proteins.

    \ ' '3235' 'IPR003758' '\

    Tetraacyldisaccharide 4\'-kinase phosphorylates the 4\'-position of a tetraacyldisaccharide 1-phosphate precursor (DS-1-P) of lipid A, but the enzyme has not yet been purified because of instability PUBMED:9575203. This\ enzyme is involved in the synthesis of lipid A portion of the bacterial lipopolysaccharide layer (LPS).

    \ \ ' '3236' 'IPR005538' '\

    This family is uncharacterised. It contains the protein LrgA that has been hypothesised to export murein hydrolases PUBMED:8824633. In Staphylococcus aureus, lrg and cid operons encode homologous proteins that regulate extracellular murein hydrolase activity and penicillin tolerance PUBMED:15659658, PUBMED:15126464. Since the proteins encoded by cidA and lrgA are so similar to the bacteriophage-encoded holin family of proteins, they are considered analogous PUBMED:15916614.

    \ ' '3237' 'IPR007300' '\ The two products of the lrgAB operon are potential membrane proteins, and LrgA and LrgB are both thought to control murein hydrolase activity and penicillin tolerance PUBMED:10714982.\ ' '3238' 'IPR000372' '\

    Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively PUBMED:10357231). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.

    \

    Reaction of amidotransferase domain:

    \ \ \

    Reactions of FMN-binding domain:

    \ \ \

    LRRs are often flanked by cysteine-rich domains: an N-terminal LRR domain and a C-terminal LRR domain (). This entry represents the N-terminal LRR domain.

    \ ' '3239' 'IPR004830' '\

    Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively PUBMED:10357231). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.

    \

    Reaction of amidotransferase domain:

    \ \ \

    Reactions of FMN-binding domain:

    \ \ \

    This signature describes a leucine-rich repeat variant (LRV), which has a novel repetitive structural motif consisting of alternating alpha- and 3(10)-helices arranged in a right-handed superhelix, with the absence of the beta-sheets present in other LRRs PUBMED:8946850.

    \ ' '3240' 'IPR007775' '\ B144/LST1 is a gene encoded in the human major histocompatibility complex that produces multiple forms of alternatively spliced mRNA and encodes peptides fewer than 100 amino acids in length. B144/LST1 is strongly expressed in dendritic cells. Transfection of B144/LST1 into a variety of cells induces morphologic changes including the production of long, thin filopodia PUBMED:11478849.\ ' '3241' 'IPR001783' '\

    The following proteins have been shown PUBMED:1996310, PUBMED: to be structurally and evolutionary related:\

    \

    These proteins seem to have evolved from the duplication of a domain of about 100 residues. In its C-terminal section, this domain contains a conserved motif [KR]-V-N-[LI]-E which has been proposed to be the binding site for lumazine (Lum) and some of its derivatives. RS-alpha which binds two molecules of Lum has two perfect copies of this motif, while LumP which binds one molecule of Lum, has a Glu instead of Lys/Arg in the first position of the second copy of the motif. Similarly, YFP, which binds to one molecule of FMN, also seems to have a potentially dysfunctional binding site by substitution of Gly for Glu in the last position of the first copy of the motif.

    \ \ ' '3242' 'IPR001517' '\

    Barley yellow dwarf virus (BYDV) can be separated into two groups based on serological relationships, presumably governed by the viral capsid structure PUBMED:2273382. Coding regions of coat proteins have been identified for the MAV-PS1, P-PAV (group 1) and NY-RPV (group 2) isolates of BYDV. Group 1 proteins show 71% sequence similarity to each other, 51% similarity to those of group 2, and a high degree of similarity to those from other luteoviruses (including coat proteins from Beet western yellows virus (BWYV) PUBMED:3194229 and Potato leafroll virus (PLrV) PUBMED:2732704, PUBMED:2732710).

    \

    Among luteovirus coat protein sequences in general, several highly conserved domains can be identified, while other domains differentiate group 1 isolates from group 2 and other luteoviruses. Sequence comparisons between the genomes of PLrV, BWYV and BYDV have revealed ~65% protein sequence similarity between the capsid proteins of BWYV and PLrV and ~45% similarity between BYDV and PLrV PUBMED:2273382. The N-terminal regions of these sequences, like those of many plant virus capsid proteins, is highly basic. These regions may be involved in protein-RNA interaction.

    \ ' '3243' 'IPR000382' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)), subfamily S39B. It is likely that the peptidase domain is involved in the cleavage of the polyprotein PUBMED:9714253.

    \ \

    The nucleotide sequence for the RNA of PLrV has been determined PUBMED:2732710, PUBMED:2466700. The sequence contains six large open reading frames (ORFs). The 5\' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5\' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide PUBMED:2732710. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined PUBMED:3194229. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus PUBMED:3194229.

    \ ' '3244' 'IPR006755' '\ This is a family of uncharacterised plant pathogen luteovirus proteins.\ ' '3245' 'IPR001964' '\

    The nucleotide sequence of the RNA of Potato leafroll virus (PLrV) has been determined PUBMED:2466700, PUBMED:2732710. The sequence contains six large ORFs. The 3\' coding region encodes three polypeptides: a 23K coat protein, a 17K polypeptide encoded in a different frame, and a 53K polypeptide, immediately following the coat protein sequence in the same frame. It has been suggested that the 53K polypeptide is translated by readthrough of the amber termination codon of the coat protein gene. The amino acid sequences encoded within the 3\' region show many similarities to analogous polypeptides of Barley yellow dwarf virus (BYDV), and Beet western yellows virus (BWYV). It is possible that the ORF5 protein is a VPG-precursor from which, at the onset of RNA synthesis, the VPG molecule is released, in a similar fashion to that proposed for Cowpea mosaic virus (CPMV).

    \ ' '3246' 'IPR007534' '\

    LuxE is an acyl-protein synthetase found in bioluminescent bacteria. LuxE catalyses the formation of an acyl-protein thiolester from a fatty acid and a protein. This is the second step in the bioluminescent fatty acid reduction system, which converts tetradecanoic acid to the aldehyde substrate of the luciferase-catalysed bioluminescence reaction PUBMED:8941351. A conserved cysteine found at position 364 in Photobacterium phosphoreum LuxE () is thought to be acylated during the transfer of the acyl group from the synthetase subunit to the reductase. The C-terminal of the synthetase is though to act as a flexible arm to transfer acyl groups between the sites of activation and reduction PUBMED:2023262. A luxE domain is also found in the Vibrio cholerae RBFN protein (), which is involved in the biosynthesis of the O-antigen component 3-deoxy-L-glycero-tetronic acid.

    \

    These proteins are found in the archaea and bacteria.

    \ ' '3247' 'IPR003815' '\

    In bacteria, the regulation of gene expression in response to changes in cell density is called quorum sensing. Quorum-sensing bacteria produce, release, and respond to hormone-like molecules (autoinducers) that accumulate in the external environment as the cell population grows. For example, enteric bacteria use quorum sensing to regulate several traits that allow them to establish and maintain infection in their host, including motility, biofilm formation, and virulence-specific genes PUBMED:17133078. The LuxS/AI-2 system is one of several quorum sensing mechanisms. AI-2 (autoinducer-2) is a signalling molecule that functions in interspecies communication by regulating niche-specific genes with diverse functions in various bacteria, often in response to population density. LuxS (S-ribosylhomocysteinase; ) is an autoinducer-production protein that has a metabolic function as a component of the activated methyl cycle. LuxS converts S-ribosylhomocysteine to homocysteine and 4,5-dihydroxy-2,3-pentanedione (DPD); DPD can then spontaneously cyclise to active AI-2 PUBMED:16923076, PUBMED:15287744. LuxS is a homodimeric iron-dependent metalloenzyme containing two identical tetrahedral metal-binding sites similar to those found in peptidases and amidases PUBMED:15751951.

    \ ' '3248' 'IPR000362' '\

    A number of enzymes, belonging to the lyase class, for which fumarate is a substrate, have been shown to share a short conserved sequence around a\ methionine which is probably involved in the catalytic activity of this type\ of enzymes PUBMED:3282546, PUBMED:. The following are examples of members of this family:

    \ \ ' '3249' 'IPR003159' '\

    Proteins containing this central domain consist of a group of secreted bacterial lyase enzymes capable of acting on a variety of substrates. One such enzyme is hyaluronate lyase, a Streptococcal surface enzyme that degrades hyaluronan and chondroitin, thereby helping to spread the bacteria throughout host tissues PUBMED:14523022. Hyaluronate lyase () is a four-domain enzyme containing an N-terminal carbohydrate-binding domain, a spacer domain, a catalytic domain, and a C-terminal domain that modulates access to the catalytic cleft of the enzyme. The central domain has a beta-sandwich topology, with 18 strands in two sheets. Other bacterial enzymes that display this structure include the central domain of chondroitin AC lyase () PUBMED:10329169, the central domain of xanthan lyase () PUBMED:12475987, and the third domain of chondroitin ABC lyase () PUBMED:12706721. This entry represents these domains of hyaluronate lyase, chondroitin AC lyase, xanthan lyase and chondroitin ABC lyase. This domain if almost always associated with the polysaccharide lyase family 8 C-terminal domain ().

    \ ' '3250' 'IPR004103' '\

    Proteins containing this domain consist of a group of secreted bacterial lyase enzymes capable of acting on hyaluronan (hyaluronate lyase, ) and chondroitin (chondroitin AC lyase, ) in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen PUBMED:14523022, PUBMED:10329169. This domain is almost always associated with the polysaccharide lyase family 8, N-terminal domain (see ). This entry represents the C-terminal domain of hyaluronate and chondroitin AC lyase enzymes.

    \ ' '3251' 'IPR001916' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 22 comprises enzymes with two known activities; lysozyme type C () and alpha-lactalbumins. Asp and/or the carbonyl oxygen of the C-2 acetamido group of the substrate acts as the catalytic nucleophile/base.

    \ \

    Alpha-lactalbumin PUBMED:6715332, PUBMED:3104032 is a milk protein that acts as the regulatory subunit of lactose synthetase, acting to promote the conversion of galactosyltransferase to lactose synthase, which is essential for milk production. In the mammary gland, alpha-lactalbumin changes the substrate specificity of galactosyltransferase from N-acetylglucosamine to glucose.

    \ \

    Lysozymes () act as bacteriolytic enzymes by hydrolyzing the beta(1->4) bonds between N-acetylglucosamine and N-acetylmuramic acid in the peptidoglycan of prokaryotic cell walls. It has also been recruited for a digestive role in certain ruminants and colobine monkeys PUBMED:2738070. There are at least five different classes of lysozymes PUBMED:3148618: C (chicken type), G (goose type), phage-type (T4), fungi (Chalaropsis), and bacterial (Bacillus subtilis). There are few similarities in the sequences of the different types of lysozymes.

    \ \

    Lysozyme type C and alpha-lactalbumin are similar both in terms of primary sequence and structure, and probably evolved from a common ancestral protein PUBMED:2731545. Around 35 to 40% of the residues are conserved in both proteins as well as the positions of the four disulphide bonds. There is, however, no similarity in function. Another significant difference between the two enzymes is that all lactalbumins have the ability to bind calcium PUBMED:3785375, while this property is restricted to only a few lysozymes PUBMED:3666156.

    \ \

    The binding site was deduced using high resolution X-ray structure analysis and was shown to consist of three aspartic acid residues. It was first suggested that calcium bound to lactalbumin stabilised the structure, but recently it has been claimed that calcium controls the release of lactalbumin from the golgi membrane and that the pattern of ion binding may also affect the catalytic properties of the lactose synthetase complex.

    \ ' '3252' 'IPR001123' '\ Lysine exporter protein is involved in the efflux of excess L-lysine as a\ control for intracellular levels of L-lysine. A number of proteins belong\ to this family. These include the chemotactic transduction protein from\ Pseudomonas aeruginosa, the threonine efflux protein and a number of\ uncharacterised proteins from a variety of sources.\ ' '3253' 'IPR005269' '\

    This family of conserved hypothetical proteins has no known function.

    \ ' '3254' 'IPR003059' '\

    The DNA sequence of the entire colicin E2 operon has been determined PUBMED:3892228.\ The operon comprises the colicin activity gene (ceaB), the colicin immunity\ gene (ceiB) and the lysis gene (celB), which is essential for colicin\ release from producing cells PUBMED:3892228. A putative LexA binding site is located\ upstream from ceaB, and a rho-independent terminator structure is located\ downstream from celB PUBMED:3892228. Comparison of the amino acid sequences of colicin\ E2 and cloacin DF13 reveal extensive similarity. These colicins have\ different modes of action and recognise different cell surface receptors;\ the two major regions of heterology at the C-terminus, and in the C-terminal\ end of the central region are thought to correspond to the catalytic and \ receptor-recognition domains, respectively PUBMED:3892228.

    \ \

    Sequence similarities between colicins E2, A and E1 PUBMED:3936034 are less striking.\ The colicin E2 (pyocin) immunity protein does not share similarity with\ either the colicin E3 or cloacin DF13 PUBMED:6253914 immunity proteins. By contrast,\ the lysis proteins of the ColE2, ColE1 and CloDF13 plasmids are almost\ identical except in the N-terminal regions, which themselves are similar to\ lipoprotein signal peptides PUBMED:3892228. Processing of the ColE2 prolysis protein\ to the mature form is prevented by globomycin, a specific inhibitor of the\ lipoprotein signal peptidase PUBMED:3892228. The mature ColE2 lysis protein is located\ in the cell envelope PUBMED:3892228.

    \ \ \ ' '3255' 'IPR007054' '\ The lysis S protein is a cytotoxic protein forming holes in membranes causing cell lysis. The action of Lysis S is independent of the proportion of acidic phospholipids in the membrane PUBMED:8467992.\ ' '3256' 'IPR001695' '\

    Lysyl oxidase () (LOX) PUBMED:8104038 is an extracellular copper-dependent enzyme that catalyses the oxidative deamination of peptidyl lysine residues in precursors of various collagens and elastins, yielding alpha-aminoadipic-delta-semialdehyde. The deaminated lysines are then able to form semialdehyde cross-links, resulting in the formation of insoluble collagen and elastin fibres in the extracellular matrix PUBMED:1357535.

    \

    The active site of LOX resides towards the C terminus: this region also binds a single copper atom in an octahedral coordination complex involving at least 3 His residues PUBMED:1352776. Four histidine residues are clustered in a central region of the enzyme. This region is thought to be involved in cooper-binding and is called the \'copper-talon\' PUBMED:8104038.

    \ ' '3257' 'IPR003451' '\

    Terpenes are among the largest groups of natural products and include compounds such as vitamins, cholesterol and carotenoids. The biosynthesis of all terpenoids begins with one or\ both of the two C5 precursors of the pathway: isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). In\ animals, fungi, and certain bacteria, the synthesis of IPP and DMAPP occurs via the well-known mevalonate pathway, however, a second, nonmevalonate terpenoid pathway has been identified in many eubacteria, algae and the chloroplasts of higher plants PUBMED:11004185.

    LytB(IspH) catalyses the conversion of 1-hydroy-2-methyl-2-(E)-butenyl 4-diphosphate into IPP and DMAPP in this second pathway The enzyme appears to be responsible for a branch-step in the nonmevalonate pathway, in that IPP and DMAPP are produced in parallel from a single precursor although the exact mechanism of this is not currently fully understood PUBMED:11818558. Escherichia coli LytB protein had been found to regulate the activity of RelA (guanosine 3\',5\'-bispyrophosphate synthetase I), which in turn controls the level of a regulatory metabolite. It is involved in penicillin tolerance and the stringent response PUBMED:9537400.

    \ ' '3258' 'IPR007492' '\

    The LytTr domain is a DNA-binding, potential winged helix-turn-helix domain (~100 residues) present in a variety of bacterial transcriptional regulators of the algR/agrA/lytR family. It is named after the lytR response regulators involved in the regulation of cell autolysis. The LytTr domain binds to a specific DNA sequence pattern in the upstream regions of target genes PUBMED:12034833. The N-terminal of the protein contains a response regulator receiver domain ().

    \ ' '3259' 'IPR003345' '\ This short repeat is found in multiple copies in bacterial M proteins. The M proteins bind to IgA and are closely associated with virulence.\ The M protein has been postulated to be a major group A streptococcal (GAS) virulence factor because of its contribution to the bacterial resistance to opsonophagocytosis PUBMED:8830235.\ ' '3260' 'IPR005555' '\ The M-factor is a pheromone produced upon nitrogen starvation. The production of M-factor is increased by the pheromone signal. The protein undergoes post-translational modification to remove the C-terminal signal peptide, the carboxy-terminal cysteine residue is carboxy-methylated and S-alkylated with a farnesyl residue PUBMED:8878833.\ ' '3261' 'IPR006395' '\

    These sequences describe methylaspartate ammonia-lyase, also called beta-methylaspartase. It follows methylaspartate mutase (composed of S and E subunits) in one of several possible pathways of glutamate fermentation.

    \ ' '3262' 'IPR004962' '\ Mab-21 is a homeotic regulator homologue. The protein is found in eukayrotes. \ ' '3263' 'IPR001862' '\

    The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death PUBMED:1722985, PUBMED:. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 PUBMED:3219351, PUBMED:4018030, PUBMED:6095282 and acts as a catalyst in the olymerization of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}.

    \

    Perforin PUBMED:1722985, PUBMED:3419519 is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells PUBMED:2395434.

    \

    There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin.

    \ ' '3264' 'IPR003543' '\ The egg peptide speract receptor is a transmembrane glycoprotein of about\ 500 amino acids PUBMED:2538832. Topologically, it comprises a large extracellular domain of about 450 residues, followed by a transmembrane domain and a short cytoplasmic region of about 12 amino acids. The extracellular\ domain contains 4 repeats of a well-conserved region, which spans 115\ amino acids and contains 6 conserved cysteines. A similar domain is also\ found towards the C-terminus of macrophage scavenger receptor type I PUBMED:1978939, a membrane glycoprotein implicated in the pathologic deposition of\ cholesterol in arterial walls during artherogenesis, and in the CD5\ glycoprotein, which acts as a receptor in regulating T-cell proliferation.\ \

    The type I and type II human scavenger receptors are similar to their \ bovine, rabbit and murine counterparts. They consist of 6 domains:\ cytoplasmic (I); membrane-spanning (II); spacer (III); alpha-helical coiled-\ coil (IV); collagen-like (V); and a type-specific C-terminal (VI) PUBMED:2251254. Immunohistochemical studies have indicated the presence of scavenger\ receptors in the macrophages of lipid-rich atherosclerotic lesions, suggesting the involvement of these receptors in atherogenesis PUBMED:2251254.

    \ \

    The macrophage scavenger receptor is trimeric and has unusual ligand-binding\ properties PUBMED:2300204. The trimeric structure of the bovine type I scavenger \ receptor contains 3 extracellular C-terminal cysteine-rich domains connected\ to the transmembrane domain by a long fibrous stalk. The stalk structure,\ which consists of an alpha-helical coiled coil and a collagen-like triple\ helix, has not previously been observed in an integral membrane protein PUBMED:2300204.

    \ ' '3265' 'IPR004690' '\ The MSS family includes the monobasic malonate:Na+ symporter of Malonomonas rubra. It consists of two integral membrane proteins, MadL and MadM. The transporter is believed to catalyze the electroneutral reversible uptake of H+-malonate with one Na+, and both subunits have been shown to be essential for activity.\ ' '3266' 'IPR018402' '\ The MSS family includes the monobasic malonate:Na+ symporter of Malonomonas rubra. It consists of two integral membrane proteins, MadL and MadM.The transporter is believed to catalyze the electroneutral reversible uptake of H+-malonate with one Na+, and both subunits have been shown to be essential for activity.\ ' '3267' 'IPR003697' '\

    Maf is a putative inhibitor of septum formation in eukaryotes, bacteria, and archaea.\ The Maf protein shares substantial\ amino acid sequence identity with the Escherichia coli OrfE protein PUBMED:8387996.

    \ \ ' '3268' 'IPR004023' '\ This family was originally identified in drosophila and called mago nashi, it is a strict maternal effect, grandchildless-like, gene PUBMED:1765008. The human homologue has been shown to interact with an RNA binding protein, ribonucleoprotein rbm8 () PUBMED:10662555. An RNAi knockout of the Caenorhabditis elegans homologue causes masculinization of the germ line (Mog phenotype) hermaphrodites, suggesting it is involved in hermaphrodite germ-line sex determination PUBMED:10656761 but the protein is also found in hermaphrodites and other organisms without a sexual differentiation.\ ' '3269' 'IPR004315' '\ The accessory gland of male insects is a genital tissue that secretes many components of the ejaculatory fluid, some of\ which affect the female\'s receptivity to courtship and her rate of oviposition. The protein is expressed exclusively in the\ male accessory glands of adult Drosophila melanogaster. During copulation it is transferred to the female genital tract where it is rapidly altered PUBMED:3142802.\ ' '3270' 'IPR007244' '\

    NatC N(alpha)-terminal acetyltransferases contain Mak10p, Mak31p and Mak3p subunits. All three subunits are associated with each other to form the active complex PUBMED:11274203.

    \ ' '3271' 'IPR006958' '\ The function of these proteins is unknown. The yeast orthologues have been implicated in cell cycle progression and biogenesis of 60S ribosomal subunits. The Schistosoma mansoni (Blood fluke) Mak16 has been shown to target protein transport to the nucleolus PUBMED:10838225.\ ' '3272' 'IPR011076' '\

    This entry represents the core TIM beta/alpha barrel found in malate synthase and in related proteins such as the beta subunit of citrate lyase.

    \ \

    Malate synthase () catalyses the aldol condensation of glyoxylate with acetyl-CoA to form malate as part of the second step of the glyoxylate bypass and an alternative to the tricarboxylic acid cycle in bacteria, fungi and plants. Malate synthase has a TIM beta/alpha-barrel fold PUBMED:10715138.

    \

    This entry is also represented by citrate lyase beta subunit (), a component of citrate lyase (), which catalyses the interconversion of citrate with acetate and oxaloacetate.

    \ ' '3273' 'IPR003428' '\ This mitochondrial matrix protein family contains members of the MAM33 family which bind to the globular \'heads\' of C1Q.\ ' '3274' 'IPR000296' '\ The cation dependent mannose-6-phosphate (man-6-P) receptor is one of two transmembrane proteins involved in the transport of lysosomal enzymes from the Golgi complex and the cell surface to lysosomes PUBMED:1376319. Lysosomal enzymes bearing phosphomannosyl residues bind specifically to man-6-P receptors in the Golgi apparatus and the resulting receptor-ligand complex is transported to an acidic prelyosomal compartment, where the low pH mediates dissociation of the complex. Binding is optimal in the presence of divalent cations.

    The amino acid sequence is a single polypeptide chain that contains a putative signal sequence and a transmembrane domain PUBMED:2954652. The cation-dependent mannose 6-phosphate (M6P)\ receptor (CD-MPR) is present predominantly as a\ stable homodimer in membranes and has a single\ M6P-binding site per polypeptide PUBMED:2954652, PUBMED:2544594. The molecule crystallizes as a homodimer\ with approximately 20% of the entire surface area of each monomer\ having contact with another through predominantly hydrophobic\ interactions PUBMED:12612639. Each monomer contains a single alpha-helix near its\ amino terminus followed by nine primarily anti-parallel beta-strands that form\ two beta-sheets, which are positioned orthogonally to each other. Extensive\ hydrophobic interactions are formed between the two beta-sheets, which\ results in each monomer forming a flattened beta-barrel structure. Six cysteine residues form three intramolecular disulphide bonds that\ are essential for the ligand-binding conformation of the receptor to be\ generated. The structures of the liganded molecules show that the\ carbohydrate-recognition domain of the enzyme lies relatively deep\ inside the protein, so that the terminal M6P residue and the penultimate\ sugar ring of bound pentamannosyl phosphate are mostly buried in the\ receptor. This deep binding pocket facilitates the formation of numerous\ interactions between the CD-MPR and its carbohydrate ligands.

    \ ' '3275' 'IPR013131' '\

    Mannitol-1-phosphate 5-dehydrogenase catalyses the NAD-dependent reduction of mannitol-1-phosphate to fructose-6-phosphate PUBMED:1904856 as part of the phosphoenolpyruvate-dependent phosphotransferase system (PTS). The PTS facilitates the vectorial translocation of metabolisable carbohydrates to form\ the corresponding sugar phosphates, which are then converted to glycolytic intermediates PUBMED:1322373. Mannitol 2-dehydrogenase catalyses the NAD-dependent reduction of mannitol to fructose PUBMED:8254318.\ Several dehydrogenases have been shown PUBMED:8254318 to be evolutionary related, including mannitol-1-phosphate 5-dehydrogenase () (gene mtlD), mannitol 2-dehydrogenase () (gene mtlK); mannonate oxidoreductase () (fructuronate reductase) (gene uxuB); Escherichia coli hypothetical proteins ydfI and yeiQ; and yeast hypothetical protein YEL070w. This domain has a Rossmann-type fold.

    \ ' '3276' 'IPR001538' '\

    Mannose-6-phosphate isomerase or phosphomannose isomerase () (PMI) is the enzyme that catalyses the interconversion of mannose-6-phosphate and fructose-6-phosphate. In eukaryotes PMI is involved in the synthesis of GDP-mannose, a constituent of N- and O-linked glycans and GPI anchors and in prokaryotes it participates in a variety of pathways, including capsular polysaccharide biosynthesis and D-mannose metabolism. PMI\'s belong to the cupin superfamily whose functions range from isomerase and epimerase activities involved in the modification of cell wall carbohydrates in bacteria and plants, to non-enzymatic storage proteins in plant seeds, and transcription factors linked to congenital baldness in mammals PUBMED:11165500. Three classes of PMI have been defined PUBMED:8307007.

    \ The type II phosphomannose isomerases are bifunctional enzymes . This entry covers the isomerase region of the protein PUBMED:9507048. The guanosine diphospho-D-mannose pyrophosphorylase region is described in another InterPro entry (see ).\ ' '3277' 'IPR007704' '\ PIG-M has a DXD motif. The DXD motif is found in many glycosyltransferases that utilise nucleotide sugars. It is thought that the motif is involved in the binding of a manganese ion that is required for association of the enzymes with nucleotide sugar substrates PUBMED:11226175.\ ' '3278' 'IPR002539' '\ The C terminus of the MaoC protein is found to share similarity with\ a wide variety of enzymes. All these enzymes contain multiple domains.\ This domain is found in parts of two enzymes that have been assigned\ dehydratase activities.\ A deletion mutant of the C-terminal 271 amino acids in \ abolished its 2-enoyl-CoA hydratase activity, suggesting that this\ region may be a hydratase enzyme PUBMED:9891075.\ The maoC gene is part of a operon with maoA which is involved\ in the synthesis of monoamine oxidase PUBMED:1556068.\ ' '3279' 'IPR005298' '\

    Map (MHC class II analogous protein), also known as eap (extracellular\ adherence protein) and p70, is exclusively found in Staphylococcus aureus. It\ is a cell-wall associated protein, which is capable of binding to a number of\ different extracellular matrix glycoploteins and plasma proteins, and to the\ cell surface of S. aureus. Besides the broad binding specificity,\ map has been shown to be important in the adherence to and internalization of\ S. aureus by eukaryotic cells as well as being capable of\ modulating inflammatory response through its interactions with ICAM-1\ (intercellular adhesion molecule-1), although its biological role in vivo\ remains to date unclear PUBMED:14523103.

    \ \

    The protein consists of a signal peptide followed by a unique sequence of\ about 20 amino acids and four to six repeated MAP domains of 110-amino acid\ residues. Within each repeat there is a subdomain consisting of 31 residues\ that was found to be highly homologous to the N-terminal beta-chain of many\ MHC class II molecules PUBMED:7545162.

    \ \

    This entry represents the MAP domain. The crystal structure of this domain has been solved and shows a core fold that is comprised of an alpha-helix lying diagonally across\ a five-stranded, mixed beta-sheet. This structure is very similar to\ the C-terminal domain of bacterial superantigens PUBMED:15691839.

    \ ' '3280' 'IPR004241' '\ Light chain 3 (LC3) may function primarily as a MAP1A and MAP1B subunit and its expression may regulate the microtubule binding activity of of the neuronal microtubule-associated proteins (MAPs), MAP1A and MAP1B PUBMED:7908909. Related proteins that belong to this group include the human ganglioside expression factor and a symbiosis-related fungal protein.\ ' '3281' 'IPR000102' '\ In MAP1B the basic region containing the KKEE and KKEVI motifs is responsible for the interaction between MAP1B and microtubules in vivo. This region bears no sequence relationship to the microtubule binding domains of kinesin, MAP2, or tau PUBMED:2480963.\ Neuraxin is a putative structural protein of the rat central nervous system that is immunologically related to\ microtubule-associated protein 5 (MAP5). Neuraxin may be implicated in neuronal membrane-microtubule interactions PUBMED:2555150. Both proteins contain a region that consists of 12 tandem\ repeats of a 17 residues motif.\ ' '3282' 'IPR001129' '\

    This family describes a widespread superfamily of membrane-associated proteins with highly divergent functions in eicosanoid and glutathione metabolism (MAPEG)\ PUBMED:10091672. Included are:

    \

    \ ' '3283' 'IPR002771' '\

    Members of this family are integral membrane proteins that includes the antibiotic resistance protein MarC. These proteins may be transporters.

    \ ' '3284' 'IPR002101' '\

    Myristoylated alanine-rich C-kinase substrate (MARCKS) is a predominent\ cellular substrate for protein kinase C (PKC) that has been implicated in the regulation of brain development, \ macrophage activation, neuro-secretion and growth factor-dependent\ mitogenesis PUBMED:8420923, PUBMED:11829734. The N-terminal glycine is the site of myristoylation, \ which allows effective binding of the protein to the plasma membrane, where\ it co-localises with PKC PUBMED:2034276. MARCKS binds calmodulin in a calcium-dependent\ manner; the region responsible for calcium-binding is highly basic, a domain\ of about 25 amino acids known as the PSD or effector domain, which also contains the PKC\ phosphorylation sites and has been shown to contribute to membrane binding. When not phosphorylated, the effector domain can bind\ to filamentous actin PUBMED:1560845. It is believed that MARCKS may be a regulated \ crossbridge between actin and the plasma membrane; modulation of the actin\ cross-linking activity by calmodulin and phosphorylation, represent a\ potential convergence of the calcium-calmodulin and PKC signal transduction\ pathways in regulation of the actin cytoskeleton. MARCKS also contains an MH2 domain of unknown function.

    \

    MARCKS-related protein (MRP) is similar to MARCKS in terms of properties\ such as its myristoylation, phosphorylation and calmodulin-binding, and\ shares a high degree of sequence similarity. The two regions that show the highest\ similarity are the kinase C phosphorylation site domain and the N-terminal\ region containing the myristoylation site PUBMED:1864362. MARCKS and MRP amino acid \ compositions are similar, but the alanine content of the latter is lower. MARCKS proteins appear to adopt a native unfolded conformation i.e. as randomly folded chains arranged in non-classical extended conformations, in common with other substrates of PKC.

    \ ' '3285' 'IPR001038' '\

    Equid herpesvirus 1 (Equine herpesvirus 1, EHV-1) glycoprotein 13 (EHV-1 gp13) has the characteristic features of a membrane-spanning protein: an N-terminal signal sequence; a hydrophobic membrane anchor region; a charged C-terminal cytoplasmic tail; and an exterior domain with nine potential N-glycosylation sites PUBMED:2455821. EHV-1 gp13 is the structural homologue of the gC-like glycoproteins of the Human herpesvirus 1 (HHV-1) and Human herpesvirus 2 (HHV-2) (gC-1 and gC-2 respectively), Pseudorabies virus (strain Indiana-Funkhauser/Becker) (PRV) (gIII) and Human herpesvirus 3 (HHV-3) (gp66).

    \

    Secretory glycoprotein GP57-65 precursor (glycoprotein A - GA) is similar to Herpesvirus glycoprotein C, and belongs to the immunoglobulin gene superfamily PUBMED:2836620, PUBMED:2543160. GA is thought to play an immunoevasive role in the pathogenesis of Marek\'s disease. It is a candidate for causing the early-stage immunosuppression that occurs after MDHV infection.

    \ ' '3286' 'IPR000835' '\

    The marR-type HTH domain is a DNA-binding, winged helix-turn-helix (wHTH) domain of about 135 amino acids present in transcription regulators of the marR/slyA family, involved in the development of antibiotic resistance. This family of transcription regulators is named after Escherichia coli marR, a repressor of genes which activate the multiple antibiotic resistance and oxidative stress regulons, and after slyA from Salmonella typhimurium and E. coli, a transcription regulator that is required for virulence and survival in\ the macrophage environment. Regulators with the marR-type HTH domain are\ present in bacteria and archaea and control a variety of biological functions,\ including resistance to multiple antibiotics, household disinfectants, organic\ solvents, oxidative stress agents and regulation of the virulence factor\ synthesis in pathogens of humans and plants. Many of the marR-like regulators\ respond to aromatic compounds PUBMED:10498949, PUBMED:10094687, PUBMED:12649270.

    \ \

    The crystal structures of marR, mexR and slyA have been determined and show a winged HTH DNA-binding core flanked by helices involved in dimerisation. The DNA-binding domains are ascribed to the superfamily of winged\ helix proteins, containing a three (four)-helix (H) bundle and a three-stranded antiparallel beta-sheet (B) in the topology: H1-(H1\')-H2-B1-H3-H4-B2-B3-H5-H6. Helices 3 and 4 comprise the helix-turn-helix motif and the beta-sheet is called the wing. Helix 4 is termed the recognition helix, like in other HTHs where it binds the DNA major groove. The helices 1, 5 and 6 are involved in dimerisation, as most marR-like transcription regulators form dimers PUBMED:12649270, PUBMED:11473263.\

    \ ' '3287' 'IPR002056' '\ Virtually all mitochondrial precursors are imported via the same \ mechanism PUBMED:7709435: precursors first bind to receptors on the mitochondrial\ surface, then insert into the translocation channel in the outer membrane.\ Many outer-membrane proteins participate in the early stages of import,\ four of which (MAS20, MAS22, MAS37 and MAS70) are components of the receptor.\ MAS20, which forms a subcomplex with MAS22, seems to interact with most or\ all mitochondrial precursors, suggesting that the protein binds directly\ to mitochondrial targeting sequences. The MAS37 and MAS70 components also\ form a subcomplex, the two subcomplexes possibly binding via their trans-\ membrane (TM) regions - the TM region of MAS70 promotes oligomerisation\ of attatched protein domains and shares sequence similarity with the\ TM region of MAS20 PUBMED:8163528.\ ' '3288' 'IPR006856' '\

    This family includes Saccharomyces cerevisiae (Baker\'s yeast) mating type protein alpha 1 (). MAT alpha 1 is a transcription activator that activates mating-type alpha-specific genes with the help of the MADS-box containing MCM1 transcription factor, which together bind cooperatively to PQ elements upstream of alpha-specific genes. The MCM1-MATalpha1 complex is required for the proper DNA-bending that is needed for transcriptional activation PUBMED:15118075. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone () response pathway PUBMED:8339934.

    \ ' '3289' 'IPR002528' '\

    Characterised members of the Multi Antimicrobial Extrusion (MATE) family function as drug/sodium antiporters. These proteins mediate resistance to a wide range of cationic dyes, fluroquinolones, aminoglycosides and other structurally diverse antibodies and drugs. MATE proteins are found in bacteria, archaea and eukaryotes. These proteins are predicted to have 12 alpha-helical transmembrane regions, some of the animal proteins may have an additional C-terminal helix.

    \ ' '3290' 'IPR002866' '\

    The maturases are splicing factors for the plant group II introns. These introns are found in plant organelles PUBMED:16763758. Maturases in higher plants are encoded for in the nuclear genes PUBMED:12527773 but are otherwise encoded by organellar introns. The Maturase-related, N-terminal domain is found in plant potential maturases, which probably assists in the splicing of chloroplast group II introns PUBMED:8255751. The function of this region is, however, unknown.

    \ ' '3291' 'IPR000982' '\

    The matrix protein plays a crucial role in virus assembly, and interacts with the RNP complex as well as with the viral membrane. It is found in Morbillivirus, Paramyxovirus, Pneumovirus.

    \ ' '3292' 'IPR004518' '\

    This domain is found in a group of prokaryotic proteins which includes Escherichia coli MazG. The domain is about 100 amino acid residues in length and contains four conserved negatively charged residues that probably form an active site or metal binding site.

    \ ' '3293' 'IPR006922' '\

    This family consists of Mbe/Mob proteins defined by an N-terminal conserved region. These proteins are essential for specific plasmid transfer.

    \ ' '3294' 'IPR006983' '\ The MbeD and MobD proteins are plasmid encoded, and are involved in the plasmid mobilisation and transfer in the presence of conjugative plasmids PUBMED:2671664.\ ' '3295' 'IPR005153' '\

    This domain is found in the MbtH protein as well as at the N-terminus of the antibiotic synthesis protein NIKP1. This domain is about 70 amino acids long and contains 3 fully conserved tryptophan residues. Many of the members of this family are found in known antibiotic synthesis gene clusters.

    \ ' '3296' 'IPR003209' '\

    Methenyltetrahydromethanopterin cyclohydrolase catalyses the interconversion of methenyltetrahydromethanopterin and N(5)formyltetrahydromethanopterin, and is found in both archaea and bacteria. In methanogenic archaea, such as Methanobacterium thermoautotrophicum (strain Marburg / DSM 2133), this enzyme is involved in the production of methane from carbon dioxide PUBMED:8617278. In the sulphate-reducer Archaeoglobus fulgidus, this enzyme is involved in the tetrahydromethanopterin-dependent oxidation of lactate PUBMED:8481088. In Gram-negative methylotrophic bacteria this enzyme is involved in the tetrahydromethanopterin-dependent oxidation of formaldehyde to formate PUBMED:10482517.

    \ ' '3297' 'IPR001208' '\

    MCM proteins are DNA-dependent ATPases required for the initiation of\ eukaryotic DNA replication PUBMED:1454522, PUBMED:8265339, PUBMED:14731643. In eukaryotes there is a family of six proteins, MCM2 to MCM7. They were first identified in yeast where most of them have a\ direct role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called minichromosome maintenance proteins, MCM proteins PUBMED:8332451.

    \ \

    This family is also present in the archebacteria in 1 to 4 copies. Methanocaldococcus jannaschii (Methanococcus jannaschii) has four members, MJ0363, MJ0961, MJ1489 and MJECL13.

    \ \

    The "MCM motif" contains Walker-A and Walker-B type nucleotide binding motifs. The diagnostic sequence defining the MCMs is IDEFDKM. Only Mcm2 (aka Cdc19 or Nda1) has been subjected to mutational analysis in this region, and most mutations abolish its activity PUBMED:9383050. The presence of a putative ATP-binding domain implies that these proteins may be involved in an ATP-consuming step in the initiation of DNA replication in eukaryotes.

    \ \

    The MCM proteins bind together in a large complex PUBMED:9366552.\ Within this complex, individual subunits associate with different affinities, and there is a tightly associated core of Mcm4 (Cdc21), Mcm6 (Mis5) and Mcm7 PUBMED:9658174. This core complex in human MCMs has been associated with helicase activity in vitro PUBMED:9305914, leading to the suggestion that the MCM proteins are the eukaryotic replicative helicase.

    \ \

    Schizosaccharomyces pombe (Fission yeast) MCMs, like those in metazoans, are found in the nucleus throughout the cell cycle. This is in contrast to the Saccharomyces cerevisiae (Baker\'s yeast) in which MCM proteins move in and out of the nucleus during each cell cycle. The assembly of the MCM complex in S. pombe is required for MCM localisation, ensuring that only intact MCM complexes remain in the nucleus PUBMED:10588642.

    \ ' '3298' 'IPR004089' '\

    Methyl-accepting chemotaxis proteins (MCPs) are a family of bacterial receptors that mediate chemotaxis to diverse signals, responding to changes in the concentration of attractants and repellents in the environment by altering swimming behaviour PUBMED:16359703. Environmental diversity gives rise to diversity in bacterial signalling receptors, and consequently there are many genes encoding MCPs PUBMED:17299051. For example, there are four well-characterised MCPs found in Escherichia coli: Tar (taxis towards aspartate and maltose, away from nickel and cobalt), Tsr (taxis towards serine, away from leucine, indole and weak acids), Trg (taxis towards galactose and ribose) and Tap (taxis towards dipeptides).

    \

    MCPs share similar topology and signalling mechanisms. MCPs either bind ligands directly or interact with ligand-binding proteins, transducing the signal to downstream signalling proteins in the cytoplasm. MCPs undergo two covalent modifications: deamidation and reversible methylation at a number of glutamate residues. Attractants increase the level of methylation, while repellents decrease it. The methyl groups are added by the methyl-transferase cheR and are removed by the methylesterase cheB. Most MCPs are homodimers that contain the following organisation: an N-terminal signal sequence that acts as a transmembrane domain in the mature protein; a poorly-conserved periplasmic receptor (ligand-binding) domain; a second transmembrane domain; and a highly-conserved C-terminal cytoplasmic domain that interacts with downstream signalling components. The C-terminal domain contains the glycosylated glutamate residues.

    \ \

    This entry represents the signalling domain found in several methyl-accepting chemotaxis proteins. This domain is thought to transduce the signal to CheA since it is highly conserved in very diverse MCPs.

    \ ' '3299' 'IPR004243' '\ This minor capsid protein may act as a link between the external capsid and the internal DNA-protein core. Residues at the C-terminal end of the protein may act as a protease cofactor leading to activation of the adenovirus proteinase PUBMED:3959314.\ ' '3300' 'IPR009047' '\

    Methyl-coenzyme M reductase (MCR) catalyses the reduction of methyl-coenzyme M (CH3-SCoM) and coenzyme B (HS-CoB) to methane and the corresponding heterosulphide CoM-S-S-CoB (), the final step in methane biosynthesis. This reaction proceeds under anaerobic conditions by methanogenic Archaea PUBMED:16260307, and requires a nickel-porphinoid prosthetic group, coenzyme F430, which is in the EPR-detectable Ni(I) oxidation state in the active enzyme. Studies on a catalytically inactive enzyme aerobically co-crystallized with coenzyme M displayed a fully occupied coenzyme M-binding site with no alternate conformations. The binding of coenzyme M appears to induce specific conformational changes that suggests a molecular mechanism by which the enzyme ensures that methyl-coenzyme M enters the substrate channel prior to coenzyme B, as required by the active-site geometry PUBMED:11491299.

    \

    MCR is a hexamer composed of 2 alpha, 2 beta, and 2 gamma subunits with two identical nickel porphinoid active sites, which form two long active site channels with F430 embedded at the bottom PUBMED:9367957, PUBMED:16234924.

    \

    This entry represents the C-terminal domain of the alpha subunit, which is comprised of an all-alpha multi-helical bundle.

    \ ' '3301' 'IPR003183' '\

    Methyl-coenzyme M reductase (MCR) catalyses the reduction of methyl-coenzyme M (CH3-SCoM) and coenzyme B (HS-CoB) to methane and the corresponding heterosulphide CoM-S-S-CoB (), the final step in methane biosynthesis. This reaction proceeds under anaerobic conditions by methanogenic Archaea PUBMED:16260307, and requires a nickel-porphinoid prosthetic group, coenzyme F430, which is in the EPR-detectable Ni(I) oxidation state in the active enzyme. Studies on a catalytically inactive enzyme aerobically co-crystallized with coenzyme M displayed a fully occupied coenzyme M-binding site with no alternate conformations. The binding of coenzyme M appears to induce specific conformational changes that suggests a molecular mechanism by which the enzyme ensures that methyl-coenzyme M enters the substrate channel prior to coenzyme B, as required by the active-site geometry PUBMED:11491299.

    \

    MCR is a hexamer composed of 2 alpha, 2 beta, and 2 gamma subunits with two identical nickel porphinoid active sites, which form two long active site channels with F430 embedded at the bottom PUBMED:9367957, PUBMED:16234924.

    \

    This entry represents the N-terminal domain of the alpha subunit, which has a ferredoxin-like alpha/beta-sandwich fold with a duplicated beta-alpha-beta topology.

    \ ' '3302' 'IPR003179' '\

    Methyl-coenzyme M reductase (MCR) catalyses the reduction of methyl-coenzyme M (CH3-SCoM) and coenzyme B (HS-CoB) to methane and the corresponding heterosulphide CoM-S-S-CoB (), the final step in methane biosynthesis. This reaction proceeds under anaerobic conditions by methanogenic Archaea PUBMED:16260307, and requires a nickel-porphinoid prosthetic group, coenzyme F430, which is in the EPR-detectable Ni(I) oxidation state in the active enzyme. Studies on a catalytically inactive enzyme aerobically co-crystallized with coenzyme M displayed a fully occupied coenzyme M-binding site with no alternate conformations. The binding of coenzyme M appears to induce specific conformational changes that suggests a molecular mechanism by which the enzyme ensures that methyl-coenzyme M enters the substrate channel prior to coenzyme B, as required by the active-site geometry PUBMED:11491299.

    \

    MCR is a hexamer composed of 2 alpha, 2 beta, and 2 gamma subunits with two identical nickel porphinoid active sites, which form two long active site channels with F430 embedded at the bottom PUBMED:9367957, PUBMED:16234924.

    \

    This entry represents the beta subunit.

    \ ' '3303' 'IPR003179' '\

    Methyl-coenzyme M reductase (MCR) catalyses the reduction of methyl-coenzyme M (CH3-SCoM) and coenzyme B (HS-CoB) to methane and the corresponding heterosulphide CoM-S-S-CoB (), the final step in methane biosynthesis. This reaction proceeds under anaerobic conditions by methanogenic Archaea PUBMED:16260307, and requires a nickel-porphinoid prosthetic group, coenzyme F430, which is in the EPR-detectable Ni(I) oxidation state in the active enzyme. Studies on a catalytically inactive enzyme aerobically co-crystallized with coenzyme M displayed a fully occupied coenzyme M-binding site with no alternate conformations. The binding of coenzyme M appears to induce specific conformational changes that suggests a molecular mechanism by which the enzyme ensures that methyl-coenzyme M enters the substrate channel prior to coenzyme B, as required by the active-site geometry PUBMED:11491299.

    \

    MCR is a hexamer composed of 2 alpha, 2 beta, and 2 gamma subunits with two identical nickel porphinoid active sites, which form two long active site channels with F430 embedded at the bottom PUBMED:9367957, PUBMED:16234924.

    \

    This entry represents the beta subunit.

    \ ' '3304' 'IPR003901' '\

    Methyl-coenzyme M reductase (MCR) catalyses the reduction of methyl-coenzyme M (CH3-SCoM) and coenzyme B (HS-CoB) to methane and the corresponding heterosulphide CoM-S-S-CoB (), the final step in methane biosynthesis. This reaction proceeds under anaerobic conditions by methanogenic Archaea PUBMED:16260307, and requires a nickel-porphinoid prosthetic group, coenzyme F430, which is in the EPR-detectable Ni(I) oxidation state in the active enzyme. Studies on a catalytically inactive enzyme aerobically co-crystallized with coenzyme M displayed a fully occupied coenzyme M-binding site with no alternate conformations. The binding of coenzyme M appears to induce specific conformational changes that suggests a molecular mechanism by which the enzyme ensures that methyl-coenzyme M enters the substrate channel prior to coenzyme B, as required by the active-site geometry PUBMED:11491299.

    \

    MCR is a hexamer composed of 2 alpha, 2 beta, and 2 gamma subunits with two identical nickel porphinoid active sites, which form two long active site channels with F430 embedded at the bottom PUBMED:9367957, PUBMED:16234924.

    \

    Genes encoding the beta (mcrB) and gamma (mcrG) subunits of MCR are separated by two open reading frames coding for two proteins C and D PUBMED:3170483, PUBMED:8863453. The function of proteins C and D is unknown. This entry represents protein D.

    \ ' '3305' 'IPR003178' '\

    Methyl-coenzyme M reductase (MCR) catalyses the reduction of methyl-coenzyme M (CH3-SCoM) and coenzyme B (HS-CoB) to methane and the corresponding heterosulphide CoM-S-S-CoB (), the final step in methane biosynthesis. This reaction proceeds under anaerobic conditions by methanogenic Archaea PUBMED:16260307, and requires a nickel-porphinoid prosthetic group, coenzyme F430, which is in the EPR-detectable Ni(I) oxidation state in the active enzyme. Studies on a catalytically inactive enzyme aerobically co-crystallized with coenzyme M displayed a fully occupied coenzyme M-binding site with no alternate conformations. The binding of coenzyme M appears to induce specific conformational changes that suggests a molecular mechanism by which the enzyme ensures that methyl-coenzyme M enters the substrate channel prior to coenzyme B, as required by the active-site geometry PUBMED:11491299.

    \

    MCR is a hexamer composed of 2 alpha, 2 beta, and 2 gamma subunits with two identical nickel porphinoid active sites, which form two long active site channels with F430 embedded at the bottom PUBMED:9367957, PUBMED:16234924.

    \

    This entry represents the gamma subunit, which has a complex alpha-helical/beta-sheet topology.

    \ ' '3306' 'IPR003420' '\

    Methanol dehydrogenase (MDH) (), found in Gram-negative bacteria, is a pyrroloquinoline quinone (PQQ)-containing enzyme which oxidises methanol to formaldehyde. It is located in the periplasmic space and passes electrons derived from the oxidation of methanol to the soluble cytochrome cL PUBMED:15234264. The enzyme is a tetramer composed of two large alpha subunits and two smaller beta subunits. The alpha subunit binds the PQQ cofactor and contains the active site, while the function of the beta subunit is currently unknown PUBMED:11502173. The alpha subunit forms an eight-bladed propeller structure, with several novel tryptophan-docking motifs linking the individual blades together.

    \ \

    This entry represents the beta subunit of methanol dehydrogenase.

    \ ' '3308' 'IPR007444' '\

    Membrane-derived oligosaccharides (MDO) are members of a family of glucans found in the periplasmic space of Gram-negative bacteria. MdoG has been shown to be necessary for the synthesis of MDO PUBMED:7934824, but its exact function is not known yet. MdoD, an MdoG paralog, is a twin-arginine-dependent periplasmic protein that controls osmoregulated periplasmic glucan backbone structures PUBMED:15175282. This entry represents the functional portion of the protein and excludes the N-terminal signal sequence.

    \ ' '3309' 'IPR013504' '\

    Methylamine dehydrogenase () is a periplasmic quinoprotein found in several methylotrophic bacteria PUBMED:8021187. It is induced when grown on methylamine as a carbon source MADH and catalyses the oxidative deamination of amines to their corresponding aldehydes. The redox cofactor of this enzyme is tryptophan tryptophylquinone (TTQ). Electrons derived from the oxidation of methylamine are passed to an electron acceptor, which is usually the blue-copper protein amicyanin ().

    \ \ \ \

    MADH is a hetero-tetramer, comprised of two heavy subunits and two light subunits. The light subunit forms two antiparallel beta sheets, and contains the active site of this enzyme which is accessible via a hydrophobic channel between the heavy and light subunits. The redox cofactor TTQ is formed from two posttranlationally modified tryptophan residues within this subunit PUBMED:9514722.

    \ ' '3310' 'IPR007018' '\

    Regulation of mRNA synthesis requires intermediary proteins that transduce regulatory signals from upstream transcriptional activator proteins to basal transcription machinery at the core promoter. Three types of intermediary factors that enable the basal transcription machinery to respond to transcriptional activator proteins bound to regulatory DNA sequences have been identified: (i) TAFIIs, which associate with TATA-binding protein (TBP) to form TFIID; (ii) mediator, which associates with RNA polymerase II to form a holo-polymerase; and (iii) coactivators such as human upstream stimulatory activity (USA), mammalian CBP/P300, yeast ADA complex, and HMG proteins. The interaction of these multiprotein complexes with activators and general transcription factors is essential for transcriptional regulation.

    This family of proteins represent the transcriptional mediator protein subunit 6 that is required for activation of many RNA polymerase II promoters and which are conserved from yeast to humans PUBMED:9234719.

    .\

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '3311' 'IPR004354' '\

    REC114 is one of 10 genes required for initiation of meiotic recombination \ in Saccharomyces cerevisiae PUBMED:9267437. Located on chromosome XIII, it is \ transcribed only in meiosis and has no detectable function in mitosis PUBMED:8385581.

    \

    REC114 has been shown to possess an intron and is one of only three genes\ in yeast with 3\' introns PUBMED:9267437. The 3\' splice site utilised in REC114 is a\ very rare AAG sequence - only three other genes in yeast use this non-\ consensus sequence PUBMED:9267437. It appears that the intron is not essential for\ expression of REC114 and is not absolutely required for meiotic function.\ Nevertheless, it is conserved in evolution - two other species of yeast\ contain an intron at the same location in their REC114 genes PUBMED:9267437.

    \ ' '3312' 'IPR004927' '\

    Mercury is a highly toxic metal. Toxicity can result from three different\ mercurial forms: elemental, inorganic ion and organomercurial compounds. The\ ability of bacteria to detoxify mercurial compounds by reduction and\ volatilisation is conferred by the Mer genes, which are usually plasmid\ encoded (although chromosome resistance determinants have also occasionally\ been identified) PUBMED:9168120. Organomercurial lyase (MerB), also known as alkylmercury lyase, mediates the first\ of the two steps in the microbial detoxification of organomercurial salts\ (the other catalysed by mercuric reductase). \

    \

    Organomercurial lyase catalyses the protonolysis of the C-Hg bond in a wide\ range of organomercurial salts (primary, secondary, tertiary, alkyl, vinyl,\ allyl and aryl) to Hg(II) and the respective organic compound PUBMED:10548738:\

    \

    RHg(+) + H(+) = RH + Hg(2+)\

    \

    Hg(II) is subsequently detoxified by mercuric reductase. \

    \

    The enzyme has been purified to homogeneity in Escherichia coli and has been found\ to be a 22.4kDa monomer with no detectable cofactors or metal ions.

    \ ' '3313' 'IPR000111' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycosyl hydrolase family 27, family 31 and family 36 alpha-galactosidases form the glycosyl hydrolase clan GH-D (), a superfamily of alpha-galactosidases, alpha-N-acetylgalactosaminidases, and isomaltodextranases which are likely to share a common catalytic mechanism and structural topology.

    \

    Alpha-galactosidase () (melibiase) PUBMED:4561015 catalyzes the hydrolysis of\ melibiose into galactose and glucose. In man, the deficiency of this enzyme is\ the cause of Fabry\'s disease (X-linked sphingolipidosis). Alpha-galactosidase\ is present in a variety of organisms. There is a considerable degree of\ similarity in the sequence of alpha-galactosidase from various eukaryotic\ species.\ Escherichia coli alpha-galactosidase (gene melA), which requires NAD and\ magnesium as cofactors, is not structurally related to the eukaryotic enzymes;\ by contrast, an Escherichia coli plasmid encoded alpha-galactosidase (gene\ rafA ) PUBMED:2556373 contains a region of about 50 amino acids which is similar to a\ domain of the eukaryotic alpha-galactosidases.\ Alpha-N-acetylgalactosaminidase () PUBMED:2174888 catalyzes the hydrolysis of\ terminal non-reducing N-acetyl-D-galactosamine residues in N-acetyl-alpha-D-\ galactosaminides. In man, the deficiency of this enzyme is the cause of\ Schindler and Kanzaki diseases. The sequence of this enzyme is highly related\ to that of the eukaryotic alpha-galactosidases.

    \ ' '3314' 'IPR002116' '\

    Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an Arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.

    \ \

    The allergens in this family include allergens with the following designations: Api m 3.

    \ \

    Melittin is the principal protein component of the venom of the honeybee, Apis mellifera. It inhibits protein kinase C, Ca2+/calmodulin-dependent protein kinase II, myosin light chain kinase and Na+/K+-ATPase (synaptosomal membrane) and is a cell membrane lytic factor. Melittin is a small peptide with no disulphide bridge; the N-terminal part of the molecule is predominantly hydrophobic and the C-terminal part is hydrophilic and strongly basic.

    \

    The molecular mechanisms underlying the various effects of melittin on membranes have not been completely defined and much of the evidence indicates that different molecular mechanisms may underlie different actions of the peptide PUBMED:2187536.

    \ \

    Extensive work with melittin has shown that the venom has multiple effects, probably, as a result of its interaction with negatively changed phospholipids. It inhibits well known transport pumps such as the Na+-K+-ATPase and the H+-K+-ATPase. Melittin increases the permeability of cell membranes to ions, particularly Na+ and indirectly Ca2+, because of the Na+-Ca2+-exchange. This effect results in marked morphological and functional changes, particularly in excitable tissues such as cardiac myocytes. In some other tissues, e.g., cornea, not only Na+ but Cl- permeability is also increased by melittin. Similar effects to melittin on H+-K+-ATPase have been found with the synthetic amphipathic polypeptide Trp-3 PUBMED:10072885.

    \

    The study of melittin in model membranes has been useful for the development of methodology for determination of membrane protein structures. A molecular dynamics simulation of melittin in a hydrated dipalmitoylphosphatidylcholine (DPPC) bilayer was carried out. The effect of melittin on the surrounding membrane was localised to its immediate vicinity, and its asymmetry with respect to the two layers may be a result of the fact that it is not fully transmembranal. Melittin\'s hydrophilic C terminus anchors it at the extracellular interface, leaving the N terminus "loose" in the lower layer of\ the membrane PUBMED:10692322.

    \ ' '3315' 'IPR004222' '\

    Methane monooxygenases () catalyse the oxidation of methane to methanol in the presence of oxygen and NADH in methanotrophs. It has a broad specificity, hydroxylating many alkanes, and converting alkenes into the corresponding epoxides. In additional reactions, CO is oxidized to CO2, ammonia is oxidized to hydroxylamine, and some aromatic compounds and cyclic alkanes can also be hydroxylated, although more slowly. In Methylococcus capsulatus there are two forms of the enzyme, a soluble and a membrane-bound type. The soluble form consists of 3 components, A, B and C. Protein A is made up of 3 chains, alpha, beta and gamma.

    \

    This entry represents the gamma chain of methane monooxygenases. Structurally, the gamma chain contains two domains, each consisting of a three helices arranged in an open bundle topology PUBMED:, PUBMED:9329079.

    \ ' '3316' 'IPR004891' '\ The mercury resistance protein, MerC, is an inner membrane protein that mediates Hg2+ transport into the cytoplasm, thereby conferring mercury resistance.\ ' '3317' 'IPR007746' '\ The prokaryotic MerE (or URF-1) protein is part of the mercury resistance operon. The protein is thought not to have any direct role in conferring mercury resistance to the organism but may be a mercury resistance transposon PUBMED:9479042, PUBMED:11763242.\ ' '3318' 'IPR000551' '\

    The many bacterial transcription regulation proteins which bind DNA through a \'helix-turn-helix\' motif can be classified into subfamilies on the basis of sequence similarities. One of these is the MerR subfamily. MerR, which is found in many bacterial species mediates the mercuric-dependent induction of the mercury\ resistance operon. In the absence of mercury merR represses transcription by binding tightly, as a dimer, to the \'mer\' operator region; when mercury is present the dimeric complex binds a single ion and becomes a potent transcriptional activator, while remaining bound to the mer site. Members of the family include the mercuric resistance operon regulatory protein merR; \ Bacillus subtilis bltR and bmrR; Bacillus glnR;\ Streptomyces coelicolor hspR; Bradyrhizobium japonicum nolA; Escherichia coli superoxide response regulator soxR;\ and Streptomyces lividans transcriptional activator tipA PUBMED:7688297, PUBMED:2492496, PUBMED:7608059, PUBMED:1677938, PUBMED:1988958, PUBMED:2305262. Other members include hypothetical proteins from E. coli, B. subtilis and Haemophilus influenzae. Within this family, the HTH motif is situated towards the N-terminus.

    \ ' '3319' 'IPR003457' '\ MerT is an mercuric transport integral membrane protein and is responsible for transport of the Hg2+ iron from periplasmic MerP (also part of the transport system) to mercuric reductase (MerA).\ ' '3320' 'IPR003402' '\ The methionine-10 mutant allele of Neurospora crassa codes for a protein of unknown function. However, homologous proteins have been found in yeast, suggesting this protein may be involved in methionine biosynthesis, transport and/or utilization PUBMED:7557397.\ ' '3321' 'IPR004223' '\

    Vitamin B12 dependent methionine synthase (5-methyltetrahydrofolate--homocysteine S-methyltransferase) catalyses the conversion of 5-methyltetrahydrofolate and L-homocysteine to tetrahydrofolate and L-methionine as the final step in de novo methionine biosynthesis. The enzyme requires methylcobalamin as a cofactor. In humans, defects in this enzyme are the cause of autosomal recessive inherited methylcobalamin deficiency (CBLG), which causes mental retardation, macrocytic anemia and homocystinuria. Mild deficiencies in activity may result in mild hyperhomocysteinemia, and mutations in the enzyme may be involved in tumorigenesis. Vitamin B12 dependent methionine synthase is found in prokaryotes and eukaryotes, but in prokaryotes the cofactor is cobalamin.

    \

    In Escherichia coli, methionine synthase is a large enzyme composed of four structurally and functionally distinct modules: the first two modules bind homocysteine and tetrahydrofolate, the third module binds the B12 cofactor (, ), and the C-terminal module (activation domain) binds S-adenosylmethionine. The activation domain is essential for the reductive activation of the enzyme. During the catalytic cycle, the highly reactive cob(I)alamin intermediate can be oxidised to produce an inactive cob(II)alamin enzyme; the enzyme is then reactivated via reductive methylation by the activation domain PUBMED:11731805. The activation domain adopts an unusual alpha/beta fold.

    \ ' '3322' 'IPR005184' '\

    A domain found in proteins of unknown function PUBMED:12625841, some of which are described as heat shock protein (HslJ). In Helicobacter pylori (Campylobacter pylori) the protein is secreted e.g. () and implicated in motility. In Leishmania spp. it is described as an essential protein, over-expression of which, in Leishmania amazonensis increases virulence (; PUBMED:10403759). A pair of cysteine residues show correlated conservation, suggesting that they form a disulphide bond.

    \ ' '3323' 'IPR006124' '\ This domain unites alkaline phosphatase,\ N-acetylgalactosamine-4-sulphatase, and cerebroside sulphatase, enzymes with known\ three-dimensional structures, with phosphopentomutase,\ 2,3-bisphosphoglycerate-independent phosphoglycerate mutase, phosphoglycerol\ transferase, phosphonate monoesterase, streptomycin-6-phosphate phosphatase, alkaline\ phosphodiesterase/nucleotide pyrophosphatase PC-1, and several closely related sulphatases. This domain is also related to alkaline phosphatase PUBMED:10082381.\ The most conserved residues are\ probably involved in metal binding and catalysis.\ ' '3324' 'IPR000869' '\

    Metallothioneins (MT) are small proteins that bind heavy metals, such as zinc, copper, cadmium, \ nickel, etc. They have a high content of cysteine residues that bind the metal ions through clusters \ of thiolate bonds PUBMED:3064814, PUBMED:2959513, PUBMED:1779825. The metallothionein superfamily comprises \ all polypeptides that resemble equine renal metallothionein in several respects, e.g. low molecular\ weight; high metal content; amino acid composition with high Cys and low aromatic residue content; \ unique sequence with characteristic distribution of cysteines, and spectroscopic manifestations \ indicative of metal thiolate clusters. A MT family subsumes MTs that share particular sequence-specific \ features and are thought to be evolutionarily related. Fifteen MT families have been characterised, \ each family being identified by its number and its taxonomic range.\

    Fungi-IV (family 11) MTs are \ proteins of about 55-56 residues, with 9 conserved cysteines. Its members are recognised by the sequence pattern C-X-K-C-x-C-x(2)-C-K-C. \ The taxonomic range of the members extends to ascomycotina. \ The protein contains a number of unusual histidine and phenylalanine residues conserved in the N-terminal part of the sequence. This fragment does not contain any Cys. The protein binds to copper ions.

    \ ' '3325' 'IPR000347' '\ Members of this family are metallothioneins. These\ proteins are cysteine rich proteins that bind to heavy\ metals. Members of this family appear to be closest to\ Class II metallothioneins.\ ' '3326' 'IPR000966' '\

    Metallothioneins (MT) are small proteins that bind heavy metals, such as zinc, copper, cadmium, and nickel. \ They have a high content of cysteine residues that bind the metal ions through clusters of thiolate bonds \ PUBMED:1779825, PUBMED:2959513, PUBMED:3064814 species, including sea urchins, fungi, insects and cyanobacteria. \ Class III MTs are atypical polypeptides composed of gamma-glutamylcysteinyl units. This original \ classification system has been found to be limited, in the sense that it does not allow clear differentiation \ of patterns of structural similarities, either between or within classes. Consequently, all class I and class \ I MTs (the proteinaceous sequences) have now been grouped into families of phylogenetically-related and thus \ alignable sequences.

    \

    Diptera (Drosophila, family 5) MTs are 40-43 residue proteins that contain 10 conserved \ cysteines arranged in five Cys-X-Cys groups. In particular, the consensus pattern \ C-G-x(2)-C-x-C-x(2)-Q-x(5)-C-x-C-x(2)-D-C-x-C has been found to be diagnostic of family 5 MTs. The protein \ is found primarily in the alimentary canal, and its induction is stimulated by ingestion of cadmium or copper \ PUBMED:2578462. Mercury, silver and zinc induce the protein to a lesser extent. Family 5 includes subfamilies: d1, d2. Only one d2 is known until now. Subfamilies hit the same entry.

    \ ' '3327' 'IPR000316' '\

    Metallothioneins (MT) are small proteins that bind heavy metals, such as zinc, copper, cadmium, nickel, etc. They have a high content of cysteine residues that bind the metal ions through clusters of thiolate bonds PUBMED:1779825, PUBMED:2959513, PUBMED:3064814, PUBMED:2959504. An empirical classification into three classes has been proposed by Fowler and coworkers PUBMED:2959504 and Kojima PUBMED:1779826. Members of class I are defined to include polypeptides related in the positions of their cysteines to equine MT-1B, and include mammalian MTs as well as MTs from crustaceans and molluscs. Class II groups MTs from a variety of species, including sea urchins, fungi, insects and cyanobacteria. Class III MTs are atypical polypeptides composed of gamma-glutamylcysteinyl units PUBMED:2959504. \ This original classification system has been found to be limited, in the sense that it does not allow clear differentiation of patterns of structural similarities, either between or within classes. Consequently, all class I and class II MTs (the proteinaceous sequences) have now been grouped into families of phylogenetically-related and thus alignable sequences. This system subdivides the MT superfamily into families, subfamilies, subgroups, and isolated isoforms and alleles. \ The metallothionein superfamily comprises all polypeptides that resemble equine renal metallothionein in several respects PUBMED:2959504: e.g., low molecular weight; high metal content; amino acid composition with high Cys and low aromatic residue content; unique sequence with characteristic distribution of cysteines, and spectroscopic manifestations indicative of metal thiolate clusters. A MT family subsumes MTs that share particular sequence-specific features and are thought to be evolutionarily related. The inclusion of a MT within a family presupposes that its amino acid sequence is alignable with that of all members. Fifteen MT families have been characterised, each family being identified by its number and its taxonomic range: e.g., Family 1: vertebrate MTs.

    \

    Family 15 consists of planta MTs. Its members are recognised by the sequence pattern [YFH]-x(5,25)-C-[SKD]-C-[GA]-[SDPAT]-x(0,1)-C-x-[CYF] which yields all plant sequences, but also MTCU_HELPO and the non-MT ITB3_HUMAN. The taxonomic range of the members extends to planta. Planta MTs are 45-84 residue proteins, containing 17 conserved cysteines that bind 5 zinc ions. Generally, there are two Cys-rich regions (domain 1 and domain 3) separated by a Cys-poor region (domain 2) and only the domain 2 contains unusual residues. It is believed that the proteins may have a role in Zn2+ homeostasis during embryogenesis. Family 15 includes the following subfamilies: p1, p2, p2v, p3, pec, p21.

    \ \ ' '3328' 'IPR000518' '\

    Metallothioneins (MT) are small proteins that bind heavy metals, such as zinc, copper, cadmium and nickel. They have a high content of cysteine residues that bind the metal ions through clusters of thiolate bonds PUBMED:3064814, PUBMED:2959513, PUBMED:1779825. An empirical classification into three classes was proposed by Kojima PUBMED:1779826, with class III MTs including atypical polypeptides composed of gamma-glutamylcysteinyl units. Class I and class II MTs (the proteinaceous sequences) have now been grouped into families of phylogenetically-related and thus alignable sequences. The MT superfamily is subdivided into families, subfamilies, subgroups, and isolated isoforms and alleles. The metallothionein superfamily comprises all polypeptides that resemble equine renal metallothionein in several respects PUBMED:2959504, e.g., low molecular weight; high metal content; amino acid composition with high Cys and low aromatic residue content; unique sequence with characteristic distribution of cysteines, and spectroscopic manifestations indicative of metal thiolate clusters. A MT family subsumes MTs that share particular sequence-specific features and are thought to be evolutionarily related. Fifteen MT families have been characterised, each family being identified by its number and its taxonomic range.

    \

    Family 14 consists of prokaryota MTs. Its members are recognised by the sequence pattern K-C-A-C-x(2)-C-L-C.The taxonomic range of the members extends to cyanobacteria. Known characteristics are: 53 to 56 AAs; 9 conserved Cys; one conserved tyrosine residue; one conserved histidine residue; contain other unusual residues.

    \ ' '3329' 'IPR003019' '\

    Metallothioneins (MT) are small proteins that bind heavy metals, such as zinc, copper, cadmium, nickel, etc. They have a high content of cysteine residues that bind the metal ions through clusters of thiolate bonds PUBMED:1779825, PUBMED:2959513. An empirical classification into three classes has been proposed by Fowler and coworkers PUBMED:2959504 and Kojima PUBMED:1779826. Members of class I are defined to include polypeptides related in the positions of their cysteines to equine MT-1B, and include mammalian MTs as well as from crustaceans and molluscs. Class II groups MTs from a variety of species, including sea urchins,\ fungi, insects and cyanobacteria. Class III MTs are atypical polypeptides composed of gamma-glutamylcysteinyl units PUBMED:2959504.

    \

    This original classification system has been found to be limited, in the sense that it does not allow clear differentiation of patterns of structural similarities, either between or within classes. Consequently, all class I and class II MTs (the proteinaceous sequences) have now been grouped into families of phylogenetically-related and thus alignable sequences. This system subdivides the MT superfamily into families, subfamilies, subgroups, and isolated isoforms and alleles.

    \

    The metallothionein superfamily comprises all polypeptides that resemble equine renal metallothionein in several respects PUBMED:2959504: e.g., low molecular weight; high metal content; amino acid composition with high Cys and low aromatic residue content; unique sequence with characteristic distribution of cysteines, and spectroscopic manifestations indicative of metal thiolate clusters. A MT family subsumes MTs that share particular sequence-specific features and are thought to be evolutionarily related. The inclusion of a MT within a family presupposes that its amino acid sequence is alignable with that of all members. Fifteen MT families have been characterised, each family being identified by its number and its taxonomic range: e.g., Family 1: vertebrate MTs [see http://www.bioc.unizh.ch/mtpage/protali.html].

    \

    This entry is a superfamily of metallothioneins, containing 3 families. All members are from eukaryotes.

    \ ' '3330' 'IPR002629' '\ This is a domain of vitamin-B12 independent methionine synthases or 5-methyltetrahydropteroyltriglutamate--homocysteine methyltransferases, from bacteria and plants. Plants are the only higher eukaryotes that have the required enzymes for methionine synthesis PUBMED:9636232. This enzyme catalyses the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to homocysteine PUBMED:9636232. The aligned region makes up the carboxy region of the approximately 750 amino acid protein except in some hypothetical archaeal proteins present in the family, where this region corresponds to the entire length.\ ' '3332' 'IPR014048' '\

    Synonym(s): 6-O-methylguanine-DNA methyltransferase, O-6-methylguanine-DNA-alkyltransferase

    \ \

    This entry represents the DNA binding region of 6-O-methylguanine-DNA methyltransferases.

    The repair of DNA containing O6-alkylated\ guanine is carried out by DNA-[protein]-cysteine S-methyltransferase (). The major mutagenic and carcinogenic effect of methylating agents in DNA is the formation of O6-alkylguanine. The\ alkyl group at the O-6 position is transferred to a cysteine residue in the\ enzyme PUBMED:3052269. This is a suicide reaction since the enzyme is irreversibly inactivated\ and the methylated protein accumulates as a dead-end product. Most, but not\ all of the methyltransferases are also able to repair O-4-methylthymine. DNA-[protein]-cysteine S-methyltransferases are widely distributed and are found in various prokaryotic and eukaryotic sources PUBMED:1579490.

    \ ' '3333' 'IPR008332' '\

    Synonym(s): 6-O-methylguanine-DNA methyltransferase, O-6-methylguanine-DNA-alkyltransferase

    \ \

    The repair of DNA containing O6-alkylated\ guanine is carried out by DNA-[protein]-cysteine S-methyltransferase (). The major mutagenic and carcinogenic effect of methylating agents in DNA is the formation of O6-alkylguanine. The\ alkyl group at the O-6 position is transferred to a cysteine residue in the\ enzyme PUBMED:3052269. This is a suicide reaction since the enzyme is irreversibly inactivated\ and the methylated protein accumulates as a dead-end product. Most, but not\ all of the methyltransferases are also able to repair O-4-methylthymine. DNA-[protein]-cysteine S-methyltransferases are widely distributed and are found in various prokaryotic and eukaryotic sources PUBMED:1579490.

    \

    This group of proteins are characterised by having an N-terminal ribonuclease-like domain associated with 6-O-methylguanine DNA methyltransferase activity ().

    \ ' '3334' 'IPR001077' '\

    Methyl transfer from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalysed by methyltransferases (Mtases) and modifies DNA, RNA, proteins and small molecules, such as catechol for regulatory purposes. The various aspects of the role of DNA methylation in prokaryotic restriction-modification systems and in a number of cellular processes in eukaryotes including gene regulation and differentiation is well documented.

    \ \

    Three classes of DNA Mtases transfer the methyl group from AdoMet to the target base to form either N-6-methyladenine, or N-4-methylcytosine, or C-5- methylcytosine. In C-5-cytosine Mtases, ten conserved motifs are arranged in the same order PUBMED:8127644. Motif I (a glycine-rich or closely related consensus sequence; FAGxGG in M.HhaI PUBMED:8343957), shared by other AdoMet-Mtases PUBMED:2684970, is part of the cofactor binding site and motif IV (PCQ) is part of the catalytic site. In contrast, sequence comparison among N-6-adenine and N-4-cytosine Mtases indicated two of the conserved segments PUBMED:2690010, although more conserved segments may be present. One of them corresponds to motif I in C-5-cytosine Mtases, and the other is named (D/N/S)PP(Y/F). Crystal structures are known for a number of Mtases PUBMED:7607476, PUBMED:8343957, PUBMED:8127644, PUBMED:7971991. The cofactor binding sites are almost identical and the essential catalytic amino acids coincide. The comparable protein folding and the existence of equivalent amino acids in similar secondary and tertiary positions indicate that many (if not all) AdoMet-Mtases have a common catalytic domain structure. This permits tertiary structure prediction of other DNA, RNA, protein, and small-molecule AdoMet-Mtases from their amino acid sequences PUBMED:7897657.

    \ \

    This domain includes a range of O-methyltransferases some of which utilise S-adenosyl methionine as substrate PUBMED:8434913. In prokaryotes, the major role of DNA methylation is to protect host DNA against degradation by restriction enzymes. In eukaryotes, DNA methylation has been implicated in the control of several cellular processes, including differentiation, gene regulation, and embryonic development. O-methyltransferases have a common catalytic domain structure, which might be universal among S-adenosyl-L-methionine (AdoMet)-dependent methyltransferases PUBMED:7773746.

    \

    Comparative analysis of the predicted amino acid sequences of a number of plant O-methyltransferase cDNA clones show that they share some 32-71% sequence identity, and can be grouped according to the different compounds they utilise as substrates PUBMED:9484457.

    \ ' '3335' 'IPR002935' '\ Members of this family are O-methyltransferases. The family includes also bacterial O-methyltransferases that may be involved in antibiotic production PUBMED:8936303.\ ' '3336' 'IPR003358' '\

    This entry represents tRNA (guanine-N-7) methyltransferase (), which catalyses the formation of N(7)-methylguanine at position 46 (m7G46) in tRNA. Capping of the pre-mRNA 5\' end by addition a monomethylated guanosine cap (m(7)G) is an essential and the earliest modification in the biogenesis of mRNA PUBMED:17949828. The reaction is catalysed by three enzymes: triphosphatase, guanylyltransferase, and tRNA (guanine-N-7) methyltransferase PUBMED:12403464, PUBMED:18412263.

    \ ' '3337' 'IPR005493' '\

    This entry represents a structural motif found in demethylmenaquinone methyltransferases and in the regulator of ribonuclease E activity A (RraA). These proteins contain a swivelling 3-layer beta/beta/alpha domain that appears to be mobile in most multi-domain proteins known to contain it. These proteins are structurally similar, and may have distant homology, to the phosphohistidine domain of pyruvate phosphate dikinase. The RraA fold is an ancient platform that has been adapted for a wide range of functions. RraA had been identified as a putative demethylmenaquinone methyltransferase and was annotated as MenG, but further analysis showed that RraA lacked the structural motifs usually required for methylases.

    \

    The Escherichia coli protein regulator RraA acts as a trans-acting modulator of RNA turnover, binding essential endonuclease RNase E and inhibiting RNA processing PUBMED:14499605. RNase E forms the core of a large RNA-catalysis machine termed the degradosomes. RraA (and RraB) causes remodelling of degradosome composition, which is associated with alterations in RNA decay and global transcript abundance and as such is a bacterial mechanism for the regulation of RNA cleavage PUBMED:.

    \

    Demethylmenaquinone methyltransferases convert dimethylmenaquinone (DMK) to menaquinone (MK) in the final step of menaquinone biosynthesis.

    \

    This region is also found at the C-terminus of the DlpA protein .

    \ ' '3338' 'IPR005299' '\

    This family of plant methyltransferases contains enzymes that act on a variety of substrates including salicylic acid, jasmonic acid and 7-Methylxanthine. Caffeine is synthesized through sequential three-step methylation of xanthine derivatives at positions 7-N, 3-N, and 1-N. The protein 7-methylxanthine methyltransferase (designated as CaMXMT) catalyses the second step to produce theobromine PUBMED:11108716.

    \ ' '3339' 'IPR012327' '\

    In prokaryotes, the major role of DNA methylation is to protect host DNA against degradation by restriction enzymes. There are 2 major classes of DNA methyltransferase that differ in the nature of the modifications they effect. The members of one class (C-MTases) methylate a ring carbon and form C5-methylcytosine (see ). Members of the second class (N-MTases) methylate exocyclic nitrogens and form either N4-methylcytosine (N4-MTases) or N6-methyladenine (N6-MTases). Both classes of MTase utilise the cofactor S-adenosyl-L-methionine (SAM) as the methyl donor and are active as monomeric enzymes PUBMED:7663118.

    \

    N-6 adenine-specific DNA methylases () (A-Mtase) are enzymes that specifically methylate the amino group at the C-6 position of adenines in DNA. Such enzymes are found in the three existing types of bacterial restriction-modification systems (in type I system the A-Mtase is the product of the hsdM gene, and in type III it is the product of the mod gene). All of these enzymes recognise a specific sequence in DNA and methylate an adenine in that sequence. It has been shown PUBMED:3323532, PUBMED:3248728, PUBMED:2541254, PUBMED:7607512 that A-Mtases contain a conserved motif Asp/Asn-Pro-Pro-Tyr/Phe in their N-terminal section, this conserved region could be\ involved in substrate binding or in the catalytic activity. The structure of N6-MTase TaqI (M.TaqI) has been resolved to 2.4 A PUBMED:7971991. The molecule folds into\ 2 domains, an N-terminal catalytic domain, which contains the catalytic and cofactor binding sites, and comprises a central 9-stranded beta-sheet, surrounded by 5 helices; and a C-terminal DNA recognition domain, which is formed by 4 small beta-sheets and 8 alpha-helices. The N- and C-terminal domains form a cleft that accommodates the DNA substrate. A classification of N-MTases has been proposed, based on conserved motif (CM) arrangements PUBMED:7607512. According to this classification, N6-MTases that\ have a DPPY motif (CM II) occuring after the FxGxG motif (CM I) are\ designated D12 class N6-adenine MTases.

    \ ' '3340' 'IPR002084' '\ Binding of a specific DNA fragment and S-adenosyl methionine (SAM) co-repressor molecules to the Escherichia coli methionine repressor (MetJ) leads to a significant reduction in dynamic flexibility of the ternary complex, with considerable entropy-enthalpy\ compensation, not necessarily involving any overall conformational change PUBMED:8026581. MetJ is a regulatory protein which when combined with\ S-adenosylmethionine (SAM) represses the expression of the methionine\ regulon and of enzymes involved in SAM synthesis. It is also autoregulated.\

    The crystal structure of the met repressor-operator complex shows two dimeric\ repressor molecules bound to adjacent sites 8 base pairs apart on an 18-base-pair\ DNA fragment. Sequence specificity is achieved by insertion of double-stranded\ antiparallel protein beta-ribbons into the major groove of B-form DNA, with direct\ hydrogen-bonding between amino-acid side chains and the base pairs. The\ repressor also recognises sequence-dependent distortion or flexibility of the operator\ phosphate backbone, conferring specificity even for inaccessible base pairs PUBMED:1406951.

    \ ' '3341' 'IPR006742' '\

    This repeated sequence,WHWLQLKPGQPMY, characterises the mating factor alpha-1 or alpha-1 mating pheromone [contains: Mating factor alpha].The hormone is excreted into the culture medium by haploid cells of the alpha mating type and acts on cells of the opposite mating type (type A) by binding to a cognate G-protein coupled receptor which is coupled to a downstream signal transduction pathway. It inhibits DNA synthesis in type A cells synchronising them with type alpha, and so mediates the conjugation process.

    \ ' '3342' 'IPR007328' '\ Mytilus foot protein-3 (Mfp-3) is a highly polymorphic protein family located in the byssal adhesive plaques of blue mussels.\ ' '3343' 'IPR000523' '\ Magnesium-chelatase is a three-component enzyme that catalyses the insertion of Mg2+ into protoporphyrin IX. This is the first unique step in the synthesis of (bacterio)chlorophyll. As a result, it is thought that Mg-chelatase has an important role in channeling intermediates into the (bacterio)chlorophyll branch in response to conditions suitable for photosynthetic growth. ChlI and BchD have molecular weights between 38-42 kDa.\ ' '3344' 'IPR007737' '\ Mga is a DNA-binding protein that activates the expression of several important virulence genes in group A streptococcus in response to changing environmental conditions PUBMED:11952907. The family also contains VirR like proteins which match only at the C terminus of the alignment.\ ' '3345' 'IPR007885' '\ This family contains several Mycoplasma MgpC like-proteins.\ ' '3346' 'IPR003416' '\ The MgtC protein is found in an operon with the Mg2+ transporter protein MgtB. The function of MgtC and its homologues is not known, but it is thought that MgtC may act as an accessory protein for MgtB, thus mediating magnesium influx into the cytosol. Also included in this family are the Bacillus subtilis SapB protein and several hypothetical proteins.\ ' '3347' 'IPR006668' '\

    This domain is found at the N-terminus of eubacterial magnesium transporters of the MgtE family . This domain is an intracellular domain that has an alpha-helical structure. The crystal structure of the MgtE transporter PUBMED:17700703 shows two of 5 magnesium ions are in the interface between the N domain and the CBS domains. In the absence of magnesium there is a large shift between the N and CBS domains.

    \ ' '3348' 'IPR003619' '\

    Mammalian dwarfins are phosphorylated in response to transforming growth factor beta and are implicated in control of cell growth PUBMED:8799132. The dwarfin family also includes the Drosophila protein MAD that is required for the function of decapentaplegic (DPP) and may play a role in DPP signalling. Drosophila Mad binds to DNA and directly mediates activation of vestigial by Dpp PUBMED:9230443. This domain is also found in nuclear factor I (NF-I) or CCAAT box-binding transcription factor (CTF).

    \ \

    This entry represents the MH1 (MAD homology 1) domain is found at the amino terminus of MAD related proteins such as Smads. This domain is separated from the MH2 domain by a non-conserved linker region. The crystal structure of the MH1 domain shows that a highly conserved 11 residue beta hairpin is used to bind the DNA consensus sequence GNCN in the major groove, shown to be vital for the transcriptional activation of target genes. Not all examples of MH1 can bind to DNA however. Smad2 cannot bind DNA and has a large insertion within the hairpin that presumably abolishes DNA binding. A basic helix (H2) in MH1 with the nuclear localisation signal KKLKK has been shown to be essential for Smad3 nuclear import. Smads also use the MH1 domain to interact with transcription factors such as Jun, TFE3, Sp1, and Runx PUBMED:11532220, PUBMED:9230443, PUBMED:8799132.

    \ ' '3349' 'IPR001132' '\

    Mammalian dwarfins are phosphorylated in response to transforming growth factor beta and are implicated in control of cell growth PUBMED:8799132. The dwarfin family also includes the Drosophila protein MAD that is required for the function of decapentaplegic (DPP) and may play a role in DPP signalling. Drosophila Mad binds to DNA and directly mediates activation of vestigial by Dpp PUBMED:9230443. This domain is also found in nuclear factor I (NF-I) or CCAAT box-binding transcription factor (CTF).

    \ \

    This entry represents the SMAD (Mothers against decapentaplegic (MAD) homologue) (also called MH2 for MAD homology 2) domain found at the carboxy terminus of MAD related proteins such as Smads. This domain is separated from the MH1 domain by a non-conserved linker region. The MH2 domain mediates interaction with a wide variety of proteins and provides specificity and selectivity to Smad function and also is critical for mediating interactions in Smad oligomers. Unlike MH1, MH2 does not bind DNA. The well-studied MH2 domain of Smad4 is composed of five alpha helices and three loops enclosing a beta sandwich. Smads are involved in the propagation of TGF-beta signals by direct association with the TGF-beta receptor kinase which phosphorylates the last two Ser of a conserved \'SSXS\' motif located at the C-terminus of MH2 PUBMED:11532220, PUBMED:9230443, PUBMED:8799132.

    \ ' '3350' 'IPR001039' '\

    Major Histocompatibility Complex (MHC) glycoproteins are heterodimeric cell surface receptors that function to present antigen peptide fragments to T cells responsible for cell-mediated immune responses. MHC molecules can be subdivided into two groups on the basis of structure and function: class I molecules present intracellular antigen peptide fragments (~10 amino acids) on the surface of the host cells to cytotoxic T cells; class II molecules present exogenously derived antigenic peptides (~15 amino acids) to helper T cells. MHC class I and II molecules are assembled and loaded with their peptide ligands via different mechanisms. However, both present peptide fragments rather than entire proteins to T cells, and are required to mount an immune response.

    \

    Class I MHC glycoproteins are expressed on the surface of all somatic nucleated cells, with the exception of neurons. MHC class I receptors present peptide antigens that are synthesised in the cytoplasm, which includes self-peptides (presented for self-tolerance) as well as foreign peptides (such as viral proteins). These antigens are generated from degraded protein fragments that are transported to the endoplasmic reticulum by TAP proteins (transporter of antigenic peptides), where they can bind MHC I molecules, before being transported to the cell surface via the Golgi apparatus PUBMED:9485452, PUBMED:15526153. MHC class I receptors display antigens for recognition by cytotoxic T cells, which have the ability to destroy viral-infected or malignant (surfeit of self-peptides) cells.

    \

    MHC class I molecules are comprised of two chains: a MHC alpha chain (heavy chain), and a beta2-microglobulin chain (light chain), where only the alpha chain spans the membrane. The alpha chain has three extracellular domains (alpha 1-3, with alpha1 being at the N-terminus), a transmembrane region and a C-terminal cytoplasmic tail. The soluble extracellular beta-2 microglobulin chain associates primarily with the alpha-3 domain and is necessary for MHC stability. The alpha1 and alpha2 domains of the alpha chain are referred to as the recognition region, because the peptide antigen binds in a deep groove between these two domains.

    \ \

    This entry represents the alpha chain domains alpha1 and alpha2 that make up this recognition region (the alpha3 domain is represented by ().

    \

    More information about these proteins can be found at Protein of the Month: MHC PUBMED:.

    \ ' '3351' 'IPR005330' '\

    The MHYT (~190-residue) domain is thought to function as a sensor domain in bacterial signalling proteins, and is named after its conserved amino acid motif, methionine, histidine, and tyrosine. The MHYT domain consists of six predicted transmembrane (TM) segments, connected by short arginine-rich cytoplasmic and periplasmic loops rich in charged residues. Three of the TM segments contain the MHYT motif near the outer face of the cytoplasmic membrane. The MHYT domain has been found in several phylogenetically distinct bacteria, either as a separate, single domain, or in combination with other domains, such as a LytTR-type DNA-binding helix-turn-helix (), or the signalling domains histidine kinase (), GGDEF (), EAL () or PAS (). Proteins containing this repeat include CoxC () and CoxH () from Pseudomonas carboxydovorans.

    \ ' '3352' 'IPR003464' '\ This small enzyme forms a homodecameric complex, that catalyses the third step in the catabolism of catechol to succinate- and acetyl-coa in the beta-ketoadipate pathway (). The protein has a ferredoxin-like fold according to SCOP.\ ' '3353' 'IPR003061' '\

    \ The structural and functional relationships among independently cloned\ segments of the plasmid ColE1 region that regulates and codes for colicin E1\ (cea), immunity (imm) and the mitomycin C-induced lethality function (lys)\ have been analysed PUBMED:3936034. A model for the structure and expression of the \ colicin E1 operon has been proposed in which the cea and lys genes are \ expressed from a single inducible promoter that is controlled by the lexA\ repressor in response to the SOS system of Escherichia coli PUBMED:3936034. The imm \ gene lies between the cea and lys genes and is expressed by transcription\ in the opposite direction from a promoter located within the lys gene PUBMED:3936034.\ This arrangement indicates that the transcriptional units for all three\ genes overlap. It is proposed that the formation of anti-sense RNA may \ be an important element in the coordinate regulation of gene expression\ in this system PUBMED:3936034.

    \ \

    Hydropathy analysis of the imm gene products suggests that they have \ hydrophobic domains characteristic of membrane-associated proteins PUBMED:3936034.\ The microcin E1 immunity protein is able to protect a cell that harbours\ the plasmid ColE1 encoding colicin E1 against colicin E1; it is thus\ essential both for autonomous replication and colicin E1 immunity PUBMED:384144.

    \ ' '3354' 'IPR006777' '\

    Bacteriophage PhiX174 is one of the simplest viruses, having a single-stranded, closed circular DNA of 5386 nucleotide bases and four capsid proteins, J, F, G and\ H. A single molecule of H protein is found on each of the 12 spikes on the microvirus shell of the bacteriophage. H is involved in the ejection of the phage DNA, and at least one copy is injected into the hosts periplasmic space along with the ssDNA viral genome PUBMED:8158636. Part of H is thought to lie outside the shell, where it recognises lipopolysaccharide from virus-sensitive bacterial strains PUBMED:10225278. Part of H may lie within the capsid, since mutations in H can influence the DNA ejection mechanism by affecting the DNA-protein interactions PUBMED:8433365. H may span the capsid through the hydrophilic channels formed by G proteins PUBMED:8158636.

    \ ' '3355' 'IPR006815' '\ This small protein is involved in DNA packaging, interacting with DNA via its hydrophobic C-terminus. In bacteriophage phi-X174, J is present in 60 copies, and forms an S-shaped polypeptide chain without any secondary structure. It is thought to interact with DNA through simple charge interactions PUBMED:911774.\ ' '3356' 'IPR007605' '\ E protein causes host cell lysis by inhibiting MraY, a peptidoglycan biosynthesis enzyme. This leads to cell wall failure at septation PUBMED:12100551. The N-terminal transmembrane region matches the signal peptide model and must be omitted from the family.\ ' '3357' 'IPR007567' '\ This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway.\ ' '3358' 'IPR001398' '\

    Macrophage migration inhibitory factor (MIF) is a key regulatory cytokine within innate and adaptive immune responses, capable of promoting and modulating the magnitude of the response PUBMED:15225126. MIF is released from T-cells and macrophages, and acts within the neuroendocrine system. MIF is capable of tautomerase activity, although its biological function has not been fully characterised. It is induced by glucocorticoid and is capable of overriding the anti-inflammatory actions of glucocorticoid PUBMED:16331703. MIF regulates cytokine secretion and the expression of receptors involved in the immune response. It can be taken up into target cells in order to interact with intracellular signalling molecules, inhibiting p53 function, and/or activating components of the mitogen-activated protein kinase and Jun-activation domain-binding protein-1 (Jab-1) PUBMED:15225126. MIF has been linked to various inflammatory diseases, such as rheumatoid arthritis and atherosclerosis PUBMED:16628200.

    \

    The MIF homologue D-dopachrome tautomerase () is involved in detoxification through the conversion of dopaminechrome (and possibly norepinephrinechrome), the toxic quinine product of the neurotransmitter dopamine (and norepinephrine), to an indole derivative that can serve as a precursor to neuromelanin PUBMED:10644007, PUBMED:10079069.

    \ ' '3359' 'IPR005526' '\

    In Escherichia coli, three Min proteins (MinC, MinD and MinE) negatively regulate FtsZ assembly at the cell poles in order to ensure the Z-ring only assembles at cell midpoint. MinC inhibits formation of the Z-ring by preventing FtsZ assembly. MinD binds to MinC near the cell poles, sequestering MinC away from the cell midpoint so the Z-ring can form there. MinC is an oligomer, probably a dimer, that consists of two domains: the N-terminal domain is responsible for FtsZ inhibition, while the C-terminal domain is responsible for binding to MinD and to a component of the division septum PUBMED:10869074, PUBMED:17085577.

    \ ' '3360' 'IPR007874' '\ In Escherichia coli FtsZ () assembles into a Z ring at midcell. Its assembly at polar sites is prevented by the min system. MinC , a component of this system, is an inhibitor of FtsZ assembly that is positioned within the cell by interaction with the MinDE proteins. MinC is an oligomer, probably a dimer PUBMED:10869074. The C-terminal half of MinC is the most conserved and interacts with MinD. The N-terminal half is thought to interact with FtsZ. MinC rapidly oscillates between the poles of the cell to destabilise FtsZ filaments that have formed before they mature into polar Z rings\ ' '3361' 'IPR005527' '\

    Cytokinesis needs to be regulated spatially in order to ensure that it occurs between the daughter genomes. In prokaryotes such as Escherichia coli, cytokinesis is initiated by FtsZ, a tubulin-like protein that assembles into a ring structure at the cell centre called the Z ring. A fundamental problem in prokaryotic cell biology is to understand how the midcell division site is identified. Two major negative regulatory systems are known to be involved in preventing Z-ring assembly at all sites except the midcell. One of these systems, called nucleoid occlusion, blocks Z-ring assembly in the area occupied by an unsegregated nucleoid until a critical stage in chromosome replication or segregation is reached. The other system consists of three proteins, MinC, MinD and MinE, which prevent assembly of Z rings in regions of the cell not covered by the nucleoid, such as the cell poles. MinC is an inhibitor of FtsZ polymerisation, resulting in the inhibition of Z ring assembly in the cell; MinD greatly enhances the inhibitory effects of MinC in vivo; and MinE antagonizes the effects of MinC and MinD PUBMED:11378404.

    \

    MinE is a small bifunctional protein. The amino terminus of MinE is required to interact with MinD, while the carboxyl terminus is required for \'topological specificity\' - that is, the ability of MinE to antagonise MinCD inhibition of Z rings at the midcell position but not at the poles.

    \ ' '3362' 'IPR000425' '\

    A number of transmembrane (TM) channel proteins can be grouped together\ \ \ on the basis of sequence similarities PUBMED:8325040, PUBMED:2014003, PUBMED:1715617, PUBMED:7529436.

    \ \

    These include:\ \

    \ \

    MIP family proteins are thought to contain 6 TM domains. Sequence analysis\ \ suggests that the proteins may have arisen through tandem, intragenic\ \ duplication from an ancestral protein that contained 3 TM domains PUBMED:1715617.

    \ \

    Some of the proteins in this group are responsible for the molecular basis of\ the blood group antigens, surface markers on the outside of the red blood \ cell membrane. Most of these markers are proteins, but some are carbohydrates a\ ttached to lipids or proteins PUBMED:11845000. Aquaporin-CHIP (Aquaporin 1) belo\ ngs to the Colton blood group system and is associated with Co(a/b) antigen.

    \ ' '3363' 'IPR004326' '\

    The Mlo-related proteins are a family of plant integral membrane proteins, first discovered in barley. Mutants lacking wild-type Mlo proteins show\ broad spectrum resistance to the powdery mildew fungus, and dysregulated cell death control, with spontaneous cell\ death in response to developmental or abiotic stimuli. Thus wild-type Mlo proteins are thought to be inhibitors of cell\ death whose deficiency lowers the threshold required to trigger the cascade of events that result in plant cell death.

    \

    Mlo\ proteins are localized in the plasma membrane and possess seven transmembrane regions; thus the Mlo family is the only\ major higher plant family to possess 7 transmembrane domains. It has been suggested that Mlo proteins function as\ G-protein coupled receptors in plants PUBMED:10574976; however the molecular and biological functions of Mlo proteins is still unclear.

    \ ' '3364' 'IPR004983' '\

    The Mlp (for Multicopy Lipoprotein) family of lipoproteins is found in Borrelia species PUBMED:9488385. This family were previously known as 2.9 lipoprotein genes PUBMED:10531261. These surface\ expressed genes may represent new candidate vaccinogens for Lyme disease PUBMED:9488385. Members of this family generally are downstream of four ORFs called A,B,C and D\ that are involved in hemolytic activity.

    \ ' '3365' 'IPR005300' '\

    This group of proteins includes MltA; a membrane-bound, murein degrading transglycosylase enzyme which plays an important role in the controlled growth\ of the stress-bearing sacculus of Escherichia coli PUBMED:10037771, PUBMED:9287002.

    \ ' '3366' 'IPR006099' '\

    Methylmalonyl-CoA mutase () (MCM) PUBMED:1975493 is an adenosylcobalamin (vitamin B12) dependent enzyme that catalyzes the isomerization between methylmalonyl-CoA and succinyl-CoA. MCM is involved in various catabolic or biosynthetic pathways; for example in man it is involved in the degradation of several amino acids, odd-chain fatty acids and cholesterol via propionyl-CoA to the tricarboxylic acid cycle; while in some bacteria it is involved in the synthesis of propionate from tricarboxylic acid-cycle intermediates.

    \

    Deficiency of MCM in man causes an often fatal disorder of organic acid metabolism termed methylmalonic acidemia. The sequences of eukaryotic and prokaryotic MCM are rather well conserved. In eukaryotes MCM is located in the mitochondrial matrix and is a homodimer of a polypeptide chain of about 710 amino acids. In bacteria MCM is a dimer of two non-identical, yet structurally related chains. This family also includes an Escherichia coli protein (gene sbm) whose function is not yet known.

    \

    A small degree of similarity is said PUBMED:2197274 to exist between MCM and the large subunit of the adenosylcobalamin-dependent enzyme ethanolamine ammonia-lyase, but this similarity is so weak that these two type of enzymes can not be detected by a single pattern.

    \ ' '3367' 'IPR005656' '\

    This entry represents proteins from the MmgE/PrpD family, which includes 2-methylcitrate dehydratase (PrpD; ). PrpD is required for propionate catabolism, catalysing the third step of the 2-methylcitric acid cycle PUBMED:. This enzyme consists of two domains: a large domain with an all-helical fold and a smaller domain that folds into an alpha+beta domain PUBMED:.

    \ ' '3368' 'IPR003454' '\ This family consists of monooxygenase components such as MmoB methane monooxygenase () regulatory protein B. When MmoB is present at low concentration it converts methane monooxygenase from an oxidase to a hydroxylase and stabilises intermediates required for the activation of dioxygen PUBMED:10393915. Also found in this family is DmpM or Phenol hydroxylase () protein component P2, this protein lacks redox co-factors and is required for optimal turnover of Phenol\ hydroxylase PUBMED:9012665. Phenol hydroxylase catabolises phenol and some of its methylated derivatives in the first step of phenol biodegradation, and is required for growth on phenol. The multicomponent enzyme is made up of P0, P1, P2, P3, P4 and P5 polypeptides.\ ' '3369' 'IPR004869' '\ Proteins of this entry are putative integral membrane proteins from bacteria. Several of the members are mycobacterial proteins.\ Many of the proteins contain two copies of this aligned region. The function of these proteins is not known, although it has been\ suggested that they may be involved in lipid transport PUBMED:10694977.\ ' '3370' 'IPR002917' '\ Human HSR1, has been localized to the human MHC class I region and is highly homologous to a putative GTP-binding protein, MMR1 from mouse. These proteins represent a new subfamily of GTP-binding proteins that has both prokaryote and eukaryote members PUBMED:8180467.\ ' '3371' 'IPR001213' '\

    The Mouse mammary tumor virus (MMTV) is a milk-transmitted type B retrovirus. The superantigen (SAg) is encoded in the long terminal repeat PUBMED:7612231.

    \ ' '3372' 'IPR007760' '\

    Catalases () are antioxidant enzymes that catalyse the conversion of hydrogen peroxide to water and molecular oxygen. Hydrogen peroxide is produced as a consequence of oxidative cellular metabolism and can be converted to the highly reactive hydroxyl radical via transition metals, this radical being able to damage a wide variety of molecules within a cell, leading to oxidative stress and cell death. Catalases act to neutralise hydrogen peroxide toxicity, and are produced by all aerobic organisms ranging from bacteria to man. There are three structurally independent classes of catalases: ubiquitous mono-functional haem-containing catalases (), bifunctional haem-containing catalase-peroxidases that are closely related to plant peroxidases (), and non-haem manganese-containing catalases PUBMED:14745498.

    \

    This entry represents the non-haem Mn-catalases, which are found in several bacterial species PUBMED:14871145. The structure of the Mn catalase from Lactobacillus plantarum reveals a homo-hexamer, where each subunit contains a dimanganese active site that is accessed by a single substrate channel PUBMED:11587647. The dimanganese active site performs a two-electron catalytic cycle that alternately oxidises and reduces the dimanganese atoms in a manner that is similar to its haem-counterpart found in other catalases.

    \ ' '3373' 'IPR005647' '\ This family of proteins includes meiotic nuclear division protein 1 (MND1) from Saccharomyces cerevisiae (Baker\'s yeast). The mnd1 protein forms a complex with hop2 to promote homologous chromosome pairing and meiotic double-strand break repair PUBMED:11940665.\ ' '3374' 'IPR007182' '\

    This domain is found in a possible subunit of the Na+/H+ antiporter PUBMED:9852009, PUBMED:11356194 as well as in the bacterial NADH dehydrogenase subunit. Usually four transmembrane regions are found in this domain.

    \ ' '3375' 'IPR002758' '\

    This family contains both characterised and uncharacterised bacterial and archaeal proteins; some of which are possibly transmembrane proteins involved in Na+/H+ or K+/H+ transport.

    \ \ \

    The characterised proteins are mnhE (Staphylococcus aureus) and phaE. (Rhizobium meliloti), which are subunits of the Na+/H+ or K+/H+ antiporters, that are required for sodium and potassium excretion, respectively PUBMED:9852009, PUBMED:9680201.

    \ \ \ ' '3376' 'IPR005066' '\

    The majority of molybdenum-containing enzymes utilise a molybdenum cofactor (MoCF or Moco) consisting of a Mo atom coordinated via a cis-dithiolene moiety to molybdopterin (MPT). MoCF is ubiquitous in nature, and the pathway for MoCF biosynthesis is conserved in all three domains of life. MoCF-containing enzymes function as oxidoreductases in carbon, nitrogen, and sulphur metabolism PUBMED:16784786, PUBMED:12114025.

    \ \

    In Escherichia coli, biosynthesis of MoCF is a three stage process. It begins with the MoaA and MoaC conversion of GTP to the meta-stable pterin intermediate precursor Z. The second stage involves MPT synthase (MoaD and MoaE), which converts precursor Z to MPT; MoeB is involved in the recycling of MPT synthase. The final step in MoCF synthesis is the attachment of mononuclear Mo to MPT, a process that requires MoeA and which is enhanced by MogA in an Mg2 ATP-dependent manner PUBMED:17198377. MoCF is the active co-factor in eukaryotic and some prokaryotic molybdo-enzymes, but the majority of bacterial enzymes requiring MoCF, need a modification of MTP for it to be active; MobA is involved in the attachment of a nucleotide monophosphate to MPT resulting in the MGD co-factor, the active co-factor for most prokaryotic molybdo-enzymes. Bacterial two-hybrid studies have revealed the close interactions between MoeA, MogA, and MobA in the synthesis of MoCF PUBMED:12372836. Moreover the close functional association of MoeA and MogA in the synthesis of MoCF is supported by fact that the known eukaryotic homologues to MoeA and MogA exist as fusion proteins: CNX1 () of Arabidopsis thaliana (Mouse-ear cress), mammalian Gephryin (e.g. ) and Drosophila melanogaster (Fruit fly) Cinnamon () PUBMED:8528286.

    \ \

    This domain is found in molybdopterin cofactor oxidoreductases, such as in the C-terminal of Mo-containing sulphite oxidase, which catalyses the conversion of sulphite to sulphate, the terminal step in the oxidative degradation of cysteine and methionine PUBMED:9428520. This domain is involved in dimer formation, and has an Ig-fold structure PUBMED:9428520.

    \ ' '3378' 'IPR002820' '\

    The majority of molybdenum-containing enzymes utilise a molybdenum cofactor (MoCF or Moco) consisting of a Mo atom coordinated via a cis-dithiolene moiety to molybdopterin (MPT). MoCF is ubiquitous in nature, and the pathway for MoCF biosynthesis is conserved in all three domains of life. MoCF-containing enzymes function as oxidoreductases in carbon, nitrogen, and sulphur metabolism PUBMED:16784786, PUBMED:12114025.

    \ \

    In Escherichia coli, biosynthesis of MoCF is a three stage process. It begins with the MoaA and MoaC conversion of GTP to the meta-stable pterin intermediate precursor Z. The second stage involves MPT synthase (MoaD and MoaE), which converts precursor Z to MPT; MoeB is involved in the recycling of MPT synthase. The final step in MoCF synthesis is the attachment of mononuclear Mo to MPT, a process that requires MoeA and which is enhanced by MogA in an Mg2 ATP-dependent manner PUBMED:17198377. MoCF is the active co-factor in eukaryotic and some prokaryotic molybdo-enzymes, but the majority of bacterial enzymes requiring MoCF, need a modification of MTP for it to be active; MobA is involved in the attachment of a nucleotide monophosphate to MPT resulting in the MGD co-factor, the active co-factor for most prokaryotic molybdo-enzymes. Bacterial two-hybrid studies have revealed the close interactions between MoeA, MogA, and MobA in the synthesis of MoCF PUBMED:12372836. Moreover the close functional association of MoeA and MogA in the synthesis of MoCF is supported by fact that the known eukaryotic homologues to MoeA and MogA exist as fusion proteins: CNX1 () of Arabidopsis thaliana (Mouse-ear cress), mammalian Gephryin (e.g. ) and Drosophila melanogaster (Fruit fly) Cinnamon () PUBMED:8528286.

    \ \

    This entry contain the molybdenum cofactor biosynthesis protein MoaC.

    \ ' '3379' 'IPR003448' '\

    This family contains the MoaE protein that is involved\ in biosynthesis of molybdopterin PUBMED:8514782. Molybdopterin, the universal\ component of the pterin molybdenum cofactors, contains a dithiolene\ group serving to bind Mo. Addition of the dithiolene sulphurs to a\ molybdopterin precursor requires the activity of the converting factor.\ Converting factor contains the MoaE and MoaD proteins.

    \ ' '3380' 'IPR001668' '\ With some plasmids, recombination can occur in a site specific manner that is independent of RecA. In such cases, the recombination event requires another protein called Pre. Pre is a plasmid recombination enzyme. This protein is also known as Mob (conjugative mobilization) PUBMED:2768188.\ \ ' '3381' 'IPR005053' '\ This family includes of the MobA protein from the Escherichia coli plasmid RSF1010, and the MobL protein from the Thiobacillus ferrooxidans (Acidithiobacillus ferrooxidans) plasmid PTF1. These sequences\ are mobilization proteins, which are essential for specific plasmid transfer.\ ' '3382' 'IPR004435' '\

    The majority of molybdenum-containing enzymes utilise a molybdenum cofactor (MoCF or Moco) consisting of a Mo atom coordinated via a cis-dithiolene moiety to molybdopterin (MPT). MoCF is ubiquitous in nature, and the pathway for MoCF biosynthesis is conserved in all three domains of life. MoCF-containing enzymes function as oxidoreductases in carbon, nitrogen, and sulphur metabolism PUBMED:16784786, PUBMED:12114025.

    \ \

    In Escherichia coli, biosynthesis of MoCF is a three stage process. It begins with the MoaA and MoaC conversion of GTP to the meta-stable pterin intermediate precursor Z. The second stage involves MPT synthase (MoaD and MoaE), which converts precursor Z to MPT; MoeB is involved in the recycling of MPT synthase. The final step in MoCF synthesis is the attachment of mononuclear Mo to MPT, a process that requires MoeA and which is enhanced by MogA in an Mg2 ATP-dependent manner PUBMED:17198377. MoCF is the active co-factor in eukaryotic and some prokaryotic molybdo-enzymes, but the majority of bacterial enzymes requiring MoCF, need a modification of MTP for it to be active; MobA is involved in the attachment of a nucleotide monophosphate to MPT resulting in the MGD co-factor, the active co-factor for most prokaryotic molybdo-enzymes. Bacterial two-hybrid studies have revealed the close interactions between MoeA, MogA, and MobA in the synthesis of MoCF PUBMED:12372836. Moreover the close functional association of MoeA and MogA in the synthesis of MoCF is supported by fact that the known eukaryotic homologues to MoeA and MogA exist as fusion proteins: CNX1 () of Arabidopsis thaliana (Mouse-ear cress), mammalian Gephryin (e.g. ) and Drosophila melanogaster (Fruit fly) Cinnamon () PUBMED:8528286.

    \ \

    The MobB domain is similar to that of the urease accessory protein UreG and the hydrogenase accessory protein HypB, both GTP hydrolases involved in loading nickel into the metallocentres of their respective target enzymes. It is involved in the final step of molybdenum-cofactor biosynthesis. While its precise function has not been identified it is thought to be involved in the transfer of a guanine dinucleotide moiety to molybdopterin, as it shows GTP-binding and weak GTPase activity PUBMED:9219527. The MobB protein () from Escherichia coli, which is comprised of this domain, is a homodimer PUBMED:14646116. Each molecule is composed of two distinct regions - an outer region comprised of 6 beta-strands and three alpha helices, and an inner region comprised of a two-strand beta hairpin followed by an alpha helix. These regions require interaction with the second monomer to allow proper folding to occur. The two monomers are intertwined and form an extensive 16-stranded beta-sheet. While the active site could not be positively identified, the presence of highly conserved residues suggests the substrate binding site occurs in the central solvent channel.

    \ ' '3383' 'IPR006788' '\ MOBP is abundantly expressed in central nervous system myelin, and shares several characteristics with myelin basic protein (MBP), in terms of regional distribution and function. MOBP has been shown to be essential for normal arrangement of the radial component in central nervous system myelin PUBMED:10103078, PUBMED:11793190.\ ' '3384' 'IPR007835' '\ The MOFRL(multi-organism fragment with rich Leucine) domain is found in bacteria and eukaryotes. The function of this domain is not clear, although it exists in some putative enzymes such as reductases and kinases.\ ' '3385' 'IPR007681' '\ Segregation of nuclear and cytoplasmic processes facilitates regulation of many eukaryotic cellular functions such as gene expression and cell cycle progression. Trafficking through the nuclear pore requires a number of highly conserved soluble factors that escort macromolecular substrates into and out of the nucleus. The Mog1 protein has been shown to interact with RanGTP, which stimulates guanine nucleotide release, suggesting Mog1 regulates the nuclear transport functions of Ran PUBMED:11733047. The human homologue of Mog1 is thought to be alternatively spliced.\ ' '3386' 'IPR006963' '\

    The molybdopterin oxidoreductase Fe4S4 domain is found in a number of reductase/dehydrogenase families, which include the periplasmic nitrate reductase precursor and the formate dehydrogenase alpha chain PUBMED:9036855.

    \ ' '3387' 'IPR006656' '\

    This domain is found in a number of molybdopterin-containing oxidoreductases, tungsten formylmethanofuran dehydrogenase \ subunit d (FwdD) and molybdenum formylmethanofuran dehydrogenase subunit (FmdD); where a single domain constitutes almost the entire subunit.\ The formylmethanofuran dehydrogenase catalyses the first step in\ methane formation from CO2 in methanogenic archaea and has a \ molybdopterin dinucleotide cofactor PUBMED:9818358.

    \ ' '3388' 'IPR006657' '\

    A domain in this entry corresponds\ to the C-terminal domain IV in dimethyl sulphoxide (DMSO)reductase\ which interacts with the 2-amino pyrimidone ring of both \ molybdopterin guanine dinucleotide molecules PUBMED:8890912.

    \ ' '3390' 'IPR006833' '\

    Ammonia monooxygenase and the particulate methane monooxygenase are both integral membrane proteins, occurring in ammonia oxidisers and methanotrophs respectively, which are thought to be evolutionarily related PUBMED:7590173. These enzymes have a relatively wide substrate specificity and can catalyse the oxidation of a range of substrates including ammonia, methane, halogenated hydrocarbons and aromatic molecules PUBMED:12209257. These enzymes are composed of 3 subunits - A (), B () and C () - and contain various metal centres, including copper. Particulate methane monooxygenase from Methylococcus capsulatus str. Bath is an ABC homotrimer, which contains mononuclear and dinuclear copper metal centres, and a third metal centre containing a metal ion whose identity in vivo is not certainPUBMED:15674245.

    \

    The soluble regions of particulate methane monooxygenase from Methylococcus capsulatus str. Bath derive primarily from the B subunit. This subunit forms two antiparallel beta sheets and contains the mono- and di- nuclear copper metal centres PUBMED:15674245.

    \ ' '3391' 'IPR005302' '\

    Molybdenum cofactor (MOCO) sulphurases PUBMED:16784786 catalyse the insertion of a terminal sulphur ligand into the molybdenum cofactor, thereby converting the oxo form of MOCO to a sulphurylated form. Suphurylated MOCO is required by several enzymes, including: aldehyde oxidase (), which function in the last step of abscisic acid biosynthesis in plants PUBMED:11549764; and xanthine dehydrogenase (), which synthesis uric acid from xanthine during nitrogen metabolism PUBMED:12650690.

    \ \

    This entry represents the beta-barrel C-terminal domain of MOCO sulphurase (MOSC domain), which has a beta-barrel structure similar to that of the beta-barrel domain in pyruvate kinase and contains a highly conserved cysteine residue required for activity. MOSC domains are found in several diverse metal-sulphur cluster biosynthesis proteins from both eukaryotes and prokaryotes. MOSC domains occu as either stand-alone forms, such as the YiiM protein from Escherichia coli, or fused to other domains, such as a NifS-like catalytic domain in MOCO sulphurase. The MOSC domain is predicted to be a sulphur-carrier domain that receives sulphur abstracted from pyridoxal phosphate-dependent NifS-like enzymes, on its conserved cysteine, and delivers it for the formation of diverse sulphur-metal clusters PUBMED:11886751.

    \ \

    The MOSC domain contains several patches of hydrophobic residues and an absolutely conserved cysteine residue situated closer to the C-terminal end of the domain. The absolutely conserved cysteine in the MOSC domain is reminiscent of the analogous conservation of a cysteine in the active site of the thioredoxin and rhodanese superfamilies. Members of both these superfamilies, especially of the latter one, have been implicated in the synthesis of Fe-S clusters, through mobilisation of sulphur with their active cysteine.

    \ ' '3392' 'IPR003872' '\ This is a family of spirochete major outer sheath protein C-terminal regions. These proteins are present on the bacterial cell surface. In Treponema denticola the major outer sheath protein (Msp) binds immobilized laminin and fibronectin supporting the hypothesis that Msp mediates the extracellular matrix binding activity of T. denticola PUBMED:9023187.\ ' '3393' 'IPR003857' '\ This is a family of spirochete major outer sheath protein N-terminal regions. These proteins are present on the bacterial cell surface. In Treponema denticola the major outer sheath protein (Msp) binds immobilized laminin and fibronectin supporting the hypothesis that Msp mediates the extracellular matrix binding activity of T. denticola PUBMED:9023187.\ ' '3394' 'IPR002898' '\ This family groups together integral membrane proteins that appear to be involved in translocation of proteins across a membrane. These proteins are probably proton channels. MotA is an essential component of the flagellar motor that uses a proton gradient to generate rotational motion in the flagellar PUBMED:10348868. ExbB is part of the TonB-dependent transduction complex. The TonB complex uses the proton gradient across the inner bacterial membrane to transport large molecules across the outer bacterial membrane.\ ' '3395' 'IPR007151' '\ This family includes proteins related to Mpp10 (M phase phosphoprotein 10). The U3 small nucleolar ribonucleoprotein (snoRNP) is required for three cleavage events that generate the mature 18S rRNA from the pre-rRNA. In Saccharomyces cerevisiae, depletion of Mpp10, a U3 snoRNP-specific protein, halts 18S rRNA production and impairs cleavage at the three U3 snoRNP-dependent sites PUBMED:9391061.\ ' '3396' 'IPR007846' '\ The MPPN (Mitotic PhosphoProtein N end) family is uncharacterised however it probably plays a role in the cell cycle because the family includes mitotic phosphoproteins PUBMED:9115395. This family also includes a suppressor of thermosensitive mutations in the DNA polymerase delta gene, Pol III PUBMED:7862092. The conserved central region appears to be distantly related to the RNA-binding region RNP-1 (RNA recognition motif, ), suggesting an RNA binding function for this protein.\ ' '3397' 'IPR013342' '\

    Mandelate racemase (MR) and muconate lactonising enzyme (MLE) are two bacterial enzymes involved in aromatic acid catabolism. They catalyse mechanistically distinct reactions yet they are related at the level of their primary, quaternary (homooctamer) and tertiary structures PUBMED:2215699, PUBMED:8256284.\ A number of other proteins also seem to be evolutionary related to these two\ enzymes. These include, various plasmid-encoded chloromuconate cycloisomerases \ , Escherichia coli protein rspA PUBMED:7545940, E. coli bifunctional DGOA protein, E. coli hypothetical proteins ycjG, yfaW and yidU and a hypothetical protein from Streptomyces ambofaciens PUBMED:8277241.

    \

    This entry represents the C-terminal region of these proteins.

    \ ' '3398' 'IPR013341' '\

    Mandelate racemase (MR) and muconate lactonizing enzyme (MLE) are two bacterial enzymes involved in aromatic acid catabolism. They catalyse mechanistically distinct reactions yet they are related at the level of their primary, quaternary (homooctamer) and tertiary structures PUBMED:2215699, PUBMED:8256284.\ A number of other proteins also seem to be evolutionary related to these two\ enzymes. These include, various plasmid-encoded chloromuconate cycloisomerases \ , Escherichia coli protein rspA PUBMED:7545940, E. coli bifunctional DGOA protein, E. coli hypothetical proteins ycjG, yfaW and yidU and a hypothetical protein from Streptomyces ambofaciens PUBMED:8277241.

    \

    This entry represents the N-terminal region of these proteins.

    \ ' '3399' 'IPR007281' '\ The Mre11 complex is a multi-subunit nuclease that is composed of Mre11, Rad50 and Nbs1/Xrs2, and is involved in checkpoint signalling and DNA replication PUBMED:11988766. Mre11 has an intrinsic DNA-binding activity that is stimulated by Rad50 on its own or in combination with Nbs1 PUBMED:10823903.\ ' '3400' 'IPR007221' '\ MreC (murein formation C) is involved in the rod shape determination in Escherichia coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped.\ ' '3401' 'IPR007227' '\

    The MreD (murein formation D) protein is involved in bacterial cell shape determination. Most rod-shaped bacteria depend on MreB and RodA to achieve either a rod shape or some other non-spherical morphology such as coil or stalk formation. MreD is encoded in an operon with MreB, and often with RodA and PBP-2 as well. It is highly hydrophobic (therefore somewhat low-complexity) and highly divergent, and therefore cannot always be identified on the basis of sequence similarity.

    \ ' '3402' 'IPR013846' '\

    This domain is found at the C terminus of the mRNA capping enzyme. The mRNA capping enzyme in yeasts is composed of two separate chains: alpha a mRNA\ guanyltransferase and beta an RNA 5\'-triphosphate. X-ray crystallography reveals a large \ conformational change during guanyl transfer by mRNA capping enzymes PUBMED:9160746.\ Binding of the enzyme to nucleotides is specific to the GMP moiety of GTP. The viral \ mRNA capping enzyme is a monomer that transfers a GMP cap onto the end of mRNA that \ terminates with a 5\'-diphosphate tail.

    \ ' '3403' 'IPR001339' '\

    The mRNA capping enzyme in yeasts is composed of two separate chains, alpha a mRNA\ guanyltransferase and beta an RNA 5\'-triphosphate. X-ray crystallography reveals a large \ conformational change during guanyl transfer by mRNA capping enzymes PUBMED:9160746.\ Binding of the enzyme to nucleotides is specific to the GMP moiety of GTP. The viral \ mRNA capping enzyme is a monomer that transfers a GMP cap onto the end of mRNA that \ terminates with a 5\'-diphosphate tail.

    \ ' '3404' 'IPR004206' '\ The mRNA capping enzyme in yeasts is composed of two separate subunits, a mRNA guanyltransferase and an RNA 5\'-triphosphate. This is the beta subunit of mRNA capping enzyme which has triphosphatase activity PUBMED:8720151, PUBMED:1315757, PUBMED:9345280. The beta chain (polynucleotide 5\'-phosphatase ) converts the 5\'-triphosphate end of a nascent mRNA chain into a diphosphate in the first step of mRNA capping.\ The function of the capping enzyme also depends on the guanylyltransferase activity conferred by the alpha chain (see ).\ ' '3406' 'IPR003330' '\ The immunogenic major surface antigen (MSG) also termed glycoprotein A (gpA) is involved in the immunopathogenesis of Pneumocystis carinii. MSG from all P. carinii has conserved secondary structure, as well as function PUBMED:9679195, PUBMED:9712777.\ ' '3407' 'IPR007208' '\ Members of the PhaF/MrpF family are predicted to be integral membrane proteins with three transmembrane regions, involved in regulation of pH. PhaF is part of a potassium efflux system involved in pH regulation. It is also involved in symbiosis in Rhizobium meliloti (Sinorhizobium meliloti) PUBMED:11356194. MrpF is a part of a Na+/H+ antiporter complex, also involved in pH homeostasis. MrpF is thought to be an efflux system for Na+ and cholate PUBMED:10198001. The Mrp system in Gram-positive species may also have primary energisation capacities PUBMED:9680201.\ ' '3408' 'IPR007560' '\

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents Mrr, a type IV restriction endonuclease involved in the acceptance of modified foreign DNA, restricting both adenine- and cytosine-methylated DNA. Plasmids carrying HincII, HpaI, and TaqI R and M genes are severely restricted in Escherichia coli strains that are Mrr+ PUBMED:1650347. Mrr appears to be the final effector of the bacterial SOS response, which is not only a vital reply to DNA damage but also constitutes an essential mechanism for the generation of genetic variability that in turn fuels adaptation and resistance development in bacterial populations PUBMED:16313623. Mrr possesses a cleavage domain that is similar to that found in type II restriction enzymes, however it has an unusual glutamine residue at the central position of the (D/E)-(D/E)XK hallmark of the active site PUBMED:11313145.

    \ ' '3409' 'IPR006685' '\

    Mechanosensitive (MS) channels provide protection against hypo-osmotic shock, responding both to stretching of the cell membrane and to membrane depolarisation. They are present in the membranes of organisms from the three domains of life: bacteria, archaea, and eukarya PUBMED:12626684. There are two families of MS channels: large-conductance MS channels (MscL) and small-conductance MS channels (MscS or YGGB). The pressure threshold for MscS opening is 50% that of MscL PUBMED:12446901. The MscS family is much larger and more variable in size and sequence than the MscL family. Much of the diversity in MscS proteins occurs in the size of the transmembrane regions, which ranges from three to eleven transmembrane helices, although the three C-terminal helices are conserved. This family contains sequences form the MscS family of proteins.

    \

    MscS folds as a homo-heptamer with a cylindrical shape, and can be divided into transmembrane and extramembrane regions: an N-terminal periplasmic region, a transmembrane region, and a C-terminal cytoplasmic region (middle and C-terminal domains). The transmembrane region forms a channel through the membrane that opens into a chamber enclosed by the extramembrane portion, the latter connecting to the cytoplasm through distinct portals PUBMED:12446901.

    \ ' '3410' 'IPR001136' '\ \ The merozoite surface antigen 2 (MSA-2) may play a role in the merozoite\ attachment to the erythrocyte. It is thought to be attached to the membrane\ by a GPI-anchor.\ \ ' '3411' 'IPR001185' '\ Mechanosensitive ion channels (MscL) play a critical role in transducing physical stresses\ at the cell membrane into an electrochemical response. MscL is a protein which forms a\ channel organised as a homopentamer, with each subunit containing two transmembrane\ regions PUBMED:9856938. Prokaryotes harbor a\ large-conductance mechanosensitive channel (gene mscL) that opens in response to stretch\ forces in the membrane lipid bilayer and may participate in the regulation of osmotic\ pressure changes within the cell PUBMED:9632260.\ ' '3412' 'IPR002628' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    In PSII, the oxygen-evolving complex (OEC) is responsible for catalysing the splitting of water to O(2) and 4H+. The OEC is composed of a cluster of manganese, calcium and chloride ions bound to extrinsic proteins. In cyanobacteria there are five extrinsic proteins in OEC (PsbO, PsbP-like, PsbQ-like, PsbU and PsbV), while in plants there are only three (PsbO, PsbP and PsbQ), PsbU and PsbV having been lost during the evolution of green plants PUBMED:15258264.

    \

    This family represents the PSII OEC protein PsbO, which appears to be the most important extrinsic protein for oxygen evolution. PsbO lies closest to the Mn cluster where water oxidation occurs, and has a stabilising effect on the Mn cluster. As a result, PsbO is often referred to as the Mn-stabilising protein (MSP), although none of its amino acids are likely ligands for Mn. Calcium ions were found to modify the conformation of PsbO in solution PUBMED:14529295.

    \ \ ' '3413' 'IPR005091' '\

    The major surface protein (MSP1) of the cattle pathogen Anaplasma is a heterodimer comprised of MSP1a and MSP1b. This family is the MSP1b chain. The MSP1\ proteins are putative adhesins for bovine erythrocytes.

    \ ' '3414' 'IPR007515' '\

    Guanine nucleotide exchange factor MSS4 (Rab interacting factor) is a guanine-nucleotide releasing protein that acts on members of the SCE4/YPT1/RAB subfamily. It stimulates release of GDP and may play a role in vesicular transport.

    \ ' '3415' 'IPR005634' '\

    This is a family of Drosophila proteins, that are typified by the repetitive motif C-G-P.

    \ ' '3416' 'IPR007757' '\ MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m6A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs.\ ' '3418' 'IPR002844' '\ This archaeal enzyme family is involved in formation of methane from\ carbon dioxide . The enzyme requires coenzyme F420 PUBMED:7852356.\ ' '3419' 'IPR003690' '\

    This family currently contains one sequence of known function human mitochondrial transcription termination factor (mTERF), a multizipper protein but binds to DNA as a monomer, with evidence pointing to intramolecular leucine zipper interactions PUBMED:9118945. The precursors contain a mitochondrial targeting sequence, and the mature mTERF exhibits three leucine zippers, of which one is bipartite, and two widely spaced basic domains. Both basic domains and the three leucine zipper motifs are necessary for DNA binding. The leucine zippers are not implicated in a dimerisation role as in other leucine zippers PUBMED:9118945.

    \ \

    The rest of the family consists of hypothetical proteins none of which have any functional information.

    \ ' '3420' 'IPR003171' '\ This family includes the 5,10-methylenetetrahydrofolate reductase from bacteria and methylenetetrahydrofolate reductase from eukaryotes. The structure for this domain is known PUBMED:10201405 to be a TIM barrel.\ ' '3421' 'IPR007761' '\ The mannitol operon of Escherichia coli, encoding the mannitol-specific enzyme II of the phosphotransferase system (MtlA) and mannitol phosphate dehydrogenase (MtlD) contains an additional downstream open reading frame which encodes the mannitol repressor (MtlR).\ ' '3422' 'IPR018169' '\ This family includes proteins such as Drosophila saliva PUBMED:9739134, MtN3 involved in root nodule development PUBMED:8634476 and a protein\ involved in activation and expression of recombination activation genes (RAGs) PUBMED:8630032. Although the molecular function of\ these proteins is unknown, they are almost certainly transmembrane proteins. This family contains a region of two\ transmembrane helices that is found in two copies in most members of the family.\ ' '3423' 'IPR004687' '\ The proteins of the MET family have 4 TMS regions and are located in late endosomal or lysosomal membranes. Substrates of the mouse MTP transporter include thymidine, both nucleoside and nucleobase analogues, antibiotics, anthracyclines, ionophores and steroid hormones. MET transporters may be involved in the subcellular compartmentation of steroid hormones and other compounds.Drug sensitivity by mouse MET was regulated by compounds that inhibit lysosomal function, interface with intracellular cholesterol transport, or modulate the multidrug resistance phenotype of mammalian cells. Thus, MET family members may compartmentalize diverse hydrophobic molecules, thereby affecting cellular drug sensitivity, nucleoside/nucleobase availability and steroid hormone responses.\ ' '3424' 'IPR013340' '\

    This domain is mostly found in N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit A (MtrA) in methanogenic archaea. This methyltranferase is a\ membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. \ \ Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase (encoded by subunit A) is involved in the transfer of \'methyl\' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase.

    \

    In some organisms this domain is found at the N-terminal region of what appears to be a fusion of the MtrA and MtrF proteins PUBMED:15466049, PUBMED:15353801. The function of these proteins is unknown, though it is likely that they are involved in C1 metabolism.

    \ ' '3425' 'IPR005865' '\

    This model describes the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit C in methanogenic archaea. This methyltranferase is a\ membrane-associated enzyme complex that uses methyl-transfer reaction to drive a sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of a methyl group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase.

    \ ' '3426' 'IPR005779' '\

    This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit D in methanogenic archaea. This methyltranferase is a\ membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. \ \ Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase (encoded by subunit A) is involved in the transfer of \'methyl\' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase.

    \ ' '3427' 'IPR005780' '\

    This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit E in methanogenic archaea. This methyltranferase is a\ membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. \ \ Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase (encoded by subunit A) is involved in the transfer of \'methyl\' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase.

    \ ' '3428' 'IPR005866' '\

    This model describes the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit G in methanogenic archaea. This methyltranferase is a\ membrane-associated enzyme complex that uses methyl-transfer reaction to drive a sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of a methyl group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase.

    \ ' '3429' 'IPR002856' '\

    This family of methyltransferases occurs in both archaea and bacteria. In archaea, members of this family (MtrH) are involved in the energy conservation step of methanogenesis, while in prokaryotes, members of this family whose function has been defined (CmuB) are involved in the metabolism of chloromethane.

    \ \

    In archaea the enzyme tetrahydromethanopterin S-methyltransferase is composed of eight subunits, MtrA-H. The enzyme is a membrane- associated enzyme complex which catalyzes an energy-conserving, sodium-ion-translocating step in methanogenesis from hydrogen and carbon dioxide PUBMED:7737157. Subunit MtrH catalyzes the methylation reaction and was shown to exhibit methyltetrahydromethanopterin:cob(I)alamin methyltransferase activity PUBMED:10338124.

    \ \ \

    In bacteria, the pathway of chloromethane utilisation allows the microorganisms that possess it to grow with chloromethane as the sole carbon and energy source. It is initiated by a corrinoid-dependent methyltransferase system involving methyltransferase I (CmuA) and methyltransferase II (CmuB), which transfer the methyl group of chloromethane onto tetrahydrofolate PUBMED:10200311. The methyl group of chloromethane is first transferred by the protein CmuA to its corrinoid moiety, from where it is transferred to tetrahydrofolate by CmuB, thereby yielding methyltetrahydrofolate PUBMED:10447694, PUBMED:11358510.

    \

    CmuB has methylcobalamin:tetrahydrofolate methyltransferase activity, and catalyzes the conversion of methylcobalamin and tetrahydrofolate to cob(I)alamin and methyltetrahydrofolate.

    \ \ \ ' '3430' 'IPR007848' '\ This domain is found in ribosomal RNA small subunit methyltransferase C (e.g. ) as well as other methyltransferases (e.g. ).\ ' '3431' 'IPR003369' '\ Members of this protein family are involved in a sec-independent translocation mechanism. This pathway has been called the DeltapH pathway in chloroplasts PUBMED:9367960. Members of this family in Escherichia coli are involved in export of redox proteins with a "twin arginine" leader motif (S/T-R-R-X-F-L-K) PUBMED:9546395. This sec-independent pathway is termed TAT for twin-arginine translocation system. This system mainly transports proteins with bound cofactors that require folding prior to export.\ ' '3432' 'IPR003314' '\

    This family consists of MuA-transposase and repressor protein CI. The Bacteriophage Mu transposase is essential for integration, replication-transposition, and excision of Mu DNA. The N-terminus of the Mu transposase has considerable sequence homology with the Mu repressor and with the NH2 terminus of the transposase of the Mu-like Bacteriophage D108. These three proteins are known to share binding sites on DNA. An internal sequence in the Mu A protein also shares these features PUBMED:2999776.

    \

    The repressor protein of Bacteriophage Mu establishes and maintains lysogeny by shutting down transposition functions needed for phage DNA replication. It interacts with several repeated DNA sequences within the early operator, preventing transcription from two divergent promoters. It also directly represses transposition by competing with the MuA transposase for an internal activation sequence (IAS) that is coincident with the operator and required for efficient transposition. The transposase and repressor proteins compete for the operator/IAS region using homologous DNA-binding domains located at their amino termini PUBMED:10387082.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '3433' 'IPR004189' '\

    This transposase is essential for integration, replication-transposition and excision of Bacteriophage Mu DNA. Transposition requires transposase and a transposition enhancer, and the DNA can be transposed into multiple sites in bacterial genomes.

    \

    The crystal structure of the core domain of Mu transposase, MuA, has been determined. The first of two subdomains contains the active site and, despite very limited sequence homology, exhibits a striking similarity to the core domain of Human immunodeficiency virus 1 integrase. The enzymatic activity of MuA is known to be activated by formation of a DNA-bound tetramer of the protein PUBMED:7628012.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '3434' 'IPR005588' '\

    The members of this family are regulators of the anti-sigma E protein RseD.

    \ ' '3435' 'IPR004332' '\

    The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements PUBMED:7672579, PUBMED:1661256. The function of these proteins is unknown.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '3436' 'IPR007406' '\

    This is the N-terminal region of MukB. MukB is involved in the segregation and condensation of prokaryotic chromosomes. MukE () along with MukF () interact with MukB in vivo forming a complex, which is required for chromosome condensation and segregation in Escherichia coli PUBMED:10545099. The Muk complex appears to be similar to the SMC-ScpA-ScpB complex in other prokaryotes where MukB is the homologue of SMC PUBMED:12065423. ScpA () and ScpB () have little sequence similarity to MukE or MukF, though they are predicted to be structurally similar, being predominantly alpha-helical with coiled coil regions.

    \ \ \

    The structure of the N-terminal domain consists of an antiparallel six-stranded beta sheet surrounded by one helix on one side and by five helices on the other side PUBMED:10545328. It contains an exposed Walker A loop in an unexpected helix-loop-helix motif. In other proteins, Walker A motifs generally adopt a P loop conformation as part of a strand-loop-helix motif embedded in a conserved topology of alternating helices and (parallel) beta strands PUBMED:10545328.

    \ ' '3437' 'IPR007385' '\

    This family contains MukE, which are proteins involved in the segregation and condensation of prokaryotic chromosomes. MukE along with MukF () interact with MukB () in vivo forming a complex, which is required for chromosome condensation and segregation in Escherichia coli PUBMED:10545099. The Muk complex appears to be similar to the SMC-ScpA-ScpB complex in other prokaryotes where MukB is the homologue of SMC PUBMED:12065423. ScpA () and ScpB () have little sequence similarity to MukE or MukF, though they are predicted to be structurally similar, being predominantly alpha-helical with coiled coil regions.

    \ ' '3438' 'IPR000390' '\

    Members of this family which have been characterised, belong to the small multidrug resistance (Smr) protein family and are integral membrane proteins. They confer resistance to a wide range of toxic compounds by removing them for the cells. The efflux is coupled to an influx of protons.\ An example is Escherichia coli mvrC which prevents the incorporation of methyl viologen into cells PUBMED:1320256 and is involved in ethidium bromide efflux PUBMED:1936950.

    \ ' '3439' 'IPR000713' '\

    The bacterial cell wall provides strength and rigidity to counteract internal osmotic pressure, and protection against the environment. The peptidoglycan layer gives the cell wall its strength, and helps maintain the overall shape of the cell. The basic peptidoglycan structure of both Gram-positive and Gram-negative bacteria is comprised of a sheet of glycan chains connected by short cross-linking polypeptides. Biosynthesis of peptidoglycan is a multi-step (11-12 steps) process comprising three main stages:

    \

    \

    Stage two involves four key Mur ligase enzymes: MurC () PUBMED:17139082, MurD () PUBMED:17427948, MurE () PUBMED:16595662 and MurF () PUBMED:16322581. These four Mur ligases are responsible for the successive additions of L-alanine, D-glutamate, meso-diaminopimelate or L-lysine, and D-alanyl-D-alanine to UDP-N-acetylmuramic acid. All four Mur ligases are topologically similar to one another, even though they display low sequence identity. They are each composed of three domains: an N-terminal Rossmann-fold domain responsible for binding the UDPMurNAc substrate; a central domain (similar to ATP-binding domains of several ATPases and GTPases); and a C-terminal domain (similar to dihydrofolate reductase fold) that appears to be associated with binding the incoming amino acid. The conserved sequence motifs found in the four Mur enzymes also map to other members of the Mur ligase family, including folylpolyglutamate synthetase, cyanophycin synthetase and the capB enzyme from Bacillales PUBMED:16934839.

    \ \

    This entry represents the N-terminal domain of several stage 2 Mur ligases, including: UDP-N-acetylmuramate-L-alanine ligase (MurC), \ UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (MurE), and UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase (MurF). This entry also includes folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate and cyanophycin synthetase that catalyses the biosynthesis of the cyanobacterial reserve material multi-L-arginyl-poly-L-aspartate (cyanophycin) PUBMED:9652408.

    \

    The N-terminal domain is almost always associated with the cytoplasmic peptidoglycan synthetases C-terminal domain (see ).

    \ ' '3440' 'IPR004101' '\

    The bacterial cell wall provides strength and rigidity to counteract internal osmotic pressure, and protection against the environment. The peptidoglycan layer gives the cell wall its strength, and helps maintain the overall shape of the cell. The basic peptidoglycan structure of both Gram-positive and Gram-negative bacteria is comprised of a sheet of glycan chains connected by short cross-linking polypeptides. Biosynthesis of peptidoglycan is a multi-step (11-12 steps) process comprising three main stages:

    \

    \

    Stage two involves four key Mur ligase enzymes: MurC () PUBMED:17139082, MurD () PUBMED:17427948, MurE () PUBMED:16595662 and MurF () PUBMED:16322581. These four Mur ligases are responsible for the successive additions of L-alanine, D-glutamate, meso-diaminopimelate or L-lysine, and D-alanyl-D-alanine to UDP-N-acetylmuramic acid. All four Mur ligases are topologically similar to one another, even though they display low sequence identity. They are each composed of three domains: an N-terminal Rossmann-fold domain responsible for binding the UDPMurNAc substrate; a central domain (similar to ATP-binding domains of several ATPases and GTPases); and a C-terminal domain (similar to dihydrofolate reductase fold) that appears to be associated with binding the incoming amino acid. The conserved sequence motifs found in the four Mur enzymes also map to other members of the Mur ligase family, including folylpolyglutamate synthetase, cyanophycin synthetase and the capB enzyme from Bacillales PUBMED:16934839.

    \ \

    This entry represents the C-terminal domain from all four stage 2 Mur enzymes: UDP-N-acetylmuramate-L-alanine ligase (MurC), UDP-N-acetylmuramoylalanine-D-glutamate ligase (MurD), UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (MurE), and UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase (MurF). This entry also includes the C-terminal domain of folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate and cyanophycin synthetase that catalyses the biosynthesis of the cyanobacterial reserve material multi-L-arginyl-poly-L-aspartate (cyanophycin) PUBMED:9652408.

    \

    The C-terminal domain is almost always associated with the cytoplasmic peptidoglycan synthetases, N-terminal domain (see ).

    \ ' '3441' 'IPR011601' '\

    This entry represents a C-terminal conserved region of UDP-N-acetylenolpyruvoylglucosamine reductase , which is also called UDP-N-acetylmuramate dehydrogenase. It is a part of the pathway for the biosynthesis of the UDP-N-acetylmuramoyl-pentapeptide, which is a precursor of bacterial peptidoglycan.

    \ \ ' '3442' 'IPR018140' '\ MutS, MutL and MutH are the three essential proteins for initiation of methyl-directed DNA mismatch repair to correct mistakes made during DNA replication in Escherichia coli. MutH cleaves a newly synthesized and unmethylated daughter strand 5\' to the sequence d(GATC) in a hemi-methylated duplex. Activation of MutH requires the recognition of a DNA mismatch by MutS and MutL PUBMED:9482749.\ ' '3443' 'IPR007695' '\

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication PUBMED:17919654. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base PUBMED:17599803. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch PUBMED:17951114. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level PUBMED:17426027. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    \

    MutS is a modular protein with a complex structure PUBMED:11048711, and is composed of:

    \

    \

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions PUBMED:9722651. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts PUBMED:17965091.

    \ \

    This entry represents the N-terminal domain of proteins in the MutS family of DNA mismatch repair proteins, as well as closely related proteins. The N-terminal domain of MutS is responsible for mismatch recognition and forms a 6-stranded mixed beta-sheet surrounded by three alpha-helices, which is similar to the structure of tRNA endonuclease. Yeast MSH3 PUBMED:8510668, bacterial proteins involved in DNA mismatch repair, and the predicted protein product of the Rep-3 gene of mouse share extensive sequence similarity. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein.

    \ ' '3444' 'IPR007860' '\

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication PUBMED:17919654. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base PUBMED:17599803. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch PUBMED:17951114. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level PUBMED:17426027. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    \

    MutS is a modular protein with a complex structure PUBMED:11048711, and is composed of:

    \

    \

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions PUBMED:9722651. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts PUBMED:17965091.

    \ \

    This entry represents the connector domain (domain 2) found in proteins of the MutS family. The structure of the MutS connector domain consists of a parallel beta-sheet surrounded by four alpha helices, which is similar to the structure of the Holliday junction resolvase ruvC.

    \ ' '3445' 'IPR007696' '\

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication PUBMED:17919654. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base PUBMED:17599803. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch PUBMED:17951114. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level PUBMED:17426027. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    \

    MutS is a modular protein with a complex structure PUBMED:11048711, and is composed of:

    \

    \

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions PUBMED:9722651. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts PUBMED:17965091.

    \ \

    This entry represents the core domain (domain 3) found in proteins of the MutS family. The core domain of MutS adopts a multi-helical structure comprised of two subdomains, which are interrupted by the clamp domain. Two of the helices in the core domain comprise the levers that extend towards the DNA.

    \ ' '3446' 'IPR007861' '\

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication PUBMED:17919654. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base PUBMED:17599803. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch PUBMED:17951114. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level PUBMED:17426027. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    \

    MutS is a modular protein with a complex structure PUBMED:11048711, and is composed of:

    \

    \

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions PUBMED:9722651. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts PUBMED:17965091.

    \ \

    This entry represents the clamp domain (domain 4) found in proteins of the MutS family. The clamp domain is inserted within the core domain at the top of the lever helices. It has a beta-sheet structure PUBMED:11048710.

    \ ' '3447' 'IPR000432' '\

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication PUBMED:17919654. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base PUBMED:17599803. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch PUBMED:17951114. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level PUBMED:17426027. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    \

    MutS is a modular protein with a complex structure PUBMED:11048711, and is composed of:

    \

    \

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions PUBMED:9722651. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts PUBMED:17965091.

    \ \

    This entry represents the C-terminal region found in proteins in the MutS family of DNA mismatch repair proteins. The C-terminal region of MutS is comprised of the ATPase domain and the HTH (helix-turn-helix) domain, the latter being involved in dimer contacts. Yeast MSH3 PUBMED:8510668, bacterial proteins involved in DNA mismatch repair, and the predicted protein product of the Rep-3 gene of mouse share extensive sequence similarity. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein.

    \ ' '3448' 'IPR004268' '\

    This entry represents MviN, a family of integral membrane proteins predicted to have ten or more transmembrane regions. Although frequently listed as a virulence protein, it is not restricted to pathogens and it is an essential protein in Sinorhizobium meliloti. In a number of species its gene is adjacent to that of the uridylyltransferase GlnD, the signal-transducing enzyme that performs the key modification to the nitrogen regulatory protein PII PUBMED:11274131.

    \ \ \

    Disruption of the MviN open reading frame results in flagellar structures that contain only the basal body and hook complex that lack the flagellum; suggesting that MviN might be involved in flagellin export or assembly PUBMED:8200538. Genome comparison studies led to MviN being predicted to be a peptidoglycan lipid II flippase though currently there is no direct evidence to support this annotation PUBMED:18832143.

    \ \ \ \ ' '3449' 'IPR003327' '\ This family consists of the leucine zipper dimerisation domain found in both cellular c-Myc proto-oncogenes and viral v-Myc oncogenes. Dimerisation via the leucine zipper motif with other basic helix-loop-helix-leucine\ zipper (b/HLH/lz) proteins is required for efficient DNA binding PUBMED:9680483. The Myc-Max\ dimer is a transactivating complex activating expression of growth related genes promoting cell proliferation.\ The dimerisation is facilitated via interdigitating leucine residues every 7th position of the alpha helix. Like\ charge repulsion of adjacent residues in this region preturbs the formation of homodimers with heterodimers\ being promoted by opposing charge attractions. It has been demonstrated that in transgenic mice the balance between oncogene-induced proliferation and apoptosis in a given tissue can be a critical determinant in the initiation and maintenance of the tumor PUBMED:10679391.\ ' '3450' 'IPR012682' '\ The class III basic helix-turn-helix (bHLH) transcription factors have proliferative and apoptotic roles and are characterised by the presence of a leucine zipper adjacent to the bHLH domain. The myc oncogene gene was first discovered in small-cell lung cancer cell lines where it is found to \ be deregulated PUBMED:2827002. Although the biochemical function of the gene product is unknown, as a nuclear protein with a short half-life it may play a direct or indirect role in controlling gene expression PUBMED:3018999. Myc forms a heterodimer with Max, and this complex regulates cell growth through direct activation of genes involved in cell replication PUBMED:9175477.\

    This entry represents the N-terminal domain found adjacent to the basic helix-loop-helix (bHLH) region ().

    \ ' '3451' 'IPR000548' '\ The myelin sheath is a multi-layered membrane, unique to the nervous system, that functions as an insulator to greatly increase the velocity of axonal impulse conduction PUBMED:2435734. Myelin basic protein (MBP) PUBMED:1710177, PUBMED:1710279 is a hydrophilic protein that may function to maintain the correct structure of myelin, interacting with the lipids in the myelin membrane by electrostatic and hydrophobic interactions. In mammals various forms of MBP exist which are produced by the alternative splicing of a single gene; these forms differ by the presence or the absence of short (10 to 20 residues) peptides in various internal locations in the sequence. The major form of MBP is generally a protein of about 18.5 Kd (170 residues). MBP is the target of many post-translational modifications: it is N-terminally acetylated, methylated on an arginine residue, phosphorylated by various serine/threonine protein-kinases, and deamidated on some glutamine residues.\ ' '3452' 'IPR001614' '\

    The myelin sheath is a multi-layered membrane, unique to the nervous system, that functions as an insulator to greatly increase the velocity of axonal impulse conduction PUBMED:2435734. Myelin proteolipid protein (PLP or lipophilin) PUBMED:1711121 is the major myelin protein from the central nervous system (CNS). It probably plays an important role in the formation or maintenance of the multilamellar structure of myelin. In man point mutations in PLP are the cause of Pelizaeus-Merzbacher disease (PMD), a neurologic disorder of myelin metabolism. In animals dismyelinating diseases such as mouse \'jimpy\' (jp), rat md, or dog \'shaking pup\' are also caused by mutations in PLP.

    \

    PLP is a highly conserved PUBMED:1722981 hydrophobic protein of 276 to 280 amino acids which seems to contain four transmembrane segments, two disulphide bonds and which covalently binds lipids (at least six palmitate groups in mammals) PUBMED:1281423.

    \

    PLP is highly related to M6, a neuronal membran glycoprotein PUBMED:8398137.

    \ ' '3453' 'IPR001609' '\

    Muscle contraction is caused by sliding between the thick and thin filaments of the myofibril. Myosin is a major component of thick filaments and exists as a hexamer of 2 heavy chains PUBMED:1939027, 2 alkali light chains, and 2 regulatory light chains. The heavy chain can be subdivided into the N-terminal globular head and the C-terminal coiled-coil rod-like tail, although some forms have a globular region in their C-terminal. There are many cell-specific isoforms of myosin heavy chains, coded for by a multi-gene family PUBMED:2806546. Myosin interacts with actin to convert chemical energy, in the form of ATP, to mechanical energy PUBMED:3540939. The 3-D structure of the head portion of myosin has been determined PUBMED:8316857 and a model for actin-myosin complex has been constructed PUBMED:8316858.

    \

    The globular head is well conserved, some highly-conserved regions possibly relating to functional and structural domains PUBMED:6576334. The rod-like tail starts with an invariant proline residue, and contains many repeats of a 28 residue region, interrupted at 4 regularly-spaced points known as skip residues. Although the sequence of the tail is not well conserved, the chemical character is, hydrophobic, charged and skip residues occuring in a highly ordered and repeated fashion PUBMED:6576334.

    \ ' '3454' 'IPR002928' '\

    Muscle contraction is caused by sliding between the thick and thin filaments of the myofibril. Myosin is a major component of thick filaments and exists as a hexamer of 2 heavy chains PUBMED:1939027, 2 alkali light chains, and 2 regulatory light chains. The heavy chain can be subdivided into the N-terminal globular head and the C-terminal coiled-coil rod-like tail, although some forms have a globular region in their C-terminal. There are many cell-specific isoforms of myosin heavy chains, coded for by a multi-gene family PUBMED:2806546. Myosin interacts with actin to convert chemical energy, in the form of ATP, to mechanical energy PUBMED:3540939. The 3-D structure of the head portion of myosin has been determined PUBMED:8316857 and a model for actin-myosin complex has been constructed PUBMED:8316858.

    \

    This family consists of the coiled-coil myosin heavy chain tail region.\ The coiled-coil is composed of the tail from two molecules of myosin.\ These can then assemble into the macromolecular thick filament PUBMED:3783701.\ The coiled-coil region provides the structural backbone of the thick filament PUBMED:3783701.

    \ ' '3455' 'IPR000881' '\

    Myotoxins PUBMED:2253781, PUBMED:1862521, PUBMED:12709056 are small basic peptides (42 to 45 residues) found in rattlesnake venom that cause severe muscle necrosis by a non-enzymatic mechanism. Myotoxins act extremely rapidly and serve two primary biological functions: limiting the flight of prey by causing instantaneous paralysis of the hind limbs and promoting rapid death by paralysis of the diaphragm. Myotoxins have a well-conserved structure containing six cysteines involved in three disulphide bridges.

    \ ' '3456' 'IPR003356' '\ This domain is fpound in N-6 adenine-specific DNA methylase () from Type I and Type IC restriction systems.\ These enzymes are responsible for the methylation of specific DNA sequences in order to prevent the host from digesting its own genome via its restriction enzymes. These methylases have the same sequence specificity as their corresponding restriction enzymes. The type I restriction and modification system is composed of three polypeptides R, M and S. The M and S subunits together form a methyltransferase that methylates two adenine residues in complementary strands of a bipartite DNA recognition sequence. In the presence of the R subunit, the complex can also act as an endonuclease, binding to the same target sequence but cutting the DNA some distance from this site. Whether the DNA is cut or modified depends on the methylation state of the target sequence. When the target site is unmodified, the DNA is cut. When the target site is hemimethylated, the complex acts as a maintenance methyltransferase, modifying the DNA so that both strands become methylated.\ ' '3457' 'IPR002941' '\

    This domain is found in DNA methylases. In prokaryotes, the major role of DNA methylation is to protect host DNA against degradation by restriction enzymes. This family contains both N-4 cytosine-specific DNA methylases and N-6 Adenine-specific DNA methylases. N-4 cytosine-specific DNA methylases () PUBMED:7607512 are enzymes that\ specifically methylate the amino group at the C-4 position of cytosines in\ DNA. Such enzymes are found as components of type II restriction-modification\ systems in prokaryotes. Such enzymes recognise a specific sequence in DNA and\ methylate a cytosine in that sequence. By this action they protect DNA from\ cleavage by type II restriction enzymes that recognise the same sequence. N-6 adenine-specific DNA methylases () (A-Mtase) are enzymes that specifically methylate the amino group at the C-6 position of adenines in DNA. Such enzymes are found in the three existing types of bacterial restriction-modification systems (in type I system the A-Mtase is the product of the hsdM gene, and in type III it is the product of the mod gene). All of these enzymes recognise a specific sequence in DNA and methylate an adenine in that sequence.

    \ ' '3458' 'IPR007358' '\

    The Escherichia coli nucleoid contains DNA in a condensed but functional form. Analysis of proteins released from isolated spermidine nucleoids after treatment with DNase I reveals significant amounts of two proteins not previously detected in wild-type E. coli. Partial amino-terminal sequencing has identified them as the products of rdgC and yejK. These proteins are strongly conserved in Gram-negative bacteria, suggesting that they have important cellular roles PUBMED:10368163.

    \ ' '3459' 'IPR001463' '\

    Sodium symporters can be divided by sequence and functional similarity\ into various groups. One such group is the sodium/alanine symporter family,\ the members of which transport alanine in association with sodium ions.

    \ \

    These transporters are believed to possess 8 transmembrane (TM) helices\ PUBMED:1447975, PUBMED:1400476, forming a channel or pore through the cytoplasmic membrane, the\ interior face being hydrophilic to allow the passage of alanine molecules\ and sodium ions PUBMED:1447975. This family is restricted to the bacteria and archaea, examples are the alanine carrier protein from the Bacillus PS3 (Thermophilic bacterium PS-3); the D-alanine/glycine permease from Pseudoalteromonas haloplanktis (Alteromonas haloplanktis); and the\ hypothetical protein yaaJ from Escherichia coli.

    \ ' '3460' 'IPR004679' '\

    The 2-hydroxycarboxylate transporter family is a family of secondary transporters found exclusively in the bacterial kingdom. They function in the metabolism of the di- and tricarboxylates malate and citrate, mostly in fermentative pathways involving decarboxylation of malate or oxaloacetate PUBMED:16339740.

    \ \

    The majority of pProteins in this entry are known or predicted members of the citrate:cation symporter (CCS) family. They contain the predicted twelve-transmembrane helix motif common to many secondary transporters PUBMED:8810332. Most of the characterised proteins in this entry are specific for citrate, with either Na+ of H+ as the contransported cation. However, one member is capable of cotransporting either citrate or malate with H+ PUBMED:11566984, while another has been shown to be an Na+-dependent malate cotransporter PUBMED:12949159.

    \ ' '3461' 'IPR018461' '\ A single member of the NhaC family, a protein from Bacillus firmus, has been functionally characterised. It is involved in pH homeostasis and sodium extrusion. Members of the NhaC family are found in both Gram-negative bacteria and Gram-positive bacteria.\ ' '3462' 'IPR000402' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    P-ATPases (sometime known as E1-E2 ATPases) () are found in bacteria and in a number of eukaryotic plasma membranes and organelles PUBMED:9419228. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.

    \

    This entry represents the beta subunit found in the P-type cation exchange ATPases located in the plasma membranes of animal cells. These P-ATPases include both H+/K+-ATPases () and Na+/K+-ATPases (), which belong to the IIC subfamily of ATPases PUBMED:9419228, PUBMED:10963432. These ATPases catalyse the hydrolysis of ATP coupled with the exchange of cations, pumping one cation out of the cell (H+ or Na+) in exchange for K+. These ATPases contain an alpha subunit () that is the catalytic component, and a glycosylated beta subunit that regulates the number of sodium pumps transported to the plasma membrane through the assembly of alpha/beta heterodimers. The beta subunit has three highly conserved disulphide bonds within the extracellular domain that stabilise the alpha subunit, the alpha/beta interaction, and the catalytic activity of the alpha subunit PUBMED:7891030. Different beta isoforms exist, permitting greater regulatory control.

    \

    An example of a H+/K+-ATPase is the gastric pump responsible for acid secretion in the stomach, transporting protons from the cytoplasm of parietal cells to create a large pH gradient in exchange for the internalisation of potassium ions, using ATP hydrolysis to drive the pump PUBMED:15096097.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '3463' 'IPR001898' '\

    Integral membrane proteins that mediate the intake of a wide variety of\ molecules with the concomitant uptake of sodium ions (sodium symporters) can\ be grouped, on the basis of sequence and functional similarities into a number\ of distinct families. One of these families currently consists of the\ following proteins:\

    \

    These transporters are proteins of from 430 to 620 amino acids which are\ highly hydrophobic and which probably contain about 12 transmembrane regions.

    \ ' '3464' 'IPR006986' '\ Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of eukaryotic (metazoa) transcription factors PUBMED:9418898. This C-terminal region is found only in the Nab1 subfamily.\ ' '3465' 'IPR002715' '\

    Nascent polypeptide-associated complex (NAC) is among the first ribosome-associated entities to bind the nascent polypeptide after peptide bond formation. The nascent polypeptide-associated complex (NAC) of yeast functions in the targeting process of ribosomes to the ER membrane PUBMED:10518932. NAC may prevent binding of ribosome nascent chains (RNCs) without a signal sequence to yeast membranes.

    \ ' '3466' 'IPR001433' '\

    Bacterial ferredoxin-NADP+ reductase may be bound to the thylakoid membrane or anchored to the thylakoid-bound phycobilisomes.\ Chloroplast ferredoxin-NADP+ reductase () may play a key role in regulating the relative amounts of cyclic and non-cyclic electron flow to meet the demands of the plant for ATP and reducing power. It is involved in the final step in the linear photosynthetic electron transport chain and has also been implicated in cyclic electron flow around photosystem I where its role would be to return electrons from ferredoxin to the cytochrome B-F complex.

    \ \

    This domain is present in a variety of proteins that include, bacterial flavohemoprotein, mammalian NADH-cytochrome b5 reductase, eukaryotic NADPH-cytochrome P450 reductase, nitrate reductase from plants, nitric-oxide synthase, bacterial vanillate demethylase, as well as others.

    \ ' '3467' 'IPR006115' '\

    6-Phosphogluconate dehydrogenase () (6PGD) is an oxidative carboxylase that catalyses the decarboxylating reduction of 6-phosphogluconate into ribulose 5-phosphate in the presence of NADP. This reaction is a component of the hexose mono-phosphate shunt and pentose phosphate pathways (PPP) PUBMED:2113917, PUBMED:6641716. Prokaryotic and eukaryotic 6PGD are proteins of about 470 amino acids whose sequence are highly conserved PUBMED:1659648. The protein is a homodimer in which the monomers act independently PUBMED:6641716: each contains a large, mainly alpha-helical domain and a smaller beta-alpha-beta domain, containing a mixed parallel and anti-parallel 6-stranded beta sheet PUBMED:6641716. NADP is bound in a cleft in the small domain, the substrate binding in an adjacent pocket PUBMED:6641716.

    This family represents the NAD binding domain of 6-phosphogluconate dehydrogenase which adopts a Rossman fold. The C-terminal domain is described in .

    \ ' '3468' 'IPR005106' '\

    Bacteria, plants and fungi metabolise aspartic acid to produce four amino acids - lysine, threonine, methionine and isoleucine - in a series of reactions known as the aspartate pathway. Additionally, several important metabolic intermediates are produced by these reactions, such as diaminopimelic acid, an essential component of bacterial cell wall biosynthesis, and dipicolinic acid, which is involved in sporulation in Gram-positive bacteria. Members of the animal kingdom do not posses this pathway and must therefore acquire these essential amino acids through their diet. Research into improving the metabolic flux through this pathway has the potential to increase the yield of the essential amino acids in important crops, thus improving their nutritional value. Additionally, since the enzymes are not present in animals, inhibitors of them are promising targets for the development of novel antibiotics and herbicides. For more information see PUBMED:11352712.

    \

    Homoserine dehydrogenase () catalyses the third step in the aspartate pathway; theNAD(P)-dependent reduction of aspartate beta-semialdehyde into homoserine PUBMED:8500624, PUBMED:8395899. Homoserine is an intermediate in the biosynthesis of threonine, isoleucine, and methionine. The enzyme can be found in a monofunctional form, in some bacteria and yeast, or a bifunctional form consisting of an N-terminal aspartokinase domain and a C-terminal homoserine dehydrogenase domain, as found in bacteria such as Escherichia coli and in plants. Structural analysis of the yeast monofunctional enzyme () indicates that the enzyme is a dimer composed of three distinct regions; an N-terminal nucleotide-binding domain, a short central dimerisation region, and a C-terminal catalytic domain PUBMED:10700284. The N-terminal domain forms a modified Rossman fold, while the catalytic domain forms a novel alpha-beta mixed sheet.

    \ \

    This entry represents the NAD(P)-binding domain of aspartate and homoserine dehydrogenase. Asparate dehydrogenase () is strictly specific for L-aspartate as substrate and catalyses the first step in NAD biosynthesis from aspartate. The enzyme has a higher affinity for NAD+ than NADP+ PUBMED:12496312.

    \ \ \ Note that the C-terminus of the protein contributes a helix to this domain that is not covered by this model.

    \ ' '3469' 'IPR011128' '\ NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyses the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the N-terminal NAD-binding domain PUBMED:10801498.\ ' '3470' 'IPR002504' '\ Members of this family are ATP-NAD kinases . The enzymes catalyse the phosphorylation of NAD to NADP utilizing ATP and other nucleoside triphosphates as well as inorganic polyphosphate as a source of phosphorus.\ \ ' '3471' 'IPR003694' '\ NAD+ synthase () catalyzes the last step in the biosynthesis of nicotinamide adenine dinucleotide and is induced by stress factors such as heat shock and glucose limitation. The three-dimensional structure of NH3-dependent NAD+ synthetase from Bacillus subtilis, in its free form and in complex with ATP shows that the enzyme consists of a tight homodimer with alpha/beta subunit topology PUBMED:8895556.\ ' '3472' 'IPR001694' '\

    This entry represents subunit 1 NADH:ubiquinone oxidoreductase PUBMED:, PUBMED:2029890. Among the many polypeptide subunits that make up complex I, there are fifteen which are located in the membrane part, seven of which are encoded by the mitochondrial and chloroplast genomes of most species. The most conserved of these organelle-encoded subunits is known as subunit 1 (gene ND1 in mitochondrion, and NDH1 in chloroplast) and seems to contain the ubiquinone binding site.

    \

    The ND1 subunit is highly similar to subunit 4 of Escherichia coli formate hydrogenlyase (gene hycD), subunit C of hydrogenase-4 (gene hyfC). Paracoccus denitrificans NQO8 and Escherichia coli nuoH NADH-ubiquinone oxidoreductase subunits also belong to this family PUBMED:7690854.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '3473' 'IPR003486' '\

    The nucleoprotein of the ssRNA negative-strand Nairovirus is an internal part of the virus particle.

    \ ' '3474' 'IPR007260' '\

    This family represents a putative ManNAc-6-P-to-GlcNAc-6P epimerase in the N-acetylmannosamine (ManNAc) utilization pathway found mainly in pathogenic bacteria for the reaction:\ \ It is probably encoded by the yhcJ gene PUBMED:986431.

    \ ' '3475' 'IPR006753' '\

    This is a family of conserved coat proteins from the single stranded DNA Nanoviruses PUBMED:10795525.

    \ ' '3476' 'IPR002164' '\

    It is thought that NAPs act as histone chaperones, shuttling both core and linker histones from their site of synthesis in the cytoplasm to the nucleus. The proteins may be involved in regulating gene expression and therefore cellular differentiation PUBMED:9325046, PUBMED:8923009.

    \

    The centrosomal protein c-Nap1, also known as Cep250, has been implicated in the\ cell-cycle-regulated cohesion of microtubule-organizing centres. This 281 kDa\ protein consists mainly of domains predicted to form coiled coil structures. The C-terminal\ region defines a novel histone-binding domain that is responsible for targeting CNAP1, and possibly condensin, to mitotic\ chromosomes PUBMED:12138188. During interphase, C-Nap1 localizes to the proximal\ ends of both parental centrioles, but it dissociates from these structures at the onset of mitosis. Re-association with centrioles\ then occurs in late telophase or at the very beginning of G1 phase, when daughter cells are still connected by post-mitotic\ bridges. Electron microscopic studies performed on isolated centrosomes suggest that a proteinaceous linker connects parental centrioles and C-Nap1 may be part of a linker structure that assures the cohesion of duplicated centrosomes during interphase, but that is dismantled upon centrosome separation at the onset of mitosis PUBMED:12140259.

    \ ' '3477' 'IPR005591' '\

    The napB gene encodes a dihaem cytochrome c, the small subunit of a heterodimeric periplasmic nitrate reductase PUBMED:11389694.

    \ ' '3478' 'IPR005623' '\ This is an uncharacterised protein involved in formation of periplasmic nitrate reductase.\ ' '3479' 'IPR004298' '\ Nicotianamine synthase catalyzes the trimerization of S-adenosylmethionine to yield one molecule of\ nicotianamine. Nicotianamine has an important role in plant iron uptake mechanisms. Plants adopt two strategies (termed I and II) of iron acquisition. Strategy I is adopted by all higher plants except graminaceous plants, which adopt strategy II\ PUBMED:10359845, PUBMED:9952442. In strategy I plants, the role of nicotianamine is not fully determined: possible roles include the formation of more\ stable complexes with ferrous than with ferric ion, which might serve as a sensor of the physiological status of iron within\ a plant, or which might be involved in the transport of iron PUBMED:10359845. In strategy II (graminaceous) plants, nicotianamine is the\ key intermediate (and nicotianamine synthase the key enzyme) in the synthesis of the mugineic family (the only known\ family in plants) of phytosiderophores. Phytosiderophores are iron chelators whose secretion by the roots is greatly\ increased in instances of iron deficiency PUBMED:9952442.\ ' '3480' 'IPR007288' '\ The NB glycoprotein is found in Influenza type B virus. Its function is unknown.\ ' '3481' 'IPR002182' '\

    This is the NB-ARC domain, a novel signalling motif found in bacteria and eukaryotes, shared by plant resistance gene products and regulators of cell death in animals PUBMED:9545207.

    \ ' '3482' 'IPR007574' '\ In the cyanobacterium Synechococcus species PCC 7942 (), nblA triggers degradation of light-harvesting phycobiliproteins in response to deprivation nutrients including nitrogen, phosphorus and sulphur. The mechanism of nblA function is not known, but it has been hypothesised that nblA may act by disrupting phycobilisome structure, activating a protease or tagging phycobiliproteins for proteolysis. Members of this family have also been identified in the chloroplasts of some red algae.\ ' '3483' 'IPR005550' '\

    Members of this family are components of the mitotic spindle. It has been shown that Ndc80 from yeast is part of a complex called the Ndc80p complex PUBMED:11266451. This complex is thought to bind to the microtubules of the spindle.

    \ ' '3484' 'IPR001564' '\

    Nucleoside diphosphate kinases () (NDK) PUBMED: are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerisation.

    \

    In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B) PUBMED:1851158. By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.

    \

    NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.

    \

    NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown PUBMED:2175255 that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism PUBMED:1851158. Our signature pattern contains this residue.

    \ ' '3485' 'IPR006635' '\

    This domain identifies a small family of protein with no known function which are found exclusively in bacteria.

    \ ' '3486' 'IPR000900' '\ Nebulin is a 600-800kDa protein found in the thin filaments of striated vertebrate muscle. It is \ presumed to play a role in binding and stabilising F-actin PUBMED:8609630, essentially by providing \ a template for actin polymerisation (i.e., acting as an "actin zipper"). The amino acid sequence \ shows a uniform repeating pattern along its length, a repeated 35-residue motif constituting up to \ 97% of the polypeptide. Analysis of individual repeats reveals a progressive N- to C-terminal \ divergence, coupled with an increasing alpha-helix propensity. This correlates with a higher\ binding affinity for F-actin at the C-terminus. Thus, it is postulated that once the repeats have \ formed an initiation complex, the whole length of the nebulin molecule may then associate in a highly \ co-operative process with the thin filament, in a manner similar to the closing of a zipper PUBMED:8609630.\ ' '3487' 'IPR007396' '\ In Bacillus subtilis, family member , PAI 2, is involved in the negative regulation of protease synthesis and sporulation PUBMED:2108124.\ ' '3488' 'IPR002186' '\

    This family is comprised of antitumour antibiotic chromoproteins, as represented by neocarzinostatin PUBMED:8235619. These chromoproteins consist of a noncovalently bound, labile enediyne chromophore and its stabilising carrier apoprotein. The protein component of the chromophore displays an unusual bicyclic dienediyne structure. The chromoprotein inter-chelates the DNA, where its cycloaromatisation produces a biradical intermediate that has the ability to abstract hydrogens from the sugar moiety of DNA. This causes single- and double-strand breaks in the DNA PUBMED:11491295. In addition to their ability to cleave DNA at sites specific for each chromophore, results indicate that these chromoproteins also possess proteolytic activity against histones, with histone H1 as the preferred substrate PUBMED:9383447.

    \

    Neocarzinostatin has 2 disulphide bridges and is kidney-shaped with 2 defined domains that hold a binding cavity. The larger domain forms a 7-stranded antiparallel beta-barrel and the smaller domain consists of 2 anti-parallel strands of beta sheet that are perpendicular to each other PUBMED:8235619. Other members of this family include macromycin, actinoxanthine, kedarcidin PUBMED:9383447, and C-1027 PUBMED:11491295.

    \ ' '3489' 'IPR005054' '\ The proteins of this entry are derived from nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This family\ aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and\ forms a pseudo T = 3 icosahedral capsid structure PUBMED:9519407.\ ' '3490' 'IPR005305' '\

    This domain oocurs within nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This domain aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and forms a pseudo T = 3 icosahedral capsid structure PUBMED:9519407.

    \ ' '3491' 'IPR005306' '\

    The members of this family are derived from nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This family aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and forms a pseudo T = 3 icosahedral capsid structure PUBMED:9519407.

    \ ' '3492' 'IPR001860' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 34 comprises enzymes with only one known activity; sialidase or neuraminidase .

    \ \

    Neuraminidases cleave the terminal sialic acid residues from carbohydrate chains in glycoproteins. Sialic acid is a negatively charged sugar associated with the protein and lipid portions of lipoproteins. In Influenza virus, neuraminidases prevent self-aggregation by removing the carbohydrate from the viral envelope thus facilitating the mobility of the virus to and from the site of infection.\ Antiviral agents that inhibit influenza viral neuraminidase activity are of major importance in the control of influenza PUBMED:10623375.

    \ ' '3493' 'IPR006029' '\

    Neurotransmitter ligand-gated ion channels are transmembrane receptor-ion channel complexes that open transiently upon binding of specific ligands, allowing rapid transmission of signals at chemical synapses PUBMED:1721053, PUBMED:1846404. Five of these ion channel receptor families have been shown to form a sequence-related superfamily:

    \

    \

    These receptors possess a pentameric structure (made up of varying subunits), surrounding a central pore. All known sequences of subunits from neurotransmitter-gated ion-channels are structurally related. They are composed of a large extracellular glycosylated N-terminal ligand-binding domain, followed by three hydrophobic transmembrane regions which form the ionic channel, followed by an intracellular region of variable length. A fourth hydrophobic region is found at the C-terminal of the sequence PUBMED:1721053, PUBMED:1846404.

    \ \

    This domain represents four transmembrane helices of a variety of neurotransmitter-gated ion-channels.

    \ ' '3494' 'IPR002154' '\

    Neuregulins are a sub-family of EGF-like molecules that have been shown to play multiple essential roles in vertebrate embryogenesis including: cardiac development, Schwann cell and oligodendrocyte differentiation, some aspects of neuronal development, as well as the formation of neuromuscular synapses PUBMED:9892702, PUBMED:9208852. Included in the family are heregulin; neu differentiation factor; acetylcholine receptor synthesis stimulator; glial growth factor; and sensory and motor-neuron derived factor PUBMED:9804837. Multiple family members are generated by alternate splicing or by use of several cell type-specific transcription initiation sites. In general, they bind to and activate the erbB family of receptor tyrosine kinases (erbB2 (HER2), erbB3 (HER3), and erbB4 (HER4)), functioning both as heterodimers and homodimers.

    The transmembrane forms of neuregulin 1 (NRG1) are present within synaptic vesicles, including those containing glutamate PUBMED:12145742. After exocytosis, NRG1 is in the presynaptic membrane, where the ectodomain of NRG1 may be cleaved off. The ectodomain then migrates across the synaptic cleft and binds to and activates a member of the EGF-receptor family on the postsynaptic membrane. This has been shown to increase the expression of certain glutamate-receptor subunits. NRG1 appears to signal for glutamate-receptor subunit expression, localisation, and /or phosphorylation facilitating subsequent glutamate transmission.

    \ \

    The NRG1 gene has been identified as a potential gene determining susceptibility to schizophrenia by a combination of genetic linkage and association approaches PUBMED:12145742.

    \ ' '3495' 'IPR003635' '\ Tachykinins PUBMED:3284438, PUBMED:1969374, PUBMED:1324401 are a group of biologically active peptides which excite\ neurons, evoke behavioral responses, are potent vasodilatators and contract\ (directly or indirectly) many smooth muscles. This family includes neurokinins, as well as many other peptides. Like other tachykinins, neurokinins are synthesized as larger protein precursors that are enzymatically converted to their mature forms.\ ' '3496' 'IPR002072' '\ During the development of the vertebrate nervous system, many neurons\ become redundant (because they have died, failed to connect to target\ cells, etc.) and are eliminated. At the same time, developing neurons send\ out axon outgrowths that contact their target cells PUBMED:2369898. Such cells control\ their degree of innervation (the number of axon connections) by the\ secretion of various specific neurotrophic factors that are essential for\ neuron survival. One of these is nerve growth factor (NGF or beta-NGF), a vertebrate protein that stimulates\ division and differentiation of sympathetic and embryonic sensory neurons PUBMED:3589669,\ PUBMED:8488558. NGF is mostly found outside the central\ nervous system (CNS), but slight traces have been detected in adult CNS\ tissues, although a physiological role for this is unknown PUBMED:2369898; it has also\ been found in several snake venoms PUBMED:1477101, PUBMED:1995338.\

    NGF is a protein of about 120 residues that is cleaved from a larger\ precursor molecule. It contains six cysteines all involved in intrachain\ disulphide bonds. A schematic representation of the structure of NGF is shown\ below:\

    \
                                        +------------------------+\
                                        |                        |\
                                        |                        |\
            xxxxxxCxxxxxxxxxxxxxxxxxxxxxCxxxxCxxxxxCxxxxxxxxxxxxxCxCxxxx\
                  |                          |     |               |\
                  +--------------------------|-----+               |\
                                             +---------------------+\
    \
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
    

    \

    This entry also contains NGF-related proteins such as neutrophin 3, which promotes the survival of visceral and proprioceptive sensory neurons, and brain-derived neurotrophin, which promotes the survival of neuronal populations that are located either in the central nervous system or directly connected to it PUBMED:2236018, PUBMED:8527932.

    \ ' '3497' 'IPR004232' '\

    Nitrile hydratases () are bacterial enzymes that catalyse the hydration of nitrile compounds to the corresponding amides. They are used as biocatalysts in acrylamide production, one of the few commercial scale bioprocesses, as well as in environmental remediation for the removal of nitriles from waste streams. Nitrile hydratases are composed of two subunits, alpha and beta, and are normally active as a tetramer, alpha(2)beta(2). Nitrile hydratases contain either a non-haem iron or a non-corrinoid cobalt centre, both types sharing a highly conserved peptide sequence in the alpha subunit (CXLCSC) that provides all the residues involved in coordinating the metal ion. Each type of nitrile hydratase specifically incorporated its metal with the help of activator proteins encoded by flanking regions of the nitrile hydratase genes that are necessary for metal insertion. The Fe-containing enzyme is photo-regulated: in the dark the enzyme is inactivated due to the association of nitric oxide (NO) to the iron, while in the light the enzyme is active by photo-dissociation of NO. The NO is held in place by a claw setting formed through specific oxygen atoms in two modified cysteines and a serine residue in the active site PUBMED:9586994, PUBMED:9195885. The cobalt-containing enzyme is unaffected by NO, but was shown to undergo a similar effect with carbon monoxide PUBMED:17267045, PUBMED:14717710. Fe- and cobalt-containing enzymes also display different inhibition patterns with nitrophenols.

    \

    This entry represents the alpha subunit of both iron- and cobalt-containing nitrile hydratases; the alpha subunit is a duplication of two structural repeats, each consisting of 4 layers, alpha/beta/beta/alpha. This entry also includes the related protein, the gamma subunit of thiocyanate hydrolase (SCNase). SCNase is a cobalt-containing metalloenzyme with a cysteine-sulphinic acid ligand that hydrolyses thiocyanate to carbonyl sulphide and ammonia PUBMED:16417356. The two enzymes, nitrile hydratase and SCNase, are homologous over regions corresponding to almost the entire coding regions of the genes: the beta and alpha subunits of thiocyanate hydrolase were homologous to the amino- and carboxyl-terminal halves of the beta subunit of nitrile hydratase, and the gamma subunit of thiocyanate hydrolase was homologous to the alpha subunit of nitrile hydratase PUBMED:9573140.

    \ ' '3498' 'IPR003168' '\

    Nitrile hydratases () are bacterial enzymes that catalyse the hydration of nitrile compounds to the corresponding amides. They are used as biocatalysts in acrylamide production, one of the few commercial scale bioprocesses, as well as in environmental remediation for the removal of nitriles from waste streams. Nitrile hydratases are composed of two subunits, alpha and beta, and are normally active as a tetramer, alpha(2)beta(2). Nitrile hydratases contain either a non-haem iron or a non-corrinoid cobalt centre, both types sharing a highly conserved peptide sequence in the alpha subunit (CXLCSC) that provides all the residues involved in coordinating the metal ion. Each type of nitrile hydratase specifically incorporated its metal with the help of activator proteins encoded by flanking regions of the nitrile hydratase genes that are necessary for metal insertion. The Fe-containing enzyme is photo-regulated: in the dark the enzyme is inactivated due to the association of nitric oxide (NO) to the iron, while in the light the enzyme is active by photo-dissociation of NO. The NO is held in place by a claw setting formed through specific oxygen atoms in two modified cysteines and a serine residue in the active site PUBMED:9586994, PUBMED:9195885. The cobalt-containing enzyme is unaffected by NO, but was shown to undergo a similar effect with carbon monoxide PUBMED:17267045, PUBMED:14717710. Fe- and cobalt-containing enzymes also display different inhibition patterns with nitrophenols.

    \

    This entry represents the beta subunit.

    \ ' '3499' 'IPR011577' '\

    Cytochrome b561 is an integral membrane and electron transport protein, that binds two haeme groups non-covalently. This domain is also found in a number of nickel-dependent hydrogenase subunits which are also B-type cytochromes that interact with quinones and anchor the hydrogenase to the membrane. Members of the \'eukaryotic cytochrome b561\' family can be found in .

    \ ' '3500' 'IPR007231' '\

    Nup93/Nic96 is a component of the nuclear pore complex. It is required for the correct assembly of the nuclear pore complex PUBMED:17897938. In Saccharomyces cerevisiae, Nic96 has been shown to be involved in the distribution and cellular concentration of the GTPase Gsp1 PUBMED:9372936. The structure of Nic96 has revealed a mostly alpha helical structure PUBMED:9348540.

    \ ' '3501' 'IPR011541' '\

    High affinity nickel transporters are involved in the incorporation of nickel into H2-uptake hydrogenase PUBMED:7934894, PUBMED:7651142 and urease PUBMED:8197192 enzymes and are essential for the expression of catalytically active hydrogenase and urease. Ion uptake is dependent on proton motive force. HoxN in Ralstonia eutropha (Alcaligenes eutrophus) is thought to be an integral membrane protein with seven transmembrane helices PUBMED:8288539. The family also includes a cobalt transporter.

    \ ' '3502' 'IPR001501' '\

    Hydrogenases are enzymes that catalyze the reversible activation of hydrogen and which occur widely in prokaryotes as well as in some eukaryotes. There are various types of hydrogenases, but all of them seem to contain at least one iron-sulphur cluster. They can be broadly divided into two groups: hydrogenases containing nickel and, in some cases, also selenium (the [NiFe] and [NiFeSe] hydrogenases) and those lacking nickel (the [Fe] hydrogenases).

    \

    The [NiFe] and [NiFeSe] hydrogenases are heterodimer that consist of a small subunit that contains a signal peptide and a large subunit. All the known large subunits seem to be evolutionary related PUBMED:2180913; they contain two Cys-x-x-Cys motifs; one at their N-terminal end; the other at their C-terminal end. These four cysteines are involved in the binding of nickel PUBMED:. In the [NiFeSe] hydrogenases the first cysteine of the C-terminal motif is a selenocysteine which has experimentally been shown to be a nickel ligand PUBMED:2521386.

    \ ' '3503' 'IPR006975' '\ NifQ is involved in early stages of the biosynthesis of the iron-molybdenum cofactor (FeMo-co) PUBMED:8316214, which is an integral part of the active site of dinitrogenase PUBMED:7954845. The conserved C-terminal cysteine residues may be involved in metal binding PUBMED:8316214.\ ' '3504' 'IPR001075' '\

    Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S] PUBMED:16221578. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.

    \

    The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins PUBMED:16211402, PUBMED:16843540. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly PUBMED:15937904.

    \

    The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA PUBMED:17350000. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA PUBMED:15278785, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.

    \

    In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins PUBMED:11498000. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen PUBMED:8875867.

    \ \

    This entry represents the C-terminal of NifU and homologous proteins. NifU contains two domains: an N-terminal () and a C-terminal domain PUBMED:8048161. These domains exist either together or on different polypeptides, both domains being found in organisms that do not fix nitrogen (e.g. yeast), so they have a broader significance in the cell than nitrogen fixation.

    \ ' '3505' 'IPR002871' '\

    Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S] PUBMED:16221578. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.

    \

    The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins PUBMED:16211402, PUBMED:16843540. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly PUBMED:15937904.

    \

    The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA PUBMED:17350000. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA PUBMED:15278785, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.

    \

    In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins PUBMED:11498000. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen PUBMED:8875867.

    \ \

    This entry represents the N-terminal of NifU and homologous proteins. NifU contains two domains: an N-terminal and a C-terminal domain () PUBMED:8048161. These domains exist either together or on different polypeptides, both domains being found in organisms that do not fix nitrogen (e.g. yeast), so they have a broader significance in the cell than nitrogen fixation.

    \ ' '3506' 'IPR004893' '\ Nitrogenase is a complex metalloenzyme composed of two proteins designated the Fe-protein and the MoFe-protein. Apart from\ these two proteins, a number of accessory proteins are essential for the maturation and assembly of nitrogenase. Even though\ experimental evidence suggests that these accessory proteins are required for nitrogenase activity, the exact roles played by many of\ these proteins in the functions of nitrogenase are unclear PUBMED:9514861. Using yeast two-hybrid screening it has been shown that NifW can\ interact with itself as well as NifZ. \ \ ' '3507' 'IPR007415' '\ This short protein is found in the nif (nitrogen fixation) operon. Its function is unknown but it is probably involved in nitrogen fixation or regulating some component of this process. The 75 residue region that defines these proteins is found in isolation in some members and in the N-terminal half of the longer NifZ proteins.\ ' '3509' 'IPR006067' '\

    Sulphite reductases (SiRs) and related nitrite reductases (NiRs) catalyse the six-electron reduction reactions of sulphite to sulphide, and nitrite to ammonia, respectively. The Escherichia coli SiR enzyme is a complex composed of two proteins, a flavoprotein alpha-component (SiR-FP) and a hemoprotein beta-component (SiR-HP) (), and has an alpha(8)beta(4) quaternary structure PUBMED:10984484. SiR-FP contains both FAD and FMN, while SiR-HP contains a Fe(4)S(4) cluster coupled to a siroheme through a cysteine bridge. Electrons are transferred from NADPH to FAD, and on to FMN in SiR-FP, from which they are transferred to the metal centre of SiR-HP, where they reduce the siroheme-bound sulphite.

    \

    SiR-HP has a two-fold symmetry, which generates a distinctive three-domain alpha/beta fold that controls assembly and reactivity PUBMED:7569952. In the E. coli SiR-HP enzyme (), the iron is bound to cysteine residues at positions 433, 439, 478 and 482, the latter also forming the siroheme ligand.

    \ ' '3510' 'IPR005117' '\

    Sulphite reductases (SiRs) and related nitrite reductases (NiRs) catalyse the six-electron reduction reactions of sulphite to sulphide, and nitrite to ammonia, respectively. The Escherichia coli SiR enzyme is a complex composed of two proteins, a flavoprotein alpha-component (SiR-FP) and a hemoprotein beta-component (SiR-HP), and has an alpha(8)beta(4) quaternary structure PUBMED:10984484. SiR-FP contains both FAD and FMN, while SiR-HP contains a Fe(4)S(4) cluster coupled to a sirohaem through a cysteine bridge. Electrons are transferred from NADPH to FAD, and on to FMN in SiR-FP, from which they are transferred to the metal centre of SiR-HP, where they reduce the siroheme-bound sulphite.

    \

    SiR-HP has a two-fold symmetry, which generates a distinctive three-domain alpha/beta fold that controls assembly and reactivity PUBMED:. This entry describes the ferrodoxin-like (alpha/beta sandwich) domain, which consists of a duplication containing two subdomains of this fold.

    \ ' '3511' 'IPR003816' '\

    The nitrate reductase enzyme () is composed of three subunits; an alpha, a beta and two gamma. It is the second nitrate reductase enzyme which it can substitute for the NRA enzyme in Escherichia coli allowing it to use nitrate as an electron acceptor during anoerobic respiration PUBMED:2233673.

    \ \

    Nitrate reductase gamma subunit resembles cytochrome b and transfers electrons from quinones to the beta subunit PUBMED:9738886.

    \ ' '3512' 'IPR002351' '\

    Nitrophorins are haemoproteins found in saliva of blood-feeding insects PUBMED:10093938, PUBMED:11058753. Saliva of the blood-sucking bug Rhodnius prolixus (Triatomid bug) contains four homologous nitrophorins, designated NP1 to NP4 in order of their relative abundance in the glands PUBMED:7721773. As isolated, nitrophorins contain nitric oxide (NO) ligated to the ferric (FeIII) haem iron. Histamine, which is released\ by the host in response to tissue damage, is another nitrophorin ligand. Nitrophorins transport NO to the feeding site.\ Dilution, binding of histamine and increase in pH (from pH ~5 in salivary gland to pH ~7.4 in the host tissue) facilitate the release of NO into the tissue where it induces vasodilatation.

    \ \

    The salivary nitrophorin from the hemipteran Cimex lectularius (Bed bug) has no sequence similarity to R. prolixus nitrophorins. It is suggested that the two classes of insect nitrophorins have arisen as a product of the convergent\ evolution PUBMED:9716517.

    \ \

    3-D structures of several nitrophorin complexes are known PUBMED:11058753. The nitrophorin structures reveal lipocalin-like\ eight-stranded beta-barrel, three alpha-helices and two disulphide bonds, with haem inserted into one end of the barrel. Members of the lipocalin family are known to bind a variety of small hydrophobic ligands, including biliverdin, in a similar fashion (see PUBMED:8761444 for review). The haem iron is ligated to His59. The position of His59 is restrained through water-mediated\ hydrogen bond to the carboxylate of Asp70. The His59-Fe bond is bent ~15 degrees out of the imidazole plane. Asp70 forms an unusual hydrogen bond with one of the haem propionates, suggesting the residue has an altered pKa. In NP1-histamine\ structure (PDB 1NP1), the planes of His59 and histamine imidazole rings lie in an arrangement almost identical to that found in oxidised cytochrome b5.

    \ ' '3513' 'IPR000415' '\

    This entry represents a family of proteins consisting of nitroreductase enzymes and related oxidoreductases.\ Members of this family utilise FMN as a cofactor and are\ often found to be homodimers. Possible characteristics include Oxygen-insensitive NAD(P)H nitroreductase (FMN-dependent nitroreductase) (Dihydropteridine reductase) () and NADH dehydrogenase (). A number of the proteins are described as oxidoreductases. They are primarily found in bacterial lineages though a number of eukaryotic homologs have been identified: Caenorhabditis elegans , Drosophila melanogaster (Fruit fly) , Mus musculus (Mouse) and Homo sapiens (Human) . This protein is not found in photosynthetic eukaryotes. The sequences containing this entry in photosynthetic organisms are possible false positives.

    \ ' '3514' 'IPR007876' '\

    This family is comprised of several flagellar sheath adhesin proteins also called neuraminyllactose-binding haemagglutinin precursor (NLBH or HpaA) or N-acetylneuraminyllactose-binding fibrillar haemagglutinin receptor-binding subunits. NLBH is found exclusively in Helicobacter which are gut colonising bacteria and bind to sialic acid rich macromolecules present on the gastric epithelium PUBMED:11855744. The sialic acid-sensitive agglutination of erythrocytes by certain strains of Helicobacter pylori has been attributed to the NLBH protein PUBMED:7592366.

    \ ' '3515' 'IPR007298' '\ This family represents a bacterial outer membrane lipoprotein that is necessary for signalling by the Cpx pathway PUBMED:11830644. This pathway responds to cell envelope disturbances and increases the expression of periplasmic protein folding and degradation factors. While the molecular function of the NlpE protein is unknown, it may be involved in detecting bacterial adhesion to abiotic surfaces. NlpE from Escherichia coli and Salmonella typhi is also known to confer copper tolerance in copper-sensitive strains of E. coli, and may be involved in copper efflux and delivery of copper to copper-dependent enzymes PUBMED:7635807.\ ' '3516' 'IPR006419' '\

    The PnuC protein of Escherichia coli is membrane protein responsible for nicotinamide mononucleotide transport, subject to regulation by interaction with the NadR (also called NadI) protein (see ). The extreme N- and C-terminal regions are poorly conserved.

    \ ' '3517' 'IPR000903' '\ Myristoyl-CoA:protein N-myristoyltransferase () (Nmt) PUBMED:8322618 is the enzyme responsible \ for transferring a myristate group on the N-terminal glycine of a number of cellular eukaryotics and \ viral proteins. Nmt is a monomeric protein of about 50 to 60kDa whose sequence appears to be well \ conserved.\ ' '3518' 'IPR000903' '\ Myristoyl-CoA:protein N-myristoyltransferase () (Nmt) PUBMED:8322618 is the enzyme responsible \ for transferring a myristate group on the N-terminal glycine of a number of cellular eukaryotics and \ viral proteins. Nmt is a monomeric protein of about 50 to 60kDa whose sequence appears to be well \ conserved.\ ' '3519' 'IPR008199' '\

    Neuromedin U (NmU) PUBMED:3239891, PUBMED:1455013 is a vertebrate peptide which stimulates uterine smooth muscle contraction and causes selective vasoconstriction. Like most other active peptides, it is proteolytically processed from a larger precursor protein. The mature peptides are 8 (NmU-8) to 25 (NmU-25) residues long and C-terminally amidated.

    \

    The sequence of the C-terminal extremity of NmU is extremely well conserved.

    \ ' '3520' 'IPR007128' '\ NNF1 is an essential yeast gene required for proper spindle orientation, nucleolar and nuclear envelope structure and mRNA export PUBMED:9247195.\ ' '3521' 'IPR004030' '\

    Nitric oxide synthase () (NOS) enzymes produce nitric oxide (NO) by catalysing a five-electron oxidation of a guanidino nitrogen of L-arginine (L-Arg). Oxidation of L-Arg to L-citrulline occurs via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine as an intermediate. 2 mol of O(2) and 1.5 mol of NADPH are consumed per mole of NO formed PUBMED:8782597.

    \

    Arginine-derived NO synthesis has been identified in mammals, fish, birds, invertebrates, plants, and bacteria PUBMED:8782597. Best studied are mammals, where three distinct genes encode NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) PUBMED:7510950. iNOS and nNOS are soluble and found predominantly in the cytosol, while eNOS is membrane associated. The enzymes exist as homodimers, each monomer consisting of two major domains: an N-terminal oxygenase domain, which belongs to the class of haem-thiolate proteins, and a C-terminal reductase domain, which is homologous to NADPH:P450 reductase (). The interdomain linker between the oxygenase and reductase domains contains a calmodulin (CaM)-binding sequence. NOSs are the only enzymes known to simultaneously require five bound cofactors animal NOS isozymes are catalytically self-sufficient. The electron flow in the NO synthase reaction is: NADPH --> FAD --> FMN --> haem --> O(2).

    \

    eNOS localisation to endothelial membranes is mediated by cotranslational N-terminal myristoylation and post-translational palmitoylation PUBMED:9199168. The subcellular localisation of nNOS in skeletal muscle is\ mediated by anchoring of nNOS to dystrophin. nNOS contains an additional \ N-terminal domain, the PDZ domain PUBMED:7535955. Some bacteria, like Bacillus halodurans, Bacillus subtilis or Deinococcus radiodurans, contain homologs of NOS oxygenase domain. The pattern is directed against the N-terminal haem binding site.

    \ \

    This entry represents the oxygenase domain of NOS.

    \ ' '3522' 'IPR003484' '\

    Rhizobial nodulation (Nod) factors are signalling molecules secreted by root-nodulating rhizobia in response to flavanoids excreted by the host plant. They induce various symbiotic responses on the roots of the leguminous host plant at low concentrations, and are required for successful infection. Rhizobial Nod factors are lipo-chitooligosaccharides carrying various substituents which are important determinants of host specificity PUBMED:11732607.

    \ \

    NodA is an N-acyl transferase which specifies the transfer of an acyl chain to the oligosaccharide backbone of Nod factor. Allelic variation of the nodA gene can contribute to the determination of host range PUBMED:8930915.

    \ \ ' '3523' 'IPR002687' '\ This domain is present in various pre-mRNA processing ribonucleoproteins. The function of the domain is unknown however it may be a common RNA or snoRNA or Nop1p binding domain.\

    Proteins have been implicated in an expanding variety of functions during\ pre-mRNA splicing. Molecular cloning has identified genes encoding spliceosomal proteins that potentially act as novel RNA helicases, GTPases, or protein isomerases. Novel protein-protein and protein-RNA interactions that are required for functional spliceosome formation have also been described. Finally, growing evidence suggests that proteins may contribute directly to the spliceosome\'s active sites PUBMED:9159080.

    \ ' '3524' 'IPR003387' '\ Nodulin is a plant protein of unknown function. It is induced during nodulation in legume roots after rhizobium infection.\ ' '3525' 'IPR001678' '\

    This domain is found in archaeal, bacterial and eukaryotic proteins.

    \ \

    In the archaea and bacteria, they are annotated as putative nucleolar protein, Sun (Fmu) family protein or tRNA/rRNA cytosine-C5-methylase. The majority have the S-adenosyl methionine (SAM) binding domain and are related to Escherichia coli Fmu (Sun) protein (16S rRNA m5C 967 methyltransferase) whose structure has been determined PUBMED:14656444.

    \ \

    In the eukaryota, the majority are annotated as being \'hypothetical protein\', nucleolar protein or the Nop2/Sun (Fmu) family. Unlike their bacterial homologues, few of the eukaryotic members in this family have a the SAM binding signature. Despite this, Saccharomyces cerevisiae (Baker\'s yeast) Nop2p is a probable RNA m5C methyltransferase PUBMED:12872006. It is essential for processing and maturation of 27S pre-rRNA and large ribosomal subunit biogenesis PUBMED:12872006; localized to the nucleolus and is essential for viability PUBMED:7806561. Reduced Nop2p expression limits yeast growth and decreases levels of mature 60S ribosomal subunits while altering rRNA processing PUBMED:8972218. There is substantial identity between Nop2p and Homo sapiens (Human) p120 (NOL1), which is also called the proliferation-associated nucleolar antigen PUBMED:7806561, PUBMED:2576976.

    \ ' '3526' 'IPR007264' '\ Nop10p is a nucleolar protein that is specifically associated with H/ACA snoRNAs. It is essential for normal 18S rRNA production and rRNA pseudouridylation by the ribonucleoprotein particles containing H/ACA snoRNAs (H/ACA snoRNPs). Nop10p is probably necessary for the stability of these RNPs PUBMED:9843512.\ ' '3527' 'IPR007742' '\

    Bacterial nitrous oxide (N(2)O) reductase is the terminal oxidoreductase of a respiratory process that generates dinitrogen from\ N(2)O. To attain its functional state, the enzyme is subjected to a maturation process which involves the protein-driven synthesis of a\ unique copper-sulphur cluster and metallation of the binuclear Cu(A) site in the periplasm. NosD is a periplasmic protein which is thought to insert copper into the exported reductase apoenzyme PUBMED:8626275.

    \ ' '3528' 'IPR007196' '\

    The Ccr4-Not complex is a global regulator of gene expression that is conserved from yeast to human. It affects genes positively and negatively and is thought to regulate transcription factor IID function. In Saccharomyces cerevisiae, it exists in two prominent forms and consists of at least nine core subunits: the five Not proteins (Not1p to Not5p), Caf1p, Caf40p, Caf130p and Ccr4p PUBMED:10637334. The Ccr4-Not complex regulates many different cellular functions, including RNA degradation and transcription initiation. It may be a regulatory platform that senses nutrient levels and stress PUBMED:12957374. Caf1p and Ccr4p, are directly involved in mRNA deadenylation, and Caf1p is associated with Dhh1p, a putative RNA helicase thought to be a component of the decapping complex PUBMED:11696541. Pop2, a component of the Ccr4-Not complex, functions as a deadenylase PUBMED:18430587.

    \ The Ccr4-Not complex is a global regulator of transcription that affects genes positively and negatively and is thought to regulate transcription factor TFIID PUBMED:11696541.\ ' '3529' 'IPR000800' '\

    The Notch domain is also called the \'DSL\' domain or the Lin-12/Notch repeat (LNR). The LNR region is present only in Notch related proteins C-terminal to EGF repeats. The lin-12/Notch proteins act as transmembrane receptors for intercellular signals that specify cell fates during animal development. In response to a ligand, proteolytic cleavages release the intracellular domain of\ Notch, which then gains access to the nucleus and acts as a transcriptional co-activator PUBMED:3119223. The LNR region is supposed to negatively regulate the Lin-12/Notch proteins activity. It is a triplication of an around 35-40 amino acids module present on the extracellular part of the protein PUBMED:7697721, PUBMED:8139658. Each module contains six cysteine residues engaged in three disulphide bonds and three conserved aspartate\ and asparagine residues PUBMED:3119223. The biochemical characterisation of a recombinantly expressed LIN-12.1 module from the human Notch1 receptor indicate that the disulphide bonds are formed between the first\ and fifth, second and fourth, and third and sixth cysteines. The formation of this particular disulphide isomer is favored by the presence of Ca2+, which is also required to maintain the structural integrity of the rLIN-12.1 module. The conserved aspartate and asparagine residues are likely to be important for\ Ca2+ binding, and thereby contribute to the native fold.

    \ ' '3530' 'IPR004249' '\ The RPT2 protein is a signal transducer of the phototropic response in Arabidopsis thaliana. The RPT2 gene is light inducible; encodes a novel protein with putative phosphorylation sites, a nuclear localization signal, a BTB/POZ domain (), and a coiled-coil domain. RPT2 belongs to a large gene family that includes the recently isolated NPH3 gene PUBMED:10662859. The NPH3 protein is a NPH1 photoreceptor-interacting protein that is essential for phototropism.\ Phototropism of A. thaliana seedlings in response to a blue light source is initiated by nonphototropic hypocotyl 1 (NPH1), a light-activated serine-threonine protein kinase PUBMED:10542152. NPH3 is a member of\ a large protein family, apparently specific to higher plants, and may function as an adapter or scaffold protein to bring\ together the enzymatic components of a NPH1-activated phosphorelay PUBMED:10542152. Many of the proteins in this group also contain the BTB/POZ domain () at the N-terminal.\ ' '3531' 'IPR004338' '\ This family of proteins describes the Nqr2 (NqrB) subunit of the bacterial 6-subunit sodium-translocating NADH-ubiquinone oxidoreductase (i.e. a respiration linked sodium pump). In Vibrio cholerae, it negatively regulates the expression of virulence factors through inhibiting (by an unknown mechanism) the transcription of the transcriptional activator ToxT PUBMED:10077658. The family also includes RnfD, which is involved in nitrogen fixation. The similarity of RnfD to NADH-ubiquinone oxidoreductases was previously noted PUBMED:9154934.\ ' '3532' 'IPR001046' '\

    The natural resistance-associated macrophage protein (NRAMP) family consists of Nramp1, Nramp2, and yeast proteins Smf1 and Smf2. The NRAMP family is a novel family of functionally related proteins \ defined by a conserved hydrophobic core of ten transmembrane domains PUBMED:7479731. Nramp1 is an integral membrane protein expressed exclusively in cells of \ the immune system and is recruited to the membrane of a phagosome upon \ phagocytosis. Nramp2 is a multiple divalent cation transporter for Fe2+, Mn2+ and Zn2+\ amongst others. It is expressed at high levels in the intestine; and is \ major transferrin-independent iron uptake system in mammals PUBMED:9719491. The yeast proteins Smf1 and Smf2 may also transport divalent cations PUBMED:9632246.

    \ \

    The natural resistance of mice to infection with intracellular parasites is\ controlled by the Bcg locus, which modulates the cytostatic/cytocidal\ activity of phagocytes. Nramp1, the gene responsible, is expressed exclusively in\ macrophages and poly-morphonuclear leukocytes, and encodes a polypeptide\ (natural resistance-associated macrophage protein) with features typical of integral\ membrane proteins. Other transporter proteins from a variety of sources also belong\ to this family.

    \ ' '3533' 'IPR005554' '\ Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus PUBMED:11895476. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript PUBMED:11895476.\ ' '3534' 'IPR005614' '\

    NrfD is an integral transmembrane protein with loops in both the periplasm and the cytoplasm. NrfD is thought to participate in the transfer of electrons, from the quinone pool into the terminal components of the Nrf pathway PUBMED:8057835.

    \ ' '3535' 'IPR003873' '\

    This is a family of small nonstructural proteins, well conserved among Coronavirus strains. This protein is also found in Murine hepatitis virus as small envelope protein E.

    \ ' '3536' 'IPR000744' '\

    Regulated exocytosis of neurotransmitters and hormones, as well as intracellular traffic, requires fusion of two lipid bilayers. SNARE proteins are thought to form a protein bridge, the SNARE complex, between an incoming vesicle and the acceptor compartment. SNARE proteins contribute to the specificity of membrane fusion, implying that the mechanisms by which SNAREs are targeted to subcellular compartments are important for specific docking and fusion of vesicles. This mechanism involves a family of conserved proteins, members of which appear to function at all sites of constitutive and regulated secretion in eukaryotes PUBMED:7846761. Among them are 2 types of cytosolic protein, NSF (N-ethyl-maleimide-sensitive protein) and the SNAPs (alpha-, beta- and gamma-soluble NSF attachment proteins). The yeast vesicular fusion protein, sec17, a cytoplasmic peripheral membrane protein involved in vesicular transport between the\ endoplasmic reticulum and the golgi apparatus, shows a high degree of sequence similarity to the alpha-SNAP family.

    \

    SNAP-25 and its non-neuronal homologue Syndet/SNAP-23 are synthesized as soluble proteins in the cytosol. Both SNAP-25 and Syndet/SNAP-23 are palmitoylated at cysteine residues clustered in a loop\ between two N- and C-terminal coils and palmitoylation is essential for membrane binding and plasma membrane targeting. The C-terminal and the N-terminal helices of SNAP-25, are each targeted to the plasma membrane by two distinct cysteine-rich domains and appear to regulate the availability of SNAP to form complexes with SNARE PUBMED:12140265.

    \ ' '3537' 'IPR007758' '\

    The NSP1-like protein appears to be an essential component of the nuclear pore complex, for example preribosome nuclear export requires the Nup82p-Nup159p-Nsp1p complex. The C-terminal of Nsp1 is involved in binding Nup82 PUBMED:11689687, probably via coiled-coil formation PUBMED:11689687, PUBMED:17037504. The family is related to the rotavirus nonstructural protein NSP1 which is the least conserved protein in the rotavirus genome. Its function in the replication process is not fully understood.

    \ ' '3538' 'IPR004850' '\

    Agrin is a multidomain heparan sulphate proteoglycan, that is a key organiser for the induction of postsynaptic specializations at the\ neuromuscular junction. Binding of agrin to basement membranes requires the amino terminal (NtA) domain PUBMED:9321698. This region mediates\ high affinity interaction with the coiled-coil domain of laminins. The binding of agrin to laminins via the NtA domain is subject to\ tissue-specific regulation. The NtA domain-containing form of agrin is expressed in non-neuronal cells or in neurons that project to\ non-neuronal cell such as motor neurons. The NtA domain forms the most N-terminal part, followed by 9 Kazal-like domains and 2 LE domains. The C-terminal part consists of a SEA domain, 4 EGF-like domains and 3 Laminin G domains, responsible for the clustering of acetylcholine receptors PUBMED:11473262.

    \

    \ Tertiairy structures show that the NtA domain folds as a beta-barrel core flanked by N- and C-terminal helical regions. The core of the domain consists of 5 beta-strands that form 2 beta-sheets. The structure belongs to the OB fold family and shows similarity with the protease inhibition domain of TIMP-1, suggesting alternative functions for agrin in addition to synaptogenic activity PUBMED:11473262. Residues Leu 117 and Val 124 in helix 3 of the NtA domain are essential for binding to the laminin gamma1 chain PUBMED:12554653.

    \ ' '3539' 'IPR002075' '\

    Ran () is an evolutionary conserved member of the Ras superfamily of small GTPases that regulates all receptor-mediated transport between the\ nucleus and the cytoplasm. Import\ receptors bind their cargos in the cytoplasm where the concentration of RanGTP is low and release their cargos in the\ nucleus where the concentration of RanGTP is high PUBMED:12019565. Export receptors respond to Ran GTP in the opposite\ manner.

    Nuclear transport factor 2 (NTF2) is a homodimer of approximately 14kDa subunits which stimulates efficient nuclear import\ of a cargo protein. NTF2 binds to both RanGDP and FxFG repeat-containing nucleoporins. NTF2 binds to RanGDP\ sufficiently strongly for the complex to remain intact during transport through NPCs,\ but the interaction between NTF2 and FxFG nucleoporins is much more transient,\ which would enable NTF2 to move through the NPC by hopping from one repeat to\ another PUBMED:11129791, PUBMED:10930458.

    NTF2 folds into a cone with a deep hydrophobic cavity, the opening of which is surrounded by several negatively charged residues. RanGDP binds to NTF2 by inserting a conserved phenylalanine residue into the hydrophobic pocket of NTF2 and making electrostatic interactions with the conserved negatively charged residues that surround the cavity.

    \

    A structurally similar domain appears in other nuclear import proteins.

    \ ' '3540' 'IPR005835' '\

    Nucleotidyl transferases transfer nucleotides from one compound to another. This domain is found in a number of enzymes that transfer nucleotides onto phosphosugars.

    \ ' '3541' 'IPR007710' '\

    Nucleoside 2-deoxyribosyltransferase () catalyses the cleavage of the glycosidic bonds of 2-deoxyribonucleosides. Nucleoside 2-deoxyribosyltransferases can be divided into two groups based on their substrate specificity: class I enzymes are specific for the transfer of deoxyribose between two purines, while class II enzymes will transfer the deoxyribose between either purines or pyrimidines. The structure of the class I PUBMED:14992575 and class II PUBMED:8805514 enzymes are very similar. In class I enzymes, the purine base shields the active site from solvent, which the smaller pyrimidine base cannot do, while in class II enzymes the active site is shielded by a loop (residues 48-62). Both classes of enzymes are found in various Lactobacillus species and participate in nucleoside recycling in these microorganisms. This entry represents both classes of enzymes.

    \ ' '3542' 'IPR004740' '\

    This family of proteins transports nucleosides at a high affinity. The transport mechanism is driven by proton motive force. This family includes nucleoside permease (NupG) and xanthosine permease (XapB) from Escherichia coli.

    \ ' '3543' 'IPR003423' '\ The OEP family (Outer membrane efflux protein) form trimeric channels that allow export of a variety of substrates in Gram negative bacteria. Each member of this family is composed of two repeats. The trimeric channel is composed of a 12\ stranded all beta sheet barrel that spans the outer membrane, and a long all helical barrel that spans the periplasm. Examples include the Escherichia coli TolC outer membrane protein, which is required for proper expression of outer membrane protein genes; the Rhizobium nodulation protein; and the Pseudomonas FusA protein, which is involved in resistance to fusaric acid.\ ' '3544' 'IPR003154' '\ This family contains both S1 and P1 nucleases () which cleave RNA and single stranded DNA with no base specificity. \ ' '3545' 'IPR004301' '\ Nucleoplasmins are also known as chromatin decondensation proteins. They bind to core histones and transfer DNA to them in a reaction that requires ATP. This is thought to play a role in the assembly of regular\ nucleosomal arrays.\ ' '3546' 'IPR004870' '\

    This is a family of nucleoporin proteins (Nups). Nucleoporins are the main components of the nuclear pore complex in eukaryotic cells, and mediate bidirectional nucleocytoplasmic transport, especially of mRNA and proteins. Two subsets of nucleoporins that contain peptide repeats have been identified: one is characterised by the FG (Phe-Gly) repeat; the other, which is included in this family, contain WD (Trp-Asp) repeats. WD repeat Nups (Nup37, Nup43, Seh1, ALADIN, RAE, and Sec13) are thought to be involved in the assembly of structural domains of the nuclear pore complex PUBMED:14517296.

    \

    More information about these proteins can be found at Protein of the Month: Importins PUBMED:.

    \ \ ' '3548' 'IPR002668' '\

    This entry contains nucleoside transport proteins. S282_RAT is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane PUBMED:8027026. S281_RAT is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC PUBMED:7775409.

    \ ' '3549' 'IPR005549' '\

    Members of this family are components of the mitotic spindle. It has been shown that Nuf2 from yeast is part of a complex called the Ndc80p complex PUBMED:11266451. This complex is thought to bind to the microtubules of the spindle. An arabidopsis protein has been included in this family that has previously not been identified as a member of this family, . The match is not strong, but in common with other members of this family contains coiled-coil to the C-terminus of this region.

    \ ' '3551' 'IPR007252' '\

    Nup84p forms a complex with five proteins, including Nup120p, Nup85p, Sec13p, and a Sec13p homolog. This Nup84p complex in conjunction with Sec13-type proteins is required for correct nuclear pore biogenesis PUBMED:8565072.

    \ ' '3552' 'IPR006027' '\

    This domain is found in a number of functionally different proteins:

    \ \ \

    NusB is a prokaryotic transcription factor involved in antitermination processes, during which it interacts with the boxA portion of the mRNA nut site. Previous studies have shown that NusB exhibits an all-helical fold, and that the protein from Escherichia coli forms monomers, while Mycobacterium tuberculosis NusB is a dimer. The functional significance of NusB dimerization is unknown. \ \ An N-terminal arginine-rich sequence is the probable RNA binding site, exhibiting aromatic residues as potential stacking partners for the RNA bases. The RNA binding region is hidden in the subunit interface of dimeric NusB proteins, such as NusB from M. tuberculosis, suggesting that such dimers have to undergo a considerable conformational change or dissociate for engagement with RNA. In certain organisms, dimerization may be employed to package NusB in an inactive form until recruitment into antitermination complexes PUBMED:9670024, PUBMED:15279620.

    \ \

    The antitermination proteins of E. coli are recruited in the replication cycle of\ Bacteriophage lambda, where they play an important role in switching from the\ lysogenic to the lytic cycle.

    \ ' '3553' 'IPR006645' '\

    This sequence is identified by the NGN domain and is represented by the bacterial antitermination protein NusG.

    \

    This protein influences transcription termination and anti-termination and acts as a component of the transcription complex. In addition to this, it interacts with the termination factor Rho and RNA polymerase PUBMED:15162485, PUBMED:12198166.

    \ ' '3554' 'IPR005899' '\

    This family comprises distantly related, low complexity, hydrophobic small\ subunits of several related sodium ion-pumping decarboxylases. These include\ oxaloacetate decarboxylase gamma subunit and methylmalonyl-CoA decarboxylase delta subunit PUBMED:11248185.

    \ ' '3555' 'IPR005661' '\

    Members of this family are integral membrane proteins. The decarboxylation reactions they catalyse are coupled to the vectorial transport of Na+ across the cytoplasmic membrane, thereby creating a sodium ion motive force that is used for ATP synthesis PUBMED:.

    \ ' '3556' 'IPR007340' '\ This family includes the Haemophilus influenzae opacity-associated protein. This protein is required for efficient nasopharyngeal mucosal colonization, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation PUBMED:.\ ' '3558' 'IPR005038' '\ This octapeptide repeat is found in several bacterial proteins. The function of this repeat is unknown.\ ' '3559' 'IPR003421' '\

    This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. This domain is found primarily in octopine dehydrogenase (), nopaline dehydrogenase (), and lysopine dehydrogenase (). NADPH is the preferred cofactor, but NADH is also used. Octopine dehydrogenase is involved in the reductive condensation of arginine and pyruvic acid to D-octopine PUBMED:9665174.

    \

    Opine dehydrogenases can be found in both bacteria and marine cephalopods. In bacteria, some of these opine dehydrogenases are involved in crown gall tumours that are produced by Agrobacterium spp., and which encode for the opine dehydrogenases on a Ti-plasmid. These bacteria can transfer a portion of this plasmid (T-DNA) to a susceptible plant cell; the T-DNA then integrates into the plant nuclear genome, where its genes can be expressed. Some of these genes direct the synthesis and secretion of unusual amino acid and sugar derivatives called opines - these opines are used as a carbon and sometimes a nitrogen source by the infecting bacteria.

    \

    Opine dehydrogenases are also found in the marine invertebrate cephalopods (octopuses, squid, and cuttlefish). For example in marine cephalopods, octopine dehydrogenase activity in mantle muscle is significantly correlated with a species\' ability to buffer the acidic end products of anaerobic metabolism, with activity declining strongly with a species\' habitat depth PUBMED:10786948.

    \ ' '3561' 'IPR001414' '\ Ocular albinism type 1 (OA1) is an X-linked disorder characterised by severe\ impairment of visual acuity, retinal hypopigmentation and the presence of\ macromelanosomes. A novel transcript from the OA1\ critical region is expressed in high levels in RNA samples from\ retina and from melanoma and encodes a potential integral membrane\ protein PUBMED:7647783. This protein is of unknown function but is known to bind heterotrimeric G proteins.\ ' '3562' 'IPR002993' '\ Ornithine decarboxylase antizyme (ODC-AZ) PUBMED:7813017 binds to, and destabilises, ornithine decarboxylase (ODC), a key enzyme in polyamine synthesis. ODC is then rapidly degraded. The expression of ODC-AZ requires programmed, ribosomal frameshifting which is modulated according to the cellular concentration of polyamines. High levels of polyamines induce a +1 ribosomal frameshift in the translation of mRNA for the antizyme leading to the expression of a full-length protein. At least two forms of ODC-AZ exist in mammals PUBMED:9782076 and the protein has been found in Drosophila (protein Gutfeeling).\ ' '3563' 'IPR003462' '\

    This family contains the bacterial Ornithine cyclodeaminase enzyme, which catalyses the deamination of ornithine to proline PUBMED:2644238. This family also contains mu-crystallin the major component of the eye lens in several Australian marsupials, mRNA for this protein has also been found in human retina PUBMED:1384048.

    \ ' '3564' 'IPR001292' '\

    The oestrogen receptors (ERs) are steroid or nuclear hormone receptors that act as transcription regulators involved in diverse physiological functions. Oestrogen receptors function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner. The ER consists of three functional and structural domains: an N-terminal modulatory domain, a highly conserved DNA-binding domain that recognises specific sequences (), and a C-terminal ligand-binding domain ().

    \

    The N-terminal modulatory domain spans the first 180 residues and contains the activation function 1 (AF1) region. Nuclear receptors differ considerably with respect to AF1 activity and regulation, as it is a poorly conserved region PUBMED:15831449. There is another activation function region, namely AF2, which resides in the C-terminal end of the ligand-binding domain. Transcription activation is facilitated by both AF1 and AF2, which appear to act synergistically in the ER complex PUBMED:15728727, PUBMED:14612550. For example, the ER can recruit TIF2 (transcription intermediary factor 2) via the AF1 and AF2 regions, whose synergistic action results in the activation of transcription.

    \

    This entry represents the AF1-containing modulatory domain found at the N-terminus in oestrogen alpha-type receptors.

    \ \ \ ' '3565' 'IPR006757' '\ Opioid peptides act as growth factors in neural and non-neural cells and tissues, in addition to serving in neurotransmission/neuromodulation in the nervous system. The opioid growth factor receptor is an integral membrane protein associated with the nucleus. This conserved region is situated at the N-terminus of the member proteins with a series of imperfect repeats lying immediately to its C-terminal PUBMED:11890982.\ ' '3566' 'IPR006770' '\

    Opioid peptides act as growth factors in neural and non-neural cells and tissues, in addition to serving for\ neurotransmission/neuromodulation in the nervous system. The native opioid growth factor (OGF), [Met(5)]-enkephalin, is an\ inhibitory peptide that plays a role in cell proliferation and tissue organization during development, cancer, cellular renewal, wound\ healing, and angiogenesis. OGF action is mediated by a receptor mechanism, the receptor for OGF (OGFr) is an\ integral membrane protein associated with the nucleus.

    \ \ \

    OGFr is distinguished by containing a series of imperfect repeats. This entry describes a proline-rich repeat found in a human opioid growth factor receptor PUBMED:11890982.

    \ ' '3567' 'IPR007684' '\ This is a viral family of phage zinc-binding transcriptional activators, which also contains cryptic members in some bacterial genomes PUBMED:1597424. The P4 phage delta protein contains two such domains attached covalently, while the P2 phage Ogr proteins possess one domain but function as dimers. All the members of this family have the following consensus sequence: C-X(2)-C-X(3)-A-(X)2-R-X(15)-C-X(4)-C-X(3)-F PUBMED:9143285.\ ' '3568' 'IPR000310' '\ Pyridoxal-dependent decarboxylases are bacterial proteins acting on ornithine, lysine, arginine and related substrates PUBMED:8181483.\ One of the regions of sequence similarity contains a conserved lysine residue, which is the site of attachment of the pyridoxal-phosphate group.\ ' '3569' 'IPR008286' '\ Pyridoxal-dependent decarboxylases are bacterial proteins acting on ornithine, lysine, arginine and related substrates PUBMED:8181483.\ One of the regions of sequence similarity contains a conserved lysine residue, which is the site of attachment of the pyridoxal-phosphate group.\ ' '3570' 'IPR005308' '\

    This domain has a flavodoxin-like fold, and is termed the "wing" domain because of its position in the overall 3D structure. Ornithine decarboxylase from Lactobacillus 30a (L30a OrnDC, ) is representative of the large, pyridoxal-5\'-phosphate-dependent\ decarboxylases that act on lysine, arginine or ornithine. The crystal structure of the L30a OrnDC has been solved to 3.0 A resolution. Six dimers related by C6 symmetry compose the enzymatically active\ dodecamer (approximately 106 Da). Each monomer of L30a OrnDC can be described in terms of five sequential folding domains.\ The amino-terminal domain, residues 1 to 107, consists of a five-stranded beta-sheet termed the "wing" domain. Two wing domains of\ each dimer project inward towards the centre of the dodecamer and contribute to dodecamer stabilisation PUBMED:7563080.

    \ ' '3571' 'IPR000136' '\ Oleosins PUBMED:1989697 are the proteinaceous components of plants\' lipid storage bodies\ called oil bodies. Oil bodies are small droplets (0.2 to 1.5 mu-m in diameter)\ containing mostly triacylglycerol that are surrounded by a phospholipid/\ oleosin annulus. Oleosins may have a structural role in stabilising the lipid\ body during dessication of the seed, by preventing coalescence of the oil.\ They may also provide recognition signals for specific lipase anchorage in\ lipolysis during seedling growth. Oleosins are found in the monolayer lipid/\ water interface of oil bodies and probably interact with both the lipid and\ phospholipid moieties.\ Oleosins are proteins of 16 Kd to 24 Kd and are composed of three domains: an\ N-terminal hydrophilic region of variable length (from 30 to 60 residues); a\ central hydrophobic domain of about 70 residues and a C-terminal amphipathic\ region of variable length (from 60 to 100 residues). The central hydrophobic\ domain is proposed to be made up of beta-strand structure and to interact with\ the lipids PUBMED:1639802. It is the only domain whose sequence\ is conserved.\ ' '3572' 'IPR003112' '\

    The olfactomedin-domain was first identified in olfactomedin, an extracellular matrix protein of the olfactory neuroepithelium PUBMED:12615070. Members of this extracellular domain-family have since been shown to be present in several metazoan proteins, such as latrophilins, myocilins, optimedins and noelins, the latter being involved in the generation of neural crest cells. Myocilin is of considerable interest, as mutations in its olfactomedin-domain can lead to glaucoma PUBMED:14764620. The olfactomedin-domains in myocilin and optimedin are essential for the interaction between these two proteins PUBMED:12019210.

    \ ' '3573' 'IPR006665' '\

    This entry represents domain with a beta/alpha/beta/alpha-beta(2) structure found in the C-terminal region of many Gram-negative bacterial outer membrane proteins PUBMED:1538702, such as porin-like integral membrane proteins (such as ompA) PUBMED:2202726, small lipid-anchored proteins (such as pal) PUBMED:10515919, and MotB proton channels PUBMED:17052729. The N-terminal half is variable although some of the proteins in this group have the OmpA-like transmembrane domain at the N terminus. OmpA from Escherichia coli is required for pathogenesis, and can interact with host receptor molecules PUBMED:17368067. MotB (and MotA) serves two functions in E. coli, the MotA(4)-MotB(2) complex attaches to the cell wall via MotB to form the stator of the flagellar motor, and the MotA-MotB complex couples the flow of ions across the cell membrane to movement of the rotor PUBMED:17052729.

    \ ' '3574' 'IPR000498' '\

    The ompA-like transmembrane domain is present in a number of different outer membrane proteins of several Gram-negative bacteria. Many of the proteins having this domain in the N-terminal also have the conserved bacterial outer membrane protein domain at the C terminus. The outer membrane protein A of Escherichia coli (OmpA), is one of the most studied proteins in this group PUBMED:10554771. It has a multifunctional role. OmpA is required for the action of colicins K and L and for the stabilisation of mating aggregates in conjugation. It also serves as a receptor for a number of T-even like phages and can act as a porin with low permeability that allows slow penetration of small solutes PUBMED:1974149.

    \ \

    OmpA consists of a regular, extended eight-stranded beta-barrel and appears to be constructed like an inverse micelle with large water-filled cavities, but does not form a pore. The cavities seem to be highly conserved during evolution. The structure corroborates the concept that all outer membrane proteins consist of beta-barrels PUBMED:9808047. The beta-barrel membrane anchor appears to be the outer membrane equivalent of the single-chain alpha-helix anchor of the inner membrane.

    \ ' '3575' 'IPR005632' '\

    This family contains proteins annotated as OmpH (outer membrane protein H). OmpH is a major structural protein of the outer membrane. In Pasteurella multocida it acts as a channel-forming transmembrane porin PUBMED:9401047. Porins act as molecular sieves to allow the diffusion of small hydrophilic solutes through the outer membrane and also acts as a receptor for bacteriophages and bacteriocins. Porins are highly immunogenic and are conserved in bacterial families, making them attractive vaccine candidates PUBMED:10067687.

    \

    The 17-kDa protein (Skp, OmpH) of Escherichia coli is a homotrimeric periplasmic chaperone for newly synthesised outer-membrane proteins, the X-ray structure of which has been reported at resolutions of 2.35 A and 2.30 A PUBMED:15361861, PUBMED:15304217. Three hairpin-shaped alpha-helical extensions reach out by approximately 60 A from a trimerisation domain, which is composed of three intersubunit beta-sheets that wind around a central axis. The alpha-helical extensions approach each other at their distal turns, resulting in a fold that resembles a \'three-pronged grasping forcep\'. The overall shape of Skp is reminiscent of the cytosolic chaperone prefoldin (), although it is based on a radically different topology. The peculiar architecture, with apparent plasticity of the prongs and distinct electrostatic and hydrophobic surface properties, supports the recently proposed biochemical mechanism of this chaperone: formation of a Skp(3)-Omp complex protects the outer membrane protein from aggregation during passage through the bacterial periplasm.

    \ \

    The ability of Skp to prevent the aggregation of model substrates in vitro is independent of ATP. Skp can interact directly with membrane lipids and lipopolysaccharide. These interactions are needed for efficient Skp-assisted folding of membrane proteins PUBMED:15304217.

    \ ' '3576' 'IPR000036' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of aspartic peptidases belongs to the MEROPS family A26 (clan AF). The omptin family, comprises a number of novel outer membrane-associated\ serine proteases that are distinct from trypsin-like proteases in that \ they cleave polypeptides between two basically-charged amino acids PUBMED:3056908. The enzyme is sensitive to the serine protease inhibitor diisopropylfluoro-phosphate, to divalent cations such as Cu2+, Zn2+ and Fe2+ PUBMED:3056908, and is\ temperature regulated, activity decreasing at lower temperatures PUBMED:3056908, PUBMED:8288530. Temperature regulation is most prominently shown in the Yersinia pestis\ coagulase/fibrinolysin protein, where coagulase activity is prevalent \ below 30 degrees Celsius, and fibrinolysin (protease) activity is prevalent\ above this point, the optimum temperature being 37 degrees PUBMED:2526282. It is possible that this assists in \'flea blockage\' and transmission of the bacteria to animals PUBMED:2526282.

    \ \

    The Escherichia coli OmpT has previously been classified as a serine protease with Ser(99) and His(212) as active site residues. The X-ray structure of the enzyme is inconsistent with this classification, and the involvement of a nucleophilic water molecule that is activated by the Asp(210)/His(212) catalytic dyad classifies this as a aspartic endopeptidase where activity is also strongly dependent on Asp(83) and Asp(85). Both may function in binding of the water molecule and/or oxyanion stabilisation. The proposed mechanism implies a novel proteolytic catalytic site PUBMED:11576541, PUBMED:11566868.

    \ ' '3577' 'IPR005618' '\ This family includes outer membrane protein W (OmpW) proteins from a variety of bacterial species. This protein may form the receptor for S4 colicins in Escherichia coli PUBMED:10348872.\ ' '3578' 'IPR000839' '\

    The outer membrane-spanning (Oms) proteins of Borrelia burgdorferi have been\ isolated and their porin activities characterised; 0.6-nS porin activity\ was found to reside in a 28kDa protein, designated Oms28 PUBMED:8759855. The gene\ sequence of oms28 was found to encode a 257-amino-acid precursor protein\ with a putative 24-amino-acid leader peptidase I signal sequence PUBMED:8759855. The\ Oms28 protein partly fractionated to the outer membrane, and was\ characterised by an average single-channel conductance of 1.1 nS in a\ planar lipid bilayer assay, confirming Oms28 to be a porin PUBMED:8759855.

    \ ' '3579' 'IPR003394' '\ Pathogenic Neisseria spp. possess a repertoire of phase-variable opacity proteins that mediate various pathogen/host cell interactions PUBMED:10036728. These proteins are integral membrane proteins related to other porins and the Haemophilus influenzae OpA protein.\ ' '3580' 'IPR006024' '\

    Vertebrate endogenous opioid neuropeptides are released by post-translational proteolytic cleavage of precursor proteins. The precursors consist of the following components: a signal sequence that precedes a conserved region of about 50 residues; a variable-length region; and the sequence of the\ neuropeptide itself. Three types of precursor are known: preproenkephalin A \ (gene PENK), which is processed to produce 6 copies of Met-enkephalin, plus \ Leu-enkephalin; preproenkephalin B (gene PDYN), which is processed to\ produce neoendorphin, dynorphin, leumorphin, rimorphin and Leu-enkephalin; \ and prepronocipeptin (gene PNOC), whose processing produces nociceptin\ (orphanin FQ) and two other potential neuropeptides.

    \

    Sequence analysis reveals that the conserved N-terminal region of the\ precursors contains 6 cysteines, which are probably involved in disulphide\ bond formation. It is speculated that this region might be important for \ neuropeptide processing PUBMED:8710928.

    \ ' '3581' 'IPR007049' '\

    The carbohydrate-selective porin OprB family includes the Pseudomonas aeruginosa porin B, a substrate-selective channel for a variety of different sugars. This protein may facilitate diffusion of a variety of diverse compounds, but is probably restricted to carbohydrates, and does facilitate glucose fusion across the outer membrane.

    \ \ ' '3582' 'IPR005318' '\

    This family contains bacterial outer membrane porins with serine protease activity PUBMED:9636669. The serine peptidase domain belongs to MEROPS peptidase family S43 (clan PA(S)).

    \ \

    However many of these proteins are not peptidases and are classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of peptidases in the S43 family. The putative role of these protein could be to bind ligands and to facilitate the diffusion through the outer membrane.

    \ ' '3583' 'IPR004813' '\ The transporter OPT family are transporters of small oligopeptides, demonstrated\ experimentally in three different species of yeast. OPT1 is not a member of the ABC or PTR membrane transport families PUBMED:9043116.\ ' '3584' 'IPR007210' '\ This domain is a part of a high affinity multicomponent binding-protein-dependent transport system involved in bacterial osmoregulation. This domain is often fused to the permease component of the transporter complex. It is often found in integral membrane proteins or proteins predicted to be attached to the membrane by a lipid anchor. Glycine betaine is involved in protection from high osmolarity environments for example in Bacillus subtilis PUBMED:7622480. OpuBC is closely related and involved in choline transport. Choline is necessary for the biosynthesis of glycine betaine PUBMED:10216873. L-carnitine is important for osmoregulation in Listeria monocytogenes. This domain is found also in proteins binding l-proline (ProX), histidine (HisX) and taurine (TauA).\ ' '3585' 'IPR002630' '\ This family consists of orbivirus non-structural protein NS1, or hydrophobic tubular protein. NS1 has no specific function in virus replication, it is however thought to play a role in transport of mature virus particles from virus inclusion bodies to the cell membrane PUBMED:9152425. Orbivirus are part of the larger reoviridae which have a dsRNA genome of at least 10 segments encoding at least 10 viral proteins PUBMED:9152425; orbivirus found in this family include Bluetongue virus, and African horsesickness virus.\ ' '3586' 'IPR002565' '\ This is a family of Orbivirus non structural protein of unknown function, but which may play a role in release of the\ virus from infected cells PUBMED:1654377.\ ' '3587' 'IPR001742' '\

    This family contains the outer capsid, VP2 proteins from the orbiviruses; these are dsRNA viruses belonging to the Reoviridae. VP2 acts as an anchor for VP1 and VP3 and contains a non-specific DNA and RNA binding domain in the N-terminus PUBMED:9311813, PUBMED:9281498.

    \ ' '3588' 'IPR002614' '\

    This entry represents the inner layer core protein VP3 from Orbiviruses, a family of Reoviruses that have dsRNA genomes of 10-12 linear segments PUBMED:9774103. Orbiviruses include Broadhaven virus (BRD), Epizootic hemorrhagic disease virus and Bluetongue virus (BTV) PUBMED:1328474. The Orbivirus VP3 protein is part of the virus core and makes a \'subcore\' shell made up of 120 copies of the 100K protein PUBMED:9774103. VP3 particles can also bind RNA and are fundamental in the early stages of viral core formation PUBMED:9774103. The structural core protein VP2 from BRD is similar to VP3 from BTV.

    \ ' '3589' 'IPR007753' '\ Orbivirus are double stranded RNA retroviruses of which the Bluetongue virus (BTV) is a member. The core of BTV is a multienzyme complex composed of two major proteins (VP7 and VP3) and three minor proteins (VP1, VP4 and VP6) in addition to the viral genome. VP4 has been shown to perform all RNA capping activities and has both methyltransferase type 1 and type 2 activities associated with it PUBMED:9811835.\ ' '3590' 'IPR000145' '\

    The orbivirus VP5 protein is one of the two proteins (with VP2) which make up the virus particle outer capsid. Cryoelectron microscopy indicates that VP5 is a trimer suggesting that there are 360 copies of VP5 per virion PUBMED:9281498.

    \ ' '3591' 'IPR001399' '\

    Bluetongue virus VP6 protein binds ATP and exhibits an\ RNA-dependent ATPase function and a helicase activity that\ catalyses the unwinding of double-stranded RNA substrates PUBMED:9311795. VP6 from five United States\ prototype bluetongue virus (BTV) serotypes contain unusually high concentrations of glycine, \ few aromatic amino acids, but a high concentration of charged amino acids,\ a characteristic of hydrophilic proteins PUBMED:1329371.

    \ \

    VP6 is an inner capsid protein that surrounds the genomic DS-RNA. Its\ hydrophilic nature coupled with a capability to bind ss- and ds-RNA,\ suggests that it interacts directly with the BTV genomic RNA.

    \ ' '3592' 'IPR001803' '\

    Bluetongue virus is a representative of the Orbivirus genus of the Reoviridae PUBMED:7816101. Orbiviruses infect mammalian hosts through insect vectors, causing economically-important diseases of domesticated animals PUBMED:7816101. They possess a segmented, double-stranded RNA genome within a capsid that comprises four major polypeptides, designated VP2, VP3, VP5 and VP7. On entering a target cell, an outer layer, formed from VP2 and VP5, is removed, leaving an intact core within the cell PUBMED:7816101. The core, which is 70nm across, contains 780 copies of VP7, which together form 260 trimeric \'bristly\' capsomeres clothing an inner scaffold constructed from VP3 PUBMED:7816101.

    \ \

    The 3D structure of VP7 reveals two domains, one a beta-sandwich, the other a bundle of alpha-helices, and a short C-terminal arm, which is thought to unite trimers during capsid formation PUBMED:7816101. A concentration of methionine residues at the core of the molecule could provide plasticity, relieving structural mismatches during assembly PUBMED:7816101.

    \

    The 3D structure of baculovirus-expressed core protein VP7 of African horse sickness virus 4 (AHSV-4) has been determined to 2.3A resolution PUBMED:8648715. During crystallisation, the two-domain protein is cleaved, leaving only the top domain, in a manner reminiscent of BTV VP7; this suggests that connections between top and bottom domains are relatively weak for these two distinct orbiviruses PUBMED:8648715. The top domains of both BTV and AHSV VP7 are trimeric and structurally very similar. Electron density maps indicate an extra density feature along their molecular 3-fold axes, probably the result of an unidentified ion PUBMED:8648715. The characteristics of the molecular surface indicate the possibility of attachment to the cell via attachment of an Arg-Gly-Asp (RGD) motif in the top domain of VP7 to a cellular integrin for both of these orbiviruses PUBMED:8648715.

    \ ' '3593' 'IPR001704' '\

    Orexins (also known as hypocretins) are recently identified neuropeptides that are specifically localised to the hypothalamus. They are thought to interact with autonomic, neurendocrine and neuroregulatory systems, and play an important role in the regulation of feeding behaviour PUBMED:9892705, PUBMED:9419374. When applied to hypothalamic neurones, these peptides are neuroexcitatory, which action is probably mediated by their binding to a new family of G-protein-coupled receptors (orexin receptors 1 and 2), which were previously orphan PUBMED:9491897.

    \

    To date, two orexins have been characterised (orexin-A and -B), both encoded by a single mRNA transcript (prepro-orexin): orexin-A is a 33-residue peptide with two intramolecular disulphide bonds in the N-terminal region; and orexin-B is a linear 28-residue peptide. These peptides have 46% identity at the amino acid sequence level, and show some similarity to the glucagon/vasoactive intestinal polypeptide/secretin peptide family.

    \ \ ' '3594' 'IPR004060' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The rhodopsin-like GPCRs themselves represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7\ transmembrane (TM) helices PUBMED:2111655, PUBMED:2830256, PUBMED:8386361.

    \

    The hypothalamus plays a central role in the integrated control of feeding\ and energy homeostasis PUBMED:9491897. A new family of neuropeptides, orexins, have been identified that bind and activate two closely related (previously) orphan GPCRs PUBMED:9491897, PUBMED:9656726. Orexins stimulate appetite and food consumption PUBMED:9656726. Their genes are expressed bilaterally and symmetrically in the lateral hypothalamus, which has been shown to be the "feeding centre". By contrast, the "satiety centre" is expressed in the ventromedial hypothalamus and is dominated by the leptin-regulated neuropeptide network.

    \

    Both orexin receptors exhibit a similar pharmacology - the 2 orexin peptides, orexin-A and orexin-B, bind to both receptors and, in each case, agonist binding results in an increase in intracellular calcium levels. However, orexin-B shows a 10-fold selectivity for orexin receptor type 2, whilst orexin-A is equipotent at both receptors PUBMED:10498827.

    \ ' '3595' 'IPR004248' '\

    Borrelia burgdorferi supercoiled plasmids encode multicopy tandem open reading frames called Orf-A, Orf-B, Orf-C and Orf-D. This entry represents the putative Orf-D protein, which has no known function PUBMED:8655511.

    \ ' '3596' 'IPR007289' '\ This short protein has no known function and is found in Jaagsiekte sheep retrovirus. Jaagsiekte sheep retrovirus (JSRV) is the etiological agent of a contagious lung tumour of sheep known as sheep pulmonary adenomatosis. JSRV exhibits a simple genetic organization, characteristic of the type D and type B retroviruses, with the canonical retroviral sequences gag, pro, pol and env encoding the structural proteins of the virion and an additional open reading frame (orf-x), of approximately 500 bp overlapping pol PUBMED:10653922.\ ' '3597' 'IPR007203' '\

    ORMDL1 belongs to a novel gene family comprising three genes in humans (ORMDL1, ORMDL2 and ORMDL3), and homologs in yeast, microsporidia, plants, Drosophila, urochordates and vertebrates. ORMDLs are involved in protein folding in the endoplasmic reticulum.

    \ ' '3598' 'IPR000183' '\ These enzymes are collectively known as group IV decarboxylases PUBMED:8181483.\ Pyridoxal-dependent decarboxylases acting on ornithine, lysine, arginine and\ related substrates can be classified into two different families on the basis\ of sequence similarities PUBMED:3143046, PUBMED:8181483.\ Members of this family while most probably evolutionary related, do not share\ extensive regions of sequence similarities. The proteins contain a conserved lysine\ residue which is known, in mouse ODC PUBMED:1730582, to be the site of attachment of the\ pyridoxal-phosphate group. The proteins also contain a stretch of three\ consecutive glycine residues and has been proposed to be part of a substrate-\ binding region PUBMED:2198270.\ ' '3599' 'IPR000183' '\ These enzymes are collectively known as group IV decarboxylases PUBMED:8181483.\ Pyridoxal-dependent decarboxylases acting on ornithine, lysine, arginine and\ related substrates can be classified into two different families on the basis\ of sequence similarities PUBMED:3143046, PUBMED:8181483.\ Members of this family while most probably evolutionary related, do not share\ extensive regions of sequence similarities. The proteins contain a conserved lysine\ residue which is known, in mouse ODC PUBMED:1730582, to be the site of attachment of the\ pyridoxal-phosphate group. The proteins also contain a stretch of three\ consecutive glycine residues and has been proposed to be part of a substrate-\ binding region PUBMED:2198270.\ ' '3600' 'IPR002463' '\ Ornatin is a potent glycoprotein IIb-IIIa (GP IIb-IIIa) antagonist and\ platelet aggregation inhibitor PUBMED:1765068. The protein is 41-52 residues in length\ and contains the RGD recognition motif common in adhesion proteins, and\ 6 conserved cysteine residues. The sequences of ornatin isoforms B, C, D \ and E are highly similar, while isoforms A2 and A3 are less similar, lacking\ the N-terminal 9 residues. Ornatins share ~40% identity with decorsin,\ a GP IIb-IIIa antagonist isolated from the leech (Macrobdella decora) PUBMED:1765068.\ ' '3601' 'IPR003184' '\ This family of orthopoxvirus secreted proteins (also known as T1 and A41) interact with members of both the CC and CXC superfamilies of chemokines. It has been suggested that these secreted proteins modulate leukocyte influx into virus-infected tissues PUBMED:9123853.\ ' '3602' 'IPR000711' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This family represents subunits called delta in bacterial and chloroplast ATPase, or OSCP (oligomycin sensitivity conferral protein) in mitochondrial ATPase (note that in mitochondria there is a different delta subunit, ). The OSCP/delta subunit appears to be part of the peripheral stalk that holds the F1 complex alpha3beta3 catalytic core stationary against the torque of the rotating central stalk, and links subunit A of the F0 complex with the F1 complex. In mitochondria, the peripheral stalk consists of OSCP, as well as F0 components F6, B and D. In bacteria and chloroplasts the peripheral stalks have different subunit compositions: delta and two copies of F0 component B (bacteria), or delta and F0 components B and B\' (chloroplasts) PUBMED:11309608, PUBMED:16045926.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '3603' 'IPR003718' '\ Osmotically inducible protein C (OsmC) is a stress-induced protein found in Escherichia coli. The transcription of the osmC gene of E. coli is regulated as a function of the phase of growth and is induced during the late exponential phase when the growth rate slows before entry into stationary phase. The transcription is initiated by two overlapping promoters, osmCp1 and osmCp2 PUBMED:8820643.\

    An organic hydroperoxide detoxification protein (OHR) from Xanthomonas campestris pv. phaseoli is highly induced by organic hydroperoxides, weakly induced by H2O2, and not induced at all by a superoxide generator. Ohr may be a new type of organic hydroperoxide detoxification protein PUBMED:9573147, PUBMED:18084893.

    \ ' '3604' 'IPR004894' '\ This is a family of outer surface proteins from Borrelia. The function of these proteins is unknown.\ ' '3605' 'IPR003483' '\ This is a family of outer surface proteins (Osp) from the Borrelia spp. spirochete PUBMED:8982001. The family includes OspE, OspF, and OspEF-related proteins (Erp) PUBMED:8655548. These proteins are coded for on different circular plasmids in the Borrelia genome.\ ' '3606' 'IPR005653' '\

    This family of proteins are mostly uncharacterised. However the family does include Escherichia coli OstA that has been characterised as an organic solvent tolerance protein PUBMED:7811102.

    \ ' '3607' 'IPR007543' '\

    This family is involved in organic solvent tolerance in bacteria. The region contains several highly conserved, potentially catalytic, residues. ostA is one of a number of genes that confer organic solvent tolerance in Escherichia coli PUBMED:7811102, PUBMED:7765247. This protein has significant medical importance since endoscopes are disinfected by pre-cleaning and soaking them in glutaraldehyde. Tolerant bacteria may, therefore, survive this disinfecting procedure PUBMED:17241305.

    \ ' '3608' 'IPR002038' '\ The major event of endochondrial ossification is the proteolytic\ degradation of calcified cartilage and the extracellular matrix, and their\ substitution with bone-specific extracellular matrix produced and organised\ by osteoblasts PUBMED:2033080. One of the most abundant products of osteoblasts is\ osteopontin, a glycosylated phosphoprotein with a high acidic amino acid\ content and one copy of the cell attachment sequence RGD PUBMED:2033080. It is thought\ that osteopontin may act as a bridge between osteoblasts and the apatite\ mineral of the bone PUBMED:2033080. Osteopontin-K is a kidney protein, similar to\ osteopontin and probably also involved in cell adhesion PUBMED:1414488.\ ' '3609' 'IPR006131' '\

    This family contains two related enzymes:\

      \
    1. Aspartate carbamoyltransferase () (ATCase) catalyzes the conversion\ of aspartate and carbamoyl phosphate to carbamoylaspartate, the second step\ in the de novo biosynthesis of pyrimidine nucleotides PUBMED:3015959. In prokaryotes\ ATCase consists of two subunits: a catalytic chain (gene pyrB) and a\ regulatory chain (gene pyrI), while in eukaryotes it is a domain in a multi-\ functional enzyme (called URA2 in yeast, rudimentary in Drosophila, and CAD\ in mammals PUBMED:8098212) that also catalyzes other steps of the biosynthesis of\ pyrimidines.
    2. \
    3. Ornithine carbamoyltransferase () (OTCase) catalyzes the conversion\ of ornithine and carbamoyl phosphate to citrulline. In mammals this enzyme\ participates in the urea cycle PUBMED:2662961 and is located in the mitochondrial\ matrix. In prokaryotes and eukaryotic microorganisms it is involved in the\ biosynthesis of arginine. In some bacterial species it is also involved in the\ degradation of arginine PUBMED:3109911 (the arginine deaminase pathway).
    4. \
    \ It has been shown PUBMED:6379651 that these two enzymes are evolutionary related. The\ predicted secondary structure of both enzymes are similar and there are some\ regions of sequence similarities. One of these regions includes three\ residues which have been shown, by crystallographic studies PUBMED:6377306, to be\ implicated in binding the phosphoryl group of carbamoyl phosphate and is described by . The carboxyl-terminal, aspartate/ornithine-binding domain is connected to the amino-terminal\ domain by two alpha-helices, which comprise a hinge between domains PUBMED:10318893.

    \ ' '3610' 'IPR006132' '\

    This entry contains two related enzymes:\

      \
    1. Aspartate carbamoyltransferase () (ATCase) catalyzes the conversion\ of aspartate and carbamoyl phosphate to carbamoylaspartate, the second step\ in the de novo biosynthesis of pyrimidine nucleotides PUBMED:3015959. In prokaryotes\ ATCase consists of two subunits: a catalytic chain (gene pyrB) and a\ regulatory chain (gene pyrI), while in eukaryotes it is a domain in a multi-\ functional enzyme (called URA2 in yeast, rudimentary in Drosophila, and CAD\ in mammals PUBMED:8098212) that also catalyzes other steps of the biosynthesis of\ pyrimidines.
    2. \
    3. Ornithine carbamoyltransferase () (OTCase) catalyzes the conversion\ of ornithine and carbamoyl phosphate to citrulline. In mammals this enzyme\ participates in the urea cycle PUBMED:2662961 and is located in the mitochondrial\ matrix. In prokaryotes and eukaryotic microorganisms it is involved in the\ biosynthesis of arginine. In some bacterial species it is also involved in the\ degradation of arginine PUBMED:3109911 (the arginine deaminase pathway).
    4. \
    \ It has been shown PUBMED:6379651 that these two enzymes are evolutionary related. The\ predicted secondary structure of both enzymes are similar and there are some\ regions of sequence similarities. One of these regions includes three\ residues which have been shown, by crystallographic studies PUBMED:6377306, to be\ implicated in binding the phosphoryl group of carbamoyl phosphate and may also play a role in trimerization of the molecules PUBMED:10318893. The carboxyl-terminal, aspartate/ornithine-binding domain is is described by . \

    \ ' '3611' 'IPR003323' '\

    This is a group of proteins found primarily in viruses, eukaryotes and in the pathogenic bacterium Chlamydia pneumoniae. In viruses they are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65).

    \ \ \

    None of these proteins has a known\ biochemical function but low sequence similarity with the polyprotein regions\ of arteriviruses, and conserved cysteine and histidine, and possibly the aspartate, residues suggests that those not yet recognised as peptidases could possess cysteine protease activity PUBMED:10664582.

    \ ' '3612' 'IPR001155' '\

    The TIM-barrel fold is a closed barrel structure composed of an eight-fold repeat of beta-alpha units, where the eight parallel beta strands on the inside are covered by the eight alpha helices on the outside PUBMED:11257493. It is a widely distributed fold which has been found in many enzyme families that catalyse completely unrelated reactions PUBMED:12206759. The active site is always found at the C-terminal end of this domain.

    \

    Proteins in this entry are a variety of NADH:flavin oxidoreductase/NADH oxidase enzymes, found mostly in bacteria or fungi, that contain a TIM-barrel fold. They commonly use FMN/FAD as cofactor and include:\

    \

    \ ' '3613' 'IPR000572' '\

    A number of different eukaryotic oxidoreductases that require and bind a molybdopterin cofactor have been shown PUBMED:2015248 to share a few regions of sequence similarity. These enzymes include xanthine dehydrogenase (), aldehyde oxidase (), nitrate reductase (), and sulphite oxidase (). The multidomain redox enzyme NAD(P)H:nitrate reductase (NR) catalyses the reduction of nitrate to nitrite in a single polypeptide electron transport chain with electron flow from NAD(P)H-FAD-cytochrome b5-molybdopterin-NO(3). Three forms of NR are known, an \ NADH-specific enzyme found in higher plants and algae (); an NAD(P)H-bispecific enzyme found in higher plants, \ algae and fungi (); and an NADPH-specific enzyme found only in fungi () PUBMED:2204158. The mitochondrial enzyme sulphite oxidase (sulphite:ferricytochrome c oxidoreductase; ) catalyses oxidation of \ sulphite to sulphate, using cytochrome c as the physiological electron acceptor. Sulphite oxidase consists of two structure/function domains, an N-terminal haem domain, similar to cytochrome b5; and a C-terminal molybdopterin domain \ PUBMED:9428520.

    \ ' '3614' 'IPR000510' '\ Enzymes belonging to this family include cofactor-requiring nitrogenases and protochlorophyllide reductase. The key enzymatic reactions in nitrogen fixation are catalysed by the nitrogenase complex, which has two components, the iron protein (component 2), and a component (component 1) which is either a molybdenum-iron, vanadium-iron or iron-iron protein. The enzyme () forms a hexamer of two alpha, two beta and two delta chains. Protochlorophyllide reductase () is involved in the light-dependent accumulation of chlorophyll, probably at the step of reduction of protochlorophyllide to chlorophyllide.\ ' '3615' 'IPR001750' '\

    This domain is found in the NADH:ubiquinone oxidoreductase (complex I) and NADH-plastoquinone oxidoreductase PUBMED:1470679.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '3616' 'IPR001516' '\

    This domain represents an N-terminal extension of . It contains NADH-Ubiquinone chain 5 and eubacterial chain L; these are found in the NADH:ubiquinone oxidoreductase (complex I) which catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane PUBMED:1470679.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '3617' 'IPR001133' '\

    This entry represents NADH:ubiquinone oxidoreductase, chain 4L, as well as NADH-quinone oxidoreductase (). In eukaryotes, these enzymes are usually found in either mitochondria or chloroplasts as part of the respiratory-chain NADH dehydrogenase (also known as complex I or\ NADH-ubiquinone oxidoreductase), an oligomeric enzymatic complex PUBMED:15843018. However, they are also found in bacteria PUBMED:18307315 and archaea PUBMED:10940377.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '3618' 'IPR001457' '\

    this entry represents chain 6 from NADH:ubiquinone oxidoreductase and NADH-plastoquinone oxidoreductase. Bacterial proton-translocating NADH-quinone oxidoreductase (NDH-1) is composed of 14 different subunits. The chain belonging to this family is a subunit that constitutes the membrane sector of the complex. It reduces ubiquinone to ubiquinol utilising NADH. Plant chloroplastic NADH-plastoquinone oxidoreductase reduces plastoquinone to plastoquinol. Mitochondrial NADH-ubiquinone oxidoreductase from a variety of sources reduces ubiquinone to ubiquinol.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '3619' 'IPR000440' '\

    This family contains chain 3 of the NADH-ubiquinone / plastoquinone oxidoreductase.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '3620' 'IPR000260' '\

    This entry represents chain 4 of NADH:ubiquinone oxidoreductase (complex I) PUBMED:1470679. This signature is found upstream of .

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '3621' 'IPR006137' '\

    Among the many polypeptide subunits that make up complex I, there is one with a molecular weight of 20 kDa (in mammals) PUBMED:1577158, which is a component of the iron-sulphur (IP) fragment of the enzyme. It seems to bind a 4Fe-4S iron-sulphur cluster. The 20 kDa subunit has been found to be nuclear encoded, as a precursor form with a transit peptide in mammals, and in Neurospora crassa. It is and chloroplast encoded in various higher plants (gene ndhK or psbG).

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '3622' 'IPR000648' '\ A number of eukaryotic proteins that seem to be involved with sterol synthesis and/or its regulation have\ been found PUBMED:8017104 to be evolutionary related. These include mammalian oxysterol-binding protein\ (OSBP), a protein of about 800 amino-acid residues that binds a variety of oxysterols (oxygenated derivatives\ of cholesterol); yeast OSH1, a protein of 859 residues that also plays a role in ergosterol synthesis; yeast\ proteins HES1 and KES1, highly related proteins of 434 residues that seem to play a role in ergosterol synthesis;\ and yeast hypothetical proteins YHR001w, YHR073w and YKR003w.\ ' '3623' 'IPR002187' '\

    In Gram-negative bacteria, the activity and concentration of glutamine synthetase (GS) is regulated in response to nitrogen source availability. PII, a tetrameric protein encoded by the glnB gene, is a component of the adenylation cascade involved in the regulation of GS activity PUBMED:1702507. In nitrogen-limiting conditions, when the ratio of glutamine to 2-ketoglutarate decreases, P-II is uridylylated on a tyrosine residue to form P-II-UMP. P-II-UMP allows the deadenylation of GS, thus activating the enzyme. Conversely, in nitrogen excess, P-II-UMP is deuridylated and then promotes the adenylation of GS. P-II also indirectly controls the transcription of the GS gene (glnA) by preventing NR-II (ntrB) to phosphorylate NR-I (ntrC) which is the transcriptional activator of glnA. Once P-II is uridylylated, these events are reversed.

    \

    P-II is a protein of about 110 amino acid residues extremely well conserved. The tyrosine which is uridylated is located in the central part of the protein. In cyanobacteria, P-II seems to be phosphorylated on a serine residue rather than being uridylated. In methanogenic archaebacteria, the nitrogenase iron protein gene (nifH) is followed by two open reading frames highly similar to the eubacterial P-II protein PUBMED:2068380. These proteins could be involved in the regulation of nitrogen fixation. In the red alga, Porphyra purpurea, there is a glnB homologue encoded in the chloroplast genome.

    \

    Other proteins highly similar to glnB are:

    \ \ ' '3624' 'IPR005919' '\

    Phosphomevalonate kinase () catalyzes the phosphorylation of 5-phosphomevalonate into 5-diphosphomevalonate,\ an essential step in isoprenoid biosynthesis via the mevalonate pathway. In an example of nonorthologous gene displacement, two different types of phosphomevalonate kinase are found - the higher eukaryotic form and the ERG8 type. This model represents the form of the enzyme found in animals.

    \ \ ' '3625' 'IPR006789' '\ The Arp2/3 protein complex has been implicated in the control of actin polymerisation. The human complex consists of seven subunits which include the actin related proteins Arp2 and Arp3, and five others referred to as p41-Arc, p34-Arc, p21-Arc, p20-Arc, and p16-Arc. The precise function of p16-Arc is currently unknown. Its structure consists of a single domain containing a bundle of seven alpha helices PUBMED:9230079, PUBMED:11721045.\ ' '3626' 'IPR001429' '\

    P2X purinoceptors are cell membrane ion channels, gated by adenosine 5\'-triphosphate (ATP) and other nucleotides; they have been found to be widely expressed on mammalian cells, and, by means of their functional properties, can be differentiated into three sub-groups. The first group is almost equally well activated by ATP and its analogue alphabetamethyleneATP, whereas, the second group is not activated by the latter compound. A third type of receptor (also called P2Z) is distinguished by the fact that repeated or prolonged agonist application leads to the opening of much larger pores, allowing large molecules to traverse the cell membrane. This increased permeability rapidly leads to cell death, and lysis.

    \ \

    Molecular cloning studies have identified seven P2X receptor subtypes, designated P2X1-P2X7. These receptors are proteins that share 35-48% amino acid identity, and possess two putative transmembrane (TM) domains, separated by a long (~270 residues) intervening sequence, which is thought to form an extracellular loop. Around 1/4 of the residues within the loop are invariant between the cloned subtypes, including 10 characteristic cysteines.

    \ \

    Studies of the functional properties of heterologously expressed P2X receptors, together with the examination of their distribution in native tissues, suggests they likely occur as both homo- and heteromultimers in vivo PUBMED:10414359, PUBMED:12270951.

    \ \

    This entry represents all P2X purinoreceptor subtypes.

    \ ' '3627' 'IPR007188' '\ Arp2/3 protein complex has been implicated in the control of actin polymerisation in cells. The human complex consists of seven subunits, which include the actin related Arp2 and Arp3, and five others referred to as p41-Arc, p34-Arc, p21-Arc, p20-Arc, and p16-Arc PUBMED:9230079. This family represents the p34-Arc subunit.\ ' '3628' 'IPR001128' '\

    Cytochrome P450 enzymes are a superfamily of haem-containing mono-oxygenases that are found in all kingdoms of life, and which show extraordinary diversity in their reaction chemistry. In mammals, these proteins are found primarily in microsomes of hepatocytes and other cell types, where they oxidise steroids, fatty acids and xenobiotics, and are important for the detoxification and clearance of various compounds, as well as for hormone synthesis and breakdown, cholesterol synthesis and vitamin D metabolism. In plants, these proteins are important for the biosynthesis of several compounds such as hormones, defensive compounds and fatty acids. In bacteria, they are important for several metabolic processes, such as the biosynthesis of antibiotic erythromycin in Saccharopolyspora erythraea (Streptomyces erythraeus).

    \

    Cytochrome P450 enzymes use haem to oxidise their substrates, using protons derived from NADH or NADPH to split the oxygen so a single atom can be added to a substrate. They also require electrons, which they receive from a variety of redox partners. In certain cases, cytochrome P450 can be fused to its redox partner to produce a bi-functional protein, such as with P450BM-3 from Bacillus megaterium PUBMED:17023115, which has haem and flavin domains.

    \

    Organisms produce many different cytochrome P450 enzymes (at least 58 in humans), which together with alternative splicing can provide a wide array of enzymes with different substrate and tissue specificities. Individual cytochrome P450 proteins follow the nomenclature: CYP, followed by a number (family), then a letter (subfamily), and another number (protein); e.g. CYP3A4 is the fourth protein in family 3, subfamily A. In general, family members should share >40% identity, while subfamily members should share >55% identity.

    \

    Cytochrome P450 proteins can also be grouped by two different schemes. One scheme was based on a taxonomic split: class I (prokaryotic/mitochondrial) and class II (eukaryotic microsomes). The other scheme was based on the number of components in the system: class B (3-components) and class E (2-components). These classes merge to a certain degree. Most prokaryotes and mitochondria (and fungal CYP55) have 3-component systems (class I/class B) - a FAD-containing flavoprotein (NAD(P)H-dependent reductase), an iron-sulphur protein and P450. Most eukaryotic microsomes have 2-component systems (class II/class E) - NADPH:P450 reductase (FAD and FMN-containing flavoprotein) and P450. There are exceptions to this scheme, such as 1-component systems that resemble class E enzymes PUBMED:16042601, PUBMED:15128046, PUBMED:8637843. The class E enzymes can be further subdivided into five sequence clusters, groups I-V, each of which may contain more than one cytochrome P450 family (eg, CYP1 and CYP2 are both found in group I). The divergence of the cytochrome P450 superfamily into B- and E-classes, and further divergence into stable clusters within the E-class, appears to be very ancient, occurring before the appearance of eukaryotes.

    \ \

    More information about these proteins can be found at Protein of the Month: Cytochrome P450 PUBMED:.

    \ ' '3629' 'IPR011615' '\

    This domain is found in p53 transcription factors, where it is responsible for DNA-binding. These transcription factors play diverse roles in the regulation of cellular functions: the p53 tumour suppressor upregulates the expression of genes involved in cell cycle arrest and apoptosis PUBMED:12826037. The DNA-binding domain acts to clamp, or in the case of TonEBP, encircle the DNA target in order to stabilise the protein-DNA complex PUBMED:11780147. Protein interactions may also serve to stabilise the protein-DNA complex, for example in the STAT-1 dimer the SH2 (Src homology 2) domain in each monomer is coupled to the DNA-binding domain to increase stability PUBMED:9630226. The DNA-binding domain consists of a beta-sandwich formed of 9 strands in 2 sheets with a Greek-key topology. This structure is found in many transcription factors, often within the DNA-binding domain.

    \ ' '3631' 'IPR006730' '\

    Exposure of mammalian cells to hypoxia, radiation and certain chemotherapeutic agents promotes cell cycle arrest and/or apoptosis.\ Activation of p53 responsive genes is believed to play an important role in mediating such responses. PA26 is differentially induced\ by genotoxic stress (UV, gamma-irradiation and cytotoxic drugs) in a p53-dependent manner.\ \ PA26 gene is a novel p53 target gene with properties common to the GADD family of growth arrest and\ DNA damage-inducible stress-response genes, and, thus, a potential novel regulator of cellular growth PUBMED:9926927. A homolgue found in Xenopus, XPA26, was initially detected in the anterior portion of developing notochord at neurula stages, and later in the entire\ notochord except its posterior region at tailbud stages PUBMED:11165487.

    \ ' '3632' 'IPR003185' '\

    PA28 activator complex (also known as 11S regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha (PA28alpha) and beta (PA28beta) subunits. The catalytic properties of PA28alpha and PA28beta-activated proteosome are similar PUBMED:9346951, PUBMED:9228287. This entry represents the alpha subunit. The activator complex binds to the 20S proteasome and stimulates peptidase activity in and ATP-independent manner.

    \ ' '3633' 'IPR003186' '\

    PA28 activator complex (also known as 11S regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha (PA28alpha) and beta (PA28beta) subunits. The catalytic properties of PA28alpha and PA28beta-activated proteosome are similar PUBMED:11147828, PUBMED:9346951. This entry represents the beta subunit. The activator complex binds to the 20S proteasome and stimulates peptidase activity in and ATP-independent manner.

    \ ' '3634' 'IPR007814' '\ This family includes proteins such as PaaA and PaaC that are part of a catabolic pathway of phenylacetic acid PUBMED:9748275. These proteins may form part of a dioxygenase complex.\ ' '3635' 'IPR004020' '\

    Pyrin domain was identified as putative protein-protein interaction domain at the N-terminal region of several proteins thought to function in apoptotic and inflammatory signalling pathways. Using secondary structure prediction and potential-based fold recognition methods, the PYRIN domain is predicted to be a member of the six-helix bundle death domain-fold superfamily that includes death domains (DDs), death effector domains (DEDs), and caspase recruitment domains (CARDs). Members of the death domain-fold superfamily are well established mediators of protein-protein interactions found in many proteins involved in apoptosis and inflammation, indicating further that the PYRIN domains serve a similar function. Comparison of a circular dichroism spectrum of the PYRIN domain of CARD7/DEFCAP/NAC/NALP1 with spectra of several proteins known to adopt the death domain-fold provides experimental support for the structure prediction PUBMED:11514682 It is found in interferon-inducible proteins, pyrin and myeloid cell nuclear differentiation antigen.

    \ ' '3636' 'IPR002004' '\

    The polyadenylate-binding protein (PABP) has a conserved C-terminal domain (PABC), which is also found in the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains () PUBMED:11287654. PABP recognises the 3\' mRNA poly(A) tail and plays an essential role in eukaryotic translation initiation and mRNA stabilisation/degradation. PABC domains of PABP are peptide-binding domains that mediate PABP homo-oligomerisation and protein-protein interactions. In mammals, the PABC domain of PABP functions to recruit several different translation factors to the mRNA poly(A) tail PUBMED:11940585.

    \ ' '3637' 'IPR013530' '\

    In the presence of calcium ions, Protein-arginine deiminase (PAD) enzymes catalyse the\ post-translational modification reaction responsible for the formation of citrulline residues from protein-bound arginine residues PUBMED:10092850. Four PAD isotypes of PAD have been identified in mammals, a fifth may also exist. Non-mammalian vertebrates appear to have only a single PAD enzyme. All known natural substrates of PAD are proteins known to have an important structural function, such as keratin (PAD1), intermediate filaments or proteins associated with intermediate filaments. Citrulination may have consequences for the structural integrity and interactions of these proteins. Physiological levels of calcium appear to be too low to activate these enzymes suggesting a role between PAD activation and loss of calcium homeostasis during terminal differentiation and cell death (apoptosis).

    \ \ \ \ \ ' '3638' 'IPR007466' '\

    Peptidyl-arginine deiminase (PAD) enzymes catalyse the deimination of the guanidino group from carboxy-terminal arginine residues of various peptides to produce ammonia. PAD from Porphyromonas gingivalis (Bacteroides gingivalis) (PPAD) appears to be evolutionarily unrelated to mammalian PAD (), which is a metalloenzyme. PPAD is thought to belong to the same superfamily as aminotransferase and arginine deiminase, and to form an alpha/beta propeller structure. This family has previously been named PPADH (Porphyromonas peptidyl-arginine deiminase homologs) PUBMED:11504612. The predicted catalytic residues in PPAD () are Asp130, Asp187, His236, Asp238 and Cys351 PUBMED:11504612. These are absolutely conserved with the exception of Asp187 which is absent in two family members. PPAD is also able to catalyse the deimination of free L-arginine, but has primarily peptidyl-arginine specificity. It may have a FMN cofactor PUBMED:10377098.

    \ ' '3639' 'IPR005149' '\

    Phenolic acids, also called substituted hydroxycinnamic acids, are abundant in the plant kingdom because they are involved in the structure of plant cell walls and are present in some vacuoles. In plant-soil ecosystems they are released as free acids by hemicellulases produced by several fungi and bacteria. Of these weak acids, the most abundant are p-coumaric, ferulic, and caffeic acids, considered to be natural toxins that inhibit the growth of microorganisms, especially at low pHs. In spite of this chemical stress, some bacteria can use phenolic acids as a sole source of carbon. For other microorganisms, these compounds induce a specific response by which the organism adapts to its environment. The ubiquitous lactic acid bacterium Lactobacillus plantarum exhibits an inducible phenolic acid decarboxylase (PAD) activity which converts these substrates into less-toxic vinyl phenol derivatives. PadR acts as a repressor of padA gene expression in the phenolic acid stress response PUBMED:15066807.

    \ ' '3640' 'IPR004963' '\ This family contains a number of uncharacterised proteins. Some of these are thought to be putative pectinacetylesterases.\ ' '3641' 'IPR005065' '\

    Platelet-activating factor acetylhydrolase (PAF-AH) is a subfamily of phospholipase A2, , responsible for inactivation of platelet-activating factor through cleavage of an\ acetyl group. Three known PAF-AHs are the brain heterotrimeric PAF-AH Ib, the extracellular,\ plasma PAF-AH (pPAF-AH), and the intracellular PAF-AH isoform II (PAF-AH II).

    \ ' '3642' 'IPR007133' '\ Members of this family are components of the RNA polymerase II associated Paf1 complex. The Paf1 complex functions during the elongation phase of transcription in conjunction with Spt4-Spt5 and Spt16-Pob3i PUBMED:11927560, PUBMED:11884586.\ ' '3643' 'IPR003822' '\

    This family contains the paired amphipathic helix (PAH) repeat. The family contains the eukaryotic Sin3 proteins, which have at least three PAH domains (PAH1, PAH2, and PAH3). Sin3 proteins are components of a co-repressor complex that silences transcription, playing important roles in the transition between proliferation and differentiation. Sin3 proteins are recruited to the DNA by various DNA-binding transcription factors such as the Mad family of repressors, Mnt/Rox, PLZF, MeCP2, p53, REST/NRSF, MNFbeta, Sp1, TGIF and Ume6 PUBMED:11101889. Sin3 acts as a scaffold protein that in turn recruits histone-binding proteins RbAp46/RbAp48 and histone deacetylases HDAC1/HDAC2, which deacetylate the core histones resulting in a repressed state of the chromatin PUBMED:14705930. The PAH domains are protein-protein interaction domains through which Sin3 fulfils its role as a scaffold. The PAH2 domain of Sin3 can interact with a wide range of unrelated and structurally diverse transcription factors that bind using different interaction motifs. For example, the Sin3 PAH2 domain can interact with the unrelated Mad and HBP1 factors using alternative interaction motifs that involve binding in opposite helical orientations PUBMED:15235594.

    \ ' '3644' 'IPR001106' '\

    This entry represents phenylalanine ammonia-lyase (PAL; ) and the mechanistically related protein histidine ammonia lyase (HAL; ). Both contain a catalytic Ala-Ser-Gly triad that is post-translationally cyclised PUBMED:16478474. PAL is a key biosynthetic catalyst in phenylpropanoid assembly in plants and fungi, and is involved in the biosynthesis of a wide variety of secondary metabolites such as flavanoids, furanocoumarin phytoalexins and cell wall components. These compounds are important for normal growth and in responses to environmental stress. HAL catalyses the first step in\ histidine degradation, the removal of an ammonia group from histidine to produce\ urocanic acid. The core domain in PAL and Hal share about 30% sequence identity, with PAL containing an additional approximately 160 residues extending from the common fold PUBMED:15350127.

    \ ' '3645' 'IPR002472' '\ Neuronal ceroid lipofuscinoses (NCL) represent a group of encephalopathies\ that occur in 1 in 12,500 children. Mutations in the palmitoyl protein thioesterase gene causing infantile neuronal\ ceroid lipofuscinosis PUBMED:7637805. \ \ The most common mutation results in intracellular\ accumulation of the polypeptide and undetectable enzyme activity in\ the brain.\ Direct sequencing of cDNAs derived from brain RNA of INCL patients has\ shown a mis-sense transversion of A to T at nucleotide position 364, which\ results in substitution of Trp for Arg at position 122 in the protein - \ Arg 122 is immediately adjacent to a lipase consensus sequence that \ contains the putative active site Ser of PPT. The occurrence of this and\ two other independent mutations in the PPT gene strongly suggests that\ defects in this gene cause INCL.\ ' '3646' 'IPR003721' '\ D-Pantothenate is synthesized via four enzymes from ketoisovalerate, which is an\ intermediate of branched-chain amino acid synthesis PUBMED:10223988.\ Pantoate-beta-alanine ligase, also know as pantothenate synthase, () catalyzes the formation of\ pantothenate from pantoate and alanine in the pantothenate biosynthesis pathway PUBMED:8760912.\ ' '3647' 'IPR003700' '\

    The panB gene from Escherichia coli encodes the first enzyme of the pantothenate biosynthesis pathway, ketopantoate hydroxymethyltransferase (KPHMT) . Fungal ketopantoate hydroxymethyltransferase is essential for the biosynthesis of coenzyme A, while the pathway intermediate 4\'-phosphopantetheine is required for penicillin production PUBMED:10503542.

    \ ' '3648' 'IPR003861' '\ This is is a family of Papillomavirus proteins, E4, coded for by ORF4. A splice variant, E1--E4, exists but the function of neither E4 nor E1--E4 is known PUBMED:9454695.\ ' '3649' 'IPR006843' '\ This family identifies a conserved region found in a number of plastid lipid-associated proteins (PAPs), and in a number of putative fibrillin proteins.\ ' '3650' 'IPR007010' '\

    In eukaryotes, polyadenylation of pre-mRNA plays an essential role in the initiation step of protein synthesis, as well as in the export and stability of mRNAs. Poly(A) polymerase, the enzyme at the heart of the polyadenylation machinery, is a template-independent RNA polymerase that specifically incorporates ATP at the 3\' end of mRNA. The crystal structure of bovine poly(A) polymerase bound to an ATP analogue at 2.5 A resolution has been determined PUBMED:10944102. The structure revealed expected and unexpected similarities to other proteins. As expected, the catalytic domain of poly(A) polymerase shares substantial structural homology with other nucleotidyl transferases such as DNA polymerase beta and kanamycin transferase.

    \ \

    The C-terminal domain unexpectedly folds into a compact domain reminiscent of the RNA-recognition motif fold. The three invariant aspartates of the catalytic triad ligate two of the three active site metals. One of these metals also contacts the adenine ring. Furthermore, conserved, catalytically important residues contact the nucleotide. These contacts, taken together with metal coordination of the adenine base, provide a structural basis for ATP selection by poly(A) polymerase.

    \ \ ' '3651' 'IPR006880' '\ This is a group of proteins with a conserved C-terminal region which is found in PAPA-1, a PAP-1 binding protein, . \ ' '3652' 'IPR004356' '\

    P pili, or fimbriae, are ~68A in diameter and 1 micron in length, the\ bulk of which is a fibre composed of the main structural protein PapA PUBMED:1348107.\ At its tip, the pilus is terminated by a fibrillum consisting of repeating\ units of the PapE protein. This, in turn, is topped by the adhesins, PapF\ and PapG, both of which are needed for receptor binding. The tip fibrillum\ is anchored to the main PapA fibre by the PapK pilus-adaptor protein. PapH,\ an outer membrane protein, then anchors the entire rod in the bacterial\ envelope PUBMED:7816100. A cytoplasmic chaperone (PapD) assists in assembling the \ monomers of the macromolecule in the membrane.

    \

    All of the functional pap genes are arranged in a cluster (operon) on the \ Escherichia coli genome. It is believed that selective pressure exerted by the \ host\'s urinal and intestinal tract isoreceptors forced the spread of this \ operon to other strains via lateral transfer PUBMED:1357526. PapB, encoded within the \ cluster, acts as a transcriptional regulator of the functional pap genes\ and is located in the bacterial cytoplasm PUBMED:2568258. Its mechanism involves\ differential binding to separate sites in the cluster, suggesting that \ this protein is both an activator and repressor of pilus-adhesion \ transcription. The protein shares similarity with other E. coli fimbrial-\ adhesion transcription regulators, such as AfaA, DaaA and FanB.\

    \ ' '3653' 'IPR005309' '\

    PapG, the adhesin of the P-pili, is situated at the tip and is only a minor component of the whole pilus structure. A two-domain structure has been postulated for PapG; a carbohydrate binding N-terminus and chaperone binding C-terminus (this domain). The chaperone-binding domain is highly conserved, and is essential for the correct assembly of the pili structure when aided by the chaperone molecule PapD PUBMED:11454740, PUBMED:11440716.

    \ ' '3654' 'IPR005310' '\

    PapG, the adhesin of the P-pili, is situated at the tip and is only a minor component of the whole pilus structure. A two-domain structure has been postulated for PapG; a carbohydrate binding N-terminus (this domain) and chaperone binding C-terminus. The carbohydrate-binding domain interacts with the receptor glycan PUBMED:11454740, PUBMED:11440716.

    \ ' '3655' 'IPR004270' '\ The E5 protein from papillomaviruses is about 80 amino acids long and contain three regions that have been predicted to be transmembrane alpha helices. The function of this protein is unknown.\ ' '3656' 'IPR003354' '\ This domain represents a conserved region in papovavirus small and middle T-antigens. It is found as the N-terminal domain in the small T-antigen, and is centrally located in the middle T-antigen.\ ' '3657' 'IPR002500' '\ This domain is found in phosphoadenosine phosphosulphate (PAPS) reductase\ enzymes or PAPS sulphotransferase. PAPS reductase is part of the adenine \ nucleotide alpha hydrolases superfamily also including N type ATP PPases\ and ATP sulphurylases PUBMED:9261082. The enzyme uses thioredoxin as an electron \ donor for the reduction of PAPS to phospho-adenosine-phosphate (PAP) PUBMED:9261082, PUBMED:7588765.\ It is also found in NodP nodulation protein P from Rhizobium meliloti (Sinorhizobium meliloti) which has ATP\ sulphurylase activity (sulphate adenylate transferase) PUBMED:2250719.\ ' '3659' 'IPR004965' '\

    Paralemmin was identified in the chicken lens as a protein with a molecular weight of 65 kDa (isoform 1) and a splice variant of 60 kDa (isoform 2). Isoform 2 is predominant during infancy and levels of isoform 1 increase with age. Paralemmin is localised to the plasma membrane of fibre cells, and was not detected in the annular pad cells. Its localisation to the short side of the fibre cell and the sites of fibre cell interlocking suggests that paralemmin may play a role in the development of such interdigitating processes PUBMED:12874826. Palmitoylation is important for localising these proteins to the filopodia of dendritic cells where they have been implicated in the regulation of membrane dynamics and process outgrowth.

    \ ' '3660' 'IPR002895' '\ The G surface protein of Paramecium primaurelia has important internal homologies and a periodic structure, which could be dictated in part by the rigid scaffolding of cysteine residues. The predicted secondary structure shows a quasi absence of alpha-helix and an abundance of beta-pleated sheets and random coils. The monotony of the amino acid sequence is in favour of a structural role for the protein PUBMED:3783679. This structure is based on the presence of 37 periods of about 75 residues, each period containing eight cysteine residues PUBMED:2308165. Homologies with other proteins are limited to surface antigens of trypanosomes.\ ' '3661' 'IPR004897' '\

    Paramyxoviral P genes are able to generate more than one product, using alternative reading frames and RNA editing. The P gene encodes the structural phosphoprotein P. In addition, it encodes several non-structural proteins present in the infected cell but not in the virus particle. This family includes phosphoprotein P and the non-structural phosphoprotein V from different paramyxoviruses. Phosphoprotein P is essential for the activity of the RNA polymerase complex which it forms with another subunit, L . Although all the catalytic activities of the polymerase are associated with the L subunit, its function requires specific interactions with phosphoprotein P PUBMED:11336555. The P and V phosphoproteins are amino co-terminal, but diverge at their C-termini. This difference is generated by an RNA-editing mechanism in which one or two non-templated G residues are inserted into P-gene-derived mRNA. In Measles virus and Sendai virus, one G residue is inserted and the edited transcript encodes the V protein. In Mumps virus, Simian virus 5 and Newcastle disease virus, two G residues are inserted, and the edited transcript codes for the P protein PUBMED:11336555. Being phosphoproteins, both P and V are rich in serine and threonine residues over their whole lengths. In addition, the V proteins are rich in cysteine residues at the C-termini PUBMED:8277263.

    \ \ ' '3662' 'IPR001016' '\

    Paramyxoviridae, like other non-segmented negative strand RNA viruses, have an RNA-dependent RNA polymerase composed of two subunits, a large protein L and a phosphoprotein P. The L protein confers the RNA polymerase activity on the complex while the P protein acts as a transcription factor PUBMED:9224928.

    \ ' '3663' 'IPR002608' '\

    This family consist of the C proteins (C\', C, Y1, Y2) found in the Paramyxovirinae, e.g. Human parainfluenza virus 3, and Sendai virus. The C proteins effect viral RNA synthesis having both a positive and negative effect during the course of infection PUBMED:9621061.\ The paramyxovirinae have a negative-strand ssRNA genome of 15.3 kb from which six mRNAs are transcribed, five of these are monocistronic. \ The P/C mRNA is polycistronic and has two overlapping open reading frames P and C, C encodes the nested C proteins C\', C, Y1 and Y2 PUBMED:2542021.

    \ ' '3664' 'IPR002021' '\ The nucleocapsid protein is referred to as NP. NP is is the major\ structural component of the nucleocapsid. The protein is approx.\ 58 kDa. 2600 NP molecules go to tightly encapsidate the viral RNA.\ NP interacts with several other viral encoded proteins, all of which are \ involved in controlling replication: NP-NP, NP-P, NP-(PL), \ and NP-V PUBMED:9125045, PUBMED:8806522, PUBMED:8396656.\ ' '3665' 'IPR003875' '\ This family consists of the polymerase accessory protein C from members of the paramyxoviridae.\ ' '3666' 'IPR002693' '\

    Sendai virus is a member of the Paramyxovirinae family. Its negative-sense ssRNA genome is packaged by the viral nucleoprotein (N) within a helical nucleocapsid. Paramyxovirinae use this N-RNA (nucleoprotein-RNA) complex as a template for both transcription and replication. During viral genome replication, the synthesis of viral RNA and its encapsidation by N are concomitant. Viral transcription and replication are carried out by viral RNA-dependent RNA polymerase, which consists of two proteins: L polymerase and phosphoprotein P. The L polymerase carries the enzyme activity. Phosphoprotein P binds the viral nucleocapsid, and positions the L polymerase on the template for transcription and replication formed by nucleoprotein-RNA (N-RNA) PUBMED:10400742.

    \

    This entry represents phosphoprotein P from Sendai virus as well as from close family members. Phosphoprotein P, an indispensable subunit of the viral polymerase complex, is a modular protein organised into two moieties that are both functionally and structurally distinct: a well-conserved C-terminal moiety that contains all the regions required for transcription, and a poorly conserved, intrinsically unstructured N-terminal moiety that provides several additional functions required for replication. The N-terminal moiety is responsible for binding to newly synthesized free N(0) (nucleoprotein that has not yet bound RNA), in order to prevent the binding of N(0) to cellular RNA. The C-terminal moiety consists of an oligomerisation domain, an N-RNA (nucleoprotein-RNA)-binding domain and an L polymerase-binding domain PUBMED:14980481, PUBMED:17459940.

    \ ' '3667' 'IPR001415' '\ Parathyroid hormone (PTH) is a polypeptidic hormone that elevates calcium\ level by dissolving the salts in bone and preventing their renal excretion.\ \ The \'parathyroid hormone-related\ protein\' (PTH-rP) is structurally related to PTH PUBMED:2682846 and seems to play a physiological role in lactation,\ possibly as a hormone for the mobilization and/or transfer of calcium to the\ milk. PTH and\ PTH-rP bind to the same G-protein coupled receptor.\ ' '3668' 'IPR003115' '\ Proteins containing this domain, appear to be related to the Escherichia coli\ plasmid protein ParB, which preferentially cleaves single-stranded DNA. ParB also nicks\ supercoiled plasmid DNA preferably at sites with potential single-stranded\ character, like AT-rich regions and sequences that can form cruciform structures. ParB also exhibits 5--3 exonuclease activity.\ ' '3669' 'IPR012317' '\

    Poly(ADP-ribose) polymerases (PARP) are a family of enzymes\ present in eukaryotes, which catalyze the poly(ADP-ribosyl)ation of a limited\ number of proteins involved in chromatin architecture, DNA repair, or in DNA\ metabolism, including PARP itself. PARP, also known as poly(ADP-ribose)\ synthetase and poly(ADP-ribose) transferase, transfers the ADP-ribose moiety\ from its substrate, nicotinamide adenine dinucleotide (NAD), to carboxylate\ groups of aspartic and glutamic residues. Whereas some PARPs might function in\ genome protection, others appear to play different roles in the cell,\ including telomere replication and cellular transport. PARP-1 is a\ multifunctional enzyme. The polypeptide has a highly conserved modular\ organization consisting of an N-terminal DNA-binding domain, a central\ regulating segment, and a C-terminal or F region accommodating the catalytic\ centre. The F region is composed of two parts: a purely alpha-helical N-\ terminal domain (alpha-hd), and the mixed alpha/beta C-terminal catalytic\ domain bearing the putative NAD binding site. Although proteins of the PARP\ family are related through their PARP catalytic domain, they do not resemble\ each other outside of that region, but rather, they contain unique domains\ that distinguish them from each other and hint at their discrete functions.\ Domains with which the PARP catalytic domain is found associated include\ zinc fingers, SAP, ankyrin, BRCT, Macro, SAM, WWE and UIM domains PUBMED:8016868, PUBMED:15273990, PUBMED:15561303.

    \

    The alpha-hd domain is about 130 amino acids in length and consists of an up-up-down-up-down-down motif of helices. It is\ thought to relay the activation signal issued on binding to damaged DNA PUBMED:8755499, PUBMED:14739238.\ The PARP catalytic domain is about 230 residues in length. Its core consists of a five-stranded antiparallel beta-sheet and\ four-stranded mixed beta-sheet. The two sheets are consecutive and are\ connected via a single pair of hydrogen bonds between two strands that run at\ an angle of 90 degrees. These central beta-sheets are surrounded by five alpha-helices, three 3(10)-helices, and by a three- and a two-stranded beta-sheet in\ a 37-residue excursion between two central beta-strands PUBMED:8755499, PUBMED:14739238. The active\ site, known as the \'PARP signature\' is formed by a block of 50 amino acids \ that is strictly conserved among the vertebrates and\ highly conserved among all species. The \'PARP signature\' is characteristic of\ all PARP protein family members. It is formed by a segment of conserved amino\ acid residues formed by a beta-sheet, an alpha-helix, a 3(10)-helix, a beta-sheet, and an alpha-helix PUBMED:15561303.

    \ ' '3670' 'IPR004102' '\

    Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The regulatory domain of the polymerase is almost always associated with the C-terminal catalytic domain (see ).

    \

    This domain consists of a duplication of two helix-loop-helix structural repeats PUBMED:9521710.

    \ ' '3671' 'IPR000176' '\

    This family contains viral proteins that are bifunctional, acting as both an mRNA cap-specific RNA 2\'-O-methyltransferase, which methylates the ribose 2\' OH group of the first transcribed nucleotide, thereby producing a 2\'-o-methylpurine cap and a poly(A) polymerase processivity factor which binds to Poly(A)\ but has no catalytic activity. The structure of this protein is known PUBMED:8612277.

    \ ' '3672' 'IPR001403' '\

    The Parvovirus coat protein VP2 together with VP1 forms a capsomer. Both of these proteins are formed from the same transcript using alternative splicing. As a result, VP1 and VP2 differ only in the N-terminus region. VP2 is involved in packaging the viral DNA PUBMED:9129667. The mature viron contains three caaspid proteins VP1, VP2, and VP3 and a noncapsid protein NS-1.

    \ ' '3673' 'IPR001257' '\

    Parvoviruses encode two noncapsid/non-structural proteins, NS1 and NS2. NS1 is essential\ for viral DNA replication PUBMED:8372437. These proteins include the ATP/GTP-binding site \ motif A (P-loop) .

    \ ' '3674' 'IPR013767' '\

    PAS domains are involved in many signalling proteins where they\ are used as a signal sensor domain. PAS domains appear in archaea,\ bacteria and eukaryotes. Several PAS-domain proteins are known to\ detect their signal by way of an associated cofactor. Haeme,\ flavin, and a 4-hydroxycinnamyl chromophore are used in different\ proteins. The PAS domain was named after three proteins that it\ occurs in:

    \
  • Per- period circadian protein
  • \
  • Arnt- Ah receptor nuclear translocator protein
  • \
  • Sim- single-minded protein.
  • \

    PAS domains are often associated with\ PAC domains . It appears that these domains are directly linked, and that together they form the conserved 3D PAS fold. The division between the PAS and PAC domains is caused by major differences in sequences in the region connecting these two motifs PUBMED:15009198. In human PAS kinase, this region has been shown to be very flexible, and adopts different conformations depending on the bound ligand PUBMED:12377121.\ Probably the most surprising identification of a PAS domain was that in\ EAG-like K+-channels PUBMED:9301332.

    \ \ ' '3675' 'IPR005543' '\

    The PASTA domain is found at the C-termini of several Penicillin-binding proteins (PBP) and bacterial serine/threonine kinases. It binds the beta-lactam stem, which implicates it in sensing D-alanyl-D-alanine - the PBP transpeptidase substrate. In PknB of Mycobacterium tuberculosis (), all of the extracellular portion is predicted to be made up of four PASTA domains, which strongly suggests that it is a signal-binding sensor\ domain. The domain has also been found in proteins involved in cell wall biosynthesis, where it is implicated in localizing the\ biosynthesis complex to unlinked peptidoglycan.

    PASTA is a small globular fold consisting of 3 beta-sheets and an alpha-helix, with a loop region of variable length between the first and\ second beta-strands. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain PUBMED:12217513.

    \ ' '3676' 'IPR001523' '\

    The paired box is a conserved 124 amino acid N-terminal domain of unknown function that usually, but not always, precedes a homeobox domain (see ) PUBMED:7527137, PUBMED:7981748. Paired box genes are expressed in alternate segments of the developing fruit fly, the observed grouping of segments into pairs depending on the position of the segment in the segmental array, and not on the identity of the segment as in the case of homeotic genes. This implies that the genes affect different processes from those altered by homeotic genes.

    \ ' '3677' 'IPR001904' '\

    Paxillin is a cytoskeletal protein involved in actin-membrane attachment at sites of cell adhesion to the extracellular matrix (focal adhesion) PUBMED:7534286, PUBMED:7525621. Extensive tyrosine phosphorylation occurs during integrin-mediated cell adhesion, embryonic development, fibroblast transformation and following stimulation of cells by mitogens that operate through the 7TM family of G-protein-coupled receptors PUBMED:7525621. Paxillin binds in vitro to the focal adhesion protein vinculin, as well as to the SH3 domain of c-Src, and, when tyrosine phosphorylated, to the SH2 domain of v-Crk PUBMED:7525621. An N-terminal region has been identified that supports the binding of both vinculin and the focal adhesion tyrosine kinase, pp125Fak PUBMED:7525621.

    \

    Paxillin is a 68 kDa protein containing multiple domains, including four tandem C-terminal LIM domains (each of which binds 2 zinc ions); an N-terminal proline-rich domain, which contains a consensus SH3 binding site; and three potential Crk-SH2 binding sites PUBMED:7534286. The predicted structure of paxillin suggests that it is a unique cytoskeletal protein capable of interaction with a variety of intracellular signalling and structural molecules important in growth control and the regulation of cytoskeletal organisation PUBMED:7534286, PUBMED:7525621.

    \ ' '3678' 'IPR005311' '\

    This domain is found at the N-terminus of Class B High Molecular Weight Penicillin-Binding Proteins. Its function has not been precisely defined, but is strongly implicated in PBP polymerisation. The domain forms a largely disordered "sugar tongs" structure.

    \ ' '3679' 'IPR006170' '\

    The olfactory receptors of terrestrial animals exist in an aqueous environment, yet detect odorants that are primarily hydrophobic. The aqueous solubility of hydrophobic odorants is thought to be greatly enhanced via odorant binding proteins which exist in the extracellular fluid surrounding the odorant receptors PUBMED:2010751. This family is composed of pheromone binding proteins (PBP), which are male-specific and associate with pheromone-sensitive neurons and general-odorant binding proteins (GOBP).

    \ ' '3680' 'IPR001297' '\ The phycobilisome linker polypeptide determines the state of aggregation and the location \ of the disc-shaped phycobiliprotein units within the phycobilisome and modulates their\ spectroscopic properties in order to mediate a directed and optimal energy transfer.\ The phycobilisome is a hemidiscoidal structure that is composed of two distinct\ substructures, a core complex (that contains the phycobiliproteins) and a number of\ rods radiating from the core. The linker polypeptide is also found in the chloroplast of\ some eukaryotes where it is required for attachment of phycocyanin to allophycocyanin\ in the core of the phycobilisome.\ ' '3681' 'IPR005542' '\

    Pbx proteins are members of the TALE (three-amino-acid loop extension) family of atypical homeodomain proteins, whose members\ are characterised by a three-residue insertion in the first helix of the homeodomain involved in their interaction with Hox proteins. Examination\ of Pbx1 has shown that, in addition to the homeodomain, a short 16-residue C-terminal tail is essential for maximal cooperative interactions with\ Hox partners as well as for maximal monomeric binding of Pbx1 to DNA.

    The PBX domain is a bipartite acidic domain PUBMED:1363814.

    \ ' '3682' 'IPR003173' '\ p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain PUBMED:8062392. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain.\ ' '3683' 'IPR002015' '\ A weakly conserved repeat module of unknown function, which occurs\ in two regulatory subunits of the 26S-proteasome and in one subunit\ of the APC-complex (cyclosome) PUBMED:9204704.\ ' '3684' 'IPR000682' '\

    Protein-L-isoaspartate(D-aspartate) O-methyltransferase () (PCMT) PUBMED:9253175 (which is also known as L-isoaspartyl protein carboxyl methyltransferase) is an enzyme that catalyses the transfer of a methyl group from S-adenosylmethionine to the free carboxyl groups of D-aspartyl or L-isoaspartyl residues in a variety of peptides and proteins. The enzyme does not act on normal L-aspartyl residues L-isoaspartyl and D-aspartyl are the products of the spontaneous deamidation and/or isomerisation of normal L-aspartyl and L-asparaginyl residues in proteins. PCMT plays a role in the repair and/or degradation of these damaged proteins; the enzymatic methyl esterification of the abnormal residues can lead to their conversion to normal L-aspartyl residues. The SAM domain is present in most of these proteins.

    \ ' '3685' 'IPR000730' '\

    Proliferating cell nuclear antigen (PCNA), or cyclin, is a non-histone acidic nuclear protein PUBMED:2884104 that plays a key role in the control of eukaryotic DNA replication PUBMED:1346518. It acts as a co-factor for DNA polymerase delta, which is responsible for leading strand DNA\ replication PUBMED:2565339. The sequence of PCNA is well conserved between plants and animals, indicating a strong selective pressure for structure conservation, and suggesting that this type of DNA replication mechanism is conserved throughout eukaryotes PUBMED:1671766. In Saccharomyces cerevisiae (Baker\'s yeast), POL30, is associated with polymerase III, the yeast analog of polymerase delta.

    \

    Homologues of PCNA have also been identified in the archaea (Euryarchaeota and Crenarchaeota) and in Paramecium bursaria Chlorella virus 1 (PBCV-1) and in nuclear polyhedrosis viruses.

    \ ' '3686' 'IPR000730' '\

    Proliferating cell nuclear antigen (PCNA), or cyclin, is a non-histone acidic nuclear protein PUBMED:2884104 that plays a key role in the control of eukaryotic DNA replication PUBMED:1346518. It acts as a co-factor for DNA polymerase delta, which is responsible for leading strand DNA\ replication PUBMED:2565339. The sequence of PCNA is well conserved between plants and animals, indicating a strong selective pressure for structure conservation, and suggesting that this type of DNA replication mechanism is conserved throughout eukaryotes PUBMED:1671766. In Saccharomyces cerevisiae (Baker\'s yeast), POL30, is associated with polymerase III, the yeast analog of polymerase delta.

    \

    Homologues of PCNA have also been identified in the archaea (Euryarchaeota and Crenarchaeota) and in Paramecium bursaria Chlorella virus 1 (PBCV-1) and in nuclear polyhedrosis viruses.

    \ ' '3687' 'IPR003376' '\ Peridinin-chlorophyll-protein, a water-soluble light-harvesting complex that has a blue-green absorbing carotenoid as its main pigment, is present in most photosynthetic dinoflagellates. These proteins are composed of two similar repeated domains. These domains constitute a scaffold with pseudo-twofold symmetry surrounding a hydrophobic cavity filled by two lipid, eight peridinin, and two chlorophyll a molecules PUBMED:8650577.\ ' '3688' 'IPR008205' '\

    This family contains prokaryotic proteins that are related to pcrB. Staphylococcus aureus chromosomal gene pcrA encodes a protein with significant similarity (40% identity) to two Escherichia coli helicases: the helicase II encoded by the uvrD gene and the Rep helicase. PcrB gene seems to belong to an operon containing at least one other gene, pcrBA, downstream from pcrB PUBMED:8232203. The PcrB proteins often contain an FMN binding site although the function of these proteins is still unknown.

    \ ' '3689' 'IPR007320' '\

    PDCD2 is localized predominantly in the cytosol of cells situated at the opposite pole of the germinal centre from the centroblasts as well as in cells in the mantle zone. It has been shown to interact with BCL6, an evolutionarily conserved Kruppel-type zinc finger protein that functions as a strong transcriptional repressor and is required for germinal centre development. The rat homologue, Rp8, is associated with programmed cell death in thymocytes.

    \ ' '3690' 'IPR006952' '\ Retinal rod and cone cGMP phosphodiesterases function as the effector enzymes in the vertebrate visual transduction cascade. This family represents the inhibitory gamma subunit PUBMED:11900530, which is also expressed outside retinal tissues and has been shown to interact with the G-protein-coupled receptor kinase 2 signalling system to regulate the epidermal growth factor- and thrombin-dependent stimulation of p42/p44 mitogen-activated protein kinase in human embryonic kidney 293 cells PUBMED:11502744.\ ' '3691' 'IPR002073' '\

    The cyclic nucleotide phosphodiesterases (PDE) comprise a group of enzymes that degrade the phosphodiester bond in the second messenger molecules cAMP and cGMP. They are divided into 11 families. They regulate the localisation, duration and amplitude of cyclic nucleotide signalling within subcellular domains. PDEs are therefore important for signal transduction.

    \ \

    PDE enzymes are often targets for pharmacological inhibition due to their unique tissue distribution, structural properties, and functional properties. Inhibitors include: Roflumilast for chronic obstructive pulmonary disease and asthma PUBMED:18447606, Sildenafil for erectile dysfunction PUBMED:18367027 and Cilostazol for peripheral arterial occlusive disease PUBMED:18436153, amongst others.

    \ \

    Retinal 3\',5\'-cGMP phosphodiesterase is located in photoreceptor outer segments PUBMED:: it is light activated, playing a pivotal role in signal transduction. In rod cells, PDE is oligomeric, comprising an alpha-, a beta- and 2 gamma-subunits, while in cones, PDE is a homodimer of alpha chains, which are associated with several smaller subunits. Both rod and cone PDEs catalyse the hydrolysis of cAMP or cGMP to the corresponding nucleoside 5\' monophosphates, both enzymes also binding\ cGMP with high affinity. The cGMP-binding sites are located in the\ N-terminal half of the protein sequence, while the catalytic core \ resides in the C-terminal portion.

    \ \ ' '3692' 'IPR000396' '\ Cyclic-AMP phosphodiesterase () (PDE) catalyses the hydrolysis of cAMP to the\ corresponding nucleoside 5\' monophosphate. On the basis of sequence\ similarity, most PDEs can be grouped together PUBMED:2159198, but 2 enzymes lie apart\ from the main family and represent a second distinct class PUBMED:2824992: this\ includes PDEs from Dictyostelium and yeast. \ There is, in the central part of these enzymes, a highly conserved region\ which contains three histidines.\ ' '3693' 'IPR000072' '\ Platelet-derived growth factor (PDGF) PUBMED:2546599, PUBMED:1425569 is a potent mitogen for cells of\ mesenchymal origin, including smooth muscle cells and glial cells. In both mouse and human, the PDGF signalling network consists of four ligands, PDGFA-D, and two receptors, PDGFRalpha and PDGFRbeta. All PDGFs function as secreted, disulphide-linked\ homodimers, but only PDGFA and B can form functional heterodimers. PDGFRs also function as homo- and heterodimers. All known PDGFs have characteristic \'PDGF domains\',\ which include eight conserved cysteines that are involved in inter- and intramolecular bonds.\ Alternate splicing of the A chain transcript can give rise to two different\ forms that differ only in their C-terminal extremity. The transforming protein\ of Woolly monkey sarcoma virus (WMSV) (Simian sarcoma virus), encoded by the v-sis oncogene, is derived from the B chain of PDGF.\

    PDGFs are mitogenic during early developmental stages, driving the proliferation of undifferentiated mesenchyme and some progenitor populations. During later maturation stages, PDGF signalling has been implicated in tissue remodelling and cellular differentiation, and in inductive events involved in patterning and morphogenesis. In addition to driving\ mesenchymal proliferation, PDGFs have been shown to direct the migration, differentiation and function of a variety of specialised mesenchymal and migratory cell types, both during development and in the\ adult animal PUBMED:12952899. Other growth factors in this family include vascular endothelial growth factors B and C (VEGF-B, VEGF-C) PUBMED:8637916, PUBMED:8617204 which are active in angiogenesis and endothelial cell growth, and placenta growth factor (PlGF) which is also active in angiogenesis PUBMED:7681160.

    \

    PDGF is structurally related to a number of other growth factors which also form disulphide-linked homo- or heterodimers.

    \ ' '3694' 'IPR006782' '\

    Platelet-derived growth factor (PDGF) PUBMED:2546599, PUBMED:1425569 is a potent mitogen for cells of\ mesenchymal origin, including smooth muscle cells and glial cells. In both mouse and human, the PDGF signalling network consists of four ligands, PDGFA-D, and two receptors, PDGFRalpha and PDGFRbeta. All PDGFs function as secreted, disulphide-linked\ homodimers, but only PDGFA and B can form functional heterodimers. PDGFRs also function as homo- and heterodimers. All known PDGFs have characteristic \'PDGF domains\',\ which include eight conserved cysteines that are involved in inter- and intramolecular bonds.\ Alternate splicing of the A chain transcript can give rise to two different\ forms that differ only in their C-terminal extremity. The transforming protein\ of Woolly monkey sarcoma virus (WMSV) (Simian sarcoma virus), encoded by the v-sis oncogene, is derived from the B chain of PDGF.

    \ \

    PDGFs are mitogenic during early developmental stages, driving the proliferation of undifferentiated mesenchyme and some progenitor populations. During later maturation stages, PDGF signalling has been implicated in tissue remodelling and cellular differentiation, and in inductive events involved in patterning and morphogenesis. In addition to driving\ mesenchymal proliferation, PDGFs have been shown to direct the migration, differentiation and function of a variety of specialised mesenchymal and migratory cell types, both during development and in the\ adult animal PUBMED:12952899.

    \ \

    PDGF is structurally related to a number of other growth factors which also form disulphide-linked homo- or heterodimers.

    \ \ \

    This domain consists of the N-terminal regions of PGDF A and B.

    \ ' '3695' 'IPR001086' '\

    Prephenate dehydratase (, PDT) catalyses the decarboxylation of prephenate to phenylpyruvate. In microorganisms it is part of the terminal pathway of phenylalanine biosynthesis. In some bacteria such as Escherichia coli PDT is part of a bifunctional enzyme (P-protein) that also catalyses the transformation of chorismate into prephenate (chorismate mutase, , ) while in other bacteria it is a monofunctional enzyme. The sequence of monofunctional PDT aligns well with the C-terminal part of P-proteins PUBMED:9642265.

    \ ' '3696' 'IPR004569' '\

    Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination PUBMED:8690703, PUBMED:7748903, PUBMED:15189147. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors PUBMED:17109392. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy PUBMED:16763894.

    \

    PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic PUBMED:15581583.

    \ \ \

    In Escherichia coli, the pdx genes involved in vitamin B6 have been characterised PUBMED:10225425, PUBMED:15242009, PUBMED:17344055. This entry represents PdxJ, which catalyses the condensation of 1-amino-3-oxo-4-(phosphohydroxy)propan-2-one and 1-deoxy-D-xylulose-5-phosphate to form pyridoxine-5\'-phosphate. The product of the PdxJ reaction is then oxidized by PdxH to pyridoxal 5\'-phosphate.

    \ ' '3697' 'IPR001478' '\

    PDZ domains are found in diverse signalling proteins in bacteria, yeasts,\ plants, insects and vertebrates PUBMED:9041651, PUBMED:9204764. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences PUBMED:9204764. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.

    \ \

    PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.

    \ ' '3698' 'IPR002022' '\

    Pectate lyase is an enzyme involved in the maceration and soft rotting of plant tissue. \ Pectate lyase is responsible for the eliminative cleavage of pectate,\ yielding oligosaccharides with 4-deoxy-alpha-D-mann-4-enuronosyl groups\ at their non-reducing ends. The protein is maximally expressed late in\ pollen development. It has been suggested that the pollen expression of \ pectate lyase genes might relate to a requirement for pectin degradation\ during pollen tube growth PUBMED:1983191.

    \ \

    The structure and the folding kinetics of one member of this family, pectate lyase C\ (pelC)1 from Erwinia chrysanthemi has been investigated in some detail PUBMED:11926834,PUBMED:8502994. PelC contains a parallel beta-helix folding motif. The majority of the regular secondary structure is composed of parallel beta-sheets (about\ 30%). The individual strands of the sheets are connected by unordered loops of varying length. The backbone is then formed by a large helix composed of beta-sheets. There are two disulphide bonds in pelC and 12 proline residues. One of these prolines, Pro220, is involved in a cis peptide bond. he folding mechanism of pelC involves two slow phases that have been attributed to proline isomerization.

    \ \

    Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee\ King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E.,\ Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of\ the first three letters of the genus; a space; the first letter of the\ species name; a space and an arabic number. In the event that two species\ names have identical designations, they are discriminated from one another\ by adding one or more letters (as necessary) to each species designation.

    \

    The allergens in this family include allergens with the following designations: Amb a 1, Amb a 2, Amb a 3, Cha o 1, Cup a 1, Cry j 1, Jun a 1.

    \ \

    Two of the major allergens in the pollen of short ragweed (Ambrosia \ artemisiifolia) are Amb aI and Amb aII. The primary structure of Amb aII\ has been deduced and has been shown to share ~65% sequence identity with\ the Amb alpha I multigene family of allergens PUBMED:1717566. Members of the Amb aI/aII\ family include Nicotiana tabacum (Common tobacco) pectate lyase, which is similar to the deduced amino\ acid sequences of two pollen-specific pectate lyase genes identified in\ Solanum lycopersicum (Tomato) (Lycopersicon esculentum) PUBMED:1421152; Cry jI, a major allergenic glycoprotein of Cryptomeria japonica (Japanese cedar) - the most common pollen allergen in Japan PUBMED:7920021; and P56 and P59, which share sequence similarity with pectate lyases of plant \ pathogenic bacteria PUBMED:1983191.

    \ ' '3699' 'IPR007735' '\ This family consists of the C-terminal region of the pecanex protein homologues. The pecanex protein is a maternal-effect neurogenic gene found in Drosophila PUBMED:1460533.\ ' '3700' 'IPR004898' '\

    Pectate lyase is responsible for the maceration and soft-rotting of plant tissue. It catalyses the eliminative cleavage of pectate to produce oligosaccharides with 4-deoxy-alpha-D-gluc-4-enuronosyl groups at their non-reducing ends. Pectate lyase is an extracellular enzyme and is induced by pectin. It is subject to self-catabolite repression, and has been implicated in plant disease.

    \ \ The structure and the folding kinetics of one member of this family, pectate lyase C\ (pelC)1 from Erwinia chrysanthemi has been investigated in some detail PUBMED:11926834. PelC contains a parallel beta-helix folding motif. The majority of the regular secondary structure is composed of parallel beta-sheets (about\ 30%). The individual strands of the sheets are connected by unordered loops of varying length. The backbone is then formed by a large helix composed of beta-sheets. There are two disulphide bonds in pelC and 12 proline residues. One of these prolines, Pro220, is involved in a cis peptide bond. he folding mechanism of pelC involves two slow phases that have been attributed to proline isomerization.\ ' '3701' 'IPR007318' '\ The Saccharomyces cerevisiae (Baker\'s yeast) phospholipid methyltransferase () has a broad substrate specificity of unsaturated phospholipids PUBMED:2445736.\ ' '3702' 'IPR005650' '\

    Proteins in this entry are transcriptional regulators found in a variety of bacteria and a small number of archaea. Many are BlaI/MecI proteins which regulate resistance to penicillins (beta-lactams), though at least one protein () appears to be involved in the regulation of copper homeostasis PUBMED:7876197. BlaI regulators repress the expression of penicillin-degrading enzymes (penicillinases) until the cell encounters the antiobiotic, at which point repression ceases and penicillinase expression occurs, allowing cell growth PUBMED:14568532. MecI regulators repress the expression of MecA, a cell-wall biosynthetic enzyme not inhibited by penicillins at clinically achievable concentrations, until the presence of the antibiotic is detected PUBMED:12881514. At this point repression ends and MecA expression occurs which, together with the switching off of the penicillin-sensitive enzymes, allows the cell to grow despite the presence of antibiotic.

    \ ' '3703' 'IPR001646' '\ These repeats were first identified in many cyanobacterial proteins but they are also found in bacterial as well as in plant proteins PUBMED:9654141. The repeats were first identified in hglK PUBMED:7592418. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix PUBMED:9655353. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid.\ ' '3704' 'IPR001759' '\

    Pentaxins (or pentraxins) PUBMED:6356809, PUBMED:7772283 are a family of proteins which show, under electron microscopy, a discoid arrangement of five noncovalently bound subunits. Proteins of the pentaxin family are involved in acute immunological responses PUBMED:7772283. Three of the principal members of the pentaxin family are serum proteins: namely, C-reactive protein (CRP) PUBMED:9614930, serum amyloid P component protein (SAP) PUBMED:9514915, and female protein (FP) PUBMED:9583999.

    \

    CRP is expressed during acute phase response to tissue injury or inflammation in mammals. The protein resembles antibody and performs several functions associated with host defence: it promotes agglutination, bacterial capsular swelling and phagocytosis, and activates the classical complement pathway through its calcium-dependent binding to phosphocholine. CRPs have also been sequenced in an invertebrate, Limulus polyphemus (Atlantic horseshoe crab), where they are a normal constituent of the hemolymph.

    \

    SAP is a vertebrate protein that is a precursor of amyloid component P. It is found in all types of amyloid deposits, in glomerular basement menbrane and in elastic fibres in blood vessels. SAP binds to various lipoprotein ligands in a calcium-dependent manner, and it has been suggested that, in mammals, this may have important implications in atherosclerosis and amyloidosis.

    \

    FP is a SAP homologue found in Mesocricetus auratus (Golden hamster). The concentration of this plasma protein is altered by sex steroids and stimuli that elicit an acute phase response.

    \

    Pentaxin proteins expressed in the nervous system are neural pentaxin I (NPI) and II (NPII) PUBMED:8884281. NPI and NPII are homologous and can exist within one species. It is suggested that both proteins mediate the uptake of synaptic macromolecules and play a role in synaptic plasticity. Apexin, a sperm acrosomal protein, is a homologue of NPII found in Cavia porcellus (Guinea pig) PUBMED:7798266.

    \

    PTX3 (or TSG-14) protein is a cytokine-induced protein that is homologous to CRPs and SAPs, but its function is not yet known.

    \ ' '3705' 'IPR000121' '\ A number of enzymes that catalyze the transfer of a phosphoryl group from\ phosphoenolpyruvate (PEP) via a phospho-histidine intermediate have been shown\ to be structurally related PUBMED:7686067, PUBMED:8973315, PUBMED:2176881, PUBMED:1557039. All these enzymes share the same catalytic mechanism: they bind PEP and\ transfer the phosphoryl group from it to a histidine residue. The sequence\ around that residue is highly conserved. This domain is often found associated with the pyruvate phosphate dikinase, PEP/pyruvate-binding domain () at its N-terminus and the PEP-utilizing enzyme mobile domain.\ ' '3706' 'IPR000181' '\

    Peptide deformylase (PDF) is an essential metalloenzyme required for the \ removal of the formyl group at the N-terminus of nascent polypeptide chains\ in eubacteria PUBMED:9846875 . The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::\ \ Catalytic efficiency strongly depends on the identity of the bound metal PUBMED:9565550.

    \

    The structure\ of these enzymes is known PUBMED:8845003, PUBMED:9665852. PDF, a member of the zinc metalloproteases family, comprises an active core\ domain of 147 residues and a C-terminal tail of 21 residue.\ The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR.\ Overall, the structure contains a series of anti-parallel beta-\ strands that surround two perpendicular alpha-helices. The C-terminal \ helix contains the characteristic HEXXH motif of metalloenzymes, which is\ crucial for activity. The helical arrangement, and the way the histidine\ residues bind the zinc ion, is reminiscent of other metalloproteases, such\ as thermolysin or metzincins. However, the arrangement of secondary and\ tertiary structures of PDF, and the positioning of its third zinc ligand (a\ cysteine residue), are quite different. These discrepancies, together with \ notable biochemical differences, suggest that PDF constitutes a new class of\ zinc-metalloproteases. \ PUBMED:8845003.

    \ ' '3707' 'IPR002870' '\

    This signature covers the region of the propeptide for members of the MEROPS peptidase family M12B (clan MA(M), adamalysin family). The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins, which mediate cell-cell or cell-matrix interactions.

    \ ' '3708' 'IPR005075' '\

    This signature, PepSY, is found in the propeptide of members of the MEROPS peptidase family M4 (clan MA(E)), which contains the thermostable thermolysins (), and related thermolabile neutral proteases (bacillolysins) () from various species of Bacillus. It is also in many non-peptidase proteins, including Bacillus subtilis YpeB protein - a regulator of SleB spore cortex lytic enzyme - and a large number of eubacterial and archaeal cell wall-associated and secreted proteins which are mostly annotated as \'hypothetical protein\'.

    \ \

    Many extracellular bacterial proteases are produced as proenzymes. The propeptides usually have a dual function, i.e. they function as an intramolecular chaperone required for the folding of the polypeptide and as an inhibitor preventing premature activation of the enzyme. Analysis of the propeptide region of the M4 family of peptidases reveals two regions of conservation, the PepSY domain and a second domain, proximate to the N terminus, the FTP domain (), which is also found in isolation in the propeptide of eukaryotic peptidases belong to MEROPS peptidase family M36.

    \ \

    Propeptide domain swapping experiments, for example swapping the propeptide domain of PA protease with that of vibrolysin, both propeptides contain the FTP and PepSY domains, allows the PA protease domain to fold correctly and inhibits the C-terminal autoprocessing activity. However, swapping the propeptide of PA protease for the thermolysin propeptide, does not facilitate the correct folding nor the processing of the chimaeric protein into an active peptidase PUBMED:12589825. Mutational analysis of the Pseudomonas aeruginosa elastase gene revealed two mutations in the propeptide which resulted in the loss of inhibitory activity but not chaperone activity: A-15V and T-153I (where +1 is defined as the first residue of the mature peptidase). Both mutations resulted in peptidase activity, the T-153V mutation being much less effective than the A-15I mutation PUBMED:11021931 in activating peptidase activity. The T-153V mutation lies N-terminal to the FTP domain while the A-15I mutation is C-terminal to the PepSY domain.

    \ \

    Given the diverse range of other proteins, both domains occur in in isolation, the exact function of each is still unclear; though it has been proposed that the PepSY domain primarily has inhibitory activity and in conjunction with the FTP domain in chaperone activity.

    \ ' '3709' 'IPR001449' '\

    Phosphoenolpyruvate carboxylase (PEPCase), an enzyme found in all multicellular plants, catalyses the formation of oxaloacetate from phosphoenolpyruvate (PEP) and a hydrocarbonate ion PUBMED:1450389. This reaction is harnessed\ by C4 plants to capture and concentrate carbon dioxide into the photosynthetic bundle sheath cells. It also plays a key role in the nitrogen\ fixation pathway in legume root nodules: here it functions in concert with\ glutamine, glutamate and asparagine synthetases and aspartate amido transferase, to synthesise aspartate and asparagine, the major nitrogen transport compounds in various amine-transporting plant species PUBMED:1421147.

    \

    PEPCase\ also plays an antipleurotic role in bacteria and plant cells, supplying\ oxaloacetate to the TCA cycle, which requires continuous input of C4\ molecules in order to replenish the intermediates removed for amino acid\ biosynthesis PUBMED:2779518.\ The C-terminus of the enzyme contains the active site that includes a\ conserved lysine residue, involved in substrate binding, and other conserved\ residues important for the catalytic mechanism PUBMED:1508152.

    \ ' '3710' 'IPR008209' '\

    Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising () and GTP-utilising () enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria PUBMED:16330239. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.

    \

    PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding PUBMED:16126724. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels PUBMED:17403375.

    \

    PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another PUBMED:15023367, PUBMED:8609605. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site PUBMED:15890557. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.

    \ \

    This entry represents GTP-utilising phosphoenolpyruvate carboxykinase enzymes.

    \ ' '3711' 'IPR001272' '\

    Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising () and GTP-utilising () enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria PUBMED:16330239. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.

    \

    PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding PUBMED:16126724. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels PUBMED:17403375.

    \

    PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another PUBMED:15023367, PUBMED:8609605. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site PUBMED:15890557. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.

    \ \

    This entry represents ATP-utilising phosphoenolpyruvate carboxykinase enzymes.

    \ ' '3712' 'IPR001328' '\

    Peptidyl-tRNA hydrolase () (PTH) is a bacterial enzyme that cleaves\ peptidyl-tRNA or N-acyl-aminoacyl-tRNA to yield free peptides or N-acyl-amino acids and \ tRNA. The natural substrate for this enzyme may be peptidyl-tRNA which drop off the \ ribosome during protein synthesis PUBMED:1833189,\ PUBMED:8635758. Bacterial PTH has been found to be \ evolutionary related to a yeast protein PUBMED:8563640.

    \ ' '3713' 'IPR005312' '\

    This is a small family of proteins of unknown function.

    \ ' '3714' 'IPR005313' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of aspartic peptidases belongs to the MEROPS family A21 (clan AB). The protein fold of the peptidase active site domain for members of this family is that of the nodavirus endopeptidase, the type example for clan AB. The type example for the family is the tetravirus endopeptidase from Nudaurelia capensis omega virus. Members of this family are found as a capsid protein in some of the tetraviridae.

    \ ' '3715' 'IPR000588' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of sequences contain an aspartic peptidase signature that belongs to MEROPS peptidase family A3, subfamily A3A (cauliflower mosaic virus-type endopeptidase, clan AA). \ \ \ Cauliflower mosaic virus belongs to the Retro-transcribing viruses, which have a \ double-stranded DNA genome. The genome includes an open reading frame (ORF V) that shows similarities to the pol\ gene of retroviruses. This ORF codes for a polyprotein that includes a reverse transcriptase, which, on the basis\ of a DTG triplet near the N-terminus, was suggested to include an aspartic protease. The presence of an aspartic\ protease has been confirmed by mutational studies, implicating Asp-45 in catalysis. The protease releases itself\ from the polyprotein and is involved in reactions required to process the ORF IV polyprotein, which includes the\ viral coat protein PUBMED:7674916. The viral aspartic peptidase signature has also been found associated with a polyprotein encoded by integrated pararetrovirus-like sequences in the genome of Nicotiana tabacum (Common tobacco) PUBMED:10557305.

    \ ' '3716' 'IPR000250' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The peptidases in family G1 form a subset of what were formerly termed \'pepstatin-insensitive carboxyl proteinases\'. After its\ discovery in about 1970, the pentapeptide pepstatin soon came to be thought of as a very general inhibitor of the\ endopeptidases that are active at acidic pH. But more recently several acid-acting endopeptidases from bacteria and fungi had\ been found to be resistant to pepstatin. The unusual active sites of the \'pepstatin-insensitive carboxyl peptidases\' proved\ difficult to characterise, but it has now been established that the enzymes from bacteria are acid-acting serine peptidases in\ family S53 (clan SB), , whereas the fungal enzymes are in family G1 (formerly A4). The importance of glutamate (\'E\') and glutamine (\'Q\') residues in the active sites of the family G1 enzymes led to the family name, Eqolisin PUBMED:14993599.

    \ \

    This group of glutamate/glutamine peptidases belong to MEROPS peptidase family G1 (eqolisin family, clan GA). An example of this group is scytalidoglutamic peptidase. The proteins are thermostable, pepstatin insensitive and are active at low pH ranges PUBMED:7674922. The enzyme has a unique heterodimeric structure, with a 39-residue light chain and a 173-residue heavy chain bound to each other non-covalently PUBMED:1918060. The tertiary structure of the active site of scytalidoglutamic peptidase (MEROPS G01.001) with a bound tripeptide product has been interpreted as\ showing that Glu136 is the primary catalytic residue. The most likely mechanism is suggested to be\ nucleophilic attack by a water molecule activated by the Glu136 side chain on the si-face of the scissile peptide bond\ carbon atom to form the tetrahedral intermediate. Electrophilic assistance, and oxyanion stabilisation, are provided by the\ side-chain amide of Gln53.

    \ \ \

    Both scytalidoglutamic peptidase (MEROPS G01.001) and aspergilloglutamic peptidase (MEROPS G01.002) cleave the Tyr26\ Thr27 bond in the B chain of oxidized insulin; a bond not cleaved by other acid-acting endopeptidases. Scytalidoglutamic\ peptidase is most active on casein at pH 2 and is inhibited by 1,2-epoxy-3-(p-nitrophenoxy)propane (EPNP), a compound that also\ inhibits pepsin.

    \ ' '3717' 'IPR000696' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of proteins, which include the Nodavirus coat precusor endopeptidases, are aspartic peptidases that belong to the MEROPS peptidase family A6 (clan AB).

    \ \

    Nodaviruses are small, icosahedral viruses, pathogenic to insects and mammals. A virus particle consists of a single virion, within which is packaged two RNA stands, RNA1 and RNA2.Nodavirus coat precursor endopeptidase (also known as protein alpha) is the only protein encoded by RNA2. During the process of virion assembly, this precursor is cleaved into coat proteins beta and gamma. RNA1 encodes two proteins, at least one of which is involved in RNA replication. The relatively uncomplicated nature of their structural protein and RNA constituents make the nodaviruses a good virus model PUBMED:2116525.

    \ \

    The 3D structure of the capsid protein has been determined by X-ray\ crystallography to 2.8A resolution PUBMED:2116525. The structure contains a beta-barrel\ domain, with a prominent protrusion composed largely of beta-sheet. This\ protrusion, together with similar protrusions from neighbouring subunits,\ forms a prominent trigonal pyramid with quasi-3-fold symmetry PUBMED:2116525. Two\ alpha-helices extend toward the interior of the particle PUBMED:2116525.

    \ ' '3718' 'IPR001872' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of aspartic peptidases belong to the MEROPS peptidase family A8 (signal peptidase II family, clan AC). The catalytic residues have not been identified, but three conserved aspartates can be identified from sequence alignments. The type example is the Escherichia coli lipoprotein signal peptidase or SPase II (). This enzyme recognises a conserved sequence and cuts in front of a cysteine residue to which a glyceride-fatty acid lipid is attached. SPase II is an integral membrane protein that is anchored in the membrane.

    \ \ \

    Bacterial cell walls contain large amounts of murein lipoprotein, a small protein that is both N-terminally bound to lipid and attached to membrane peptidoglycan (murein) through the epsilon-amino group of its C-terminal lysine residue PUBMED:7674916.\ Secretion of this lipoprotein is facilitated by the action of the lipoprotein signal peptidases in this entry, located in the inner membrane PUBMED:7674916, PUBMED:6368552. They enzyme are inhibited by globomycin\ and also by pepstatin, suggesting that they are aspartic peptidases PUBMED:7674916.

    \ ' '3719' 'IPR000200' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to MEROPS peptidase family C10 (streptopain family, clan CA). Streptopain is a cysteine protease found in Streptococcus pyogenes that shows some structural and functional similarity to papain (family C1) PUBMED:7845226, PUBMED:1270417. The order of the catalytic cysteine/histidine dyad is the same and the surrounding sequences are similar. The two proteins also show similar specificities, both preferring a hydrophobic residue at the P2 site PUBMED:7845226, PUBMED:4683008.

    \ \

    Streptopain shows a high degree of sequence similarity to the S. pyogenes exotoxin B, and strong similarity to the prtT gene product of\ Porphyromonas gingivalis (Bacteroides gingivalis), both of which have been included in the family PUBMED:7845226.

    \ ' '3720' 'IPR005077' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to the MEROPS peptidase family C11 (clostripain family, clan CD).

    \ ' '3721' 'IPR001578' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    \ This group of cysteine peptidases belong to the MEROPS peptidase family C12 (ubiquitin C-terminal hydrolase family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. The type example is the human ubiquitin C-terminal hydrolase UCH-L1.

    \ \

    Ubiquitin is highly conserved, commonly found conjugated to proteins in\ eukaryotic cells, where it may act as a marker for rapid degradation, or\ it may have a chaperone function in protein assembly PUBMED:7845226. The ubiquitin is released by cleavage from the bound protein by a protease PUBMED:7845226. A number of\ deubiquitinising proteases are known: all are activated by thiol compounds\ PUBMED:7845226, PUBMED:3015923, and inhibited by thiol-blocking agents and ubiquitin aldehyde PUBMED:7845226, PUBMED:3031653, and as such have the properties of cysteine proteases PUBMED:7845226.

    \ \

    The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa\ and 100-200 kDa, ) PUBMED:7845226: this family are the 20-30 kDa ppeptides which includes the yeast yuh1. Yeast yuh1 protease is known to be active only against small ubiquitin conjugates, being inactive against conjugated beta-galactosidase PUBMED:7845226. A mammalian homologue, UCH (ubiquitin conjugate hydrolase), is one of the most abundant proteins in the brain PUBMED:7845226. Only one conserved cysteine can be\ identified, along with two conserved histidines. The spacing between the\ cysteine and the second histidine is thought to be more representative of\ the cysteine/histidine spacing of a cysteine protease catalytic dyad PUBMED:7845226.

    \ ' '3722' 'IPR000816' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to MEROPS peptidase family C15 (pyroglutamyl peptidase I, clan CF). The type example being pyroglutamyl peptidase I of Bacillus amyloliquefaciens.

    \ \ \

    Pyroglutamyl/pyrrolidone carboxyl peptidase (Pcp or PYRase) is an exopeptidase that\ hydrolytically removes the pGlu from pGlu-peptides or pGlu-proteins PUBMED:7824521, PUBMED:1353026.\ PYRase has been found in prokaryotes and eukaryotes where at least two different classes have been characterised: the first\ containing bacterial and animal type I PYRases, and the second containing\ animal type II and serum PYRases. Type I and bacterial PYRases are soluble\ enzymes, while type II PYRases are membrane-bound. The primary application\ of PYRase has been its utilisation for protein or peptide sequencing, and\ bacterial diagnosis PUBMED:1353026. The conserved residues Cys-144 and His-168 have\ been identified by inhibition and mutagenesis studies PUBMED:7824521, PUBMED:7909543.

    \ ' '3723' 'IPR002705' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This entry contains coronavirus cysteine endopeptidases that belong to MEROPS peptidase families C30 (clan PA) and C16 (subfamiles C16A and C16B, clan CA). These peptidase are involved in viral polyprotein processing. All coronaviruses encodes between one and two accessory cysteine proteinases that recognise and process one or two sites in the amino-terminal half of the replicase polyprotein during assembly of the viral replication complex. MHV, HCoV and TGEV encode two accesssory proteinases, called coronavirus papain-like proteinase 1 and 2 (PL1-PRO and PL2-PRO). IBV and SARS encodes only one called PL-PRO PUBMED:10725411. Coronavirus papain-like proteinases 1 and 2 have restricted specificities, cleaving respectively two and one bond(s)in the polyprotein. This restricted activity may be due to extended specificity sites: Arg or Lys at the cleavage site position P5 are required for PL1-PRO PUBMED:8396668, and Phe at the cleavage site position P6 is required for PL2-PRO PUBMED:12805436. PL1-PRO releases p28 and p65 from the N-terminus of the polyprotein; PL2-PRO cleaves between p210 and p150.

    \ ' '3724' 'IPR001300' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium PUBMED:2539381. The protein is a complex of 2\ polypeptide chains (light and heavy), with three known forms in mammals\ PUBMED:7845226, PUBMED:2555341: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only PUBMED:2555341.

    \ \

    All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28-kDa subunit and an 80-kDa subunit that shares 55-65% sequence homology between the two proteases PUBMED:7845226, PUBMED:2539381. The crystallographic structure of m-calpain reveals six "domains" in the 80-kDa subunit:

    \ \
      \
    1. A 19-amino acid NH2-terminal sequence;
    2. \
    3. Active site domain IIa;
    4. \
    5. Active site domain IIb.\ \

      Domain 2 shows\ low levels of sequence similarity to papain; although the catalytic His has\ not been located by biochemical means, it is likely that calpain and papain\ are related PUBMED:7845226.

      \ \
    6. \
    7. Domain III;
    8. \
    9. An 18-amino acid extended sequence linking domain III to domain IV;
    10. \
    11. Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity PUBMED:7845226. />. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad PUBMED:11914728.
    12. \
    \ \ \

    Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known\ about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin (). The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma PUBMED:12843408.

    \ ' '3725' 'IPR000045' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of aspartic endopeptidases belong to MEROPS peptidase family A24 (type IV prepilin peptidase family, clan AD), subfamily A24A.

    \ \

    Bacteria produce a number of protein precursors that undergo post-translational methylation and proteolysis prior to secretion as active\ proteins. Type IV prepilin leader peptidases are enzymes that mediate this type of post-translational modification. Type IV pilin is a protein found on the surface of Pseudomonas aeruginosa, Neisseria gonorrhoeae and other Gram-negative\ pathogens. Pilin subunits attach the infecting organism to the surface of\ host epithelial cells. They are synthesised as prepilin subunits, which\ differ from mature pilin by virtue of containing a 6-8 residue leader\ peptide consisting of charged amino acids. Mature type IV pilins also\ contain a methylated N-terminal phenylalanine residue.

    \ \

    The bifunctional enzyme prepilin peptidase (PilD) from Pseudomonas aeruginosa is a key determinant in both type-IV pilus biogenesis and extracellular protein secretion, in its roles as a leader peptidase and methyl transferase (MTase). It is responsible for endopeptidic cleavage of the unique leader peptides that characterise type-IV pilin precursors, as well as proteins with homologous leader sequences that are essential components of the general secretion pathway found in a variety of Gram-negative pathogens. Following removal of the leader peptides, the same enzyme is responsible for the second posttranslational modification that characterises the type-IV pilins and their homologues, namely N-methylation of the newly exposed N-terminal amino acid residue PUBMED:9224881.

    \ ' '3726' 'IPR000317' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \ The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). \ \

    Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein PUBMED:8892921; while \ ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine\ protease and RNA polymerase activity. The regions of the polyprotein in\ which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs.

    \ \

    Calicivirus proteases from the non-SRSV group, which are members of the PA\ protease clan, constitute family C24 of the cysteine proteases (proteases\ from SRSVs belong to the C37 family). As mentioned above, the protease\ activity resides within a polyprotein. The enzyme cleaves the polyprotein\ at sites N-terminal to itself, liberating the polyprotein helicase.

    \ ' '3727' 'IPR001769' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to MEROPS peptidase family C25 (gingipain, clan CD). The protein fold of the peptidase domain for members of this entry resembles that of caspase 1, the type example for clan CD.

    \

    This is a protein family found only in the bacteria. Porphyromonas gingivalis (Bacteroides gingivalis) a Gram-negative anaerobic bacterial species strongly associated with adult periodontitis. One of its distinguishing characteristics and putative virulence properties is the ability to agglutinate erythrocytes PUBMED:8926061. It is a highly proteolytic organism which metabolises small peptides and amino acids. Indirect evidence suggests that the proteases produced by this microorganism constitute an important virulence factor PUBMED:1322368. Protease-encoding genes have been shown to contain multiple copies of repeated nucleotide sequences. These conserved sequences have also been found in haemagglutinin genes PUBMED:9632563.

    \ ' '3728' 'IPR005536' '\

    This domain is found in almost all members of MEROPS peptidase family C25, (clan CD). Peptidase family C25 is a protein family found in the bacteria Porphyromonas gingivalis (Bacteroides gingivalis) a Gram-negative anaerobic bacterial species strongly associated with adult periodontitis. One of its distinguishing characteristics and putative virulence properties is the ability to agglutinate erythrocytes PUBMED:8926061. It is a highly proteolytic organism which metabolises small peptides and amino acids. Indirect evidence suggests that the proteases produced by this microorganism constitute an important virulence factor PUBMED:1322368. Protease-encoding genes have been shown to contain multiple copies of repeated nucleotide sequences. These conserved sequences have also been found in haemagglutinin genes PUBMED:9632563.

    \ ' '3729' 'IPR005074' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of sequences defined by this cysteine peptidase domain belong to the MEROPS peptidase family C39 (clan CA). It is found in a wide range of ABC transporters, which are maturation proteases for peptide bacteriocins, the proteolytic domain residing in the N-terminal region of the protein PUBMED:7674922. A number of the proteins are classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.

    \ \ \

    Lantibiotic and non-lantibiotic bacteriocins are synthesised as precursor peptides containing N-terminal extensions (leader peptides) which are cleaved off during maturation. Most non-lantibiotics and also some lantibiotics have leader peptides of the so-called double-glycine type. These leader peptides share consensus sequences and also a common processing site with two conserved glycine residues in positions -1 and -2. The double- glycine-type leader peptides are unrelated to the N-terminal signal sequences which direct proteins across the cytoplasmic membrane via the sec pathway. Their processing sites are also different from typical signal peptidase cleavage sites, suggesting that a different processing enzyme is involved.

    \ \ \ ' '3730' 'IPR000855' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine aminopeptidases belong to the peptidase family C5 (adenain family, clan CE). Several adenovirus proteins are synthesised as precursors, requiring\ processing by a protease before the virion is assembled PUBMED:7845226, PUBMED:3052288. Until\ recently, the adenovirus endopeptidase was classified as a serine protease,\ having been reported to be inhibited by serine protease inhibitors PUBMED:7845226, PUBMED:462815.\ However, it has since been shown to be inhibited by cysteine protease\ inhibitors, and the catalytic residues are believed to be His-54 and\ Cys-104 PUBMED:7845226, PUBMED:3052288.

    \ ' '3731' 'IPR005314' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to MEROPS peptidase family C50 (separase family, clan CD). The active site residues for members of this family and family C14 occur in the same order in the sequence: H,C.

    \ \

    The separases are caspase-like proteases, which plays a central role in the chromosome segregation. In yeast they cleave the rad21 subunit of the cohesin complex at the onset of anaphase. During most of the cell cycle, separase is inactivated by the securin/cut2 protein, which probably covers its active site.

    \ ' '3732' 'IPR005083' '\

    The infection of mammalian host cells by Yersinia sp. causes a rapid induction of the mitogen-activated protein kinase (MAPK; including the ERK, JNK and p38 pathways) and nuclear factor kappaB (NF-kappaB) signalling pathways that would typically result in cytokine production and initiation of the innate immune response. However, these pathways are rapidly inhibited promoting apoptosis.

    \ \

    This entry contains YopJ and related proteins. It has been shown that YopJ is a serine/threonine acetyltransferase PUBMED:17412595. It acetylates the serine and threonine residues in the phosphorylation sites of MAPK kinases and nuclear factor kappaB, preventing their activation by phosphorylation and the inhibition of these signalling pathways PUBMED:17116858.

    \ \ \

    Serine and threonine acetylation is yet another complication to the control of signalling pathways and may be a may be a widespread mode of biochemical regulation of endogenous processes in eukaryotic cells. Although, at the present time there are no predicted eukaryotic orthologs of YopJ based on primary sequence analysis, it could well be they have not as yet been identified.

    \ ' '3733' 'IPR004970' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This is a group of cysteine peptidases which constitute MEROPS peptidase family C57 (clan CE). The type example is vaccinia virus I7 processing peptidase (vaccinia virus); protein I7 is expressed in the late phase of infection PUBMED:2835495.

    \ ' '3734' 'IPR001456' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This domain represents the potyvirus helper component protease found in genome polyproteins of potyviruses. It is is a cysteine peptidase belonging to the MEROPS peptidase family C6 (clan CA).

    \ \

    The genome polyprotein contains: N-terminal peptidase belonging to MEROPS peptidase family S30 (protein P1), helper component protease, MEROPS peptidase family C6, () (HC-PRO), protein P3, 6KD protein (6K1), cytoplasmic inclusion protein (CI), 6KD protein 2 (6K2), genome-linked protein (VPG), nuclear inclusion protein A (), nuclear inclusion protein B () and coat protein (CP).

    \ \

    The helper component-proteinase is required for aphid transmission.

    \ ' '3735' 'IPR002704' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \ This group of cysteine peptidases belong to MEROPS peptidase family C7 (clan CA). These are found in fungi and viruses (Hypoviridae). They are involved in transmissible hypovirulence and may indicate the possible origins of hypovirulence-associated dsRNAs PUBMED:2009854.\ ' '3736' 'IPR005315' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to MEROPS peptidase family C8 (clan CA). The peptidases are encoded by the double stranded viral RNAs belonging to the genus Hypovirus.

    \ ' '3737' 'IPR002620' '\

    The alphaviruses produce two mRNAs after infection: the genomic (49S) RNA which is translated into the nonstructural (replicase) proteins and the subgenomic (26S) RNA which serves as the mRNA for the virion structural proteins. The long polyprotein comprises individual nonstructural proteins that are formed by a proteolytic processing steps to give nsPl, nsP2, nsP3 and nsP4 PUBMED:3488539. This signature identifies non-structural protein 2 (nsP2) which has two reported activities:\

    \ ' '3738' 'IPR001818' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase families:

    \

    \ \

    The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.

    Sequences having this domain are extracellular metalloproteases, such as collagenase and stromelysin, which\ degrade the extracellular matrix, are known as matrixins. They are zinc-dependent,\ calcium-activated proteases synthesised as inactive precursors\ (zymogens), which are proteolytically cleaved to yield the active enzyme\ PUBMED:2551898, PUBMED:2167841. All matrixins and related proteins possess 2 domains: an N-terminal\ domain, and a zinc-binding active site domain. The N-terminal domain\ peptide, cleaved during the activation step, includes a conserved PRCGVPDV\ octapeptide, known as the cysteine switch, whose Cys residue chelates the\ active site zinc atom, rendering the enzyme inactive. The active enzyme\ degrades components of the extracellular matrix, playing a role in the\ initial steps of tissue remodelling during morphogenesis, wound healing,\ angiogenesis and tumour invasion PUBMED:2551898, PUBMED:2167841.

    \ \ ' '3740' 'IPR011765' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The majority of the sequences in this entry are metallopeptidases and non-peptidase homologs belong to MEROPS peptidase family M16 (clan ME), subfamilies M16A, M16B and M16C; they include:

    \ \ \

    These proteins do not share many regions of sequence similarity; the most noticeable is in the N-terminal section. This region includes a conserved histidine followed, two residues later by a glutamate and another histidine. In pitrilysin, it has been shown PUBMED:7990931 that this H-x-x-E-H motif is involved in enzymatic activity; the two histidines bind zinc and the glutamate is necessary for catalytic activity.\ \ The proteins classified as non-peptidase homologues either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.

    \ ' '3741' 'IPR000819' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine).

    \ \

    Aminopeptidases are exopeptidases involved in the processing and regular\ turnover of intracellular proteins, although their precise role in cellular\ metabolism is unclear PUBMED:1555602, PUBMED:2395881. Leucine aminopeptidases cleave leucine residues\ from the N-terminal of polypeptide chains, but substantial rates are evident\ for all amino acids PUBMED:2395881.

    \ \

    The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of\ one another PUBMED:2395881. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape PUBMED:2395881. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices PUBMED:2395881. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core PUBMED:2395881. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer PUBMED:2395881. The 2 zinc ions and the active site are entirely located in the C-terminal catalytic domain PUBMED:2395881.

    \ ' '3742' 'IPR001548' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M2 (clan MA(E)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA. The catalytic residues and\ zinc ligands have been identified, the zinc ion being ligated to two His residues within the motif HEXXH, showing that the enzyme belongs to the E sub-group of metalloproteases PUBMED:7674922.

    \ \ \ \ Pepetidyl-dipeptidase A (angiotensin-converting enzyme) is a mammalian\ enzyme responsible for cleavage of dipeptides from the C-termini of\ proteins, notably converting angiotensin I to angiotensin II PUBMED:7674922. The enzyme\ exists in two differentially transcribed forms, the most common of which\ is from lung endothelium; this contains two homologous domains that have\ arisen by gene duplication PUBMED:7674922. The testis-specific form contains only the\ C-terminal domain, arising from a duplicated promoter region present in\ intron 12 of the gene PUBMED:7674922. Both enzymatic forms are membrane proteins that are anchored by means of a\ C-terminal transmembrane domain. Both domains of the endothelial enzyme are\ active, but have differing kinetic constants PUBMED:7674922. PUBMED:1851160. A number of insect enzymes have\ been shown to be similar to peptidyl-dipeptidase A, these containing a\ single catalytic domain.

    \ ' '3743' 'IPR000994' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This entry contains proteins that belong to MEROPS peptidase family M24 (clan MG), which share a common structural-fold, the "pita-bread" fold. The fold contains both alpha helices and an anti-parallel beta sheet within two structurally similar domains that are thought to be derived from an ancient gene duplication. The active site, where conserved, is located between the two domains. The fold is common to methionine aminopeptidase (), aminopeptidase P (), prolidase (), agropine synthase and creatinase (). Though many of these peptidases require a divalent cation, creatinase is not a metal-dependent enzyme PUBMED:8146141, PUBMED:12136144, PUBMED:8471602.

    \

    The entry also contains proteins that have lost catalytic activity, for example Spt16, which is a component of the FACT complex. The crystal structure of the N-terminal domain of Spt16, determined to 2.1A, reveals an aminopeptidase P fold whose enzymatic activity has been lost. This fold binds directly to histones H3-H4 through a interaction with their globular core domains, as well as with their N-terminal tails PUBMED:18579787.

    \

    The FACT complex is a stable heterodimer in Saccharomyces cerevisiae (Baker\'s yeast) comprising Spt16p (, ) and Pob3p (, ). The complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin PUBMED:15987999; it also facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilising and then reassembling nucleosome structure PUBMED:12524332, PUBMED:12934006, PUBMED:18579787.

    \ ' '3744' 'IPR000395' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M27 (clan MA(E)). A number of the proteins have been classified as non-peptidase homologues as they have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of peptidases in the family.

    \ \

    There are seven antigenically distinct forms of botulinum neurotoxin, \ designated A, B, C1, D, E, F and G. The seven neurotoxins are potent\ protein toxins that inhibit neurotransmitter release from peripheral\ cholinergic synapses PUBMED:2160960. On binding to the neuronal synapses, the\ molecules are internalised and move by retrograde transport up the axon\ into the spinal cord, where they can move between post- and presynaptic\ neurons. The toxin inhibits neurotransmitter release by acting as a zinc\ endopeptidase that cleaves synaptic proteins\ such as synaptobrevins, syntaxin and SNAP-25 PUBMED:8897436.\ The protein toxins exist as disulphide-linked heterodimers of light and \ heavy chains. The light chain has the pharmacological activity, while the\ N- and C-termini of the heavy chain mediate channel formation and toxin\ binding PUBMED:2160960. The light chain exhibits a high level of sequence similarity\ to tetanus toxin (TeTx). Alignment of all characterised neurotoxin sequences\ reveals the presence of highly conserved amino acid domains interspersed\ with amino acid tracts with little overall similarity. The most divergent\ region corresponds to the C-terminal extremity of each toxin, which may\ reflect differences in specificity of binding to neurone acceptor sites PUBMED:1541280.

    \ ' '3745' 'IPR000787' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M29 (aminopeptidase T family, clan M-). The protein fold of the peptidase domain and the active site residues are not known for any members of the thermophilic metallo-aminopeptidases family.

    \ \ ' '3746' 'IPR001567' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M3 (clan MA(E)), subfamilies M3A and M3B. The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.

    \ \ \

    The Thimet oligopeptidase family, is a large family of archaeal, bacterial and eukaryotic oligopeptidases that cleave medium sized peptides. The group contains:

    \ \ \ ' '3747' 'IPR001333' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M32 (carboxypeptidase Taq family, clan MA(E)). The predicted active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH.

    \ \

    Carboxypeptidase Taq is a zinc-containing thermostable metallopeptidase. It was originally discovered and purified from Thermus aquaticus; optimal enzymatic activity occurs at 80 celcius. Although very little is known about this enzyme, it is thought either to be associated\ with a membrane or to be particle bound PUBMED:.

    \ ' '3748' 'IPR001384' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M35 (deuterolysin family, clan MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.

    \ \

    Deuterolysin is a microbial zinc-containing metalloprotease that shows\ some similarity to thermolysin PUBMED:1886621. The protein is expressed with a\ possible 19-residue signal sequence, a 155-residue propeptide, and an\ active peptide of 177 residues PUBMED:8049277. The latter contains an HEXXH motif\ towards the C-terminus, but the other zinc ligands are as yet undetermined\ PUBMED:1886621, PUBMED:8049277.

    \ ' '3749' 'IPR001842' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M36 (fungalysin family, clan MA(E)). The predicted active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH.

    \ \

    Fungalysin is produced by fungi, Aspergillus and other\ species, to aid degradation of host lung cell walls on infection. The\ enzyme is a 42kDa single chain protein, with a pH optimum of 7.5-8.0 and\ optimal temperature of 60 celcius PUBMED:, PUBMED:14766908.

    \ ' '3750' 'IPR013856' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases constitutes the MEROPS peptidase family M4 (thermolysin family, clan MA(E)). The protein fold of the peptidase domain of thermolysin, is the type example for members of the clan MA. The thermolysin family is composed only of secreted eubacterial endopeptidases. The zinc-binding residues\ are H-142, H-146 and E-166, with E-143 acting as the catalytic residue.\ Thermolysin also contains 4 calcium-binding sites, which contribute to its\ unusual thermostability. The family also includes enzymes from a number\ of pathogens, including Legionella and Listeria, and the protein pseudolysin,\ all with a substrate specificity for an aromatic residue in the P1\' position. Three-dimensional structure analysis has shown that the enzymes undergo\ a hinge-bend motion during catalysis. Pseudolysin has a broader\ specificity, acting on large molecules such as elastin and collagen,\ possibly due to its wider active site cleft PUBMED:7674922.

    \ ' '3751' 'IPR005072' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M44 (clan ME). The active site residues for members of this family and family M16 occur in the motif HXXEHProtein. The type example is the vaccinia virus-type metalloendopeptidase G1 from vaccinia virus, it is a metalloendopeptidase expressed by many Poxviridae which appears to play a role in the maturation of viral proteins.

    \ ' '3752' 'IPR000755' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M15 (clan MD), subfamily M15D (vanX D-Ala-D-Ala dipeptidase).

    \ \

    The D-alanyl-D-alanine dipeptidase enzyme from Enterococcus faecalis is also known as the\ vancomycin resistance protein VanX, and hydrolyses D-ala-D-ala. It has a 250-fold differential in catalytic efficiency for hydrolysis of D-ala-D-ala versus D-ala-D-lactate. The latter therefore remains intact for subsequent incorporation into peptidoglycan precursors that terminate in the dipeptide D-ala-D-lactate rather than the dipeptide D-ala-D-ala, thereby preventing vancomycin from binding. The enzyme requires a metal cofactor, and is induced by vancomycin through regulation by VanS and VanR.

    \ ' '3753' 'IPR001915' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M48 (Ste24 endopeptidase family, clan M-); members of both subfamily are represented. The members of this set of proteins are mostly described as probable protease htpX homologue () or CAAX prenyl protease 1, which proteolytically removes the C-terminal three residues of farnesylated proteins. They are intergral membrane proteins associated with the endoplasmic reticulum and golgi, binding one zinc ion per subunit.

    \ \ \

    In Saccharomyces cerevisiae (Baker\'s yeast) Ste24p is required for the first NH2-terminal proteolytic processing event within the a-factor precursor, which takes place after COOH-terminal CAAX modification is complete. The Ste24p contains multiple predicted membrane spans, a zinc metalloprotease motif (HEXXH), and a COOH-terminal ER retrieval signal (KKXX). The HEXXH protease motif is critical for Ste24p activity, since Ste24p fails to function when conserved residues within this motif are mutated.

    \ \

    The Ste24p homologues occur in a diverse group of organisms, including Escherichia coli, Schizosaccharomyces pombe (Fission yeast), Haemophilus influenzae, and Homo sapiens (Human), which indicates that the gene is highly conserved throughout evolution. Ste24p and the proteins related to it define a subfamily of proteins that are likely to function as intracellular, membrane-associated zinc metalloproteases PUBMED:9015299.

    \ ' '3754' 'IPR005317' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M49 (dipeptidyl-peptidase III family, clan M-). The predicted active site residues occur in the motif HEXXXH which is unlike that in any other family. The dipeptidyl peptidase III aminopeptidases cleave dipeptides from the N-terminal of peptides consisting of four or more amino acids and have a broad specificity.

    \ ' '3755' 'IPR001570' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases constitutes the MEROPS peptidase family M4 (thermolysin family, clan MA(E)). The protein fold of the peptidase domain of thermolysin, is the type example for members of the clan MA. The thermolysin family is composed only of secreted eubacterial endopeptidases. The zinc-binding residues\ are H-142, H-146 and E-166, with E-143 acting as the catalytic residue.\ Thermolysin also contains 4 calcium-binding sites, which contribute to its\ unusual thermostability. The family also includes enzymes from a number\ of pathogens, including Legionella and Listeria, and the protein pseudolysin,\ all with a substrate specificity for an aromatic residue in the P1\' position. Three-dimensional structure analysis has shown that the enzymes undergo\ a hinge-bend motion during catalysis. Pseudolysin has a broader\ specificity, acting on large molecules such as elastin and collagen,\ possibly due to its wider active site cleft PUBMED:7674922.

    \ ' '3756' 'IPR007035' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M55 (DppA aminopeptidase family, clan MN). The type example is Bacillus subtilis DppA, which is a binuclear zinc-dependent, D-specific aminopeptidase. The structure reveals that DppA is a new example of a self-compartmentalising protease, a family of proteolytic complexes. Proteasomes are the most extensively studied representatives of this family. The DppA enzyme is composed of identical 30 kDa subunits organised in a decamer with 52 point-group symmetry. A 20 A wide channel runs through the complex, giving access to a central chamber holding the active sites. The structure shows DppA to be a prototype of a new family of metalloaminopeptidases characterised by the SXDXEG key sequence PUBMED:11473256. The only known substrates are D-ala-D-ala and D-ala-gly-gly.

    \ ' '3757' 'IPR000013' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M7 (snapalysin family, clan MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.

    \ \ \

    With a molecular weight of around 16kDa, Streptomyces extracellular neutral protease is one of the smallest known proteases PUBMED:7674922; it is capable of hydrolysing milk proteins PUBMED:7674922. The enzyme is synthesised as a proenzyme with a signal peptide, a propeptide and an active domain that contains the conserved HEXXH motif characteristic of metalloproteases. Although family M7 shows active site sequence similarity to other members, it differs in one major respect: the third zinc ligand appears to be an aspartate residue rather than the usual histidine.

    \ ' '3758' 'IPR001577' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M8 (leishmanolysin family, clan MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.

    \ \ Leishmanolysin is an enzyme found in the eukaryotes including Leishmania and related parasitic\ protozoa PUBMED:7674922. The endopeptidase is the most abundant protein on the cell\ surface during the promastigote stage of the parasite, and is attached to\ the membrane by a glycosylphosphatidylinositol anchor PUBMED:7674922. In the amastigote\ form, the parasite lives in lysosomes of host macrophages, producing a\ form of the protease that has an acidic pH optimum PUBMED:7674922. This differs from\ most other metalloproteases and may be an adaptation to the environment in\ which the organism survives PUBMED:7674922.

    \ ' '3759' 'IPR013510' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases constitutes the MEROPS peptidase family M9, subfamilies M9A and M9B (microbial collagenase, clan MA(E)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH PUBMED:7674922.

    \ \

    Microbial collagenases have been identified from bacteria of both the\ Vibrio and Clostridium genuses. Collagenase is used during bacterial attack to degrade the collagen barrier of the host during invasion. Vibrio bacteria are non-pathogenic, and are sometimes used in hospitals to remove dead tissue from burns and ulcers. Clostridium histolyticum is a pathogen that causes gas gangrene;\ nevertheless, the isolated collagenase has been used to treat bed sores.\ Collagen cleavage occurs at an Xaa+Gly in Vibrio bacteria and at Yaa+Gly\ bonds in Clostridium collagenases PUBMED:.

    \ \

    Analysis of the primary structure of the gene product from Clostridium perfringens has revealed that the enzyme is produced with a stretch of 86 residues that contain a putative signal sequence PUBMED:8282691. Within this stretch is found PLGP, an amino acid sequence typical of collagenase substrates. This sequence may thus be implicated in self-processing of the collagenase PUBMED:8282691.

    \ ' '3760' 'IPR001967' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to MEROPS peptidase family S11 (D-Ala-D-Ala carboxypeptidase A family, clan SE). The protein fold of the peptidase domain for members of this family resembles that of D-Ala-D-Ala-carboxypeptidase B, the type example for clan SE.

    \ \ Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endo-peptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S27) of serine protease have been identified, these being grouped into 6 clans (SA, SB, SC, SE, SF and SG) on the basis of structural similarity and other functional evidence. Structures are known for four of the clans (SA, SB, SC and SE): these appear to be totally unrelated, suggesting at least four evolutionary origins of serine peptidases and possibly many more PUBMED:7845208.

    \

    Not with standing their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C clans have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (SA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \

    Bacterial cell walls are complex structures containing amino acids and amino sugars, with alternating chains of N-acetylglucosamine and N-acetyl-muramic acid units linked by short peptides PUBMED:7845208: the link peptide in Escherichia coli is L-alanyl-D-isoglutamyl-L-meso-diaminopimelyl-D-alanine. The chains are usually cross-linked between the carboxyl of D-alanine and the free amino group of diaminopimelate. During the synthesis of peptidoglycan, the precursor has the described tetramer sequence with an added C-terminal D-alanine PUBMED:7845208.

    \

    D-Ala-D-Ala carboxypeptidase A is involved in the metabolism of cell components PUBMED:1741619; it is synthesised with a leader peptide to target it to the cell membrane PUBMED:7845208. After cleavage of the leader peptide, the enzyme is retained in the membrane by a C-terminal anchor. There are three families of serine-type D-Ala-D-Ala peptidase, which are also known as low molecular weight penicillin-binding proteins.

    \

    Family S11 contains only D-Ala-D-Ala peptidases, unlike families S12 and S13, which contain other enzymes, such as class C beta-lactamases and D-amino-peptidases PUBMED:7845208. Although these enzymes are serine proteases, some members of family S11 are partially inhibited by thiol-blocking agents PUBMED:1930140.

    \ ' '3761' 'IPR000667' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This family of serine peptidases belong to MEROPS peptidase family S13 (D-Ala-D-Ala carboxypeptidase C, clan SE). The predicted active site residues for members of this family and family S12 occur in the motif SXXK.

    \

    D-Ala-D-Ala carboxypeptidase C is involved in the metabolism of\ cell components PUBMED:1741619; it is synthesised with a leader peptide to target it to the cell membrane PUBMED:7845208. After cleavage of the leader peptide, the enzyme is retained in the membrane by a C-terminal anchor PUBMED:7845208. There are three families of serine-type D-Ala-D-Ala peptidase (designated S11, S12 and S13), which are also known as low molecular weight penicillin-binding proteins PUBMED:7845208.\ Family S13 comprises D-Ala-D-Ala peptidases that have sufficient sequence\ similarity around their active sites to assume a distant evolutionary\ relationship to other clan members; members of the S13 family also bind\ penicillin and have D-amino-peptidase activity. Proteases of family S11 have\ exclusive D-Ala-D-Ala peptidase activity, while some members of S12 are\ C beta-lactamases PUBMED:7845208.

    \ \ ' '3762' 'IPR000383' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This family of sequences are serine peptidases belonging to MEROPS peptidase family S15 (clan SC) PUBMED:7845208. The type example is X-Pro dipeptidyl-peptidase of Lactococcus lactis.

    \ \

    These proteins, which have similar specificity to mammalian dipeptidyl-peptidase IV, cleave Xaa-Pro-releasing\ N-terminal dipeptides. The penultimate residue must be proline.\ In L. lactis the proteins exist as cytoplasmic homodimers PUBMED:7845208.

    \ ' '3763' 'IPR001847' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to MEROPS peptidase family S21 (assemblin family, clan 21).

    \ \

    A number of viral proteases have been discovered and their sequence similarity is very low. Studies with protease inhibitors suggest that the Herpesviridae protease is a serine protease belonging to either the trypsin-like or subtilisin-like families; it is not inhibited by inhibitors of Cys, Asp or metallo proteases.

    \ ' '3764' 'IPR019759' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \ ' '3767' 'IPR005151' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to the MEROPS peptidase family S41 (C-terminal processing peptidase family, clan SM). The members of this group include: the tricorn protease of bacteria and archaea, C-terminal peptidases with different substrates specificities in different species including processing of D1 protein of the photosystem II reaction centre in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein; and some appear to be responsible for degrading oligopeptides, probably derived from the proteasome.

    \ ' '3768' 'IPR005319' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases, which includes HetR, are associated with heterocystous cyanobacteria and belong to MEROPS peptidase family S48 (clan S-). HetR is a DNA-binding serine-type protease required for heterocyst differentiation in heterocystous cyanobacteria under conditions of nitrogen deprivation. Mutation of HetR from of Anabaena sp. (strain PCC 7120) by site-specific mutagenesis of Ser-152 showed that this residue was one of the peptidase active site residues. It was suggested that peptidase activity might be needed for repression of HetR overproduction under conditions of nitrogen deprivation PUBMED:10692362. Modification of Cys-48 prevented disulphide-bond formation and homodimerisation of HetR and DNA-binding. The homodimer of HetR binds the promoter regions of hetR, hepA, and patS, suggesting a direct control of the expression of these genes by HetR. The pentapeptide RGSGR, which is present at the C terminus of PatS, blocks heterocyst formation, inhibits the DNA binding of HetR and prevents hetR up-regulation PUBMED:15051891.

    \ ' '3769' 'IPR005320' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to MEROPS peptidase family S51 (clan PC(S)). The type example being dipeptidase E (alpha-aspartyl dipeptidase) from Escherichia coli. The family contains alpha-aspartyl dipeptidases (dipeptidase E) and cyanophycinases.

    \ \

    The three-dimensional structure of Salmonella typhimurium aspartyl dipeptidase, peptidase E has been determine at 1.2-A resolution. The structure of this 25-kDa enzyme consists of two mixed beta-sheets forming a V, flanked by six alpha-helices. The active site contains a Ser-His-Glu catalytic triad and is the first example of a serine peptidase/protease with a glutamate in the catalytic triad. The active site Ser is located on a strand-helix motif reminiscent of that found in alpha/beta-hydrolases, but the polypeptide fold and the organisation of the catalytic triad differ from those of the known serine proteases. This enzyme appears to represent a new example of convergent evolution of peptidase activity PUBMED:11106384.

    \ \

    Alpha-aspartyl dipeptidase hydrolyses dipeptides containing N-terminal aspartate residues, asp-|-xaa. It does not act on peptides with N-terminal Glu, Asn or Gln, nor does it cleave isoaspartyl peptides. In the cyanobacteria, cyanophycinase is an exopeptidase that catalyses the hydrolytic cleavage of multi-l-arginyl-poly-l-aspartic acid (cyanophycin; a water- insoluble reserve polymer) into aspartate-arginine dipeptides.

    \ ' '3770' 'IPR005321' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to MEROPS peptidase family S58 \ (DmpA aminopeptidase family, clan PB(S)). The protein fold of the peptidase unit for members of this family resembles that of archaean proteasome subunit B, the type example of clan PB. The type example is aminopeptidase DmpA from OOchrobactrum anthropi. This family also contains proteins that have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of peptidases in the family.

    \

    L-aminopeptidase D-Ala-esterase/amidase (DmpA) from O. anthropi releases the N-terminal L and/or D-Ala residues from peptide substrates. This is the only known enzyme to liberate N-terminal amino acids with both D and L stereospecificity. DmpA active form is an alphabeta heterodimer, which results from a putative autocatalytic cleavage of an inactive precursor polypeptide. DmpA shows structural homology to N-terminal nucleophile (Ntn) hydrolase family members, and may work by a similar catalytic mechanism, however their secondary structure elements differ significantly PUBMED:10673442.

    \ ' '3771' 'IPR000209' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to the MEROPS peptidase families S8 (subfamilies S8A (subtilisin) and S8B (kexin)) and S53 (sedolisin) both of which are members of clan SB.

    \ \

    The subtilisin family is the second largest serine protease family characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence PUBMED:9070434. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses PUBMED:7845208. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase PUBMED:7845208, PUBMED:8439290. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity PUBMED:7845208, PUBMED:8439290. Some subtilisins are mosaic proteins, while others contain N- and C-terminal extensions that show no sequence similarity to any other known protein PUBMED:7845208. Based on sequence homology, a subdivision into six families has been proposed PUBMED:9070434.

    \ \

    The proprotein-processing endopeptidases kexin, furin and related enzymes\ form a distinct subfamily known as the kexin subfamily (S8B). These preferentially\ cleave C-terminally to paired basic amino acids. Members of this subfamily\ can be identified by subtly different motifs around the active site PUBMED:7845208, PUBMED:8439290.\ Members of the kexin family, along with endopeptidases R, T and K from the\ yeast Tritirachium and cuticle-degrading peptidase from Metarhizium, require\ thiol activation. This can be attributed to the presence of Cys-173 near to\ the active histidine PUBMED:8439290.Only 1 viral member of the subtilisin family is known, a 56-kDa protease from herpes virus 1, which infects the channel catfish PUBMED:7845208.

    \ \

    Sedolisins (serine-carboxyl peptidases) are proteolytic enzymes whose fold resembles that of subtilisin; however, they\ are considerably larger, with the mature catalytic domains containing approximately 375 amino acids. The defining\ features of these enzymes are a unique catalytic triad, Ser-Glu-Asp, as well as the presence of an aspartic acid\ residue in the oxyanion hole. High-resolution crystal structures have now been solved for sedolisin from Pseudomonas sp. 101, as well as for kumamolisin from a thermophilic bacterium, Bacillus sp. MN-32. Mutations in the human gene leads to a fatal neurodegenerative disease PUBMED:12673349.

    \ ' '3772' 'IPR001375' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This domain covers the active site serine of the serine peptidases belonging to MEROPS peptidase family S9 (prolyl oligopeptidase family, clan SC). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. \ Examples of protein families containing this domain are:

    \

    \ \

    These proteins belong to MEROPS peptidase families S9A, S9B and S9C.

    \ ' '3773' 'IPR005080' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    Metalloproteases are the most diverse of the four main types of protease, with more than 30 families identified to date PUBMED:7674922. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as abXHEbbHbc, where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    This group of metallopeptidases belong to MEROPS peptidase family A25 (gpr protease family, clan AE).\ \ These are tetrameric proteases that makes the rate-limiting first cut in the small, acid-soluble spore proteins (SASP) of Bacillus subtilis and related species during spore germination. The enzyme lacks clear homology to other known proteases. It processes its own amino end before becoming active to cleave SASPs.

    \ ' '3774' 'IPR001539' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported.

    \

    This is a group of peptidases belonging to MEROPS peptidase family U32 (clan U-). The type example is collagenase (gene prtC) from Porphyromonas gingivalis (Bacteroides gingivalis) PUBMED:1317840, which is an enzyme that degrades type I collagen and that seems to require a metal cofactor. The product of PrtC is evolutionary related to a number of uncharacterised proteins with a well conserved region containing two cysteines.

    \ ' '3775' 'IPR005322' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of peptidases, belong to MEROPS peptidase family C69 (dipeptidase, clan PB). They are mainly dipeptidases PUBMED:8766699 and incude dipeptidase A from Lactobacillus helveticus.

    \ \

    Comparative sequence and structural analysis, particularly to penicillin V acylase (MEROPS peptidase family C59) revealed a cysteine as the catalytic nucleophile as well as other conserved residues important for catalysis PUBMED:12717035. In general, C69 family is variable in sequence and exhibits great diversity in substrate specificity, to include enzymes such as choloyglycine hydrolases, acid ceramidases, isopenicillin N acyltransferases, and a subgroup of eukaryotic proteins with unclear function.

    \ \ ' '3776' 'IPR005081' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported.

    \

    This group of peptidases belong to the MEROPS peptidase family U4 (SpoIIGA peptidase family, clan U-).

    \ \

    Sporulation in bacteria such as Bacillus subtilis involves the formation of a polar septum, which divides the sporangium into a mother cell and a forespore. The sigma E factor, which is encoded within the spoIIG operon, is a cell-specific regulatory protein that directs gene transcription in the mother cell. Sigma E is synthesised as an inactive proprotein pro-sigma E, which is converted to the mature factor by the putative processing enzyme SpoIIGA PUBMED:11849534.

    \ ' '3777' 'IPR005073' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of peptidases belong to MEROPS peptidase family M74 (murein endopeptidase family, clan MD). The type example is murein endopeptidase from Escherichia coli (MepA). The entry represents a family of penicillin-insensitive murein endopeptidases involved in the removal of murein from the sacculus by cleaving the peptide bonds between neighbouring strands in mature murein. The crystal structure of MepA has been determined revealing similarities to the D-Ala-D-Ala carboxypeptidases in MEROPS peptidase family M15 PUBMED:15292190.

    \ ' '3778' 'IPR002142' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to MEROPS peptidase family S49 (protease IV family, clan S-). The predicted active site serine for members of this family occurs in a transmembrane domain.

    \ \

    The domain defines sequences in viruses, archaea, bacteria and plants. These sequences are variously annotated in the different taxonomic groups, examples are:

    \ \

    \ \

    This group also contains proteins classified as non-peptidase homologues that either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of peptidases. Related proteins, non-peptidase homologs and unclassified S49 members are also to be found in .

    \ \ ' '3779' 'IPR005082' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported.

    \

    This group of peptidases belongs to MEROPS peptidase family U9 (phage prohead processing peptidase family, clan U-), which play a role in the head assembly of Bacteriophage T4.

    \

    The pathway of bacteriophage T4 head assembly begins with the formation of a prohead bound to the bacterial cell membrane which is later converted to the mature, DNA-containing head. During maturation, all but one of the prohead proteins are proteolytically processed by a phage-coded protease which is formed by autocatalytic cleavage of the product of gene 21 (gp21). Protease gp21 has been tentatively located in the centre of the prohead core PUBMED:3552886.

    \ ' '3780' 'IPR004279' '\ The perilipin family includes lipid droplet-associated protein (perilipin) and adipose differentiation-related protein\ (adipophilin). Perilipin is a modulator of adipocyte lipid metabolism and adipophilinis involved in the development and maintenance of adipose tissue. Other proteins belong to this group include TIP47, a cargo selection device for mannose 6-phosphate receptor trafficking PUBMED:9590177.\ ' '3781' 'IPR002491' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain PUBMED:9873074.

    \ The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site PUBMED:11421269, PUBMED:1282354, PUBMED:9640644.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis PUBMED:11988180, PUBMED:11470432, PUBMED:11402022, PUBMED:9872322, PUBMED:11080142, PUBMED:11532960.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions PUBMED:9873074. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette PUBMED:9873074, PUBMED:11421270. More than 50 subfamilies have been described based on a phylogenetic and functional classification PUBMED:9873074, PUBMED:11421269, PUBMED:11421270; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    \

    Most bacterial importers employ a periplasmic substrate-binding protein (PBP) that delivers the ligand to the extracellular gate of the TM domains. These proteins bind their substrates selectively and with high affinity, which is thought to ensure the specificity of the transport reaction. Binding proteins in Gram-negative bacteria are present within the periplasm, whereas those in Gram-positive bacteria are tethered to\ the cell membrane via the acylation of a cysteine residue that is an integral\ component of a lipoprotein signal sequence. In planta expression of a high-affinity iron-uptake system involving the siderophore chrysobactin in Erwinia chrysanthemi 3937 contributes greatly to invasive growth of this pathogen on its natural host, African violets PUBMED:8596459. The cobalamin (vitamin B12) and\ the iron transport systems share many common attributes and probably evolved\ from the same origin PUBMED:12475936, PUBMED:15475351.\

    \

    This entry represents of the periplasmic-binding domain is composed of two subdomains,\ each consisting of a central beta-sheet and surrounding alpha-helices, linked\ by a rigid alpha-helix. The substrate binding site is located\ in a cleft between the two alpha/beta subdomains PUBMED:12468528.

    \ ' '3782' 'IPR001761' '\

    This family includes the periplasmic binding proteins, and the LacI family transcriptional regulators. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The LacI family of proteins consist of transcriptional regulators related to the lac repressor. In this case, generally the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain (lacI) PUBMED:1583688, PUBMED:8638105.

    \ ' '3783' 'IPR002016' '\ Peroxidases are haem-containing enzymes that use hydrogen peroxide as\ the electron acceptor to catalyse a number of oxidative reactions.\ Most haem peroxidases follow the reaction scheme:\ \ \ \ \

    In this mechanism, the enzyme reacts with one equivalent of H2O2 to give \ [Fe4+=O]R\' (compound I). This is a two-electron oxidation/reduction \ reaction where H2O2 is reduced to water and the enzyme is oxidised. One \ oxidising equivalent resides on iron, giving the oxyferryl PUBMED:8062820 \ intermediate, while in many peroxidases the porphyrin (R) is oxidised to \ the porphyrin pi-cation radical (R\'). Compound I then oxidises an organic \ substrate to give a substrate radical PUBMED:7922023.

    \ \

    Haem peroxidases include two superfamilies: one found in bacteria, fungi, plants and the second found in animals. The first one can be\ viewed as consisting of 3 major classes PUBMED:. Class\ I, the intracellular peroxidases, includes: yeast cytochrome c peroxidase\ (CCP), a soluble protein found in the mitochondrial electron transport\ chain, where it probably protects against toxic peroxides; ascorbate\ peroxidase (AP), the main enzyme responsible for hydrogen peroxide removal\ in chloroplasts and cytosol of higher plants PUBMED:; and bacterial catalase-\ peroxidases, exhibiting both peroxidase and catalase activities. It is\ thought that catalase-peroxidase provides protection to cells under\ oxidative stress PUBMED:1954228.

    \

    Class II consists of secretory fungal peroxidases: ligninases, or lignin \ peroxidases (LiPs), and manganese-dependent peroxidases (MnPs). These are\ monomeric glycoproteins involved in the degradation of lignin. In MnP,\ Mn2+ serves as the reducing substrate PUBMED:8167033. Class II proteins contain four\ conserved disulphide bridges and two conserved calcium-binding sites.

    \

    Class III consists of the secretory plant peroxidases, which have multiple \ tissue-specific functions: e.g., removal of hydrogen peroxide from\ chloroplasts and cytosol; oxidation of toxic compounds; biosynthesis of the\ cell wall; defence responses towards wounding; indole-3-acetic acid (IAA) \ catabolism; ethylene biosynthesis; and so on PUBMED:. Class III proteins are \ also monomeric glycoproteins, containing four conserved disulphide bridges \ and two calcium ions, although the placement of the disulphides differs \ from class II enzymes.

    \

    The crystal structures of a number of these proteins show that they share the same architecture - two all-alpha domains between which the haem group is embedded.

    \ ' '3784' 'IPR000028' '\

    Chloroperoxidase (CPO) is a versatile haem-containing enzyme that exhibits peroxidase, catalase and cytochrome P450-like activities in addition to catalyzing halogenation reactions PUBMED:8747463. Despite functional similarities with other haem enzymes, CPO folds into a novel tertiary structure dominated by eight helical segments. The catalytic base, required to cleave the peroxide O-O bond, is glutamic acid rather than histidine as in other peroxidases.

    \ ' '3785' 'IPR007223' '\

    Peroxin-13 is a component of the peroxisomal translocation machinery with Peroxin-14 and Peroxin-17. Both termini of Peroxin-13 are oriented to the cytosol. It is required for peroxisomal association of peroxin-14 PUBMED:10882522. The proteins also contain an SH3 domain ().

    \ ' '3786' 'IPR006966' '\

    Peroxin 3 (Pex3p), also known as Peroxisomal biogenesis factor 3, has been identified and characterised as a peroxisomal membrane protein in yeasts and mammals PUBMED:14733948. Two putative peroxisomal membrane-bound Pex3p homologues have also been found in Arabidopsis thaliana PUBMED:14733948. They possess a membrane peroxisomal targeting signal. Pex3p is an integral membrane protein of peroxisomes, exposing its N- and C-terminal parts to the cytosol PUBMED:10848631. Peroxin is involved in peroxisome biosynthesis and integrity; it assembles membrane vesicles before the matrix proteins are translocated.

    \ \

    In humans, defects in PEX3 are the cause of peroxisome biogenesis disorders PUBMED:10968777, which include Zellweger syndrome (ZWS), neonatal adrenoleukodystrophy (NALD), infantile Refsum disease (IRD), and classical rhizomelic chondrodysplasia punctata (RCDP). These are peroxisomal disorders that are the result of proteins failing to be imported into the peroxisome.

    \ ' '3787' 'IPR004899' '\

    Bordetella pertussis is a Gram-negative, aerobic coccobacillus that causes \ pertussis (whooping cough), especially in young children PUBMED:2542937. Once present in the lungs, the bacterium attaches to ciliated pulmonary epithelial cells via a collection of outer membrane proteins, all of which are virulence \ factors.

    \

    Pertactin, or P69 protein, is one of these virulence factors. Pertactin and\ filamentous haemagglutinin have been identified as Bordetella adhesins PUBMED:1527510. Both proteins contain an arg-gly-asp (RGD) motif that promotes binding to integrins, known to be important in cell mobility and development. The\ production of most Bordetella virulence factors (including pertactin) is \ controlled by a two-component signal transduction system, comprising the\ BvgA regulator and the BvgS sensor PUBMED:10943406. Pertactin shares a high level of similarity with other Bordetella adhesins, such as BrkA. The protein is\ first produced as a 93kDa precursor. Upon secretion into the extracellular\ environment, a 30kDa domain at the C-terminus remains in the outer membrane,\ while the mature 60.4kDa pertactin molecule is released PUBMED:8609998.

    \

    The crystal structure of mature pertactin has been determined to 2.5A \ resolution by means of X-ray diffraction. The fold is characterised by a 16-stranded parallel beta-helix, with a V-shaped cross-section. Several between-strand amino-acid repeats form internal and external ladders. The helical structure is interrupted by several protruding loops that contain motifs associated with the activity of the protein. One such sequence - [GGXXP]5 - appears directly after the RGD motif, and may mediate interaction with epithelial cells. The C-terminal region of P.69 pertactin contains a [PQP]5 motif loop, which contains the major immunoprotective epitope PUBMED:8609998.

    \

    The superfamily also includes immunoglobulin A1 protease and adhesion penetration protein HAP.

    \ ' '3788' 'IPR003898' '\

    A large group of bacterial exotoxins are referred to as "A/B toxins", \ essentially because they are formed from two subunits PUBMED:8225592. The "A" subunit\ possesses enzyme activity, and is transferred to the host cell following a conformational change in the membrane-bound transport "B" subunit PUBMED:8225592.

    \

    Bordetella pertussis is the causative agent of whooping cough, and is a \ Gram-negative aerobic coccus. Its major virulence factor is the pertussis \ toxin, an A/B exotoxin that mediates both colonisation and toxaemic stages\ of the the disease PUBMED:3704651, PUBMED:2873570. Recombinant, inactive forms of the 5 subunits that make up the toxin have proven to be good vaccines.\ The S1 ("A") subunit of pertussis toxin causes the characteristic sound of \ the "whoop" in whooping cough. It achieves this through ADP-ribosylation of \ host Gi alpha-units, an adenylate cyclase inhibitor PUBMED:3704651, PUBMED:2873570. Uninhibited, this enzyme produces elevated levels of cAMP, leading to increased cell exudate and inflammation in the lungs PUBMED:2737291.

    \

    The crystal structure of pertussis toxin has been determined to 2.9A \ resolution PUBMED:8075982. The catalytic A-subunit (S1) shares structural similarity with other ADP-ribosylating bacterial toxins, although differences in the C-terminal portion explain its unique activation mechanism. Despite its\ heterogeneous subunit composition, the structure of the cell-binding\ B-oligomer (S2, S3, two copies of S4, and S5) resembles the symmetrical\ B-pentamers of the cholera and shiga toxin families, but it interacts\ differently with the A-subunit and there is virtually no sequence similarity between B-subunits of the different toxins.

    \ ' '3789' 'IPR020063' '\

    A large group of bacterial exotoxins are referred to as "A/B toxins", \ essentially because they are formed from two subunits PUBMED:8225592. The "A" subunit\ possesses enzyme activity, and is transferred to the host cell following a conformational change in the membrane-bound transport "B" subunit PUBMED:8225592.

    \

    Bordetella pertussis is the causative agent of whooping cough, and is a \ Gram-negative aerobic coccus. Its major virulence factor is the pertussis \ toxin, an A/B exotoxin that mediates both colonisation and toxaemic stages\ of the the disease PUBMED:3704651, PUBMED:2873570. Recombinant, inactive forms of the 5 subunits that make up the toxin have proven to be good vaccines. The S2 and S3 subunits of the toxin form part of the "B" moiety. They are responsible for binding the whole toxin to host cells prior to invasion, and are classed as adhesins PUBMED:2873570. S2 attaches to a host receptor called lactosylceramide. It has also been speculated that the S3 unit may preferentially bind phagocytes.

    \

    The crystal structure of pertussis toxin has been determined to 2.9A \ resolution PUBMED:8075982. The catalytic A-subunit (S1) shares structural similarity with other ADP-ribosylating bacterial toxins, although differences in the C-terminal portion explain its unique activation mechanism. Despite its\ heterogeneous subunit composition, the structure of the cell-binding\ B-oligomer (S2, S3, two copies of S4, and S5) resembles the symmetrical\ B-pentamers of the cholera and shiga toxin families, but it interacts\ differently with the A-subunit and there is virtually no sequence similarity between B-subunits of the different toxins. Two peripheral domains that are unique to the pertussis toxin B-oligomer share structural similarity with a calcium-dependent eukaryotic lectin, and reveal possible receptor-binding sites.

    \ ' '3790' 'IPR003683' '\

    This family consists of cytochrome b6/f complex subunit 5 (PetG). The cytochrome bf complex, found in green plants, eukaryotic algae and cyanobacteria, connects photosystem I to photosystem II in the electron transport chain, functioning as a plastoquinol:plastocyanin/cytochrome c6 oxidoreductase PUBMED:7493961. The purified complex from the unicellular alga Chlamydomonas reinhardtii contains seven subunits; namely four high molecular weight subunits (cytochrome f, Rieske iron-sulphur protein, cytochrome b6, and subunit IV) and three approximately miniproteins (PetG, PetL, and PetX) PUBMED:7493968. Stoichiometry measurements are consistent with every subunit being present as two copies per b6/f dimer. The absence of PetG affects either the assembly or stability of the cytochrome bf complex in C. reinhardtii PUBMED:7493961.

    \ ' '3791' 'IPR007802' '\ This family consists of several Cytochrome B6-F complex subunit VI (PetL) proteins found in a number of plant species. PetL is one of the small subunits which make up the cytochrome b(6)f complex. PetL is not absolutely required for either the accumulation or for the function of cytochrome b6f; in its absence, however, the complex becomes unstable in vivo in aging cells and labile in vitro. It has been suggested that the N terminus of the protein is likely to lie in the thylakoid lumen PUBMED:11796719.\ ' '3792' 'IPR005497' '\

    PetN is a small hydrophobic protein, crucial for cytochrome b6-f complex assembly and/or stability. It is found in bacteria and plants. Cytochrome b6-f complex is composed of 4 large subunits: cytochrome b6, subunit IV (17 kDa polypeptide, petD), cytochrome f and the Rieske protein, as well as 4 small subunits: petG, petL, petM and petN. The complex functions as a dimer. The cytochrome b6-f complex mediates electron transfer between photosystem II (PSII) and photosystem I (PSI) PUBMED:14526088.

    \ ' '3793' 'IPR006708' '\

    Peroxisome(s) form an intracellular compartment, bounded by a typical lipid bilayer membrane. Peroxisome functions are often specialised by organism and cell type; two widely distributed and well-conserved functions are H2O2-based respiration and fatty acid beta-oxidation. Other functions include ether lipid (plasmalogen) synthesis and cholesterol synthesis in\ animals, the glyoxylate cycle in germinating seeds ("glyoxysomes"), photorespiration in leaves, glycolysis in trypanosomes ("glycosomes"), and methanol and/or amine oxidation and assimilation in some yeasts.

    \ \

    PEX genes encode the machinery ("peroxins") required to assemble the peroxisome. Membrane assembly and maintenance requires three of these (peroxins 3, 16, and 19) and may occur without the import of the matrix (lumen) enzymes. Matrix protein import follows a branched pathway of soluble recycling receptors, with one branch for each class of peroxisome targeting sequence (two are well characterised), and a common trunk for all. At least one of these receptors, Pex5p, enters and exits peroxisomes as it functions. Proliferation of the organelle is regulated by Pex11p. Peroxisome biogenesis is remarkably conserved among eukaryotes. A group of fatal, inherited neuropathologies are recognised as peroxisome biogenesis diseases.

    \ ' '3794' 'IPR006845' '\

    This region is the N-terminal part of a number of peroxisomal biogenesis proteins, including Pex2, Pex10 and Pex12, which contain two predicted transmembrane segments. The majority of these proteins have a C-terminal ring finger domain .

    \ ' '3795' 'IPR004258' '\

    Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see PUBMED:10885986.

    \ Severe Plasmodium falciparum malaria is characterised by excessive sequestration of infected and uninfected erythrocytes in the microvasculature of the affected organ. Rosetting, the adhesion of P. falciparum-infected erythrocytes to uninfected erythrocytes is a virulent parasite phenotype associated with the occurrence of severe malaria PUBMED:9419207. The adhesive ligand P. falciparum erythrocyte membrane protein 1 (PfEMP1) is a rosetting protein that contains clusters of glycosaminoglycan-binding motifs.\ ' '3796' 'IPR000023' '\ The enzyme-catalysed transfer of a phosphoryl group from ATP is an\ important reaction in a wide variety of biological processes PUBMED:2953977. One\ enzyme that utilises this reaction is phosphofructokinase (PFK), which\ catalyses the phosphorylation of fructose-6-phosphate to fructose-1,6-\ bisphosphate, a key regulatory step in the glycolytic pathway PUBMED:12023862, PUBMED:7825568. \ PFK exists as a homotetramer in bacteria and mammals (where each monomer\ possesses 2 similar domains), and as an octomer in yeast (where there are\ 4 alpha- (PFK1) and 4 beta-chains (PFK2), the latter, like the mammalian\ monomers, possessing 2 similar domains PUBMED:7825568).

    PFK is ~300 amino acids in length, and structural studies of the\ bacterial enzyme have shown it comprises two similar (alpha/beta) lobes: one involved in\ ATP binding and the other housing both the substrate-binding site and the allosteric site (a regulatory binding site distinct from the active site, but that affects enzyme\ activity). The identical tetramer subunits adopt 2 \ different conformations: in a \'closed\' state, the bound magnesium ion\ bridges the phosphoryl groups of the enzyme products (ADP and fructose-1,6-\ bisphosphate); and in an \'open\' state, the magnesium ion binds only the ADP\ PUBMED:2975709, as the 2 products are now further apart. These conformations are\ thought to be successive stages of a reaction pathway that requires subunit\ closure to bring the 2 molecules sufficiently close to react PUBMED:2975709.

    \

    Deficiency in PFK leads to glycogenosis type VII (Tauri\'s disease), an\ autosomal recessive disorder characterised by severe nausea, vomiting,\ muscle cramps and myoglobinuria in response to bursts of intense or\ vigorous exercise PUBMED:7825568. Sufferers are usually able to lead a reasonably\ ordinary life by learning to adjust activity levels PUBMED:7825568.

    \ ' '3797' 'IPR004184' '\

    Pyruvate formate-lyase (also known as formate C-acetyltransferase) is an enzyme which converts acetyl-CoA and formate to CoA and pyruvate.\ In Escherichia coli, it uses a radical mechanism to reversibly cleave the C1-C2 bond of pyruvate using the Gly 734 radical and two cysteine residues (Cys 418, Cys 419) PUBMED:10504733.

    \ ' '3798' 'IPR002477' '\

    This entry represents peptidoglycan binding domain (PGBD), as well as related domains that share the same structure. PGBD may have a general peptidoglycan binding function, has a core structure consisting of a closed, three-helical bundle with a left-handed twist. It is found at the N or C terminus of a variety of enzymes involved in bacterial cell wall degradation PUBMED:9555893, PUBMED:7121588, PUBMED:1683402. Examples are:

    \

    \

    Many of the proteins having this domain are as yet uncharacterised. However, some are known to belong to MEROPS peptidase family M15 (clan MD), subfamily M15A metallopeptidases. A number of the proteins belonging to subfamily M15A are non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.

    \

    Eukaryotic enzymes can contain structurally similar PGBD-like domains. Matrix metalloproteinases (MMP), which catalyse extracellular matrix degradation, have N-terminal domains that resemble PGBD. Examples are gelatinase A (MMP-2), which degrades type IV collagen PUBMED:10190290, stromelysin-1 (MMP-3), which plays a role in arthritis and tumour invasion PUBMED:12810425, PUBMED:12888258, and gelatinase B (MMP-9) secreted by neutrophils as part of the innate immune defence mechanism PUBMED:12950257. Several MMPs are implicated in cancer progression, since degradation of the extracellular matrix is an essential step in the cascade of metastasis PUBMED:11956636.

    \ ' '3799' 'IPR013078' '\

    Phosphoglycerate mutase () (PGAM) and bisphosphoglycerate mutase () \ (BPGM) are structurally related enzymes that catalyse reactions involving the transfer of phospho groups between the three carbon atoms of phosphoglycerate PUBMED:2847721, PUBMED:2831102, PUBMED:10958932. Both enzymes can catalyse three different reactions with different specificities, the isomerization of 2-phosphoglycerate (2-PGA) to 3-phosphoglycerate (3-PGA) with 2,3-diphosphoglycerate (2,3-DPG) as the primer of the reaction, the synthesis of 2,3-DPG from 1,3-DPG with 3-PGA as a primer and the degradation of 2,3-DPG to 3-PGA (phosphatase activity).

    \

    In mammals, PGAM is a dimeric protein with two isoforms, the M (muscle) and B (brain) forms. In yeast, PGAM is a tetrameric protein.

    BPGM is a dimeric protein and is found mainly in erythrocytes where it plays a major role in regulating haemoglobin oxygen affinity as a consequence of controlling 2,3-DPG concentration. The catalytic mechanism of both PGAM and BPGM involves the formation of a phosphohistidine intermediate PUBMED:6294454.

    A number of other proteins including, the bifunctional enzyme 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase PUBMED:2557623 that catalyses both the synthesis and the degradation of fructose-2,6-bisphosphate and bacterial alpha-ribazole-5\'-phosphate phosphatase, which is involved in cobalamin biosynthesis, contain this domain PUBMED:7929373.

    \ ' '3800' 'IPR001672' '\

    Phosphoglucose isomerase () (PGI) PUBMED:6115414, PUBMED:1593646 is a dimeric enzyme that catalyses the reversible isomerization of glucose-6-phosphate and fructose-6-phosphate. PGI is involved in different pathways: in most higher organisms it is involved in glycolysis; in mammals it is involved in gluconeogenesis; in plants in carbohydrate biosynthesis; in some bacteria it provides a gateway for fructose into the Entner-Doudouroff pathway. The multifunctional protein, PGI, is also known as neuroleukin (a neurotrophic factor that mediates the differentiation of neurons), autocrine motility factor (a tumour-secreted cytokine that regulates cell motility), differentiation and maturation mediator and myofibril-bound serine proteinase inhibitor, and has different roles inside and outside the cell. In the cytoplasm, it catalyses the second step in glycolysis, while outside the cell it serves as a nerve growth factor and cytokine PUBMED:10653639.

    \

    PGI from Bacillus stearothermophilus has an open twisted alpha/beta structural motif consisting of two globular domains and two protruding parts. It has been suggested that the top part of the large domain together with one of the protruding loops might participate in inducing the neurotrophic activity PUBMED:10318897. The structure of rabbit muscle phosphoglucose isomerase complexed with various inhibitors shows that the enzyme is a dimer with two alpha/beta-sandwich domains in each subunit. The location of the bound D-gluconate 6-phosphate inhibitor leads to the identification of residues involved in substrate specificity. In addition, the positions of amino acid residues that are substituted in the genetic disease nonspherocytic hemolytic anemia suggest how these substitutions can result in altered catalysis or protein stability PUBMED:10653639, PUBMED:10770936.

    \ ' '3801' 'IPR001576' '\

    Phosphoglycerate kinase () (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to\ 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.

    \

    PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale \'hinge-bending\' conformational changes, similar to those found in hexokinase PUBMED:10593256. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded PUBMED:2124145.

    \

    Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man PUBMED:6689547.

    \

    This entry represents the full PGK enzyme.

    \ ' '3802' 'IPR005843' '\

    The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM) PUBMED:10506283. PGM () converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose PUBMED:15299905. PGM/PMM (; ) are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate PUBMED:16595672, PUBMED:14725765. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine PUBMED:10913078, PUBMED:11004509.

    \

    Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme PUBMED:15238632.

    \

    The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.

    \ \

    This entry represents the C-terminal domain alpha-D-phosphohexomutase enzymes.

    \ ' '3803' 'IPR005844' '\

    The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM) PUBMED:10506283. PGM () converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose PUBMED:15299905. PGM/PMM (; ) are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate PUBMED:16595672, PUBMED:14725765. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine PUBMED:10913078, PUBMED:11004509.

    \

    Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme PUBMED:15238632.

    \

    The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.

    \ \

    This entry represents domain I found in alpha-D-phosphohexomutase enzymes. This domain has a 3-layer alpha/beta/alpha topology.

    \ ' '3804' 'IPR005845' '\

    The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM) PUBMED:10506283. PGM () converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose PUBMED:15299905. PGM/PMM (; ) are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate PUBMED:16595672, PUBMED:14725765. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine PUBMED:10913078, PUBMED:11004509.

    \

    Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme PUBMED:15238632.

    \

    The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.

    \ \

    This entry represents domain II found in alpha-D-phosphohexomutase enzymes. This domain has a 3-layer alpha/beta/alpha topology.

    \ ' '3805' 'IPR005846' '\

    The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM) PUBMED:10506283. PGM () converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose PUBMED:15299905. PGM/PMM (; ) are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate PUBMED:16595672, PUBMED:14725765. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine PUBMED:10913078, PUBMED:11004509.

    \

    Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme PUBMED:15238632.

    \

    The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.

    \ \

    This entry represents domain III found in alpha-D-phosphohexomutase enzymes. This domain has a 3-layer alpha/beta/alpha topology.

    \ ' '3806' 'IPR007686' '\ This family represents a family of bacterial phosphatidylglycerophosphatases (), known as PgpA. It appears that bacteria possess several phosphatidylglycerophosphatases, and thus, PgpA is not essential in Escherichia coli PUBMED:1309518.\ ' '3807' 'IPR005133' '\

    This is a family of small, transmembrane proteins believed to be\ components of Na+/H+ and K+/H+ antiporters. Members, including proteins designated\ MnhG from Staphylococcus aureus and PhaG from Rhizobium meliloti (Sinorhizobium meliloti), show some\ similarity to chain L of the NADH dehydrogenase I, which also translocates protons.

    \ ' '3808' 'IPR003513' '\ This is a family of proteins from single-stranded DNA bacteriophages. Scaffold proteins B and D are required for\ procapsid formation. Sixty copies of the internal scaffold protein B are found in the procapsid.\ ' '3809' 'IPR006531' '\

    These domain occurs in a family of phage (and bacteriocin) proteins related to the phage P2 V gene product, which forms the small spike at the tip of the tail PUBMED:7483254. Homologs in general are annotated as baseplate assembly protein V. At least one member is encoded within a region of Pectobacterium carotovorum (Erwinia carotovora) described as a bacteriocin, a phage tail-derived module able to kill bacteria closely related to the host strain.

    \ ' '3810' 'IPR005564' '\

    Major capsid protein E plays a role in the stabilisation of the condensed form of the DNA molecule in phage heads PUBMED:2522554.

    \ ' '3811' 'IPR006441' '\

    This family represents the major capsid protein component of the heads (capsids) of bacteriophage P2 and related phage including prophage. These sequences represent one of several analogous families lacking detectable sequence similarity. The gene encoding this component is typically located in an operon encoding the small and large terminase subunits, the portal protein and the prohead or maturation protease.

    \ ' '3812' 'IPR006516' '\

    This set of sequences represent a family of phage and plasmid replication proteins. In bacteriophage IKe and related phage, the full-length protein is designated gene II protein. A much shorter protein of unknown function, translated from a conserved in-frame alternative initiator, is designated gene X protein. Members of this family also include plasmid replication proteins.

    \ ' '3813' 'IPR003512' '\ This family contains the bacteriophage helix-destabilising protein, or single-stranded DNA binding protein, required for DNA synthesis. The protein binds to DNA in a highly cooperative manner without pronounced sequence specificity. In the presence of single-stranded DNA it binds cooperatively to form a helical protein-DNA complex. It prevents the conversion during synthesis of the single-stranded (progeny) viral DNA back into the double-stranded replicative form.\ ' '3814' 'IPR003514' '\ This is a family of proteins from single-stranded DNA bacteriophages. Protein F is the major capsid component, sixty\ copies of which are found in the virion. The virion is also composed of 60 copies of each of the G and J proteins, and 12 copies of the H protein.\ ' '3815' 'IPR005003' '\

    This is a repeat found in the tail fibres of many bacteriophage and homologous bacterial proteins.

    \ ' '3816' 'IPR005068' '\

    This repeat is found in the tail fibers of phage, for example protein K PUBMED:7676622 but bacterial homologues have also been identified. The repeats are about 40 residues long.

    \ ' '3817' 'IPR003515' '\ This is a family of proteins from single-stranded DNA bacteriophages. The G protein is a major spike protein involved in attachment to the bacterial host cell. The virion is composed of sixty copies of each of the F, G and J proteins, and 12 copies of the H protein. There are twelve spikes formed by five G proteins, each a tight beta barrel, and one H protein.\ ' '3818' 'IPR006479' '\

    This group of sequences represent one of more than 30 families of phage proteins, all lacking detectable homology with each other, known or believed to act as holins. Holins act in cell lysis by bacteriophage. Members of this family are found in phage PBSX and phage SPP1, among others.

    \ ' '3819' 'IPR006485' '\

    Phage proteins for bacterial lysis typically include a membrane-disrupting protein, or holin, and one or more cell wall degrading enzymes that reach the cell wall because of holin action. Holins are found in a large number of mutually non-homologous families.

    \ ' '3820' 'IPR007633' '\ Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the build-up of a holin oligomer which causes the lysis PUBMED:11459934.\ ' '3821' 'IPR006481' '\

    This group of sequences represent one of a large number of mutually dissimilar families of phage holins. Holins act against the host cell membrane to allow lytic enzymes of the phage to reach the bacterial cell wall. This family includes the product of the S gene of phage lambda.

    \ ' '3822' 'IPR006480' '\

    This group of sequences describe one of the many mutually dissimilar families of holins, phage proteins that act together with lytic enzymes in bacterial lysis. This family includes, besides phage holins, the protein TcdE/UtxA involved in toxin secretion in Clostridium difficile and related species PUBMED:11444771.

    \ ' '3823' 'IPR002104' '\

    Phage integrase proteins cleave DNA substrates by a series of staggered cuts, during which the protein becomes covalently linked to the DNA through a catalytic tyrosine residue at the carboxy end of the alignment PUBMED:9082984, PUBMED:9288963.

    \ \

    The catalytic site residues in CRE recombinase () are Arg-173, His-289, Arg-292 and Tyr-324.

    \ ' '3824' 'IPR004929' '\

    Many bacteriophages with Gram-negative hosts contain two auxiliary lysis genes Rz and Rz1. These genes are nested, with Rz1 occupying the last third of Rz in a +1 reading frame. Both of these genes are required for host cell lysis if the outer membrane is stabilised by millimolar concentrations of divalent cations, but are otherwise uneccessary PUBMED:10628848. The Rz protein is believed to posses endopeptidase activity, while Rz1 encodes a prolipoprotein which, after cleavage by a signal peptidase, is located in the outer membrane. It has been suggested that these two proteins may form a complex which cleaves the oligopeptide crosslinks between glycosidic strands in the peptidoglycan and the Lpp lipoproteins of the outer bacterial membrane. For more information see PUBMED:10707065.

    \ \

    This entry represents the Rz protein. This family is not considered to be a peptidase according to the MEROPs database.

    \ ' '3825' 'IPR002196' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 24 comprises enzymes with only one known activity; lysozyme ().

    \ \

    This entry includes Bacteriophage lambda lysozyme and Escherichia coli endolysin PUBMED:3586019. Lysozyme helps to release mature phage particles from the cell wall by breaking down the peptidoglycan. The enzyme hydrolyses the 1,4-beta linkages between N-acetyl-D-glucosamine and N-acetylmuramic acid in peptidoglycan heteropolymers of prokaryotic cell walls. E. coli endolysin also functions in bacterial cell lysis and acts as a transglycosylase.\ The Bacteriophage T4 lysozyme structure contains 2 domains, the interface between which forms the active-site cleft. The N-terminus of the 2 domains undergoes a \'hinge-bending\' motion about an axis passing through the molecular waist PUBMED:3586019, PUBMED:2234094. This mobility is thought to be important in allowing access of substrates to the enzyme active site.

    \ ' '3826' 'IPR005563' '\

    The single-stranded RNA genome of bacteriophage MS2 is 3,569 nt long and contains 4 genes. Their products are necessary for phage\ maturation, encapsidation, lysis of the host, and phage RNA replication, respectively. The maturation protein is required for the typical attachment of the phage to the side of the bacterial pili. It accompanies the viral DNA into the cell.

    \ \ ' '3828' 'IPR006528' '\

    This group of sequences is identified by a region of about 110 amino acids found exclusively in phage-related proteins, internally or toward the C terminus. One member, gp7 of phage SPP1, appears to be involved in head morphogenesis.

    \ ' '3829' 'IPR006429' '\

    This group of sequences represent one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 kDa.

    \ ' '3830' 'IPR006428' '\

    This group of sequences represent one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 kDa.

    \ ' '3831' 'IPR006450' '\

    This group of sequences represents small (~100 amino acids) proteins found in phage and in putative prophage regions of a number of bacterial genomes. The function of these sequences is unknown.

    \ ' '3832' 'IPR006497' '\

    This set of protein sequences, defined by an N-terminal domain, represent phage lambda replication protein O and other homologous phage and prophage proteins.

    \ ' '3833' 'IPR007067' '\ This family represents the tail sheath protein Gp18 of Bacteriophage T4 and its homologues.\ ' '3834' 'IPR006724' '\

    This family contains a major tail protein from phage.

    \ ' '3835' 'IPR006487' '\

    This group of sequences represent members of the family of phage lambda minor tail protein L.

    \ ' '3836' 'IPR006848' '\ This family represents a number of putative transcription repressor proteins found in several Lactococcus bacteriophages. Horizontal transfer may account for the presence of similar proteins in Lactococcus PUBMED:11337471.\ ' '3837' 'IPR006498' '\

    The tails of some phage are contractile. These sequences represent the tail tube, or tail core, protein of the contractile tail of phage P2, and homologous proteins from other phage.

    \ ' '3838' 'IPR006516' '\

    This set of sequences represent a family of phage and plasmid replication proteins. In bacteriophage IKe and related phage, the full-length protein is designated gene II protein. A much shorter protein of unknown function, translated from a conserved in-frame alternative initiator, is designated gene X protein. Members of this family also include plasmid replication proteins.

    \ ' '3839' 'IPR004188' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    Phenylalanyl-tRNA synthetase from Thermus thermophilus has an alpha 2 beta 2 type quaternary structure and is one of the most complicated members of the synthetase family. Identification of phenylalanyl-tRNA synthetase as a member of class II aaRSs was based only on sequence alignment of the small alpha-subunit with other synthetases PUBMED:8199244. This is the N-terminal domain of phenylalanyl-tRNA synthetase.

    \ ' '3840' 'IPR003430' '\ Bacterial phenol hydroxylase () is a multicomponent enzyme that catabolises phenol and some of its methylated derivatives. This family contains both the P1 and P3 polypeptides of phenol hydroxlase and the alpha and beta chain of methane\ hydroxylase protein A. Methane hydroxylase protein A () is responsible for the initial oxygenation of methane to methanol in methanotrophs. It also catalyses the monohydroxylation of a variety of unactivated alkenes, alicyclic, aromatic and heterocyclic compounds. Also included in this family is toluene-4-monooxygenase system protein A (), which hydroxylates toluene to form P-cresol.\ ' '3841' 'IPR006756' '\ Under aerobic conditions, phenol is usually hydroxylated to catechol and degraded via the meta or ortho pathways. Two types of phenol hydroxylase are known: one is a multi-component enzyme the other is a single-component monooxygenase. This region is found in both types of enzymes PUBMED:2254258, PUBMED:11571188.\ ' '3842' 'IPR006766' '\

    This entry represents a family of conserved plant proteins. A conserved region in these proteins was identified in a phosphate-induced protein of unknown function PUBMED:10189698.

    \ ' '3843' 'IPR013988' '\

    The PhnA protein family includes the uncharacterised Escherichia coli protein PhnA and its homologues. The E. coli phnA gene is part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage PUBMED:2155230. The protein is not related to the characterised phosphonoacetate hydrolase designated PhnA PUBMED:9300819.

    \ \

    This entry represents the C-terminal domain of PhnA.

    \ ' '3844' 'IPR003714' '\ PhoH is a cytoplasmic protein and predicted ATPase that is induced by phosphate starvation and belongings to the phosphate regulon (pho) in Escherichia coli PUBMED:8444794.\ ' '3845' 'IPR001200' '\ The outer and inner segments of vertebrate rod photoreceptor cells contain phosducin,\ a soluble phosphoprotein that complexes with the beta/gamma-subunits of the GTP-binding\ protein, transducin. Light-induced changes in cyclic nucleotide levels modulate the\ phosphorylation of phosducin by protein kinase A PUBMED:2203790. The protein is thought to participate in the regulation of\ visual phototransduction or in the integration of photo-receptor metabolism. Similar\ proteins have been isolated from the pineal gland and it is believed that the functional\ role of the protein is the same in both retina and pineal gland PUBMED:2210381.\ ' '3846' 'IPR001211' '\

    Phospholipase A2 () (PLA2) is a small lipolytic enzyme that releases fatty acids from the second carbon group of glycerol. It is involved in a number of physiologically important cellular processes, such as the liberation of arachidonic\ acid from membrane phospholipids PUBMED:7664098. It plays a pivotal role in the biosynthesis of prostaglandin and other\ mediators of inflammation. PLA2 has four to seven disulphide bonds and binds a calcium\ ion that is essential for activity. Within the active enzyme, the alpha amino group is\ involved in a conserved hydrogen-bonding network linking the N-terminal region to\ the active site. The side chains of two conserved residues, His and Asp, participate in\ the catalytic network.

    \ \

    Many PLA2\'s are widely distributed in snakes, lizards, bees and mammals. In mammals\ there are at least four forms: pancreatic, membrane-associated as well as two less\ well characterised forms. The venom of most snakes contains multiple forms of PLA2.\ Some of them are presynaptic neurotoxins which inhibit neuromuscular transmission by\ blocking acetylcholine release from the nerve termini.

    \

    Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee\ King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E.,\ Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species\ names have identical designations, they are discriminated from one another\ by adding one or more letters (as necessary) to each species designation.

    \

    The allergens in this family include allergens with the following designations: Api m 1.

    \ ' '3847' 'IPR007312' '\ This entry includes both bacterial phospholipase C enzymes () and eukaryotic acid phosphatases .\ ' '3848' 'IPR005984' '\

    Phospholamban (PLB) is a small protein (52 amino acids) that regulates the affinity of the cardiac sarcoplasmic reticulum Ca2+-ATPase (SERCA2a) for calcium. PLB is present in cardiac myocytes, in slow-twitch and smooth muscle and is expressed also in aorta endothelial cells in which it could play a role in tissue relaxation. The phosphorylation/dephosphorylation of phospholamban removes and restores, respectively, its inhibitory activity on SERCA2a. It has in fact been shown that phospholamban, in its non-phosphorylated form, binds to SERCA2a and inhibits this pump by lowering its affinity for Ca2+, whereas the phosphorylated form does not exert the inhibition. PLB is phosphorylated at two sites, namely at Ser-16 for a\ cAMP-dependent phosphokinase and at Thr-17 for a Ca2+/calmodulin-dependent phosphokinase, phosphorylation at Ser-16 being a prerequisite for the phosphorylation at Thr-17.

    The structure of a 36-amino-acid-long N-terminal fragment of human phospholamban phosphorylated at Ser-16 and Thr-17 and Cys36Ser mutated was determined from nuclear magnetic resonance data. The peptide assumes a conformation characterised by two alpha-helices connected by an irregular strand, which\ comprises the amino acids from Arg-13 to Pro-21. The proline is in a trans conformation. The two phosphate groups on Ser-16 and Thr-17 are shown to interact preferably with the side chains of Arg-14 and Arg-13, respectively PUBMED:12080135.

    \ \ ' '3849' 'IPR000224' '\ This entry contains phosphoprotein from vesiculoviruses, which are ssRNA negative-strand rhabdoviruses. It is\ known as the phosphoprotein or P protein PUBMED:9375014,\ PUBMED:9343167. This protein may be part of the RNA\ dependent RNA polymerase complex PUBMED:9375014. The\ phosphorylation states of this protein may regulate the transcription\ and replication complexes PUBMED:9343167.\ ' '3850' 'IPR000811' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 35 \ comprises enzymes with only one known activity; glycogen and starch phosphorylase ().

    \ \

    The main role of glycogen phosphorylase (GPase) is to provide phosphorylated glucose molecules (G-1-P) PUBMED:2182117. GPase is a highly regulated allosteric enzyme. The net effect of the regulatory site allows the enzyme to operate at a variety of rates; the enzyme is not simply regulated as "on" or "off", but rather it can be thought of being set to operate at an ideal rate based on changing conditions at in the cell. The most important allosteric effector is the phosphate molecule covalently attached to Ser14.\ This switches GPase from the b (inactive) state to the a (active) state. Upon phosphorylation, GPase attains about 80% of its Vmax. When the enzyme is not phosphorylated, GPase activity is practically non-existent at low AMP levels PUBMED:.

    \

    \ There is some apparent controversy as to the structure of GPase. All sources agree that the enzyme is multimeric, but there is apparent controversy as to the enzyme being a tetramer or a dimer. Apparently, GPase (in the a\ form) forms tetramers in the crystal form. The consensus seems to be that \'regardless of the a or b form, GPase functions as a dimer in vivo PUBMED:2667896. The GPase monomer is best described as consisting of two domains, an N-terminal domain and a C-terminal domain PUBMED:8798388. The C-terminal domain is often referred to as the catalytic domain. It consists of a beta-sheet core surrounded by layers of helical segments PUBMED:2667896. The vitamin cofactor pyridoxal phosphate (PLP) is covalently attached to the amino acid backbone. The N-terminal domain also consists of a central beta-sheet core and is surrounded by layers of helical segments. The N-terminal domain contains different allosteric effector sites to regulate the enzyme.

    \

    Bacterial phosphorylases follow the same catalytic mechanisms as their plant and animal counterparts, but differ considerably in terms of their substrate specificity and regulation. The catalytic domains are highly conserved while the regulatory sites are only poorly conserved. For maltodextrin phosphorylase from Escherichia coli the physiological role of the enzyme in the utilisation of maltidextrins is known in detail; that of all the other bacterial phosphorylases is still unclear. Roles in regulatuon of endogenous glycogen metabolism in periods of starvation, and sporulation, stress response or quick adaptation to changing environments are possible PUBMED:10077830.

    \ ' '3851' 'IPR000484' '\

    The photosynthetic apparatus in non-oxygenic bacteria consists of light-harvesting (LH) protein-pigment complexes LH1 and LH2, which use carotenoid and bacteriochlorophyll as primary donors PUBMED:11005826. LH1 acts as the energy collection hub, temporarily storing it before its transfer to the photosynthetic reaction centre (RC) PUBMED:15329728. Electrons are transferred from the primary donor via an intermediate acceptor (bacteriopheophytin) to the primary acceptor (quinine Qa), and finally to the secondary acceptor (quinone Qb), resulting in the formation of ubiquinol QbH2. RC uses the excitation energy to shuffle electrons across the membrane, transferring them via ubiquinol to the cytochrome bc1 complex in order to establish a proton gradient across the membrane, which is used by ATP synthetase to form ATP PUBMED:16931113, PUBMED:12872158, PUBMED:2676514.

    \ \

    The core complex is anchored in the cell membrane, consisting of one unit of RC surrounded by LH1; in some species there may be additional subunits PUBMED:11095707. RC consists of three subunits: L (light), M (medium), and H (heavy). Subunits L and M provide the scaffolding for the chromophore, while subunit H contains a cytoplasmic domain PUBMED:8027023. In Rhodopseudomonas viridis, there is also a non-membranous tetrahaem cytochrome (4Hcyt) subunit on the periplasmic surface.

    \ \

    This entry describes the photosynthetic reaction centre L and M subunits, and the homologous D1 (PsbA) and D2 (PsbD) photosystem II (PSII) reaction centre proteins from cyanobacteria, algae and plants. The D1 and D2 proteins only show approximately 15% sequence homology with the L and M subunits, however the conserved amino acids correspond to the binding sites of the phytochemically active cofactors. As a result, the reaction centres (RCs) of purple photosynthetic bacteria and PSII display considerable structural similarity in terms of cofactor organisation.

    \

    The D1 and D2 proteins occur as a heterodimer that form the reaction core of PSII, a multisubunit protein-pigment complex containing over forty different cofactors, which are anchored in the cell membrane in cyanobacteria, and in the thylakoid membrane in algae and plants. Upon absorption of light energy, the D1/D2 heterodimer undergoes charge separation, and the electrons are transferred from the primary donor (chlorophyll a) via pheophytin to the primary acceptor quinone Qa, then to the secondary acceptor Qb, which like the bacterial system, culminates in the production of ATP. However, PSII has an additional function over the bacterial system. At the oxidising side of PSII, a redox-active residue in the D1 protein reduces P680, the oxidised tyrosine then withdrawing electrons from a manganese cluster, which in turn withdraw electrons from water, leading to the splitting of water and the formation of molecular oxygen. PSII thus provides a source of electrons that can be used by photosystem I to produce the reducing power (NADPH) required to convert CO2 to glucose PUBMED:12518057, PUBMED:14871485.

    \ \ ' '3852' 'IPR008170' '\

    This family contains phosphate regulatory proteins including PhoU. PhoU proteins are known to play a role in the regulation of phosphate uptake. The PhoU domain is composed of a three helix bundle PUBMED:15716271. The PhoU protein contains two copies of this domain. The domain binds to an iron cluster via its conserved E/DXXXD motif. Deletion of PhoU activates constitutive expression of the phosphate ABC transporter and allows phosphate transport, but causes a growth defect; suggesting that the protein has some secondary function PUBMED:8226621.

    \ ' '3853' 'IPR012128' '\

    Cyanobacteria and red algae harvest light through water-soluble complexes, called phycobilisomes, which are attached to the outer face of the thylakoid membrane PUBMED:15238265. These complexes are capable of transferring the absorbed energy to the photosynthetic reaction centre with greater than 95% efficiency. Phycobilisomes contain various photosynthetic light harvesting proteins known as biliproteins, and linker proteins which help assemble the structure. The two main structural elements of the complex are a core located near the photosynthetic reaction centre, and rods attached to this core. Allophycocyanin is the major component of the core, while the rods contain phycocyanins, phycoerythrins and linker proteins. The rod biliproteins harvest photons, with the excitation energy being passed through the rods into the allophycocyanin in the core. Other core biliproteins subsequently pass this energy to chlorophyll within the thylakoid membrane.

    \ \

    This entry represents the alpha and beta subunits found in biliproteins from cyanobacteria and red algae. Structural studies indicate that the basic structural unit of most biliproteins is a heterodimer composed of these alpha and beta subunits PUBMED:10388620, PUBMED:11134927, PUBMED:11463658, PUBMED:7783202. The full protein is a ring-like trimer assembly of these heterodimers. Each subunit of the heterodimer has eight helices and binds chromophores through thioester bonds formed at particular cysteine residues. These chromophores, also known as bilins, are open-chain tetrapyrroles whose number and type vary with the particular biliprotein eg R-phyocerythrin binds five phycoerythrobilins per heterodimer, while allophycocyanin binds two phycocyanobilins per heterodimer.

    \ \ ' '3854' 'IPR003431' '\

    Phytase () (phytate 3-phosphatase) is a secreted enzyme which hydrolyses phytate to release inorganic phosphate. This family appears to represent a novel enzyme that shows phytase activity (PUBMED:9603817) and has been shown to have a six- bladed propeller folding architecture (PUBMED:10655618).

    \ ' '3855' 'IPR006918' '\

    In Arabidopsis thaliana (Mouse-ear cress) members of the family are all extracellular glycosyl-phosphatidyl inositol-anchored proteins (GPI-linked) PUBMED:15849274. The type example of the family is COBRA () and the family is generally annotated as COBRA-like (COBL). COBRA is involved in determining the orientation of cell expansion, probably by playing an important role in cellulose deposition. It may act by recruiting cellulose synthesizing complexes to discrete positions on the cell surface. Some members of this family are annotated as phytochelatin synthase, but these annotations are incorrect PUBMED:15375203.

    \ ' '3856' 'IPR013515' '\

    Phytochrome belongs to a family of plant photoreceptors that mediate physiological and \ developmental responses to changes in red and far-red light conditions PUBMED:1812812.\ The protein undergoes reversible photochemical conversion between a biologically-inactive \ red light-absorbing form and the active far-red light-absorbing form. Phytochrome is a \ dimer of identical 124 kDa subunits, each of which contains a linear tetrapyrrole \ chromophore, covalently-attached via a Cys residue.\

    \

    This domain represents a region specific to phytochrome proteins.

    \ ' '3857' 'IPR004964' '\ The phenazine biosynthesis proteins A and B are involved in the biosynthesis of this antibiotic. Phenazine is a nitrogen-containing heterocyclic molecule with important implications in virulence, competition and biological control.\ ' '3858' 'IPR000909' '\ Phosphatidylinositol-specific phospholipase C (), a eukaryotic intracellular enzyme, plays \ an important role in signal transduction processes PUBMED:1849017. It catalyzes the hydrolysis of \ 1-phosphatidyl-D-myo-inositol-3,4,5-triphosphate into the second messenger molecules diacylglycerol \ and inositol-1,4,5-triphosphate. This catalytic process is tightly regulated by reversible phosphorylation \ and binding of regulatory proteins PUBMED:1419362, PUBMED:1319994, PUBMED:1335185. In mammals, there are at \ least 6 different isoforms of PI-PLC, they differ in their domain structure, their regulation, and their \ tissue distribution. Lower eukaryotes also possess multiple isoforms of PI-PLC. All eukaryotic PI-PLCs \ contain two regions of homology, sometimes referred to as the \'X-box\' and \'Y-box\'. The order of these two \ regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance\ between these two regions is only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, \ and one SH3 domain are inserted between the two PLC-specific domains. The two conserved regions have been \ shown to be important for the catalytic activity. By profile analysis, we could show that sequences with \ significant similarity to the X-box domain occur also in prokaryotic and trypanosome PI-specific \ phospholipases C. Apart from this region, the prokaryotic enzymes show no similarity to their eukaryotic \ counterparts.\ ' '3859' 'IPR001711' '\

    Phosphatidylinositol-specific phospholipase C (), an eukaryotic intracellular enzyme, plays an important role in signal transduction processes PUBMED:1849017 (see ). It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol-3,4,5-triphosphate into the second messenger molecules diacylglycerol and inositol-1,4,5-triphosphate. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins PUBMED:1419362, PUBMED:1319994, PUBMED:1335185.

    \

    In mammals, there are at least 6 different isoforms of PI-PLC, they differ in their domain structure, their regulation, and their tissue distribution. Lower eukaryotes also possess multiple isoforms of PI-PLC.

    \

    All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as \'X-box\' (see ) and \'Y-box\'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, and one SH3 domain are inserted between the two PLC-specific domains. The two conserved regions have been shown to be important for the catalytic activity. At the C-terminal of the Y-box, there is a C2 domain (see ) possibly involved in Ca-dependent membrane attachment.

    \ ' '3860' 'IPR003113' '\ This is the region of the p110 phosphatidylinositol 3-kinase (PI3-Kinase) that binds the p85 subunit.\ ' '3861' 'IPR000341' '\

    Phosphatidylinositol 3-kinase (PI3K) () is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. A subset of PI3Ks has the capacity to bind and be activated by the GTP-bound small GTPase p21Ras (Ras). PI3Ks are recognised as one of the principal effectors of Ras signalling to the cell-cycle control machinery.

    In the structure of the Ras-PI3K gamma complex, contacts between the two molecules are made primarily via the so-called switch I region of Ras and the PI3K RBD. The RBD fold comprises a five-stranded mixed beta-sheet, flanked by two alpha-helices. Interaction between Ras and the PI3K RBD is primarily polar in character and, as characterised by kinetic measurements, is reversible and transient PUBMED:12151228.

    \ ' '3862' 'IPR001263' '\

    Phosphatidylinositol 3-kinase (PI3-kinase) () is an enzyme\ that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol\ ring. The role of the accessory domain of phosphoinositide 3-kinase (PI3-kinase) \ is unclear. It may be involved in substrate presentation \ PUBMED:8248783.

    \ ' '3863' 'IPR003138' '\

    VP1, VP2, VP3 and VP4 are the four basic units that form the icosahedral coat of picornaviruses. Five symmetry-related N termini of coat protein VP4 form a ten-stranded, antiparallel beta barrel around the base of the icosahedral fivefold axis PUBMED:9083115.

    \ ' '3864' 'IPR000081' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin PUBMED:8164744, the type example for clan PA.

    \ \

    Picornaviral proteins are expressed as a single polyprotein\ which is cleaved by the viral 3C cysteine protease PUBMED:9460917. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly.\

    \ ' '3865' 'IPR002527' '\

    Poliovirus infection leads to drastic alterations in membrane permeability late during infection. Proteins 2B and 2BC enhance membrane permeability PUBMED:9218794, PUBMED:8798506.

    \ ' '3866' 'IPR016147' '\

    Most Gram-negative bacteria possess a supramolecular structure - the pili - on their surface, which mediates attachment to specific receptors. Many interactive subunits are required to assemble pili, but their assembly only takes place after translocation across the cytoplasmic membrane. Periplasmic chaperones assist pili assembly by binding to the subunits, thereby preventing premature aggregation PUBMED:8670884, PUBMED:1683764. Pili chaperones are structurally, and possibly evolutionarily, related to the immunoglobulin superfamily PUBMED:1348692, PUBMED:17082819: they contain two globular domains, with a topology identical to an immunoglobulin fold.

    \

    This entry represents the N-terminal domain of pili assembly chaperone, and has a beta-sandwich fold consisting of seven strands in two sheets with a Greek key topology.

    \ ' '3867' 'IPR016148' '\

    Most Gram-negative bacteria possess a supramolecular structure - the pili - on their surface, which mediates attachment to specific receptors. Many interactive subunits are required to assemble pili, but their assembly only takes place after translocation across the cytoplasmic membrane. Periplasmic chaperones assist pili assembly by binding to the subunits, thereby preventing premature aggregation PUBMED:8670884, PUBMED:1683764. Pili chaperones are structurally, and possibly evolutionarily, related to the immunoglobulin superfamily PUBMED:1348692, PUBMED:17082819: they contain two globular domains, with a topology identical to an immunoglobulin fold.

    \

    This entry represents the C-terminal domain of pili assembly chaperone, and has a beta-sandwich fold consisting of eight strands in two sheets with a Greek key topology.

    \ ' '3868' 'IPR001082' '\ Pilin is a subunit of the pilus, a polar flexible filament, which consists\ of a single polypeptide chain arranged in a helical configuration of five\ subunits per turn. Gram-negative bacteria produce pilin which is characterised\ by the presence of a very short leader peptide of 6 to 7 residues, followed by\ a methylated N-terminal phenylalanine residue and by a highly conserved sequence\ of about 24 hydrophobic residues, of the NMePhe type pilin PUBMED:2898203, PUBMED:3118043.\ ' '3869' 'IPR007813' '\

    PilN is a plasmid-encoded, lipoprotein which locates to the outer membrane of bacteria and are part of a thin pilus required only for liquid mating PUBMED:10686134.

    \ ' '3870' 'IPR007445' '\ PilO proteins are involved in the assembly of pilin. However, the precise function of this family of proteins is not known.\ ' '3871' 'IPR007446' '\ The PilP family are periplasmic proteins involved in the biogenesis of type IV pili PUBMED:11751821.\ ' '3872' 'IPR006787' '\

    This conserved region is found at the N-terminal of the member proteins. It is located adjacent and N-terminal to the pinin/SKD/memA domain . Members of this family have very varied localisations within the eukaryotic cell. Pinin is known to localise at the desmosomes and is implicated in anchoring intermediate filaments to the desmosomal plaque PUBMED:8922384, PUBMED:9447706. SDK2/3 is a dynamically localised nuclear protein thought to be involved in modulation of alternative pre-mRNA splicing PUBMED:12051732. MemA is a tumour marker preferentially expressed in human melanoma cell lines. A common feature of the members of this family is that they may all participate in regulating protein-protein interactions PUBMED:10645008.

    \ \ ' '3873' 'IPR015793' '\

    Pyruvate kinase () (PK) catalyses the final step in glycolysis PUBMED:2379684, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:

    \ \

    The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.

    \

    PK helps control the rate of glycolysis, along with phosphofructokinase () and hexokinase (). PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions PUBMED:12798932. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.

    \

    The structure of several pyruvate kinases from various organisms have been determined PUBMED:11960989, PUBMED:10751408. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.

    \ \

    This entry represents the two barrel domains, the beta/alpha-barrel, and the beta-barrel inserted within it.

    \ ' '3874' 'IPR015794' '\

    Pyruvate kinase () (PK) catalyses the final step in glycolysis PUBMED:2379684, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:

    \ \

    The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.

    \

    PK helps control the rate of glycolysis, along with phosphofructokinase () and hexokinase (). PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions PUBMED:12798932. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.

    \

    The structure of several pyruvate kinases from various organisms have been determined PUBMED:11960989, PUBMED:10751408. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.

    \ \

    This entry represents the 3-layer alpha/beta/alpha sandwich domain.

    \ ' '3875' 'IPR004171' '\ Members of this family are extremely potent competitive inhibitors of cAMP-dependent protein kinase activity. These proteins interact with the catalytic subunit of the enzyme after the cAMP-induced dissociation of its regulatory chains.\ ' '3876' 'IPR003102' '\ The nuclear factor CREB activates transcription of target genes in part through direct interactions with the KIX domain of the coactivator CBP in a phosphorylation-dependent manner. CBP and P300 bind to the pKID (phosphorylated kinase-inducible-domain) domain of CREB PUBMED:9413984.\ ' '3877' 'IPR001573' '\

    Cell signalling mediated via GPCRs (G-protein-coupled receptors) involves the assembly of receptors, G-proteins, effectors and downstream elements into complexes that approach in design \'solid-state\' signalling devices. Scaffold molecules, such as the AKAPs (A-kinase anchoring proteins), were discovered more than a decade ago and represent dynamic platforms, enabling multivalent signalling PUBMED:12546660. This family of functionally related proteins is classified on the basis of their ability to associate with the PKA holoenzyme inside cells. A shared property of most, if not all, AKAPs is the ability to form multivalent signal transduction complexes. \ \

    Each anchoring protein contains at least two functional motifs PUBMED:8968497. The conserved PKA binding motif forms an amphipathic helix of 14-18 residues that interacts with hydrophobic determinants located in the extreme N-terminus of the regulatory subunit dimmer. The subcellular address of each AKAP is encoded by a unique targeting motif. Gravin, an autoantigen recognised by serum from myasthenia gravis patients contains 3 repeats of this domain PUBMED:9000000.

    \ \

    The WSK motif is short motif, named after three conserved residues found in the WXSXK motif, found in protein kinase A anchoring proteins.

    \ ' '3878' 'IPR017442' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \ Eukaryotic protein kinases PUBMED:12734000, PUBMED:7768349, PUBMED:1835513, PUBMED:1956325, PUBMED:3291115 are enzymes\ that belong to a very extensive family of proteins which share a conserved catalytic core common with\ both serine/threonine and tyrosine protein kinases. There are a number of conserved regions in the\ catalytic domain of protein kinases. In the N-terminal extremity of the catalytic domain there is a\ glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown to be involved\ in ATP binding. In the central part of the catalytic domain there is a conserved aspartic acid residue\ which is important for the catalytic activity of the enzyme PUBMED:1862342. This entry includes protein kinases from eukaryotes and viruses and may include some bacterial hits too.\ ' '3879' 'IPR017892' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \ This domain is found in a large variety of protein kinases with different functions and dependencies. Protein kinase C, for example, is a calcium-activated, phospholipid-dependent serine- and threonine-specific enzyme. It is activated by diacylglycerol which, in turn, phosphorylates a range of cellular proteins. This domain is most often found associated with .\ ' '3880' 'IPR003187' '\

    Outer membrane phospholipase A (OMPLA) is an integral membrane phospholipase, which is present in many\ Gram-negative bacteria and has a broad substrate specificity . The role of OMPLA has been most thoroughly studied in Escherichia coli,\ where it participates in the secretion of bacteriocins. Bacteriocin release is triggered by a lysis\ protein (bacteriocin release protein or BRP), followed by a phospholipase dependent accumulation\ of lysophospholipids and free fatty acids in the outer membrane PUBMED:12615538. The reaction products enhance the\ permeability of the outer membrane, which allows the semispecific secretion of bacteriocins. One speculative function of OMPLA is related to organic solvent tolerance in bacteria.

    Structurally, it consists of a\ 12-stranded antiparallel beta-barrel with a convex and a flat side. The active site residues are exposed\ on the exterior of the flat face of the beta-barrel. The activity of the enzyme is regulated by reversible\ dimerisation. Dimer interactions occur exclusively in the\ membrane-embedded parts of the flat side of the beta-barrel, with polar residues embedded in an\ apolar environment forming the key interactions. The active site His and Ser residues are located at the exterior of the beta-barrel, at the outer\ leaflet side of the membrane. This location indicates that under normal conditions the substrate and\ the active site are physically separated, since in E. coli phospholipids are exclusively located in the\ inner leaflet of the outer membrane.

    \ ' '3881' 'IPR002642' '\ This family consists of lysophospholipase / phospholipase B and cytosolic phospholipase A2 which also has a C2 domain . Phospholipase B enzymes catalyse the release of fatty acids from lysophsopholipids and are capable in vitro of hydrolyzing all phospholipids extractable from yeast cells PUBMED:8027085. Cytosolic phospholipase A2 associates with natural membranes in response to physiological increases in Ca2+ and selectively hydrolyses arachidonyl phospholipids PUBMED:8051052, the aligned region corresponds the carboxy-terminal Ca2+-independent catalytic domain of the protein as discussed in PUBMED:8051052.\ ' '3882' 'IPR004126' '\ Proteins in this entry inhibit basic phospholipase A2 isozymes in snake\'s venom PUBMED:9395334.\ ' '3883' 'IPR001010' '\ Thionins are small, basic plant proteins, 45 to 50 amino acids in length, which include three or four conserved disulphide linkages. The proteins are toxic to animal cells, presumably attacking the cell membrane and rendering it permeable: this results in the inhibition of sugar uptake and allows potassium and phosphate ions, proteins, and nucleotides to leak from cells PUBMED:3985614. Thionins are mainly found in seeds where they may act as a defence against consumption by animals. A barley (Hordeum vulgare) leaf thionin that is highly toxic to plant pathogens and is involved in the mechanism of plant defence against microbial infections has also been identified PUBMED:1377959. The hydrophobic protein crambin from the Abyssinian kale (Crambe abyssinica) is also a member of the thionin family PUBMED:3985614.\ ' '3884' 'IPR001896' '\ This family of membrane/coat proteins are found in a number of different ssRNA plant virus families that include Potexvirus, Hordeivirus and Carlavirus.\ ' '3885' 'IPR006904' '\

    These sequences are a family of uncharacterised hypothetical proteins restricted to eukaryotes. represents a sequence from Nicotiana tabacum (Common tobacco)which is up regulated in response to TMV infection.

    \ ' '3886' 'IPR007711' '\ Several plasmids with proteic killer gene systems have been reported. All of them encode a stable toxin and an unstable antidote. Upon loss of the plasmid, the less stable inhibitor is inactivated more rapidly than the toxin, allowing the toxin to be activated. The activation of those systems result in cell filamentation and cessation of viable cell production. It has been verified that both the stable killer and the unstable inhibitor of the systems are short polypeptides. This family corresponds to the toxin.\ ' '3887' 'IPR002596' '\ This family consists of conserved hypothetical proteins from Borrelia burgdorferi (Lyme disease spirochete)\ , some of which are putative plasmid partition proteins PUBMED:9695920.\ ' '3888' 'IPR007712' '\ Members of this family are involved in plasmid stabilisation. The exact molecular function of this protein is not known.\ ' '3889' 'IPR001101' '\

    Plectin may have a role in cross-linking intermediate filaments, in inter-linking intermediate filaments with microtubules and microfilaments and in anchoring intermediate filaments to the plasma and nuclear membranes. Plectin is recruited into hemidesmosomes, multiprotein complexes that facilitate adhesion of epithelia to the basement membrane, thereby providing linkage between the intracellular keratin filaments to the laminins of the extracellular matrix. Plectin binds to hemidesmosomes through association of its actin-binding domain with the first pair of fibronectin type III repeats and a small part of the connecting segment of the integrin-beta4 subunit, the latter (integrin-alpha6,beta4) acting as a receptor for the extracellular matrix component laminin-5.

    \

    The plectin repeat is also seen in the cell adhesion junction plaque proteins, desmoplakin, envoplakin, and bullous pemphigoid antigen. The domains in plakins show considerable sequence homology. The N-terminus consists of a plakin domain containing a number of subdomains with high alpha-helical content, while the central coiled-coil domain is composed of heptad repeats involved in the dimerisation of plakin, and the C-terminus contains one or more homologous repeat sequences referred to plectin repeats PUBMED:14668477. This entry represents the plectin repeats found in the C-terminus of plakin proteins.

    \ ' '3890' 'IPR002510' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported.

    \

    This group of peptidases belong to MEROPS peptidase family U62 (clan U-). The type example is microcin-processing peptidase 1 from Escherichia coli, which is the product of the gene PmbA. It has been suggests that the pmbA gene product acts to inhibit the interaction between the letD protein and the A subunit of DNA gyrase. The letA (ccdA) and letD (ccdB) genes of the F plasmid, located just outside the sequence essential for F-plasmid replication, contribute to stable maintenance of the plasmid in E. coli\ cells. The letD gene product acts to inhibit partitioning of chromosomal DNA and cell\ division by inhibiting DNA gyrase activity, whereas the letA gene product acts to reverse the\ inhibitory activity of the letD gene product PUBMED:8604133.

    \ \ It has also been proposed that PmbA facilitates the secretion of microcin B17 (MccB17) the by completing its maturation PUBMED:2082149. Microcin B17 (MccB17) is a peptide antibiotic produced by E. coli strains harbouring plasmid pMccB17.\

    \ ' '3892' 'IPR005002' '\ This enzyme () is involved in the synthesis of the GDP-mannose and dolichol-phosphate-mannose required for a number of critical mannosyl transfer reactions.\ ' '3893' 'IPR005599' '\

    Members of this family are mannosyltransferase enzymes PUBMED:9576863, PUBMED:10954751. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis PUBMED:12200473, PUBMED:12030331. In yeast the SMP3 (YOR149C) has been implemented in plasmid stability PUBMED:2005867.

    \ ' '3894' 'IPR004031' '\ Several vertebrate small integral membrane glycoproteins are evolutionary related PUBMED:7499407, PUBMED:7499420, PUBMED:8996089, including eye lens specific membrane protein 20 \ (MP20 or MP19); epithelial membrane protein-1 (EMP-1), which is also known as tumor-associated\ membrane protein (TMP) or as squamous cell-specific protein Cl-20; epithelial membrane protein-2 \ (EMP-2), which is also known as XMP; epithelial membrane protein-3 (EMP-3), also known as YMP;\ and peripheral myelin protein 22 (PMP-22), which is expressed in many tissues but mainly by \ Schwann cells as a component of myelin of the peripheral nervous system (PNS). PMP-22 probably \ plays a role both in myelinization and in cell proliferation. Mutations affecting PMP-22 are \ associated with hereditary motor and sensory neuropathies such as Charcot-Marie-Tooth disease \ type 1A (CMT-1A) in human or the trembler phenotype in mice. The proteins of this family are \ about 160 to 173 amino acid residues in size, and contain four transmembrane segments. PMP-22, \ EMP-1, -2 and -3 are highly similar, while MP20 is more distantly related. This family also includes the claudins, which are components of tight junctions.\ ' '3895' 'IPR002569' '\

    Peptide methionine sulphoxide reductase (Msr) reverses the inactivation of many proteins due to the oxidation of critical methionine residues by reducing methionine sulphoxide, Met(O), to methionine PUBMED:10841552. It is present in most living organisms, and the cognate structural gene belongs to the so-called minimum gene set PUBMED:8994848, PUBMED:8816789.

    \ \

    The domains: MsrA and MsrB, reduce different epimeric forms of methionine sulphoxide. This group represent MsrA, the crystal structure of which has been determined in a number of organisms. In Mycobacterium tuberculosis, the MsrA structure has been determined to 1.5 Angstrom resolution PUBMED:12837786. \ \ In contrast to the three catalytic cysteine residues found in previously characterised MsrA structures, M. tuberculosis MsrA represents a class containing only two functional cysteine residues. The overall structure shows no resemblance to the structures of MsrB () from other organisms; though the active sites show approximate mirror symmetry. In each case, conserved amino acid motifs mediate the stereo-specific recognition and reduction of the substrate.

    \ \

    In a number of pathogenic bacteria including Neisseria gonorrhoeae, the MsrA and MsrB domains are fused; the MsrA being N-terminal to MsrB. This arrangement is reversed in Treponema pallidum. In N. gonorrhoeae and Neisseria meningitidis a thioredoxin domain is fused to the N-terminus. This may function to reduce the active sites of the downstream MsrA and MsrB domains.

    \ ' '3896' 'IPR003342' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Dolichyl-phosphate-mannose-protein mannosyltransferase proteins belong to the glycosyltransferase family 39 () and are responsible for O-linked glycosylation of proteins. They catalyse the reaction:

    \

    The transfer of mannose to seryl and threonyl residues of secretory proteins is catalyzed by a family of protein mannosyltransferases in Saccharomyces cerevisiae coded for by seven genes (PMT1-7). Protein O-glycosylation is essential for cell wall rigidity and cell integrity and this protein modification is vital for S. cerevisiae PUBMED:8918452.

    \ ' '3897' 'IPR005056' '\ The matrix proteins of Pneumovirus virus are transcriptional processivity and antitermination factor and play a crucial role in viral assembly.\ ' '3898' 'IPR004930' '\ This family is the Pneumovirus nucleocapsid protein. It is the most abundant protein in the virion and an important element in conferring helical symmetry on the nucleoprotein core as well as interacting with the M protein during virion formation.\ ' '3899' 'IPR005099' '\ This non-structural protein is one of two found in pneumoviruses. The protein is about 140 amino acids in length. The NS1 protein appears to be important for\ efficient replication but not essential PUBMED:10982380. The NS1 protein has been shown by yeast two-hybrid to interact with the viral P protein PUBMED:10949949. This protein is also known as\ the 1C protein. It has also been shown that NS1 can potently inhibit transcription and RNA replication PUBMED:9445048.\ ' '3900' 'IPR003487' '\

    This family represents a phosphoprotein from Paramyxoviridae, which could be a putative RNA polymerase alpha subunit that may function in template binding PUBMED:7996153.

    \ ' '3901' 'IPR000845' '\

    Phosphorylases in this entry include:

    \

    \ ' '3902' 'IPR004003' '\ Alanine dehydrogenases () and pyridine nucleotide transhydrogenase () have been\ shown to share regions of similarity PUBMED:8439307. Alanine dehydrogenase catalyzes the NAD-dependent\ reversible reductive amination of pyruvate into alanine. Pyridine nucleotide transhydrogenase catalyzes\ the reduction of NADP+ to NADPH with the concomitant oxidation of NADH to NAD+. This enzyme is located\ in the plasma membrane of prokaryotes and in the inner membrane of the mitochondria of eukaryotes. The\ transhydrogenation between NADH and NADP is coupled with the translocation of a proton across the\ membrane. In prokaryotes the enzyme is composed of two different subunits, an alpha chain (gene pntA)\ and a beta chain (gene pntB), while in eukaryotes it is a single chain protein. The sequence of alanine\ dehydrogenase from several bacterial species are related with those of the alpha subunit of bacterial\ pyridine nucleotide transhydrogenase and of the N-terminal half of the eukaryotic enzyme. The two most\ conserved regions correspond respectively to the N-terminal extremity of these proteins and to a central\ glycine-rich region which is part of the NAD(H)-binding site.\ ' '3903' 'IPR016033' '\

    DP2 is the large subunit of a two-subunit novel archaebacterial replicative DNA polymerase first characterised for Pyrococcus furiosus. The structure of DP2 appears to be organised as a ~950 residue component separated from a ~300 residue component by a ~150 residue intein. The other subunit, DP1, has sequence similarity to the eukaryotic DNA polymerase delta small subunit.

    \ \

    This entry represents the N-terminal ~950 residue component of DP2.

    \ ' '3904' 'IPR002914' '\

    Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.

    \

    The allergens in this family include allergens with the following designations: Lol p 5, Pha a 5, Phl p 5, Phl p 6, Phl p 11 and Poa p 9.

    \

    Grass pollen allergens are one of the major causes of type I allergies (including allergic rhinoconjunctivitis, allergic bronchial asthma and hayfever), afflicting 15-20% of a genetically predisposed population PUBMED:1702432. The predicted molecular masses of the known pollen allergen proteins range from 28.3 to 37.8kDa PUBMED:1702432. Northern analysis indicates that expression of the genes is confined to pollen tissue. A low level of similarity is observed between the Phl p 5 allergens and the N-terminal sequences of Poa pratensis (Kentucky bluegrass) p 9 proteins PUBMED:2051020 (see ).

    \

    The N-terminal region of P. pratensis p 9 has been shown to possess epitopes that cross-react with the acidic group V allergens of Phleum pratense (Common timothy) PUBMED:2051020. Comparison of amino acid sequences of recombinant P. pratensis p 9 proteins with those of Lol p 5 isoallergens revealed a low level of similarity between the N-terminal sequences of these proteins PUBMED:2051020. A C-terminal region (), conserved in P. pratensis p 9 allergens, appears to contain epitopes unique to these proteins PUBMED:2051020.

    \ ' '3905' 'IPR006041' '\

    Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee\ King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E.,\ Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of\ the first three letters of the genus; a space; the first letter of the\ species name; a space and an arabic number. In the event that two species\ names have identical designations, they are discriminated from one another\ by adding one or more letters (as necessary) to each species designation.

    \

    The allergens in this family include allergens with the following designations: Ole e 1.

    \ \

    A number of plant pollen proteins, whose biological function is not yet\ known, are structurally related PUBMED:8404906.\ These proteins are most probably secreted and consist of about 145 residues.\ There are six cysteines\ which are conserved in the sequence of these proteins. They seem to be\ involved in disulphide bonds.

    \ ' '3906' 'IPR017450' '\

    A subgroup of serine/threonine protein kinases, Polo or Polo-like kinases play multiple roles during the cell cycle. Polo kinases are required at several key points\ through mitosis, starting from control of the G2/M transition through phosphorylation of Cdc25C and mitotic cyclins. Polo kinases are characterised by an amino terminal catalytic domain, and a carboxy terminal non-catalytic domain consisting of three blocks of conserved\ sequences known as polo boxes which form one single functional domain PUBMED:9914175. The domain is named after its founding member encoded by the polo gene of Drosophila melanogaster PUBMED:1660828. This domain of around 70 amino acids has been found in species ranging from yeast to mammals. Polo boxes appear to mediate interaction with multiple proteins through protein:protein interactions; some but not all of these proteins are substrates for the kinase domain of the molecule PUBMED:12615979.

    \

    The crystal structure of the polo domain of the murine protein, Sak, is dimeric,\ consisting of two alpha-helices and two six-stranded beta-sheets PUBMED:12352953. The topology of one polypeptide subunit of the\ dimer consists of, from its N- to C-terminus, an extended strand segment, five beta-strands, one alpha-helix (A) and a\ C-terminal beta-strand. Beta-strands from one\ subunit form a contiguous antiparallel beta-sheet with beta-strands from the second subunit. The two beta-sheets pack with a\ crossing angle of 110 degrees, orienting the hydrophobic surfaces\ inward and the hydrophilic surfaces outward. Helix A, which is\ colinear with beta-strand 6 of the same polypeptide, buries a large\ portion of the non-overlapping hydrophobic beta-sheet surfaces.\ Interactions involving helices A comprise a majority of the\ hydrophobic core structure and also the dimer interface.

    \

    Point mutations in the Polo box of the budding yeast Cdc5 protein abolish the ability of overexpressed Cdc5 to interact with the spindle poles and to organise cytokinetic structures PUBMED:10594031.

    \ ' '3907' 'IPR003715' '\ The extracellular polysaccharide colanic acid (CA) is produced by species of the family Enterobacteriaceae. In Escherichia coli (strain K12) the CA cluster comprises 19 genes. The wzx gene encodes a protein with multiple transmembrane segments that may function in export of the CA repeat unit from the cytoplasm into the periplasm in a process analogous to O-unit export. The CA gene clusters may be involved in the export of polysaccharide from the cell PUBMED:8759852.\ ' '3908' 'IPR002646' '\

    This group includes nucleic acid independent RNA polymerases, such as polynucleotide adenylyltransferase (), which adds the poly (A) tail to mRNA. This group also includes the tRNA nucleotidyltransferase that adds the CCA to the 3\' of the tRNA .

    \ ' '3909' 'IPR002507' '\ Reovirus nonstructural protein sigma NS exhibits a ssRNA-binding activity and is thought to be involved in assembling the reovirus mRNAs for genome replication and virion morphogenesis. Various studies have been carried out to localize the RNA-binding site PUBMED:9343167. They suggest that the first 11 amino acids of sigma NS, which are predicted to form an amphipathic alpha-helix, are important for both ssRNA binding and formation of complexes larger than 7-9 S.\

    A number of other studies have attempted to identify and characterise the RNA-binding activities of sigma NS. A study of the Avian orthoreovirus sigma NS protein suggests that it binds to single-stranded RNA in a nucleotide sequence non-specific manner and is functionally similar to its counterpart specified by mammalian reovirus PUBMED:9634083.

    \ ' '3910' 'IPR001746' '\ These occlusion proteins are major components of the virus occlusion bodies, large proteinaceous structures (polyhedra) that protect the virus from the outside environment for extended periods until they are ingested by insect larvae. They occur in various viruses including the single nucleocapsid nuclear polyhedrosis viruses and granuloviruses.\ ' '3911' 'IPR005031' '\ Members of this family of enzymes from Streptomyces spp. are involved in polyketide (linear poly-beta-ketones) synthesis.\ ' '3912' 'IPR002643' '\ This family consists of the DNA-binding protein or agnoprotein from various polyomaviruses. This protein is highly basic and can bind single stranded and double stranded DNA PUBMED:6262654. Mutations in the agnoprotein produce smaller viral plaques, hence its function is not essential for growth in tissue culture cells but something has slowed in the normal replication cycle PUBMED:3027418. There is also evidence suggesting that the agnogene and agnoprotein act as regulators of structural protein synthesis PUBMED:3027418.\ ' '3913' 'IPR000662' '\

    This entry represents the major capsid protein VP1 (viral protein 1) from Polyomaviruses, such as Murine polyomavirus (strain P16 small-plaque) (MPyV) PUBMED:9628860. Polyomaviruses are dsDNA viruses with no RNA stage in their life cycle. The virus capsid is composed of 72 icosahedral units, each of which is composed of five copies of VP1. The virus attaches to the cell surface by recognition of oligosaccharides terminating in alpha(2,3)-linked sialic acid. The capsid protein VP1 forms a pentamer. The complete capsid is composed of 72 VP1 pentamers, with a minor capsid protein, VP2 or VP3, inserted into the centre of each pentamer like a hairpin. This structure restricts the exposure of internal proteins during viral entry. Polyomavirus coat assembly is rigorously controlled by chaperone-mediated assembly. During viral infection, the heat shock chaperone hsc70 binds VP1 and co-localises it in the nucleus, thereby regulating capsid assembly PUBMED:12928495.

    \ ' '3914' 'IPR001070' '\

    This family includes the VP2 and VP3 internal coat proteins from Polyomaviruses, which are small dsDNA tumour viruses. Their capsids contain 360 copies of the VP1 proteins arranged in 72 pentamers. This capsid encloses the internal proteins VP2 and VP3, as well as the viral DNA. A single copy of VP2 or VP3 associates with each VP1 pentamer. A\ crystal structure shows that the C-terminal region of the VP2/VP3 protein\ interacts with the VP1 pentamer PUBMED:9628860.

    \ ' '3915' 'IPR000092' '\ A variety of isoprenoid compounds are synthesized by various organisms. For\ example in eukaryotes the isoprenoid biosynthetic pathway is responsible for\ the synthesis of a variety of end products including cholesterol, dolichol,\ ubiquinone or coenzyme Q. In bacteria this pathway leads to the synthesis of\ isopentenyl tRNA, isoprenoid quinones, and sugar carrier lipids. Among the\ enzymes that participate in that pathway, are a number of polyprenyl\ synthetase enzymes which catalyze a 1\'4-condensation between 5 carbon isoprene\ units.\ It has been shown PUBMED:2198286, PUBMED:2089044, PUBMED:1826006, PUBMED:1303794, PUBMED:1495965 that all the above enzymes share some regions of\ sequence similarity. Two of these regions are rich in aspartic-acid residues\ and could be involved in the catalytic mechanism and/or the binding of the\ substrates.\ ' '3916' 'IPR002797' '\ Members of this family are integral membrane proteins PUBMED:8118055, and many are implicated in the production\ of polysaccharide. The family includes RfbX part of the O antigen biosynthesis\ operon PUBMED:7517390, and SpoVB from Bacillus subtilis (),\ which is involved in spore cortex biosynthesis PUBMED:1744050.\ ' '3917' 'IPR003869' '\ This domain is found in diverse bacterial polysaccharide biosynthesis proteins including the CapD protein from Staphylococcus aureus PUBMED:7961465, the WalL protein, mannosyl-transferase PUBMED:9079898, and several putative epimerases. The CapD protein is required for biosynthesis of type 1 capsular polysaccharide.\ ' '3918' 'IPR003684' '\ This family consists of porins from the alpha subdivision of Proteobacteria the members of this family are related to Gram-negative porins PUBMED:1370281. The porins form large aqueous channels in the cell membrane allowing the selective entry of hydrophilic compounds this so called \'molecular sieve\' is found in the cell walls of Gram-negative bacteria.\ ' '3919' 'IPR000860' '\

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway PUBMED:16564539. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin PUBMED:17227226.

    \

    The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase (), or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase () to charge a tRNA with glutamate, glutamyl-tRNA reductase () to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase () to catalyse a transamination reaction to produce ALA.

    \

    The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase, ) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase, ) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase () to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.

    \

    Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase (). To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase () PUBMED:11215515.

    \ \ \

    This entry represents hydroxymethylbilane synthase (or porphobilinogen deaminase, ), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses the polymerisation of four PBG molecules into the tetrapyrrole structure, preuroporphyrinogen, with the concomitant release of four molecules of ammonia. This enzyme uses a unique dipyrro-methane cofactor made from two molecules of PBG, which is covalently attached to a cysteine side chain. The tetrapyrrole product is synthesized in an ordered, sequential fashion, by initial attachment of the first pyrrole unit (ring A) to the cofactor, followed by subsequent additions of the remaining pyrrole units (rings B, C, D) to the growing pyrrole chain PUBMED:11215515. The link between the pyrrole ring and the cofactor is broken once all the pyrroles have been added. This enzyme is folded into three distinct domains that enclose a single, large active site that makes use of an aspartic acid as its one essential catalytic residue, acting as a general acid/base during catalysis PUBMED:12555854, PUBMED:1522882. A deficiency of hydroxymethylbilane synthase is implicated in the neuropathic disease, Acute Intermittent Porphyria (AIP) PUBMED:16935474.

    \ ' '3920' 'IPR000860' '\

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway PUBMED:16564539. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin PUBMED:17227226.

    \

    The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase (), or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase () to charge a tRNA with glutamate, glutamyl-tRNA reductase () to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase () to catalyse a transamination reaction to produce ALA.

    \

    The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase, ) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase, ) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase () to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.

    \

    Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase (). To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase () PUBMED:11215515.

    \ \ \

    This entry represents hydroxymethylbilane synthase (or porphobilinogen deaminase, ), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses the polymerisation of four PBG molecules into the tetrapyrrole structure, preuroporphyrinogen, with the concomitant release of four molecules of ammonia. This enzyme uses a unique dipyrro-methane cofactor made from two molecules of PBG, which is covalently attached to a cysteine side chain. The tetrapyrrole product is synthesized in an ordered, sequential fashion, by initial attachment of the first pyrrole unit (ring A) to the cofactor, followed by subsequent additions of the remaining pyrrole units (rings B, C, D) to the growing pyrrole chain PUBMED:11215515. The link between the pyrrole ring and the cofactor is broken once all the pyrroles have been added. This enzyme is folded into three distinct domains that enclose a single, large active site that makes use of an aspartic acid as its one essential catalytic residue, acting as a general acid/base during catalysis PUBMED:12555854, PUBMED:1522882. A deficiency of hydroxymethylbilane synthase is implicated in the neuropathic disease, Acute Intermittent Porphyria (AIP) PUBMED:16935474.

    \ ' '3921' 'IPR000864' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    This family of proteinase inhibitors belong to MEROPS inhibitor family I13, clan IG. They inhibit peptidases of the S1 () and S8 () families PUBMED:14705960. \ Potato inhibitor type I sequences are not solely restricted to potatoes but are found in other plant species for example: barley endosperm chymotrypsin inhibitor PUBMED:3106042, and pumpkin trypsin inhibitor. Exceptions are found in leech\'s, e.g.Hirudo medicinalis (Medicinal leech), but not other metazoa PUBMED:3519213. In general, the proteins have retained a specificity towards chymotrypsin-like and elastase-like proteases PUBMED:. Structurally these inhibitors are small (60 to 90 residues) and in contrast with other families of protease inhibitors, they lack disulphide bonds. The inhibitor is a wedge-shaped molecule, its pointed edge formed by the protease-binding loop, which contains the scissile bond. The loop binds tightly to the protease active site, subsequent cleavage of the scissile bond causing inhibition of the enzyme PUBMED:3519213.

    \ \

    The inhibitors (designated type I and II) are \ synthesised in potato tubers, increasing in concentration as the tuber develops. Synthesis of the inhibitors throughout the plant is also induced by leaf damage; this systemic response being triggered by the release of a putative plant hormone PUBMED:.

    \ \

    Examples found in the bacteria and archaea are probable false positives.

    \ \ ' '3922' 'IPR001592' '\

    This protease is found in genome polyproteins of potyviruses. The genome polyprotein contains: N-terminal protein (P1), helper component protease\ (, HC-PRO), protein P3, 6KD protein (6K1), cytoplasmic inclusion protein (CI), 6KD protein 2 (6K2), genome-linked protein (VPG), nuclear inclusion protein A (), nuclear inclusion protein B () and coat protein (CP).\ The coat protein is at the C terminus of the polyprotein.

    \ ' '3923' 'IPR000327' '\

    POU proteins are eukaryotic transcription factors containing a bipartite DNA binding domain referred to as the POU domain. The acronym POU (pronounced \'pow\') is derived from the names of three mammalian transcription factors, the pituitary-specific Pit-1, the octamer-binding proteins Oct-1 and Oct-2, and the neural Unc-86 from Caenorhabditis elegans. POU domain genes have been identified in diverse organisms including nematodes, flies, amphibians, fish and mammals but have not been yet identified in plants and fungi. The various members of the POU family have a wide variety of functions, all of which are related to the function of the neuroendocrine system PUBMED:8462099 and the development of an organism PUBMED:11159814. Some other genes are also regulated, including those for immunoglobulin light and heavy chains (Oct-2) PUBMED:1967834, PUBMED:1967821, and trophic hormone genes, such as those for prolactin and growth hormone (Pit-1).

    \ \

    The POU domain is a bipartite domain composed of two subunits separated by a non-conserved region of 15-55 aa. The N-terminal subunit is known as the POU-specific (POUs) domain (), while the C-terminal subunit is a homeobox domain (). 3D structures of complexes including both POU subdomains bound to DNA are available. Both subdomains contain the structural motif \'helix-turn-helix\', which directly associates with the two components of bipartite DNA binding sites, and both are required for high affinity sequence-specific DNA-binding. The domain may also be involved in protein-protein interactions PUBMED:1628619. The subdomains are connected by a flexible linker PUBMED:11183772, PUBMED:8156594, PUBMED:9009203. In proteins a POU-specific domain is always accompanied by a homeodomain. Despite of the lack of sequence homology, 3D structure of POUs is similar to 3D structure of bacteriophage lambda repressor and other members of HTH_3 family PUBMED:11183772, PUBMED:8156594.

    \ \

    This entry represents the POU-specific subunit of the POU domain.

    \ ' '3924' 'IPR007755' '\ This is a family of conserved Chordopoxvirinae A11 family proteins. A conserved region spans the entire protein in the majority of family members.\ ' '3925' 'IPR006744' '\

    This family contains vaccinia virus protein A12 and its homologues. VVA12 is a virion protein though its function is unknown.

    \ ' '3926' 'IPR006932' '\

    This family, representing the Poxvirus A22 protein, is a Holliday junction resolvase, it specifically cleaves and resolves four-way DNA Holliday junctions into linear duplex products.

    \ ' '3927' 'IPR007664' '\

    The poxvirus A28 protein is expressed at late times during the virus replication cycle and is a membrane component of the intracellular mature virion. Repression of A28 inhibits cell-to-cell spread, suggesting that all poxviruses use a common A28-dependent mechanism of cell penetration PUBMED:14963132. An N-terminal hydrophobic sequence, present in all poxvirus A28 orthologues, anchors the protein in the virion surface membrane so that most of it is exposed to the cytoplasm PUBMED:14963131.

    \ ' '3928' 'IPR006758' '\

    The A32 protein is thought to be an ATPase involved in viral DNA packaging PUBMED:8470370.

    \ ' '3929' 'IPR007032' '\

    These proteins are homologues of vaccinia virus A51.

    \ ' '3930' 'IPR007008' '\

    This is a family of poxvirus proteins of unknown function.

    \ ' '3931' 'IPR006834' '\ This is a family of Chordopoxvirus proteins composing one of the two subunits that make up VITF-3, a virally encoded complex necessary for intermediate stage transcription PUBMED:10077573.\ ' '3932' 'IPR006920' '\ This is a family of Chordopoxvirus A9 proteins. Chordopoxvirus belongs to the family Poxviridae and is the cause of vertebrate infections PUBMED:17848071.\ ' '3933' 'IPR007596' '\ The repeat is found in the A-type inclusion protein of the Poxvirus family PUBMED:2826668.\ ' '3934' 'IPR004966' '\ The Pox virus Ag35 surface protein is an evelope protein known as protein H5.\ ' '3936' 'IPR005004' '\

    This is a family of proteins expressed by members of the Poxviridae.

    \ ' '3937' 'IPR004967' '\ This family includes Poxvirus C7 and F8A proteins.\ ' '3938' 'IPR006791' '\

    This entry represents the Pox virus D2 proteins.

    \ ' '3939' 'IPR011608' '\

    Transcriptional antiterminators and activators containing phosphoenolpyruvate: sugar phosphotransferase system (PTS) regulation domains (PRDs) form a class of bacterial regulatory proteins whose activity is modulated by phosphorylation. These regulators stimulate the expression of genes and operons involved in carbohydrate metabolism.

    \ \

    PRD-containing proteins are involved in the regulation of catabolic operons in Gram+ and Gram- bacteria PUBMED:1732212, PUBMED:9045813 and are often characterised by a short N-terminal effector domain that binds to either RNA (CAT-RBD for antiterminators, ) or DNA (for activators), and a duplicated PRD module which is phosphorylated on conserved histidines by the sugar phosphotransferase system (PTS) in response to the availability of carbon source. The phosphorylations are thought to modify the stability of the dimeric proteins and thereby the RNA- or DNA-binding activity of the effector domain PUBMED:11751049, PUBMED:11733988, PUBMED:11447120.

    \ \

    PRDs are characterised by the presence of a duplicated regulatory module of ~100 residues that can be reversibly phosphorylated on histidyl residues by the PTS. PRDs in transcriptional antiterminators and activators are PTS regulatory targets that are (de)phosphorylated in response to the availability of carbon sources PUBMED:9202047, PUBMED:9663674, PUBMED:11751049, PUBMED:11447120, PUBMED:15699035.

    \ \

    The PRD domain comprises one and often two highly conserved histidines. It forms a compact bundle comprising five helices (alpha1-alpha5). The core of the PRD module consists of two pairs of antiparallel helices making an angle of ~60 degrees. The first pair contains the antiparallel helices alpha1 and alpha4, while the second pair contains alpha2 and alpha5. The third helix (alpha3) is oriented perpendicularly to alpha5 at the periphery of the bundle. The helices are connected by loops of varying length PUBMED:11751049, PUBMED:11447120, PUBMED:15699035.

    \ ' '3940' 'IPR007660' '\

    This is a family of Chordopoxvirinae D3 protein. The conserved region occupies the entire length of D3 protein.

    \ ' '3941' 'IPR004968' '\

    This domain is found at the C terminus of phage P4 alpha protein and related proteins. Phage P4 DNA replication depends on the product of the alpha gene, which has origin recognition ability, DNA helicase activity, and DNA primase activity. The structure of the protein can be summarised as follows: The N terminus provides the primase activity, the central region is the helicase/nucleoside triphosphatase domain and the ori DNA recognition resides in the C-terminal 1/3 of the protein PUBMED:7635818.

    \ \ \ \

    The domain is also found at the C terminus of a number of proteins from orthopox viruses including vaccinia virus D5. D5 encodes a 90-kDa protein that is transiently expressed at early times after infection. It has an nucleoside triphosphatase activity which is independent of common nucleic acid cofactors and it can hydrolyze all the common ribo- and deoxyribonucleoside triphosphates to diphosphates in the presence of a divalent cation PUBMED:7636979.

    \ ' '3942' 'IPR006890' '\

    This entry represents a family of probable FAD-linked sulphydryl oxidases found in poxviruses.

    \ ' '3943' 'IPR007585' '\

    Protein E2 is encoded by pox viruses and its function is unknown.

    \ ' '3944' 'IPR006749' '\ This family contains fowlpox virus protein E6 and its homologues. The members of this family are functionally uncharacterised PUBMED:10729156.\ ' '3945' 'IPR005057' '\

    This is a protein family of unknown function.

    \ ' '3946' 'IPR007027' '\ These proteins belong to the poxvirus F11 family. They are early virus proteins.\ ' '3947' 'IPR005005' '\

    The vaccinia virus F12L gene encodes a 65 kDa protein that is expressed late during infection and is important for\ plaque formation, EEV production and virulence. The F12L protein\ is located on intracellular enveloped virus (IEV) particles, but is absent from immature virions, intracellular mature virus\ and cell-associated enveloped virus. F12L shows co-localization with endosomal compartments\ and microtubules and appears to play a role in the the transport of IEV particles to the cell surface on microtubules PUBMED:11752717.

    \ ' '3948' 'IPR007675' '\

    Protein F15 is found in a number of Poxviruses.

    \ ' '3949' 'IPR006798' '\

    This entry represents the Poxvirus F16 proteins.

    \ ' '3950' 'IPR006854' '\ This is a family of poxvirus proteins required for virus morphogenesis. This protein is necessary for proteolytic processing of the major viral structural proteins, P4a and P4b PUBMED:1920628.\ ' '3951' 'IPR007678' '\

    Protein G5 is found in a number of Poxviruses.

    \ ' '3952' 'IPR006872' '\ This is a family of poxvirus late H7 proteins.\ ' '3953' 'IPR004969' '\

    Proteins in this group show homology to vaccinia virus I1L (Late) encoded protein.

    \ ' '3954' 'IPR006754' '\

    The 34-kDa protein encoded by the I3 gene of vaccinia virus is expressed at early and intermediate times postinfection and is\ phosphorylated on serine residues. I3 protein demonstrates a striking affinity for single-stranded, but not\ for double-stranded, DNA which suggests a role in DNA replication and/or repair. Electrophoretic mobility shift assays indicate that numerous I3 molecules can bind to a template,\ reflecting the stoichiometric interaction of I3 with DNA. Sequence analysis reveals that a pattern of aromatic and charged amino acids\ common to many replicative single-stranded DNA binding proteins (SSBs) is conserved in I3 PUBMED:9525612.

    \ ' '3955' 'IPR006803' '\

    This entry represents the Poxvirus protein I5.

    \ ' '3956' 'IPR007674' '\ Previously uncharacterised I6 protein binds tightly and with great specificity to the hairpin form of the viral telomeric sequence. This telomere binding protein is thought to play a role in the initiation of vaccinia virus genome replication and/or genome encapsidation PUBMED:11581377.\ ' '3958' 'IPR005006' '\

    This is a family of proteins expressed by members of the Poxviridae.

    \ ' '3959' 'IPR005007' '\

    This is a family of proteins expressed by members of the Poxviridae.

    \ ' '3960' 'IPR006083' '\

    Phosphoribulokinase (PRK) catalyses the ATP-dependent phosphorylation of \ ribulose-5-phosphate to ribulose-1,5-phosphate, a key step in the pentose phosphate \ pathway where carbon dioxide is assimilated by autotrophic organisms PUBMED:2175647. In \ general, plant enzymes are light-activated by the thioredoxin/ferredoxin system, while \ those from photosynthetic bacteria are regulated by a system that has an absolute \ requirement for NADH. Thioredoxin/ferredoxin regulation is mediated by the reversible\ oxidation/reduction of sulphydryl and disulphide groups.

    Uridine kinase (pyrimidine ribonucleoside kinase) is the rate-limiting enzyme in the pyrimidine\ salvage pathway. It catalyzes the following reaction:\

    Pantothenate kinase () catalyzes the rate-limiting step in the biosynthesis of coenzyme A, the conversion of pantothenate to D-4\'-phosphopantothenate in the presence of ATP.

    \ ' '3961' 'IPR006956' '\ This family includes variola (smallpox) and vaccinia virus L5 proteins. L5 is thought to contain a metal-binding region PUBMED:8383392.\ ' '3962' 'IPR005023' '\

    This entry represents the late protein H2 found in Vaccinia and other poxviruses. This protein is a highly conserved viral membrane protein found in all sequenced poxviruses, containing an N-terminal transmembrane domain and four conserved cysteines thought to be involved in the formation of intramolecular disulphide bonds PUBMED:15795260. H2 has been shown to be necessary for entry into the host cell and virus-induced cell-cell fusion, but is not required for virus morphogenesis or the attachment of virus particles to cells. It is part of an entry-fusion complex composed of eight viral membrane proteins PUBMED:16339313.

    \ ' '3963' 'IPR006971' '\ This family includes M2 protein of unknown function from variola virus. \ ' '3964' 'IPR005009' '\ Vaccinia virus, the prototypic poxvirus, possesses a double-stranded DNA genome of 191,686 base pairs \ capable of encoding approximately 200 proteins. Virion enzymes produce mature viral mRNA with eukaryotic features,\ including a 5\' cap and a 3\' poly(A) tail. V. virus mRNA capping enzyme is a multifunctional protein with RNA triphosphatase, RNA guanylyltransferase, RNA\ (guanine-7) methyltransferase, and transcription termination factor activities. The protein is a heterodimer of 95- and 33-kDa\ subunits encoded by the v.virus D1 and D12 genes, respectively. The capping reaction entails transfer of GMP from\ GTP to the 5\'-diphosphate end of mRNA via a covalent enzyme-(lysyl-GMP) intermediate.\ ' '3966' 'IPR004900' '\ The Poxvirus P35 protein is an immunodominant envelope protein.\ ' '3967' 'IPR005058' '\

    P4A is one of the most abundant structural proteins in the Vaccinia virion.

    \ ' '3968' 'IPR004972' '\

    This family is the Poxvirus P4B major core protein. It is a precursor for one of the two most abundant structural components of the virion (major core proteins 4A and 4B).

    \ ' '3969' 'IPR004976' '\ Poly(A) polymerase () catalyses template-independent extension of the 3\'-end of a DNA or RNA strand by one nucleotide at a time. The Poxvirus enzyme creates the 3\'(poly)A tail of mRNAs, and is a heterodimer of a catalytic and a regulatory subunit. This is the catalytic subunit.\ ' '3970' 'IPR004974' '\

    The Poxvirus RNA polymerase-associated transcription specificity factor Rap94 associates with RNA polymerase and may mediate binding of the core polymerase to VetF. It is required for transcription of early genes.

    \ ' '3971' 'IPR005008' '\

    This family represents the Poxvirus rifampicin resistance protein. The failure to isolate genotypic variants of Poxvirus family members encoding a predicted C-terminal truncated form of these proteins, suggests that\ the C terminus of the molecule may be essential to protein function, and, in turn, that this function may be essential to viral\ replication. It has been proposed that possession of a\ gene encoding a member of this polypeptide family might represent a defining molecular characteristic of the Poxviridae PUBMED:8609479.

    \ ' '3972' 'IPR004973' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \

    The Poxvirus DNA-directed RNA polymerase () catalyses DNA-template-directed extension of the 3\'-end of an RNA strand by one nucleotide at a time. The enzyme consists of at least eight subunits, this is the 18 kDa subunit.

    \ ' '3973' 'IPR005059' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \

    The DNA-dependent RNA polymerase from vaccinia virions has a molecular weight of approximately 500 kDa and can be dissociated into putative subunits of 140, 137, 37, 35, 31, 22, and 17 kDa. This group represents a DNA-directed RNA polymerase, 35 kDa subunit.

    \ ' '3974' 'IPR007579' '\

    Poxvirus T4 protein is thought to be retained in the endoplasmic reticulum. M-T4 of myxoma virus () is thought to protect infected lymphocytes from apoptosis and modulate the inflammatory response to virus infection PUBMED:10544103. The N terminus is .

    \ ' '3975' 'IPR007580' '\

    Poxvirus T4 protein is thought to be secreted or retained in the endoplasmic reticulum if the protein also contains an additional C-terminal region (). M-T4 of myxoma virus () is thought to protect infected lymphocytes from apoptosis and modulate the inflammatory response to virus infection PUBMED:10544103.

    \ ' '3976' 'IPR004975' '\

    Late transcription factor VLTF-2, acts with RNA polymerase to initiate transcription from late gene promoters PUBMED:2344616.

    \ ' '3977' 'IPR005022' '\

    This family of proteins function as a trans-activator of viral late genes.

    \ ' '3978' 'IPR007532' '\ The poxvirus early transcription factor (VETF), in addition to the viral RNA polymerase, is required for efficient transcription of early genes in vitro. VETF is a heterodimeric protein that binds specifically to early gene promoters. The heterodimer is comprised of an 82 kDa (this family) subunit and a 70 kDa subunit.\ ' '3979' 'IPR007031' '\ Members of this family are approximately 26 kDa, and are involved in trans-activation of late transcription PUBMED:8523544.\ ' '3980' 'IPR007586' '\ The 25 kDa product of Vaccinia virus gene L4R is also known as VP8. VP8 is found in the cores of Vaccinia virions and is essential for the formation of transcriptionally competent viral particles. It binds both single stranded and double stranded DNA and RNA with similar affinities. Binding is thought to involve cooperative interactions between protein subunits. The protein is proteolytically cleaved during viral assembly at an Ala-Gly-Ala site. Possible roles for VP8 include packaging and maintaining the DNA genome in a transcribable configuration; binding ssDNA during transcription initiation; and cooperation with I8R protein to unwind early promoter regions. VP8 may also function in either transcription elongation or release of mRNA molecules from viral particles PUBMED:9321647.\ ' '3981' 'IPR007490' '\

    This family is the B22R protein from Poxviruses.

    \ ' '3982' 'IPR006163' '\

    Phosphopantetheine (or pantetheine 4\' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a \'swinging arm\' for the attachment of activated fatty acid and amino-acid groups PUBMED:5321311.

    The amino-terminal region of the ACP proteins is well defined and consists of alpha four helices arranged in a right-handed\ bundle held together by interhelical hydrophobic interactions. The Asp-Ser-Leu (DSL)motif is conserved in all of the ACP sequences, and the 4\'-PP prosthetic group is covalently linked\ via a phosphodiester bond to the serine residue. The DSL sequence is present at the amino terminus of helix II, a domain of the protein referred to as the recognition helix and which is responsible for the\ interaction of ACPs with the enzymes of type II fatty acid synthesis PUBMED:11825906.

    \ ' '3983' 'IPR003414' '\ Polyphosphate kinase (Ppk) () catalyzes the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules. It is a membrane protein and goes through an intermediate stage during the reaction where it is autophosphorylated with a phosphate group covalently linked to a basic amino acid residue through an N-P bond.\ ' '3984' 'IPR004259' '\

    This family includes the M1 phosphoprotein non-structural RNA polymerase alpha subunit () from various strains of Rabies virus PUBMED:2148206. The M1 phosphoprotein is thought to be a component of the active polymerase, and may be involved in template binding.

    \ ' '3985' 'IPR004168' '\

    PPAK is a repeated protein motif found in the PEVK (Pro-Glu-Val-Lys) domain of the titin protein and in a number of other proteins. Titin () is a giant elastic protein found in striated muscle that is a key component in the assembly and functioning of sarcomeres PUBMED:15507486. PPAK motifs (PPAK refers to the four amino acids found at the beginning of the motif) occur 60 times in human soleus titin PUBMED:11276084. PPAK motifs occur in groups of 2-12 that are separated by regions rich in glutamic acid (approximately 45%) and termed polyE segments. The charge fluctuation between the PPAK and polyE regions suggests ionic interactions between these segments and their involvement in the elastic function of titin.

    \ ' '3986' 'IPR002192' '\ This enzyme catalyses the reversible conversion of ATP to AMP, pyrophosphate and phosphoenolpyruvate (PEP) PUBMED:8610096. Residues at the N-terminus correspond to the transit peptide which is indispensable for the transport of the precursor protein into chloroplasts in plants PUBMED:2841317. This domain is present at the N-terminus of some PEP-utilizing enzymes.\ ' '3987' 'IPR002088' '\

    Protein prenylation is the posttranslational attachment of either a farnesyl group or a geranylgeranyl group via a thioether linkage (-C-S-C-) to a cysteine at or near the carboxyl terminus of the protein. Farnesyl and geranylgeranyl groups are polyisoprenes, unsaturated hydrocarbons with a multiple of five carbons; the chain is 15 carbons long in the farnesyl moiety and 20 carbons long in the geranylgeranyl moiety. There are three different protein prenyltransferases in humans: farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) share the same motif (the CaaX box) around the cysteine in their substrates, and are thus called CaaX prenyltransferases, whereas geranylgeranyltransferase 2 (GGT2, also called Rab geranylgeranyltransferase) recognises a different motif and is thus called a non-CaaX prenyltransferase. Protein prenyltransferases are currently known only in eukaryotes, but they are widespread, being found in vertebrates, insects, nematodes, plants, fungi and protozoa, including several parasites.

    Each protein consists of two subunits, alpha and beta; the alpha subunit of FT and GGT1 is encoded by the same gene, FNTA. The alpha subunit is thought to participate in a stable complex with the isoprenyl substrate; the beta subunit binds the peptide substrate. In the alpha subunits of both types of protein prenyltransferases, seven tetratricopeptide repeats are formed by pairs of helices that are stabilised by conserved intercalating residues. The alpha subunits of GGT2 in mammals and plants also have an immunoglobulin-like domain between the fifth and sixth tetratricopeptide repeat, as well as leucine-rich repeats at the carboxyl terminus. The functions of these additional domains in GGT2 are as yet undefined, but they are apparently not directly involved in the interaction with substrates and Rab escort proteins. The tetratricopeptide repeats of the alpha subunit form a right-handed superhelix, which embraces the (alpha-alpha)6 barrel of the beta subunit PUBMED:1622936.

    \ ' '3988' 'IPR003695' '\ Exopolyphosphate phosphatase (Ppx) and guanosine pentaphosphate phosphatase (GppA) belong to the sugar kinase/actin/hsp70 superfamily PUBMED:8212131.\ ' '3989' 'IPR007498' '\

    Paraquat is a superoxide radical-generating agent. The promoter for the pqiA gene is also inducible by other known superoxide generators PUBMED:7751275. This is predicted to be a family of integral membrane proteins, possibly located in the inner membrane. This family is related to NADH dehydrogenase subunit 2 ().

    \ ' '3990' 'IPR002496' '\ Phosphoribosyl-AMP cyclohydrolase catalyses the third step in the histidine biosynthetic pathway:\ \ It requires Zn2+ ions for activity PUBMED:9931020.\ ' '3991' 'IPR008179' '\

    Phosphoribosyl-ATP pyrophosphatase, catalyses the second step in the histidine biosynthetic pathway:\ \ The Neurospora crassa enzyme also catalyzes the reactions of histidinol dehydrogenase () and phosphoribosyl-AMP cyclohydrolase ().

    \ ' '3992' 'IPR004895' '\ This family includes yeast hypothetical proteins and the uncharacterised rat prenylated rab acceptor protein PRA1.\ ' '3993' 'IPR001240' '\ Indole-3-glycerol phosphate synthase (IGPS) (see ) catalyzes the fourth step in the biosynthesis of tryptophan, the ring closure of 1-(2-carboxy-phenylamino)-1-deoxyribulose into indol-3-glycerol-phosphate. In some bacteria, IGPS is a single chain enzyme. In others, such as \ Escherichia coli, it is the N-terminal domain of a bifunctional enzyme that also catalyzes N-(5-phosphoribosyl)anthranilate isomerase \ (PRAI) activity, the third step of tryptophan biosynthesis. In fungi, IGPS is the central domain of a trifunctional enzyme that contains a PRAI C-terminal domain and a glutamine amidotransferase (GATase) N-terminal domain (see ).\

    Phosphoribosylanthranilate isomerase (PRAI) is monomeric and labile in most\ mesophilic microorganisms, but dimeric and stable in the hyperthermophile Thermotoga maritima (tPRAI) PUBMED:10745009. The comparison to the known 2.0 A structure of PRAI from Escherichia coli (ePRAI) shows that tPRAI has the complete TIM- or (beta alp\ ha)8-barrel fold, whereas helix alpha5 in ePRAI is replaced by a loop. The subunits of tPRAI associate via the N-terminal faces of their central beta-barrels. Two long, symmetry-related loops that protrude reciprocally into cavities of the other subunit provide for multiple hydrophobic interactions. Moreover, the side chains of the N-terminal methionines and the C-terminal leucines of both subunits are immobilized in a hydrophobic cluster, and the number of salt bridges is increased in tPRAI. These features appear to be mainly responsible for the high thermostability of tPRAI PUBMED:9166771.

    \ ' '3994' 'IPR015810' '\

    The photosynthetic apparatus in non-oxygenic bacteria consists of light-harvesting (LH) protein-pigment complexes LH1 and LH2, which use carotenoid and bacteriochlorophyll as primary donors PUBMED:11005826. LH1 acts as the energy collection hub, temporarily storing it before its transfer to the photosynthetic reaction centre (RC) PUBMED:15329728. Electrons are transferred from the primary donor via an intermediate acceptor (bacteriopheophytin) to the primary acceptor (quinine Qa), and finally to the secondary acceptor (quinone Qb), resulting in the formation of ubiquinol QbH2. RC uses the excitation energy to shuffle electrons across the membrane, transferring them via ubiquinol to the cytochrome bc1 complex in order to establish a proton gradient across the membrane, which is used by ATP synthetase to form ATP PUBMED:16931113, PUBMED:12872158, PUBMED:2676514.

    \ \

    The core complex is anchored in the cell membrane, consisting of one unit of RC surrounded by LH1; in some species there may be additional subunits PUBMED:11095707. RC consists of three subunits: L (light), M (medium), and H (heavy). Subunits L and M provide the scaffolding for the chromophore, while subunit H contains a cytoplasmic domain PUBMED:8027023. In Rhodopseudomonas viridis, there is also a non-membranous tetrahaem cytochrome (4Hcyt) subunit on the periplasmic surface.

    \ \

    This entry represents the N-terminal domain of the photosynthetic reaction centre H subunit, which includes the transmembrane domain and part of the cytoplasmic domain PUBMED:10611277.

    \ ' '3995' 'IPR001330' '\

    The beta subunit of the farnesyltransferases is responsible for peptide binding.\ Squalene-hopene cyclase is a bacterial enzyme that catalyzes the cyclization of \ squalene into hopene, a key step in hopanoid (triterpenoid) metabolism PUBMED:9295270. \ Lanosterol synthase () (oxidosqualene-lanosterol cyclase) catalyzes the \ cyclization of (S)-2,3-epoxysqualene to lanosterol, the initial precursor of cholesterol, \ steroid hormones and vitamin D in vertebrates and of ergosterol in fungi PUBMED:8016864. \ Cycloartenol synthase () (2,3-epoxysqualene-cycloartenol cyclase) is a plant \ enzyme that catalyzes the cyclization of (S)-2,3-epoxysqualene to cycloartenol.

    \ ' '3996' 'IPR001108' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of aspartic peptidases belong to MEROPS peptidase family A22 (presenilin family, clan AD): subfamily A22A, the type example being presenilin 1 from Homo sapiens (Human).

    \ \

    Presenilins are polytopic transmembrane (TM) proteins, mutations in which\ are associated with the occurrence of early-onset familial Alzheimer\'s\ disease, a rare form of the disease that results from a single-gene\ mutation PUBMED:9791530, PUBMED:9521418. \ The physiological functions of presenilins are unknown, but they may be related to developmental signalling, apoptotic signal transduction, or processing of selected proteins, such as the beta-amyloid precursor protein(beta-APP). There are a number of subtypes which belong to this presenilin family. That presenilin homologues have been identified in species that do not have an Alzhemier\'s disease correlate suggests that they may have functions unrelated to the disease, homologues having been identified in mouse, Drosophila melanogaster, Caenorhabditis elegans \ PUBMED:7566091 and other members of the eukarya including plants.

    \ ' '3997' 'IPR000817' '\

    Prion protein (PrP-c) PUBMED:2572197, PUBMED:1916104, PUBMED:2908696 is a small glycoprotein found in high \ quantity in the brain of animals infected with certain degenerative neurological diseases, such as \ sheep scrapie and bovine spongiform encephalopathy (BSE), and the human dementias Creutzfeldt-Jacob \ disease (CJD) and Gerstmann-Straussler syndrome (GSS). PrP-c is encoded in the host genome and is \ expressed both in normal and infected cells. During infection, however, the PrP-c molecule become \ altered (conformationally rather than at the amino acid level) to an abnormal isoform, PrP-sc. In detergent-treated brain extracts from infected individuals, fibrils\ composed of polymers of PrP-sc, namely scrapie-associated fibrils or prion rods, can be evidenced by electron microscopy. The precise function of the normal PrP isoform in healthy individuals remains unknown. Several results, mainly obtained in transgenic animals, indicate that PrP-c\ might play a role in long-term potentiation, in sleep physiology, in oxidative burst compensation (PrP can fix four Cu2+ through its octarepeat domain), in\ interactions with the extracellular matrix (PrP-c can bind to the precursor of the laminin receptor, LRP), in apoptosis and in signal transduction (costimulation of\ PrP-c induces a modulation of Fyn kinase phosphorylation) PUBMED:12354606.

    The normal isoform, PrP-c, is anchored at the cell membrane, in rafts, through a glycosyl phosphatidyl inositol (GPI); its half-life at the cell surface is 5 h, after which\ the protein is internalised through a caveolae-dependent mechanism and degraded in the endolysosome compartment. Conversion between PrP-c and PrP-sc\ occurs likely during the internalisation process.

    In humans, PrP is a 253 amino acid protein, which has a molecular weight of 35-36 kDa. It has two hexapeptides\ and repeated octapeptides at the N-terminus, a disulphide bond and is associated at the C-terminus with a GPI, which enables it to anchor to the external part of the\ cell membrane. The\ secondary structure of PrP-c is mainly composed of alpha-helices, whereas PrP-sc is mainly beta-sheets: transconformation of alpha-helices into beta-sheets has been\ proposed as the structural basis by which PrP acquires pathogenicity in TSEs. The three-dimensional structures shows the protein to be made of a globular domain which includes three alpha-helices and two small antiparallel beta-sheet\ structures, and a long flexible tail whose conformation depends on the biophysical parameters of the environment. Crystals of the globular domain of PrP\ have recently been obtained; their analysis suggests a possible dimerisation of the protein through the three-dimensional swapping of the C-terminal helix 3 and\ rearrangement of the disulphide bond.

    \ ' '3998' 'IPR004137' '\ Members of this family, also known as hybrid-cluster proteins, contain two Fe/S centres - a [4Fe-4S] cubane cluster, and a hybrid [4Fe-2S-2O] cluster. The physiological role of this protein is as yet unknown, although a role in nitrate/nitrite respiration has been suggested PUBMED:10651802.\ ' '3999' 'IPR001765' '\ Carbonic anhydrases () (CA) are zinc metalloenzymes which catalyze the reversible hydration of carbon dioxide.\ In Escherichia coli, CA (gene cynT) is involved in recycling carbon dioxide formed in the bicarbonate-dependent decomposition of cyanate by cyanase (gene cynS). By this action, it prevents the depletion of cellular bicarbonate PUBMED:1740425. In photosynthetic bacteria and plant chloroplast, CA is essential to inorganic carbon fixation PUBMED:1584776.\ Prokaryotic and plant chloroplast CA are structurally and evolutionary related and form a family distinct from the one which groups the many different forms of eukaryotic CA\'s (see ).\ Hypothetical proteins yadF from Escherichia coli and HI1301 from Haemophilus influenzae also belong to this family.\ ' '4000' 'IPR002872' '\ The proline oxidase/dehydrogenase is responsible for the first step in the conversion of proline to glutamate for use as a carbon and nitrogen source. The enzyme requires FAD as a cofactor, and is induced by proline.\ It is found in combination with in bacteria.\ ' '4001' 'IPR002130' '\

    Cyclophilin PUBMED: is the major high-affinity binding protein in vertebrates for the immunosuppressive drug cyclosporin A (CSA), but is also found in other organisms. It exhibits a peptidyl-prolyl cis-trans isomerase activity () (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides PUBMED:2186809. It is probable that CSA mediates some of its effects via an forming a tight complex with cyclophilin that inhibits the phosphatase activity of calcineurin PUBMED:7514602, PUBMED:8117697. Cyclophilin A is a cytosolic and highly abundant protein. The protein belongs to a family of isozymes, including cyclophilins B and C, and natural killer cell cyclophilin-related protein PUBMED:1464374, PUBMED:8404888, PUBMED:7526121. Major isoforms have been found throughout the cell, including the ER, and some are even secreted. The sequences of the different forms of cyclophilin-type PPIases are well conserved.

    \
  • Note: FKBP\'s, a family of proteins that bind the immunosuppressive drug FK506, are also PPIases, but their sequence is not at all related to that of cyclophilin (see ).
  • \ ' '4002' 'IPR002097' '\

    Profilin is a small eukaryotic protein that binds to monomeric actin (G-actin) in a 1:1 ratio thus preventing the polymerisation of actin into filaments (F-actin). It can also in certain circumstance promote actin polymerisation. Profilin also binds to polyphosphoinositides such as PIP2. Overall sequence similarity among profilin from organisms which belong to different phyla (ranging from fungi to mammals) is low, but the N-terminal region is relatively well conserved. That region is thought to be involved in the binding to actin.

    \ \

    A protein structurally similar to profilin is present in the genome of Variola virus and Vaccinia virus (gene A42R).

    \ \

    Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.

    \ \

    The allergens in this family include allergens with the following designations: Ara t 8, Bet v 2, Cyn d 12, Hel a 2, Mer a 1 and Phl p 11.

    \ ' '4003' 'IPR000128' '\

    Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner PUBMED:7899080, PUBMED:8165128. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.

    \ \

    NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed "orphan" receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.

    \ \

    The progesterone receptor consists of 3 functional and structural domains: an N-terminal (modulatory) domain; a DNA binding domain that mediates specific binding to target DNA sequences (ligand-responsive elements); and a hormone binding domain. The N-terminal domain is unique to the progesterone receptors and spans approximately the first 500 residues; the highly-conserved DNA-binding domain is smaller (around 65 residues) and occupies the central portion of the protein; and the hormone binding domain lies at the receptor C-terminus.

    \ ' '4004' 'IPR003146' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    The peptidases are synthesised as inactive molecules, zymogens, with propeptides that must be removed by proteolytic cleavage to activate the enzyme.\ Structural studies of carboxypeptidases A and B reveal the propeptide to\ exist as a globular domain, followed by an extended alpha-helix; this\ shields the catalytic site, without specifically binding to it, while the\ substrate-binding site is blocked by making specific contacts PUBMED:7674922, PUBMED:1548696.

    \ \

    Members of this propeptide family are found in the metallocarboxypeptidases: A1, A2 PUBMED:9384570, A3, A4, A5, A6, U, insect gut carboxypeptidase and B PUBMED:12162965, and and are associated with peptidases belonging to MEROPS peptidase family M14A.

    \ \

    Carboxypeptidases are found in abundance in pancreatic secretions. The pro-segment moiety (activation peptide) accounts for up to a quarter of the total length of the peptidase.

    \ ' '4005' 'IPR016103' '\

    This entry represents a structural domain consisting of six helices in an irregular non-globular array; it also contains two small beta-hairpins. This domain is found in the RNA-binding fertility inhibitor FinO that represses the conjugative transfer of F-like plasmids in Escherichia coli. FinO blocks the translation of TraJ, a positive activator of transcription of gene thereby protecting it from degradation, and catalyses FinP-TraJ mRNA hybridization. Interactions between these two RNAs are predicted to block the TraJ ribosomal binding site. FinO is largely helical, binds to its highest affinity binding site within FinP as a monomer, and contains two distinct RNA binding regions PUBMED:10876242.

    \ \ \

    This entry also includes ProQ, which is required for full activation of the osmoprotectant transporter, ProQ, in Escherichia coli.

    \ ' '4006' 'IPR003465' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    Members of the Pin2 family are proteinase inhibitors that belong to MEROPS inhibitor family I20, clan IA and are restricted to plants. They inhibit serine peptidases belonging to MEROPS peptidase family S1 PUBMED:14705960 (). They have a multidomain structure PUBMED:12446136, which permits circular permutation of the sequences. It was been shown that some naturally occurring Pin2 proteins, have an \'ancestral\' circularly permuted structure PUBMED:11604534. Circular permutation/ rearrangements of sequences has also been observed between species, such as favin from Vicia faba and the lectin concanavalin A from Canavalia ensiformis PUBMED:4506778 or amongst members of the plant aspartyl proteinases and human lung surfactant proteins PUBMED:7610480.

    \ \ \

    The Pin2 family of proteinase inhibitors are present in seeds, leaves and other organs. Perhaps the best known representatives are the wound-induced proteinase inhibitors PUBMED:11216843, PUBMED:11351092, which contain up to eight sequence-repeats (the \'IP repeats\'). The sequence of the IP repeats is quite variable, only the cysteines constituting the four disulphide bridges and a single proline residue are conserved throughout all the known repeat sequences. The structure of the proteinase inhibitor complex is known PUBMED:2494344.

    \ ' '4007' 'IPR000221' '\ Protamines are small, highly basic proteins, that substitute for histones in\ sperm chromatin during the haploid phase of spermatogenesis. They pack\ sperm DNA into a highly condensed, stable and inactive complex. There are\ two different types of mammalian protamine, called P1 and P2. P1 has been\ found in all species studied, while P2 is sometimes absent. There seems to be\ a single type of avian protamine whose sequence is closely related to that of\ mammalian P1 PUBMED:2808336.\ ' '4008' 'IPR000492' '\

    Protamines P1 and P2 form a family of small basic peptides that represent the major sperm proteins in placental mammals. In human and mouse protamine P2 is one of the most abundant sperm proteins. Protamine 2 (PRM2) is a low molecular weight arginine-rich protein which is present in haploid spermatogenic cells of human, mouse and other primates. The protamine P2 gene codes for a P2 precursor, pro-P2 which is later processed by proteolytic cleavages in its N-terminal region to form the mature P2 protamines PUBMED:8513810.

    \

    Protamines substitute for histones in the chromatin of sperm during the haploid phase of spermatogenesis. They compact sperm DNA into a highly condensed, stable and inactive complex.

    \ ' '4009' 'IPR004931' '\ Prothymosin alpha and parathymosin are two ubiquitous small acidic nuclear proteins that are thought to be involved in cell cycle\ progression, proliferation, and cell differentiation PUBMED:10854063. \ \ ' '4010' 'IPR007738' '\ The homeobox gene Prox1 is expressed in a subpopulation of endothelial cells that, after budding from veins, gives rise to the mammalian lymphatic system PUBMED:11927535. Prox1 has been found to be an early specific marker for the developing liver and pancreas in the mammalian foregut endoderm PUBMED:12351178. This family contains an atypical homeobox domain.\ ' '4011' 'IPR004098' '\

    The splicing factor Prp18 is required for the second step of pre-mRNA\ splicing. PRP18 appears to\ be primarily associated with the U5 snRNP.

    \

    The structure of a large fragment of the Saccharomyces cerevisiae\ Prp18 is known PUBMED:10737784. This fragment is fully active in yeast splicing in vitro and\ includes the sequences of Prp18 that have been evolutionarily conserved.\ The core structure consists of five alpha-helices that adopt a novel fold. The\ most highly conserved region of Prp18, a nearly invariant stretch of 19 aa,\ forms part of a loop between two alpha-helices and may interact with the\ U5 small nuclear ribonucleoprotein particles PUBMED:10737784.

    \ ' '4012' 'IPR003434' '\

    This family consists of a conserved probable envelope protein or ORF2 in Porcine reproductive and respiratory syndrome virus (PRRSV) also in the family is a minor structural protein from lactate dehydrogenase-elevating virus.

    \ ' '4013' 'IPR000501' '\ The members of this family are associated with capsid intermediates during packaging of dsDNA viruses with no RNA stage in their replication cycle PUBMED:9696839. The protein may affect translocation of the virus glycoproteins to membranes, and is involved in capsid maturation.\ ' '4014' 'IPR003817' '\ Phosphatidylserine decarboxylase plays a pivotal role in the synthesis of phospholipid by the mitochondria. The substrate phosphatidylserine is synthesized extramitochondrially and must be translocated to the mitochondria prior to decarboxylation PUBMED:8407984. \ Phosphatidylserine decarboxylases is responsible for conversion of phosphatidylserine to phosphatidylethanolamine and plays a central role in the biosynthesis of aminophospholipids PUBMED:7890740.\ ' '4015' 'IPR007345' '\ Pyruvyl-transferases are involved in peptidoglycan-associated polymer biosynthesis. CsaB in Bacillus anthracis is necessary for the non-covalent anchoring of proteins containing an SLH (S-layer homology) domain to peptidoglycan-associated pyruvylated polysaccharides. WcaK and AmsJ are involved in the biosynthesis of colanic acid in Escherichia coli and of amylovoran in Erwinia amylovora PUBMED:10970841.\ ' '4016' 'IPR001280' '\

    Photosystem I (PSI) PUBMED:3333014 is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin. PSI is found in the chloroplast of plants and cyanobacteria. The electron transfer components of the reaction centre of PSI are a primary electron donor P-700 (chlorophyll dimer) and five electron acceptors: A0 (chlorophyll), A1 (a phylloquinone) and three 4Fe-4S iron-sulphur centres: Fx, Fa, and Fb.

    \ \

    PsaA and psaB, two closely related proteins, are involved in the binding of P700, A0, A1, and Fx. psaA and psaB are both integral membrane proteins of 730 to 750 amino acids that seem to contain 11 transmembrane segments. The Fx 4Fe-4S iron-sulphur centre is bound by four cysteines; two of these cysteines are provided by the psaA protein and the two others by psaB. The two cysteines in both proteins are proximal and located in a loop between the ninth and tenth transmembrane segments. A leucine zipper motif seems to be present PUBMED:2186925 downstream of the cysteines and could contribute to dimerisation of psaA/psaB.

    \ \ ' '4017' 'IPR003685' '\

    PsaD is a small, extrinsic polypeptide located on the stromal side (cytoplasmic side in cyanobacteria) of the photosystem I reaction centre complex. It is required for native assembly of PSI reaction clusters and is implicated in the electrostatic binding of ferredoxin within the reaction centre PUBMED:9692933. PsaD forms a dimer in solution which is bound by PsaE however PsaD is monomeric in its native complexed PSI environment PUBMED:9692933.

    \ ' '4018' 'IPR003757' '\ The trimeric photosystem I of the cyanobacterium Synechococcus elongatus recomprises 11 protein subunits. Subunit XI, PsaL, from plants and bacteria is one of the smaller subunits with only two transmembrane alpha helices. PsaL interacts closely with PsaI PUBMED:8901876.\ ' '4019' 'IPR001056' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight phosphoprotein PsbH found in PSII. The phosphorylation site of PsbH is located in the N-terminus, where reversible phosphorylation is light-dependent and redox-controlled. PsbH is necessary for the photoprotection of PSII, being required for: (1) the rapid degradation of photodamaged D1 core protein to prevent further oxidative damage to the PSII core, and (2) the insertion of newly synthesised D1 protein into the thylakoid membrane PUBMED:12909614. PsbH may also regulate the transfer of electrons from D2 (Qa) to D1 (Qb) in the reaction core.

    \ \ \ ' '4020' 'IPR003686' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight transmembrane protein PsbI, which is tightly associated with the D1/D2 heterodimer in PSII. The function of PsbI is unknown, but it may be involved in the assembly, dimerisation or stabilisation of PSII dimers PUBMED:8544827.

    \ ' '4021' 'IPR002682' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight transmembrane protein PsbJ found in PSII. PsbJ is one of the most hydrophobic proteins in the thylakoid membrane, and is located in a gene cluster with PsbE, PsbF and PsbL (PsbEFJL). Both PsbJ and PsbL () are essential for proper assembly of the OEC. Mutations in PsbJ cause the light-harvesting antenna to remain detached from the PSII dimers PUBMED:14686923. In addition, both PsbJ and PsbL are involved in the unidirectional flow of electrons, where PsbJ regulates the forward electron flow from D2 (Qa) to the plastoquinone pool, and PsbL prevents the reduction of PSII by back electron flow from plastoquinol protecting PSII from photo-inactivation PUBMED:14979726.

    \ ' '4022' 'IPR003687' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight transmembrane protein PsbK found in PSII, where it is tightly associated with the antenna protein CP43 (PsbC). PsbK is required for accumulation of the PSII complex, and may participate in the assembly and stability of the PSII complex. In particular, PsbK may be involved in the binding of plastoquinone and in maintaining the dimeric organisation of PSII PUBMED:12939265, PUBMED:9632665.

    \ ' '4023' 'IPR003372' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight transmembrane protein PsbL found in PSII. PsbL is located in a gene cluster with PsbE, PsbF and PsbJ (PsbEFJL). Both PsbL and PsbJ () are essential for proper assembly of the OEC. Mutations in PsbL prevent the formation of both PSII core dimers and PSII-light harvesting complex PUBMED:14686923. In addition, both PsbL and PsbJ are involved in the unidirectional flow of electrons, where PsbJ regulates the forward electron flow from D2 (Qa) to the plastoquinone pool, and PsbL prevents the reduction of PSII by back electron flow from plastoquinol protecting PSII from photo-inactivation PUBMED:14979726.

    \ ' '4024' 'IPR007826' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight transmembrane protein PsbM found in PSII. PsbM is one of the most hydrophobic proteins in the thylakoid membrane. The function of this protein is unknown.

    \ ' '4025' 'IPR003398' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight transmembrane protein PsbN found in PSII. PsbN may have a role in PSII stability, however its actual function unknown. PsbN does not appear to be essential for photoautotrophic growth or normal PSII function.

    \ ' '4026' 'IPR006814' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight intrinsic protein PsbR found in PSII, which is also known as the 10 kDa polypeptide. The PsbR gene is found only in the nucleus of green algae and higher plants. PsbR may provide a binding site for the extrinsic oxygen-evolving complex protein PsbP to the thylakoid membrane. PsbR has a transmembrane domain to anchor it to the thylakoid membrane, and a charged N-terminal domain capable of forming ion bridges with extrinsic proteins, allowing PsbR to act as a docking protein. PsbR may be a pH-dependent stabilising protein that functions at both donor and acceptor sides of PSII PUBMED:1697267.

    \ \ \ ' '4027' 'IPR001743' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight transmembrane protein PsbT found in PSII, which is thought to be associated with the D1 (PsbA) - D2 (PsbD) heterodimer. PsbT may be involved in the formation and/or stabilisation of dimeric PSII complexes, because in the absence of this protein dimeric PSII complexes were found to be less abundant. Furthermore, although PsbT does not confer photo-protection, it is required for the efficient recovery of photo-damaged PSII PUBMED:11451956.

    \ ' '4028' 'IPR005610' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight transmembrane protein Psb28 (PsbW) found in PSII, where it is a subunit of the oxygen-evolving complex. Psb28 appears to have several roles, including guiding PSII biogenesis and assembly, stabilising dimeric PSII PUBMED:10950961, and facilitating PSII repair after photo-inhibition PUBMED:9335523. There appears to be two classes of Psb28, class 1 being found predominantly in algae and cyanobacteria, and class 2 being found predominantly in plants. This entry represents class 1 Psb28.

    \ ' '4029' 'IPR006145' '\

    This entry represents several different pseudouridine synthases from family 3, including: RsuA (acts on small ribosomal subunit), RluA, RluB, RluC, RluD, RluE and RluF (act on large ribosomal subunit).

    \

    RsuA from Escherichia coli catalyses formation of pseudouridine at position 516 in 16S rRNA during assembly of the 30S ribosomal subunit PUBMED:11953756, PUBMED:16511038. RsuA consists of an N-terminal domain connected by an extended linker to the central and C-terminal domains. Uracil and UMP bind in a cleft between the central and C-terminal domains near the catalytic residue Asp 102. The N-terminal domain shows structural similarity to the ribosomal protein S4. Despite only 15% amino acid identity, the other two domains are structurally similar to those of the tRNA-specific psi-synthase TruA, including the position of the catalytic Asp. Our results suggest that all four families of pseudouridine synthases share the same fold of their catalytic domain(s) and uracil-binding site.

    \ \

    RluB, RluC, RluD, RluE and RluF are homologous enzymes which each convert specific uridine bases in E. coli ribosomal 23S RNA to pseudouridine:

    \

    \ \

    RluD also possesses a second function related to proper assembly of the 50S ribosomal subunit that is independent of Psi-synthesis PUBMED:15078091, PUBMED:14659742. Both RluC and RluD have an N-terminal S4 RNA binding domain. Despite the conserved topology shared by RluC and RluD, the surface shape and charge distribution are very different.

    \ \

    Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an alpha+beta structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are PUBMED:10529181:

    \ \

    \ \ ' '4030' 'IPR001302' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    This entry represents subunit VIII (PsaI) of the photosystem I (PSI) reaction centre. PSI is located, along with photosystem II (PSII), in the thylakoid photosynthetic membranes of plants, green algae and cyanobacteria. The crystal structure of PSI from the thermophilic cyanobacterium Synechococcus elongatus (Thermosynechococcus elongatus) has 12 protein subunits and 127 cofactors comprising 96 chlorophylls, 2 phylloquinones, 3 4Fe4S clusters, 22 carotenoids, 4 lipids, and a putative calcium ion PUBMED:11418848. PsaI consists of a single transmembrane helix, and has a crucial role in aiding normal structural organization of PsaL within the PSI complex and the absence of PsaI alters PsaL organization, leading to a small, but physiologically significant, defect in PSI function PUBMED:7608190. PsaL encodes a subunit of PSI and is necessary for trimerisation of PSI. PsaL may constitute the trimer-forming domain in the structure of PSI PUBMED:8262256.

    \ ' '4031' 'IPR003375' '\

    PsaE is a 69 amino acid polypeptide from photosystem I present on the stromal side of the thylakoid membrane. The structure is comprised of a well-defined five-stranded beta-sheet similar to SH3 domains PUBMED:8193119. This subunit may form complexes with ferredoxin and ferredoxin-oxidoreductase in the photosystem I reaction centre.

    \ ' '4032' 'IPR003666' '\ Photosystem I (PSI) is an integral membrane protein complex that uses light energy to mediate electron transfer from\ plastocyanin to ferredoxin. Subunit III (or PsaF) is one of at least 14 different subunits that compose the photosystem I reaction centre (PSI-RC) PUBMED:8443351.\ ' '4033' 'IPR004928' '\

    Photosystem I, a membrane complex found in the chloroplasts of plants and cyanobacteria \ uses light energy to transfer electrons from plastocyanin to ferredoxin PUBMED:. \ The electron transfer components of the photosystem include the primary electron donor \ chlorophyll P-700 and 5 electron acceptors: chlorophyll (A0), phylloquinone (A1) and \ three 4Fe-4S iron-sulphur centres, designated Fx, Fa and Fb. The role of this protein, subunit VI or PsaH, may be in docking of the light harvesting \ complex I antenna to the core complex.

    \ ' '4034' 'IPR002615' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \ This family consists of the photosystem I reaction centre subunit IX or PsaJ from various organisms including Synechocystis sp. (strain PCC 6803), Pinus thunbergii (Green pine) and Zea mays (Maize).\ PsaJ () is a small 4.4kDa, chloroplast encoded, hydrophobic subunit of the photosystem I reaction complex whose function is not yet fully understood PUBMED:10220342. PsaJ can be cross-linked to PsaF () and has a single predicted transmembrane domain. It has a proposed role in maintaing PsaF in the correct orientation to allow for fast electron transfer from soluble donor proteins to P700+ PUBMED:10220342.\ ' '4035' 'IPR000549' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \ Photosystem I (PSI) PUBMED:3333014 is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin. It is found in the chloroplasts of plants and cyanobacteria. PSI is composed of at least 14 different subunits, two of which, PSI-G (gene psaG) and PSI-K (gene psaK), are small hydrophobic proteins of about 7 to 9 Kd and evolutionary related PUBMED:8360180. Both seem to contain two transmembrane regions. Cyanobacteria contain only PSI-K.\ ' '4036' 'IPR000932' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the intrinsic antenna proteins CP43 (PsbC) and CP47 (PsbB) found in the reaction centre of PSII. These polypeptides bind to chlorophyll a and beta-carotene and pass the excitation energy on to the reaction centre PUBMED:12163077. This family also includes the iron-stress induced chlorophyll-binding protein CP43\' (IsiA), which evolved in cyanobacteria from a PSII protein to cope with light limitations and stress conditions. Under iron-deficient growth conditions, CP43\' associates with PSI to form a complex that consists of a ring of 18 or more CP43\' molecules around a PSI trimer, which significantly increases the light-harvesting system of PSI. IsiA can also provide photoprotection for PSII PUBMED:15301529.

    \ \ ' '4037' 'IPR007157' '\ This family includes PspA a protein that suppresses sigma54-dependent transcription. The PspA protein, a negative regulator of the Escherichia coli phage shock psp operon, is produced when virulence factors are exported through secretins in many Gram-negative pathogenic bacteria and its homologue in plants, VIPP1, plays a critical role in thylakoid biogenesis, essential for photosynthesis. Activation of transcription by the enhancer-dependent bacterial sigma54-containing RNA polymerase occurs through ATP hydrolysis-driven protein conformational changes enabled by activator proteins that belong to the large AAA(+) mechanochemical protein family. It has been shown that PspA directly and specifically acts upon and binds to the AAA(+) domain of the PspF transcription activator PUBMED:12079332.\ ' '4038' 'IPR006924' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \ This small acidic protein is found in 30S ribosomal subunit of cyanobacteria and plant plastids. In plants it has been named plastid-specific ribosomal protein 3 (PSRP-3), and in cyanobacteria it is named Ycf65. Plastid-specific ribosomal proteins may mediate the effects of nuclear factors on plastid translation. The acidic PSRPs are thought to contribute to protein-protein interactions in the 30S subunit, and are not thought to bind RNA PUBMED:10874039.\ ' '4039' 'IPR004277' '\ Phosphatidyl serine synthase is also known as serine exchange enzyme (). This family represents eukaryotic PSS I and II, membrane bound proteins that catalyse the replacement of the head group of a phospholipid\ (phosphotidylcholine or phosphotidylethanolamine) by L-serine.\ ' '4040' 'IPR002505' '\

    This entry contains both phosphate acetyltransferase :\ \ \ and\ phosphate butaryltransferase :

    \ \ \ \

    These enzymes catalyse the\ transfer of an acetyl or butaryl group to orthophosphate.

    \ ' '4041' 'IPR001559' '\

    Synonym(s): Paraoxonase, A-esterase, Aryltriphosphatase, Phosphotriesterase, Paraoxon hydrolase

    \

    Bacteria such as Brevundimonas diminuta (Pseudomonas diminuta) harbour a plasmid that carries the gene for Aryldialkylphosphatase () (PTE) (also known as parathion hydrolase). This enzyme has attracted interest because of its potential use in the detoxification of chemical waste and warfare agents and its ability to degrade agricultural pesticides such as parathion. It acts specifically on synthetic organophosphate triesters and phosphorofluoridates. It does not seem to have a natural occuring substrate and may thus have optimally evolved for utilizing paraoxon.

    \ \

    Aryldialkylphosphatase belongs to a family PUBMED:9383406, PUBMED:9548740 of enzymes that possess a binuclear zinc metal centre at their active site. The two zinc ions are coordinated by six different residues, six of which being histidines. This family so far includes, in addition to the parathion hydrolase, the following proteins:

    \

    \ ' '4042' 'IPR000489' '\

    All organisms require reduced folate cofactors for the synthesis of a variety of metabolites. Most microorganisms must synthesize folate de novo because they lack the active transport system of higher vertebrate cells that allows these organisms to use dietary folates. Proteins containing this domain include dihydropteroate synthase () as well as a group of methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulphur protein methyltransferase (MeTr) that catalyses a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation.

    \ \

    Dihydropteroate synthase () (DHPS) catalyses the condensation of 6-hydroxymethyl-7,8-dihydropteridine pyrophosphate to para-aminobenzoic acid to form 7,8-dihydropteroate. This is the second step in the three-step pathway leading from 6-hydroxymethyl-7,8-dihydropterin to 7,8-dihydrofolate. DHPS is the target of sulphonamides, which are substrate analogues that compete with para-aminobenzoic acid. Bacterial DHPS (gene sul or folP) PUBMED:2123867 is a protein of about 275 to 315 amino acid residues that is either chromosomally encoded or found on various antibiotic resistance plasmids. In the lower eukaryote Pneumocystis carinii, DHPS is the C-terminal domain of a multifunctional folate synthesis enzyme (gene fas) PUBMED:1313386.

    \ ' '4043' 'IPR020090' '\ Several extracellular heparin-binding proteins involved in regulation of growth and differentiation belong to a new family of growth factors. These growth factors are highly related proteins of about 140 amino acids that contain 10 conserved cysteines probably involved in disulphide bonds, and include pleiotrophin PUBMED:15121180 (also known as heparin-binding growth-associated molecule HB-GAM, heparin-binding growth factor 8 HBGF-8, heparin-binding neutrophic factor HBNF and osteoblast specific protein OSF-1); midkine (MK) PUBMED:15047154; retinoic acid-induced heparin-binding protein (RIHB) PUBMED:7796887; and pleiotrophic factors alpha-1and -2 and beta-1 and -2 from Xenopus laevis, the homologs of midkine and pleiotrophin respectively. Pleiotrophin is a heparin-binding protein that has neurotrophic activity and has mitogenic activity towards fibroblasts. It is highly expressed in brain and uterus tissues, but is also found in gut, muscle and skin. It is thought to possess an important brain-specific function. Midkine is a regulator of differentiation whose expression is regulated by retinoic acid, and, like pleiotrophin, is a heparin-binding growth/differentiation factor that acts on fibroblasts and nerve cells.\ ' '4044' 'IPR020089' '\ Several extracellular heparin-binding proteins involved in regulation of growth and differentiation belong to a new family of growth factors. These growth factors are highly related proteins of about 140 amino acids that contain 10 conserved cysteines probably involved in disulphide bonds, and include pleiotrophin PUBMED:15121180 (also known as heparin-binding growth-associated molecule HB-GAM, heparin-binding growth factor 8 HBGF-8, heparin-binding neutrophic factor HBNF and osteoblast specific protein OSF-1); midkine (MK) PUBMED:15047154; retinoic acid-induced heparin-binding protein (RIHB) PUBMED:7796887; and pleiotrophic factors alpha-1and -2 and beta-1 and -2 from Xenopus laevis, the homologs of midkine and pleiotrophin respectively. Pleiotrophin is a heparin-binding protein that has neurotrophic activity and has mitogenic activity towards fibroblasts. It is highly expressed in brain and uterus tissues, but is also found in gut, muscle and skin. It is thought to possess an important brain-specific function. Midkine is a regulator of differentiation whose expression is regulated by retinoic acid, and, like pleiotrophin, is a heparin-binding growth/differentiation factor that acts on fibroblasts and nerve cells.\ ' '4045' 'IPR007482' '\

    Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; ) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation PUBMED:9818190, PUBMED:14625689. The PTP superfamily can be divided into four subfamilies PUBMED:12678841:

    \

    \

    Based on their cellular localisation, PTPases are also classified as:

    \

    \

    All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif PUBMED:9646865. Functional diversity between PTPases is endowed by regulatory domains and subunits.

    \ \

    This family includes the mammalian protein tyrosine phosphatase-like protein, PTPLA. A significant variation of PTPLA from other protein tyrosine phosphatases is the presence of proline instead of catalytic arginine at the active site. It is thought that PTPLA proteins have a role in the development, differentiation, and maintenance of a number of tissue types PUBMED:10644438.

    \ ' '4046' 'IPR007115' '\

    The complex organic chemistry involved in the transformation of GTP to tetrahydrobiopterin is catalysed by only three enzymes: GTP cyclohydrolase I, 6-pyruvoyltetrahydropterin synthase and sepiapterin reductase. Tetrahydrobiopterin is the cofactor for several aromatic amino acid monooxygenases and the nitric oxide synthases. 6-Pyruvoyl tetrahydropterin synthase (PTPS) PUBMED:8137809 is a Zn-dependent metalloprotein, transforms dihydroneopterin triphosphate into 6-pyruvoyltetrahydropterin in the presence of Mg(II) and for which the crystal structure is known.

    \ \

    The enzyme is a homohexameric, composed of a dimer of trimers. A transition metal binding site formed by the three histidine residues 23, 48 and 50 is present in each subunit, and bound Zn(II) is responsible for the enzymatic activity. Site-directed mutagenesis of each of these three histidine residues results in a complete loss of metal binding and enzymatic activity PUBMED:7563095, PUBMED:9165069.

    \ \

    The function of the bacterial branch of the sequence lineage appears not to have been established.

    \ ' '4047' 'IPR000109' '\

    This entry represents the POT (proton-dependent oligopeptide transport) family, which all appear to be proton dependent transporters. The transport of peptides into cells is a well-documented biological phenomenon which is accomplished by specific, energy-dependent transporters found in a number of organisms as diverse as bacteria and humans. The POT family of proteins is distinct from the ABC-type peptide transporters and was uncovered by sequence analyses of a number of recently discovered peptide transport proteins PUBMED:7476181. These proteins that seem to be mainly involved in the intake of small peptides with the concomitant uptake of a proton PUBMED:7817396.

    \ \

    These integral membrane proteins are predicted to comprise twelve\ transmembrane regions.

    \ ' '4048' 'IPR005698' '\

    The histidine-containing phosphocarrier protein (HPr) is a central component of the phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS), which transfers metabolic carbohydrates across the cell membrane in many bacterial species PUBMED:8246840, PUBMED:2197982. PTS catalyses the phosphorylation of incoming sugar substrates concomitant with their translocation across the cell membrane. The general mechanism of the PTS is as follows: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred to Enzyme I (EI) of the PTS, which in turn transfers it to the phosphoryl carrier protein (HPr) PUBMED:7853396, PUBMED:7704530. Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease complex (enzymes EII/EIII).

    \ \

    HPr PUBMED:1549615, PUBMED:7686067 is a small cytoplasmic protein of 70 to 90 amino acid residues. In some bacteria, HPr is a domain in a larger protein that includes a EIII(Fru) (IIA) domain and in some cases also the EI domain. A conserved histidine in the N-terminal section of HPr serves as an acceptor for the phosphoryl group of EI. In the central part of HPr, there is a conserved serine which (in Gram-positive bacteria only) is phosphorylated by an ATP-dependent protein kinase; a process which probably play a regulatory role in sugar transport. The overall architecture of the HPr domain has been described as an open faced beta-sandwich in which a beta-sheet is packed against three alpha-helices. Regulatory phosphorylation at the conserved Ser residue does not appear to induce large structural changes to the HPr domain, in particular in the region of the active site PUBMED:11054290, PUBMED:15713472.

    \ ' '4049' 'IPR002745' '\

    The final step of tRNA splicing in Saccharomyces cerevisiae (Baker\'s yeast) requires 2\'-phosphotransferase (Tpt1) to transfer the 2\'-phosphate from\ ligated tRNA to NAD, producing mature tRNA and ADP ribose-1\' \'-2\' \'-cyclic phosphate. Yeast and Mus musculus (Mouse) Tpt1 protein and bacterial KptA protein can catalyze the conversion of the\ generated intermediate to both product and the original substrate, these enzymes\ likely use the same reaction mechanism. Step 1 of this reaction is strikingly similar to the\ ADP-ribosylation of proteins catalyzed by a number of bacterial toxins.

    KptA, a functional Tpt1\ protein homologue from Escherichia coli is strikingly similar to yeast Tpt1 in its kinetic parameters, although\ E. coli is not known to have a 2\'-phosphorylated RNA substrate PUBMED:9915792,PUBMED:11705403.

    \ ' '4050' 'IPR001127' '\

    The phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS) PUBMED:8246840, PUBMED:2197982 is a major carbohydrate transport system in bacteria. The PTS catalyses the phosphorylation of incoming sugar substrates and coupled with translocation across the cell membrane, makes the PTS a link between the uptake and metabolism of sugars.

    \ \

    The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred via a signal transduction pathway, to enzyme I (EI) which in turn transfers it to a phosphoryl carrier, the histidine protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease, a membrane-bound complex known as enzyme 2 (EII), which transports the sugar to the cell. EII consists of at least three structurally distinct domains IIA, IIB and IIC PUBMED:1537788. These can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII).

    \ \

    The first domain (IIA or EIIA) carries the first permease-specific phosphorylation site, a histidine which is phosphorylated by phospho-HPr. The second domain (IIB or EIIB) is phosphorylated by phospho-IIA on a cysteinyl or histidyl residue, depending on the sugar transported. Finally, the phosphoryl group is transferred from the IIB domain to the sugar substrate concomitantly with the sugar uptake processed by the IIC domain. This third domain (IIC or EIIC) forms the translocation channel and the specific substrate-binding site.

    \ \

    An additional transmembrane domain IID, homologous to IIC, can be found in some PTSs, e.g. for mannose PUBMED:8246840, PUBMED:1537788, PUBMED:7815935, PUBMED:11361063.

    \ \ ' '4051' 'IPR002178' '\

    The phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS) PUBMED:8246840, PUBMED:2197982 is a major carbohydrate transport system in bacteria. The PTS catalyses the phosphorylation of incoming sugar substrates and coupled with translocation across the cell membrane, makes the PTS a link between the uptake and metabolism of sugars.

    \ \

    The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred via a signal transduction pathway, to enzyme I (EI) which in turn transfers it to a phosphoryl carrier, the histidine protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease, a membrane-bound complex known as enzyme 2 (EII), which transports the sugar to the cell. EII consists of at least three structurally distinct domains IIA, IIB and IIC PUBMED:1537788. These can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII).

    \ \

    The first domain (IIA or EIIA) carries the first permease-specific phosphorylation site, a histidine which is phosphorylated by phospho-HPr. The second domain (IIB or EIIB) is phosphorylated by phospho-IIA on a cysteinyl or histidyl residue, depending on the sugar transported. Finally, the phosphoryl group is transferred from the IIB domain to the sugar substrate concomitantly with the sugar uptake processed by the IIC domain. This third domain (IIC or EIIC) forms the translocation channel and the specific substrate-binding site.

    \ \

    An additional transmembrane domain IID, homologous to IIC, can be found in some PTSs, e.g. for mannose PUBMED:8246840, PUBMED:1537788, PUBMED:7815935, PUBMED:11361063.

    \ \ ' '4052' 'IPR018113' '\

    The phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS) PUBMED:8246840, PUBMED:2197982 is a major carbohydrate transport system in bacteria. The PTS catalyzes the phosphorylation of incoming sugar substrates concomitant with their translocation across the cell membrane. The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred to enzyme-I (EI) of PTS which in turn transfers it to a phosphoryl carrier protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease which consists of at least three structurally distinct domains (IIA, IIB, and IIC) PUBMED:1537788 which can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII).

    \ \

    The first domain (IIA) carries the first permease-specific phoshorylation site, a histidine, which is phosphorylated by phospho-HPr. The second domain (IIB) is phosphorylated by phospho-IIA on a cysteinyl or histidyl residue, depending on the permease. Finally, the phosphoryl group is transferred from the IIB domain to the sugar substrate in a process catalyzed by the IIC domain; this process is coupled to the transmembrane transport of the sugar.

    \

    This entry covers the phosphorylation site of EIIB domains.

    \ ' '4053' 'IPR003352' '\ The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The PTS catalyzes the phosphorylation of incoming sugar substrates concomitant with their translocation across the cell membrane. The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred to enzyme-I (EI) of PTS which in turn transfers it to a phosphoryl carrier protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease which consists of at least three structurally distinct domains (IIA, IIB, and IIC) PUBMED:1537788 which can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII). The IIC domain catalyzes the transfer of a phosphoryl group from IIB to the sugar substrate.\ ' '4054' 'IPR003188' '\

    The phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS) PUBMED:8246840, PUBMED:2197982 is a major carbohydrate transport system in bacteria. The PTS catalyses the phosphorylation of incoming sugar substrates and coupled with translocation across the cell membrane, makes the PTS a link between the uptake and metabolism of sugars.

    \ \

    The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred via a signal transduction pathway, to enzyme I (EI) which in turn transfers it to a phosphoryl carrier, the histidine protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease, a membrane-bound complex known as enzyme 2 (EII), which transports the sugar to the cell. EII consists of at least three structurally distinct domains IIA, IIB and IIC PUBMED:1537788. These can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII).

    \ \

    The first domain (IIA or EIIA) carries the first permease-specific phosphorylation site, a histidine which is phosphorylated by phospho-HPr. The second domain (IIB or EIIB) is phosphorylated by phospho-IIA on a cysteinyl or histidyl residue, depending on the sugar transported. Finally, the phosphoryl group is transferred from the IIB domain to the sugar substrate concomitantly with the sugar uptake processed by the IIC domain. This third domain (IIC or EIIC) forms the translocation channel and the specific substrate-binding site.

    \ \

    An additional transmembrane domain IID, homologous to IIC, can be found in some PTSs, e.g. for mannose PUBMED:8246840, PUBMED:1537788, PUBMED:7815935, PUBMED:11361063.

    \ \

    The lactose/cellobiose-specific family are one of four structurally and functionally distinct group IIA PTS system enzymes. This family of proteins normally function as a homotrimer, stabilised by a centrally located metal ion PUBMED:9261069. Separation into subunits is thought to occur after phosphorylation.

    \ ' '4055' 'IPR005323' '\

    Domain is found in pullanase - carbohydrate de-branching - proteins. It is found both to the N or the C-termini of of the alpha-amylase active site region. This domain contains several conserved aromatic residues that are suggestive of a carbohydrate binding function.

    \ ' '4057' 'IPR004716' '\

    The phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS) PUBMED:8246840, PUBMED:2197982 is a major carbohydrate transport system in bacteria. The PTS catalyses the phosphorylation of incoming sugar substrates and coupled with translocation across the cell membrane, makes the PTS a link between the uptake and metabolism of sugars.

    \ \

    The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred via a signal transduction pathway, to enzyme I (EI) which in turn transfers it to a phosphoryl carrier, the histidine protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease, a membrane-bound complex known as enzyme 2 (EII), which transports the sugar to the cell. EII consists of at least three structurally distinct domains IIA, IIB and IIC PUBMED:1537788. These can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII).

    \ \

    The first domain (IIA or EIIA) carries the first permease-specific phosphorylation site, a histidine which is phosphorylated by phospho-HPr. The second domain (IIB or EIIB) is phosphorylated by phospho-IIA on a cysteinyl or histidyl residue, depending on the sugar transported. Finally, the phosphoryl group is transferred from the IIB domain to the sugar substrate concomitantly with the sugar uptake processed by the IIC domain. This third domain (IIC or EIIC) forms the translocation channel and the specific substrate-binding site.

    \ \

    An additional transmembrane domain IID, homologous to IIC, can be found in some PTSs, e.g. for mannose PUBMED:8246840, PUBMED:1537788, PUBMED:7815935, PUBMED:11361063.

    \ \

    The Man family is unique in several respects among PTS permease families:

    \

    \

    This family consists only of glucitol-specific transporters, and occur both in Gram-negative and Gram-positive bacteria. The system in Escherichia coli consists of a IIA protein, and a IIBC protein. This family is specific for the IIA component.

    \ ' '4058' 'IPR004720' '\

    Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Man family is unique in several respects among PTS permease families:

    \

    \

    The mannose permease of Escherichia coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine, N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine. This entry is specific for the IIB components of this family of PTS transporters PUBMED:12662934.

    \ ' '4059' 'IPR004896' '\ This protein is required for high-level transcription of the PUC operon. It is an integral membrane protein. The family includes other proteins form Rhodobacter eg. bacteriochlorophyll synthase.\ ' '4060' 'IPR001313' '\

    The drosophila pumilio gene codes for an unusual protein that binds through the Puf\ domain that usually occurs as a tandem repeat of eight domains. The FBF-2 protein of\ Caenorhabditis elegans also has a Puf domain. Both proteins function as translational \ repressors in early embryonic development by binding sequences in the 3\' UTR of target \ mRNAs PUBMED:9393998, PUBMED:9404893. The same type of repetitive domain has been found in\ in a number of other proteins from all eukaryotic kingdoms. The Puf proteins characterised to date have been reported to bind to 3\'-untranslated region (UTR) sequences encompassing a so-called UGUR tetranucleotide motif and thereby to repress gene expression by affecting mRNA translation or stability.

    \

    In Saccharomyces cerevisiae (Baker\'s yeast), five proteins, termed Puf1p to Puf5p, bear six to eight Puf repeats PUBMED:15024427. Puf3p binds nearly exclusively to cytoplasmic mRNAs that encode mitochondrial proteins; Puf1p and Puf2p interact preferentially with mRNAs encoding membrane-associated proteins; Puf4p preferentially binds mRNAs encoding nucleolar ribosomal RNA-processing factors; and Puf5p is associated with mRNAs encoding chromatin modifiers and components of the spindle pole body. This suggests the existence of an extensive network of RNA-protein interactions that coordinate the post-transcriptional fate of large sets of cytotopically and functionally related RNAs through each stage of its lifecycle.

    \ ' '4061' 'IPR003180' '\

    Methylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA PUBMED:18191412. Its action is induced by alkylating chemotherapeutics, as well as deaminated and lipid peroxidation-induced purine adducts PUBMED:17768096. MPG without an N-terminal extension excises hypoxanthine with one-third of the efficiency of full-length MPG under similar conditions, suggesting that is function may largely be attributable to the N-terminal extension PUBMED:17716976.

    \ ' '4062' 'IPR006628' '\

    The Pur protein family consists of four known members in humans and is strongly conserved throughout evolution. Pur-alpha is a highly conserved, sequence-specific DNA- and RNA-binding protein involved in diverse cellular and viral functions including transcription, replication, and cell growth. Pur-alpha has a modular structure with alternating three basic aromatic class I and two acidic leucine-rich class II repeats in the central region of the protein PUBMED:1448097.

    \ \ \

    In addition to its involved in basic cellular function, Pur-alpha, has been implicated in the development of blood cells and cells of the central nervous system; it has also been implicated in the inhibition of oncogenic transformation and along with Pur-beta in myelodysplastic syndrome progressing to acute myelogenous leukemia. Pur-alpha can influence viral interaction through functional associations, for example with the Tat protein and TAR RNA of HIV-1, and with large T-antigen and DNA regulatory regions of JC virus. JC virus causes opportunistic infections in the brains of certain HIV-1-infected individuals PUBMED:12894583.

    \ \ ' '4063' 'IPR003850' '\

    Phosphoribosylformylglycinamidine(FGAM) synthetase, , catalyses the fourth step in the de\ novo purine biosynthetic pathway PUBMED:15301532.

    \ \ \

    \ In eukaryotes and many bacterial systems (including Escherichia coli and\ Salmonella typhimurium), the FGAM synthetase is encoded\ by the large form of PurL (lgPurL), which contains an N-terminal ATPase\ domain and a C-terminal glutamine-binding domain. In\ archaeal and other bacterial systems, however, FGAM\ synthetase is encoded by separate genes, making it a\ multisubunit (rather than multidomain) enzyme. The protein is composed of the small form of PurL (smPurL), which is homologus to the ATPase domain of lgPurL, PurQ which is homologous to the glutamine-binding domain of of lgPurL, and PurS, whose function is not known.

    \ \

    This entry represents the PurS subunit of the multisubunit FGAM synthetase. Recent studies showed that disruption of the purS gene in Bacillus subtilis resulted in a purine auxotrophic phenotype, due to defective FGAM synthetase activity. Therefore, the PurS protein appears to be required for the function of the PurL and PurQ subunits of the FGAM synthetase, but the molecular mechanism for the functional role of PurS is currently not known. For additional information please see PUBMED:15301530, PUBMED:15301531.

    \ ' '4065' 'IPR000313' '\ Upon characterization of WHSC1, a gene mapping to the Wolf-Hirschhornsyndrome critical region and at its C-terminus similar to the Drosophila melanogaster ASH1/trithorax group proteins, a novel protein domain designated PWWP domain was identified PUBMED:9618163. The PWWP domain is named after a conserved Pro-Trp-Trp-Pro motif. It is present in proteins of nuclear origin and plays a role in cell growth and differentiation. Due to its position, the composition of amino acids close to the PWWP motif and the pattern of other domains present it has been suggested that the domain is involved in protein-protein interactions PUBMED:10802047.\ ' '4066' 'IPR003379' '\ This domain represents a conserved region in pyruvate carboxylase (PYC) (), oxaloacetate decarboxylase alpha chain (OADA) (), and transcarboxylase 5s subunit (). The domain is found adjacent to the HMGL-like domain () and often close to the biotin_lipoyl domain () of biotin requiring enzymes.\ ' '4067' 'IPR004260' '\

    Pyrimidine dimer DNA glycosylases are enzymes responsible for initiating the base excision repair pathway, excising pyrimidine dimers by hydrolysis of the glycosylic bond of the 5\' pyrimidine, followed by the intra-pyrimidine phosphodiester bond PUBMED:11148051. One such enzyme is T4 endonuclease V, an enzyme responsible for the first step of a pyrimidine-dimer-specific excision-repair pathway PUBMED:2067549. Bacteriophage T4 that are deficient in these enzymes are extremely sensitive to UV.

    \ ' '4068' 'IPR020545' '\

    Aspartate carbamoyltransferase (aspartate transcarbamylase, ATCase) is an allosteric enzyme that plays a central role in the regulation of the pyrimidine pathway in bacteria. The holoenzyme is a dodecamer composed of six catalytic chains, each with an active site, and six regulatory chains lacking catalytic activity PUBMED:11323717. The catalytic subunits exist as a dimer of catalytic trimers, (c3)2, while the regulatory subunits exist as a trimer of regulatory dimers, (r2)3, therefore the complete holoenzyme can be represented as (c3)2(r2)3. The association of the catalytic subunits c3 with the regulatory subunits r2 is responsible for the establishment of positive co-operativity between catalytic sites for the binding of aspartate and it dictates the pattern of allosteric response toward nucleotide effectors. ATCase from Escherichia coli is the most extensively studied allosteric enzyme PUBMED:7791626. The crystal structure of the T-state, the T-state with CTP bound, the R-state with N-phosphonacetyl-L-aspartate (PALA) bound, and the R-state with phosphonoacetamide plus malonate bound have been used in interpreting kinetic and mutational studies.

    \ \

    A high-resolution structure of E. coli ATCase in the presence of PALA (a bisubstrate analog) allows a detailed description of the binding at the active site of the enzyme and allows a detailed model of the tetrahedral intermediate to be constructed. The entire regulatory chain has been traced showing that the N-terminal regions of the regulatory chains R1 and R6 are located in close proximity to each other and to the regulatory site. This portion of the molecule may be involved in the observed asymmetry between the regulatory binding sites as well as in the heterotropic response of the enzyme PUBMED:10651286. The C-terminal domain of the regulatory chains have a rubredoxin-like zinc-bound fold.

    \ \

    ATCase from Enterobacter agglomerans (Erwinia herbicola) (Pantoea agglomerans) differs from the other investigated enterobacterial ATCases by its absence of homotropic co-operativity toward the substrate aspartate and its lack of response to ATP which is an allosteric effector (activator) of this family of enzymes. Nevertheless, the E. herbicola ATCase has the same quaternary structure, two trimers of catalytic chains with three dimers of regulatory chains, (c3)2(r2)3, as other enterobacterial ATCases and shows extensive primary structure conservation PUBMED:10600394.

    \ ' '4069' 'IPR020542' '\

    Aspartate carbamoyltransferase (aspartate transcarbamylase, ATCase) is an allosteric enzyme that plays a central role in the regulation of the pyrimidine pathway in bacteria. The holoenzyme is a dodecamer composed of six catalytic chains, each with an active site, and six regulatory chains lacking catalytic activity PUBMED:11323717. The catalytic subunits exist as a dimer of catalytic trimers, (c3)2, while the regulatory subunits exist as a trimer of regulatory dimers, (r2)3, therefore the complete holoenzyme can be represented as (c3)2(r2)3. The association of the catalytic subunits c3 with the regulatory subunits r2 is responsible for the establishment of positive co-operativity between catalytic sites for the binding of aspartate and it dictates the pattern of allosteric response toward nucleotide effectors. ATCase from Escherichia coli is the most extensively studied allosteric enzyme PUBMED:7791626. The crystal structure of the T-state, the T-state with CTP bound, the R-state with N-phosphonacetyl-L-aspartate (PALA) bound, and the R-state with phosphonoacetamide plus malonate bound have been used in interpreting kinetic and mutational studies.

    \ \

    A high-resolution structure of E. coli ATCase in the presence of PALA (a bisubstrate analog) allows a detailed description of the binding at the active site of the enzyme and allows a detailed model of the tetrahedral intermediate to be constructed. The entire regulatory chain has been traced showing that the N-terminal regions of the regulatory chains R1 and R6 are located in close proximity to each other and to the regulatory site. This portion of the molecule may be involved in the observed asymmetry between the regulatory binding sites as well as in the heterotropic response of the enzyme PUBMED:10651286. The C-terminal domain of the regulatory chains have a rubredoxin-like zinc-bound fold.

    \ \

    ATCase from Enterobacter agglomerans (Erwinia herbicola) (Pantoea agglomerans) differs from the other investigated enterobacterial ATCases by its absence of homotropic co-operativity toward the substrate aspartate and its lack of response to ATP which is an allosteric effector (activator) of this family of enzymes. Nevertheless, the E. herbicola ATCase has the same quaternary structure, two trimers of catalytic chains with three dimers of regulatory chains, (c3)2(r2)3, as other enterobacterial ATCases and shows extensive primary structure conservation PUBMED:10600394.

    \

    This entry represents the C-terminal domain.

    \ ' '4070' 'IPR011576' '\

    Pyridoxamine 5\'-phosphate oxidase (PNPOx; ) is a FMN flavoprotein that catalyses the oxidation of pyridoxamine-5-P (PMP) and pyridoxine-5-P (PNP) to pyridoxal-5-P (PLP). This reaction serves as the terminal step in the de novo biosynthesis of PLP in Escherichia coli and as a part of the salvage pathway of this coenzyme in both E. coli and mammalian cells PUBMED:12686112, PUBMED:12824491. The binding sites for FMN and for substrate have been highly conserved throughout evolution.

    \

    This entry represents the FMN-binding domain present in pyridoxamine 5\'-phosphate oxidases, as well as in a number of proteins that have not been demonstrated to have enzymatic activity. The binding sites for FMN and for substrate have been highly conserved throughout evolution. The FMN-binding domain has a structure consisting of a beta-barrel with Greek key topology, and is related to the ferredoxin reductase-like FAD-binding domain. PNPOx has a different dimerisation mode than that found in flavin reductases, which also carry an FMN-binding domain with a similar topology.

    \ ' '4071' 'IPR002129' '\

    Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination PUBMED:8690703, PUBMED:7748903, PUBMED:15189147. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors PUBMED:17109392. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy PUBMED:16763894.

    \

    PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic PUBMED:15581583.

    \ \ \ A number of pyridoxal-dependent decarboxylases share regions of sequence similarity, particularly in the vicinity of a conserved lysine residue, which provides the attachment site for the pyridoxal-phosphate (PLP) group PUBMED:8181483, PUBMED:2124279. Among these enzymes are aromatic-L-amino-acid decarboxylase (L-dopa decarboxylase or tryptophan decarboxylase), which catalyses the decarboxylation of tryptophan to tryptamine PUBMED:8889823; tyrosine decarboxylase, which converts tyrosine into tyramine; and histidine decarboxylase, which catalyses the decarboxylation of histidine to histamine PUBMED:2300558. These enzymes belong to the group II decarboxylases PUBMED:8181483, PUBMED:8889823.\ ' '4072' 'IPR008162' '\

    Inorganic pyrophosphatase () (PPase) PUBMED:2160278, PUBMED:1323891 is the enzyme responsible for the hydrolysis of pyrophosphate (PPi) which is formed principally as the product of the many biosynthetic reactions that utilise ATP. All known PPases require the presence of divalent metal cations, with magnesium conferring the highest activity. Among other residues, a lysine has been postulated to be part of or close to the active site. PPases have been sequenced from bacteria such as Escherichia coli (homohexamer), Bacillus PS3 (Thermophilic bacterium PS-3) and Thermus thermophilus, from the archaebacteria Thermoplasma acidophilum, from fungi (homodimer), from a plant, and from bovine retina. In yeast, a mitochondrial isoform of PPase has been characterised which seems to be involved in energy production and whose activity is stimulated by uncouplers of ATP synthesis.

    \

    The sequences of PPases share some regions of similarities, among which is a region that contains three conserved aspartates that are involved in the binding of cations.

    \ ' '4073' 'IPR003699' '\

    This entry represents the queuosine biosynthesis proteins QueA. Queuosine is a hypermodified nucleoside that usually occurs in the first position of the anticodon of tRNAs specifying the amino acids asparagine, aspartate, histidine, and tyrosine. The hypermodified nucleoside is found in bacteria and eukaryotes PUBMED:8347586. Queuosine is synthesized de novo exclusively in bacteria; for eukaryotes the compound is a nutrient factor. Queuosine biosynthesis protein, or S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (QueA) catalyses the formation of the 2,3-epoxy-4,5-dihydroxycyclopentane ring of the Q precursor epoxyqueuosine (oQ). S-adenosyl-L-methionine (AdoMet) reacts with 7-aminomethyl-7-deazaguanine of tRNA at position 34 to yield adenine, methionine, and a modified tRNA with oQ at position 34.

    \

    QueA consists of two domains: domain 1 has 3 layers alpha/beta/alpha, while domain 2 is a closed beta-barrel with Greek-key topology PUBMED:15822125.

    \ ' '4074' 'IPR000159' '\

    Proteins with this domain are mostly RasGTP effectors and include guanine-nucleotide releasing factor in mammals PUBMED:8987396. This factor stimulates the dissociation of GDP from the Ras-related RALA and RALB GTPases, which allows GTP binding and activation of the GTPases. It interacts and acts as an effector molecule for R-ras, K-Ras and Rap PUBMED:7972015.

    \ \

    The domain is also present in a number of other proteins among them the sexual differentiation protein in yeast that is essential for mating and meiosis and yeast adenylate cyclase. These proteins contain repeated leucine-rich (LRR) segments.

    \ ' '4075' 'IPR018514' '\

    Regeneration of injured axons at neuromuscular junctions has been assumed\ to be regulated by extra-cellular factors that promote neurite outgrowth.\ A novel neurite outgrowth factor from chick denervated skeletal muscle has \ been cloned and characterised. The protein, termed neurocrescin (rabaptin),\ has been shown to be secreted in an activity-dependent fashion PUBMED:9427343.

    Rabaptin is a 100kDa coiled-coil protein that interacts with the GTP form of the small GTPase Rab5, a potent regulator of endocytic transport PUBMED:8521472. It is mainly cytosolic, but a fraction co-localises with Rab5 to early endosomes. Rab5 recruits rabaptin-5 to purified early endosomes in a\ GTP-dependent manner, demonstrating functional similarities with other members of the Ras superfamily. Immunodepletion of rabaptin-5 from cytosol strongly inhibits Rab5-dependent early endosome fusion. Thus, rabaptin-5 is a Rab effector required for membrane docking and fusion.

    \ ' '4076' 'IPR003021' '\ REC1 of Ustilago maydis plays a key role in regulating the genetic system\ of the fungus. REC1 mutants are very sensitive to UV light. Mutation\ leads to a complex phenotype with alterations in DNA repair, recombination,\ mutagenesis, meiosis and cell division PUBMED:8276878. The predicted product of the\ REC1 gene is a polypeptide of 522 amino acid residues with molecular mass \ 57kDa. The protein shows 3\'--5\' exonuclease activity, but only in cells\ over-expressing REC1 PUBMED:8276878. While it is distinguishable from the major\ bacterial nucleases, the protein has certain enzymatic features in common\ with epsilon, the proof-reading exonuclease subunit of Escherichia coli DNA polymerase\ III holoenzyme PUBMED:8276878.\ The rad1 gene of Schizosaccharomyces pombe comprises three exons and encodes\ a 37kDa protein that exhibits partial similarity to the REC1 gene of \ U. maydis PUBMED:7926829. The two genes share putative functional similarities\ in their respective organisms.\ ' '4077' 'IPR004582' '\

    To be effective as a mechanism that preserves genomic integrity, the DNA damage checkpoint must be\ extremely sensitive in its ability to detect DNA damage. In Saccharomyces cerevisiae the Ddc1/Rad17/Mec3 complex and Rad24 are DNA damage checkpoint components which may promote checkpoint\ activation by "sensing" DNA damage directly PUBMED:11691833. Rad24 shares sequence homology with RF-c, a protein that recognises DNA template/RNA primer hybrids during DNA replication. The\ Ddc1 complex has structural homology to proliferating-cell nuclear antigen (PCNA), which clamps onto\ DNA and confers processivity to DNA polymerases delta and epsilon. Rad24 is postulated to\ recognise DNA lesions and then recruit the Ddc1 complex to generate checkpoint signals.

    \ ' '4078' 'IPR006909' '\ This family represents a conserved C-terminal region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Rad21/Rec8 like proteins mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex PUBMED:11687503. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation PUBMED:10207075.\ ' '4079' 'IPR006910' '\ This domain represents a conserved N-terminal region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Rad21/Rec8 like proteins mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex PUBMED:11687503. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation PUBMED:10207075.\ ' '4080' 'IPR018325' '\

    Mutations in the nucleotide excision repair (NER) pathway can cause the xeroderma pigmentosum skin cancer predisposition syndrome. NER lesions are limited to one DNA strand, but otherwise they are chemically and structurally diverse, being caused by a wide variety of genotoxic chemicals and ultraviolet radiation. The xeroderma pigmentosum C (XPC) protein has a central role in initiating global-genome NER by recognising the lesion and recruiting downstream factors.

    \

    In NER in eukaryotes, DNA is incised on both sides of the lesion, resulting in the removal of a fragment ~25-30 nucleotides long. This is followed by repair synthesis and ligation. This reaction, in yeast, requires the damage binding factors Rad14, RPA, and the Rad4-Rad23 complex, the transcription factor TFIIH which contains the two DNA helicases Rad3 and Rad25, essential for creating a bubble structure, and the two endonucleases, the Rad1-Rad10 complex and Rad2, which incise the damaged DNA strand on the 5\'- and 3\'-side of the lesion, respectively PUBMED:10915862.

    \

    The crystal structure of the yeast XPC orthologue Rad4 bound to DNA containing a cyclobutane pyrimidine dimer lesion has been determined. The structure shows that Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. The expelled nucleotides of the undamaged strand are recognised by Rad4, whereas the two cyclobutane pyrimidine dimer-linked nucleotides become disordered. This indicates that the lesions recognised by Rad4/XPC thermodynamically destabilise the double helix in a manner that facilitates the flipping-out of two base pairs PUBMED:17882165.

    \

    Homologues of all the above mentioned yeast genes, except for RAD7, RAD16, and MMS19, have been identified in humans, and mutations in these human genes\ affect NER in a similar fashion as they do in yeast, with the exception of XPC, the human counterpart of yeast RAD4. Deletion of RAD4 causes the same high level\ of UV sensitivity as do mutations in the other class 1 genes, and rad4 mutants are completely defective in incision. By contrast, XPC is required for\ the repair of nontranscribed regions of the genome but not for the repair of the transcribed DNA strand.

    \

    This entry represents a domain with an ancient transglutaminase-like fold, which is specifically related to the peptide-N-glycanases (PNGases) that remove glycans from glycoproteins during their degradation PUBMED:11487565.

    \ ' '4081' 'IPR007232' '\ The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to Rad52. These proteins contain two helix-hairpin-helix motifs PUBMED:11914131.\ ' '4082' 'IPR007268' '\ Rad9 is required for transient cell-cycle arrests and transcriptional induction of DNA repair in response to DNA damage.\ ' '4083' 'IPR001405' '\

    This family was named initially with reference to the Escherichia coli radC102 mutation which suggested that RadC was involved in repair of DNA lesions PUBMED:11053371. However the relevant mutation has subsequently been shown to be in recG, where radC is in fact an allele of recG PUBMED:10224240. In addition all attempts to characterise a radiation-related function for RadC in Streptococcus pneumoniae failed, suggesting that it is not involved in repair of DNA lesions, in recombination during transformation, in gene conversion, nor in mismatch repair PUBMED:18556794.

    \ ' '4084' 'IPR006802' '\ This family includes the radial spoke head proteins RSP4 and RSP6 from Chlamydomonas reinhardtii, and several eukaryotic homologues, including mammalian RSHL1, the protein product of a familial ciliary dyskinesia candidate gene PUBMED:11237735.\ ' '4085' 'IPR004321' '\

    The variable portion of the genes encoding immunoglobulins and T cell receptors are assembled from component V, D, and J DNA segments by a site-specific recombination reaction termed V(D)J recombination. V(D)J recombination is targeted to specific sites on the chromosome by recombination signal sequences (RSSs) that flank antigen receptor gene segments. The RSS consists of a conserved heptamer (consensus, 5\'-CACAGTG-3\') and nonamer (consensus, 5\'-ACAAAAACC-3\') separated by a spacer of either 12 or 23 bp. Efficient recombination occurs between a 12-RSS and a 23-RSS, a restriction known as the 12/23 rule.

    \ \

    V(D)J recombination can be divided into two phases, DNA cleavage and DNA joining. DNA cleavage requires two lymphocyte-specific factors, the\ products of the recombination activating genes, RAG1 and RAG2, which together recognise the RSSs and create double strand breaks at the RSS-coding segment junctions PUBMED:11961538. RAG-mediated DNA cleavage occurs in a synaptic complex termed the paired complex, which is constituted from two distinct RSS-RAG complexes, a 12-SC and a 23-SC (where SC stands for signal complex). The DNA cleavage reaction involves two distinct enzymatic steps, initial nicking that creates a 3\'-OH between a coding segment and its RSS, followed by hairpin formation in which the newly created 3\'-OH attacks a phosphodiester bond on the opposite DNA strand. This generates a\ blunt, 5\' phosphorylated signal end containing all of the RSS elements, and a covalently sealed hairpin coding end.

    \ \

    The second phase of V(D)J recombination, in which broken DNA fragments are processed and joined, is less well characterised. Signal ends are typically joined\ precisely to form a signal joint, whereas joining of the coding ends requires the hairpin structure to be opened and typically involves nucleotide addition and deletion\ before formation of the coding joint. The factors involved in these processes include ubiquitously expressed proteins involved in the repair of DNA double strand\ breaks by nonhomologous end joining, terminal deoxynucleotidyl transferase, and Artemis protein.

    \ \

    In addition to their critical roles in RSS recognition and DNA cleavage, the RAG proteins may perform two distinct types of functions in the\ postcleavage phase of V(D)J. A structural function has been inferred\ from the finding that, after DNA cleavage in vitro, the DNA ends remain associated with the RAG proteins in a "four end" complex known as the cleaved signal\ complex. After release of the coding ends in vitro, and after coding joint formation in vivo, the RAG proteins remain in a\ stable signal end complex (SEC) containing the two signal ends. These postcleavage complexes may serve\ as essential scaffolds for the second phase of the reaction, with the RAG proteins acting to organise the DNA processing and joining events.

    \ \

    The second type of RAG protein-mediated postcleavage activity is the catalysis of phosphodiester bond hydrolysis and strand transfer reactions. The RAG proteins are capable of opening hairpin coding ends in vitro. The RAG proteins\ also show 3\' flap endonuclease activity that may contribute to coding end processing/joining and can utilise the\ 3\' OH group on the signal ends to attack hairpin coding ends (forming hybrid or open/shut joints) or virtually any DNA duplex (forming a transposition product).

    \ ' '4086' 'IPR006985' '\ The calcitonin-receptor-like receptor can function as either a calcitonin-gene-related peptide or an adrenomedullin receptor. The receptors function is modified by receptor activity modifying protein or RAMP. RAMPs are single-transmembrane-domain proteins PUBMED:9620797.\ ' '4087' 'IPR000156' '\

    Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran Binding Protein 1 (RanBP1) has guanine nucleotide dissociation inhibitory activity, specific for the GTP form of Ran and also functions to stimulate Ran GTPase activating protein(GAP)-mediated GTP hydrolysis by Ran. RanBP1 contributes to maintaining the gradient of RanGTP across the nuclear envelope high (GDI activity) or the cytoplasmic levels of RanGTP low (GAP cofactor) PUBMED:12019565.

    All RanBP1 proteins contain an approx 150 amino acid residue Ran binding domain. Ran BP1 binds directly to RanGTP with high affinity.\ \ There are four sites of contact between Ran and the Ran binding domain. One of these involves binding of the C-terminal segment of Ran to a groove on the Ran binding domain that is analogous to the surface utilised in the EVH1-peptide interaction PUBMED:10404224. Nup358 contains four Ran binding domains. The structure of the first of these is known PUBMED:10078529.

    \ ' '4088' 'IPR004318' '\

    Members of this family are found in the parasite Babesia bigemina. Other rhoptry-associated proteins are found in Plasmodium falciparum but these do not belong to this family. Animal infection with B. bigemina may produce a pattern similar to human malaria PUBMED:10614497. Rhoptry organelles form part of the apical complex in apicomplexan parasites.

    \ \

    Rhoptry-associated proteins are antigenic, and generate partially protective immune responses in infected mammals. Thus RAPs are among the targeted vaccine antigens for babesial (and malarial) parasites. However, RAP-1 proteins are encoded by by a multigene family; thus RAP-1 proteins are polymorphic, with B and T cell epitopes that are conserved among strains, but not across species PUBMED:9662706, PUBMED:9476795, PUBMED:9529082. Antibodies to B. bigemina RAP-1 may also be helpful in the serological detection of B. bigemina infections PUBMED:10364599.

    \ ' '4089' 'IPR013753' '\

    Many members of the Ras superfamily of GTPases have been implicated in the regulation of hematopoietic cells, with roles in growth,\ survival, differentiation, cytokine production, chemotaxis, vesicle-trafficking, and phagocytosis. The Ras superfamily of proteins now includes over 150 small GTPases (distinguished from the large,\ heterotrimeric GTPases, the G-proteins). It comprises six subfamilies, the Ras, Rho, Ran, Rab, Arf,\ and Kir/Rem/Rad subfamilies PUBMED:10712923. They exhibit remarkable overall amino acid identities, especially in the regions interacting with the guanine\ nucleotide exchange factors that catalyze their activation PUBMED:12384139.

    \ \ \ ' '4090' 'IPR001936' '\

    Ras proteins are membrane-associated molecular switches that bind GTP and GDP and slowly hydrolyze GTP to GDP PUBMED:1898771. This intrinsic GTPase activity of ras is stimulated by a family of proteins collectively known as \'GAP\' or GTPase-activating proteins PUBMED:1883874, PUBMED:7945277. As it is the GTP bound form of ras which is active, these proteins are said to be down-regulators of ras.

    \

    The Ras GTPase-activating proteins are quite large (from 765 residues for sar1 to 3079 residues for IRA2) but share only a limited (about 250 residues) region of sequence similarity, referred to as the \'catalytic domain\' or rasGAP domain.

    \

    Note: There are distinctly different GAPs for the rap and rho/rac subfamilies of ras-like proteins (reviewed in reference PUBMED:8259209) that do not share sequence similarity with ras GAPs.

    \ ' '4091' 'IPR000593' '\

    Ras GTPase-activating protein (rasGAP) is a major contributor to the down-regulation of ras by facilitating GTP hydrolysis of activated ras. In addition, GAP participates in the down-stream effector system of the ras signalling pathway. Abnormal signal transduction involving activated ras genes plays a major role in the development of a variety of tumours. Depending on the precise genetic alteration, its location within the gene and the effects it exerts on protein function, rasGAP can theoretically function as either an oncogene or as a tumour suppressor gene PUBMED:8738474.

    \ ' '4092' 'IPR001895' '\

    Ras proteins are membrane-associated molecular switches that bind GTP and GDP and slowly hydrolyze GTP to GDP PUBMED:1898771. The balance between the GTP bound (active) and GDP bound (inactive) states is regulated by the opposite action of proteins activating the GTPase activity and that of proteins which promote the loss of bound GDP and the uptake of fresh GTP PUBMED:8259209, PUBMED:15335949. The latter proteins are known as guanine-nucleotide dissociation stimulators (GDSs) (or also as guanine-nucleotide releasing (or exchange) factors (GRFs)). Proteins that act as GDS can be classified into at least two families, on the basis of sequence similarities, the CDC24 family (see ) and the CDC25 family.

    \

    The size of the proteins of the CDC25 family range from 309 residues (LTE1) to 1596 residues (sos). The sequence similarity shared by all these proteins is limited to a region of about 250 amino acids generally located in their C-terminal section (currently the only exceptions are sos and ralGDS where this domain makes up the central part of the protein). This domain has been shown, in CDC25 an SCD25, to be essential for the activity of these proteins.

    \ ' '4093' 'IPR000651' '\ This domain is found in several guanine nucleotide exchange factors for Ras-like small GTPases, and lies N-terminal to the RasGef (Cdc25-like) domain. Proteins belonging to this family include guanine nucleotide dissociation stimulator, which stimulates the dissociation of GDP from the Ras-related RalA and RalB GTPases and allows GTP binding and activation of the GTPases; GTPase-activating protein (GAP) for Rho1 and Rho2, which is involved in the control of cellular morphogenesis; and the yeast cell division control protein, which promotes the exchange of Ras-bound GDP by GTP and controls the level of cAMP when the cell division cycle is triggered. Also included is the son of sevenless protein, which promotes the exchange of Ras-bound GDP by GTP during neuronal development.\ ' '4094' 'IPR002720' '\ Retinoblastoma-like and retinoblastoma-associated proteins may have a function in cell cycle regulation. They form a complex with adenovirus E1A and Simian virus 40 (SV40) large T antigen, and may bind and modulate the function of certain cellular proteins with which T and E1A compete for pocket binding. The proteins may act as tumor suppressors, and are potent inhibitors of E2F-mediated trans-activation. \ This domain has the cyclin fold PUBMED:8152925.\ \

    The crystal structure of the Rb pocket bound to a nine-residue E7 peptide containing the LxCxE motif, shared by other Rb-binding viral and cellular proteins, shows that the LxCxE peptide binds a highly conserved groove on the B-box portion of the pocket; the A-box portion appears to be required for the stable folding of the B box (see ). Also highly conserved is the extensive A-B interface, suggesting that it may be an additional protein-binding site. The A and B boxes each contain the cyclin-fold structural motif, with the LxCxE-binding site on the B-box cyclin fold being similar to a Cdk2-binding site of cyclin A and to a TBP-binding site of TFIIB PUBMED:9495340.

    \ \

    The A and B boxes are found at the C-terminal end of the protein; the A-box is on N-terminal side of the B-box.

    \ ' '4095' 'IPR002719' '\ Retinoblastoma-like and retinoblastoma-associated proteins may have a function in cell cycle regulation. They form a complex with adenovirus E1A and SV40 large T antigen, and may bind and modulate the function of certain cellular proteins with which T and E1A compete for pocket binding. The proteins may act as tumor suppressors, and are potent inhibitors of E2F-mediated trans-activation. \ This domain has the cyclin fold PUBMED:8152925.\ \

    The crystal structure of the Rb pocket bound to a nine-residue E7 peptide containing the LxCxE motif, shared by other Rb-binding viral and cellular proteins, shows that the LxCxE peptide binds a highly conserved groove on the B-box portion of the pocket; the A-box portion (see ) appears to be required for the stable folding of the B box. Also highly conserved is the extensive A-B interface, suggesting that it may be an additional protein-binding site. The A and B boxes each contain the cyclin-fold structural motif, with the LxCxE-binding site on the B-box cyclin fold being similar to a Cdk2-binding site of cyclin A and to a TBP-binding site of TFIIB PUBMED:9495340.

    \ \

    The A and B boxes are found at the C-terminal end of the protein; the B-box is on C-terminal side of the A-box.

    \ ' '4096' 'IPR003116' '\ This is the Ras-binding domain found in proteins related to Ras. It is found\ in association with the PE-bind and pkinase domains.\ ' '4097' 'IPR000238' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosome-binding factor A PUBMED:9422595 (gene rbfA) is a bacterial protein that\ associates with free 30S ribosomal subunits. It does not associate with 30S\ subunits that are part of 70S ribosomes or polysomes. It is essential for\ efficient processing of 16S rRNA.\ Ribosome-binding factor A is a protein of from 13 to 15 Kd which is found in\ most prokaryotic organisms. A putative chloroplastic form seems to exist in\ plants.

    \ ' '4098' 'IPR007721' '\ The Escherichia coli high-affinity ribose-transport system consists of six proteins encoded by the rbs operon (rbsD, rbsA, rbsC, rbsB, rbsK and rbsR). Of the six components, RbsD is the only one whose function is unknown although it is thought that it somehow plays a critical role in PtsG-mediated ribose transport PUBMED:11320319. This family also includes FucU a protein from the fucose biosynthesis operon that is presumably also involved in fucose transport by similarity to RbsD.\ ' '4099' 'IPR003435' '\

    The RbcX protein has been identified as having a possible chaperonin-like function PUBMED:9642201. The rbcX gene is juxtaposed to and cotranscribed with rbcL and rbcS encoding RubisCO in Anabaena sp. (strain CA / ATCC 33047). RbcX has been shown to possess a chaperonin-like function assisting correct folding of RubisCO in Escherichia coli expression studies and is needed for RubisCO to reach its maximal activity PUBMED:9171433.

    \ ' '4100' 'IPR000408' '\

    The regulator of chromosome condensation (RCC1) PUBMED:8480369 is a eukaryotic protein\ which binds to chromatin and interacts with ran, a nuclear GTP-binding\ protein , to promote the loss of bound GDP and the uptake of\ fresh GTP, thus acting as a guanine-nucleotide dissociation stimulator (GDS).\ The interaction of RCC1 with ran probably plays an important role in the\ regulation of gene expression.

    \ \

    RCC1, known as PRP20 or SRM1 in yeast, pim1 in fission yeast and BJ1 in\ Drosophila, is a protein that contains seven tandem repeats of a domain of\ about 50 to 60 amino acids. As shown in the following schematic\ representation, the repeats make up the major part of the length of the\ protein. Outside the repeat region, there is just a small N-terminal domain of\ about 40 to 50 residues and, in the Drosophila protein only, a C-terminal\ domain of about 130 residues.

    \
    \
    +----+-------+-------+-------+-------+-------+-------+-------+-------------+\
    |N-t.|Rpt. 1 |Rpt. 2 |Rpt. 3 |Rpt. 4 |Rpt. 5 |Rpt. 6 |Rpt. 7 | C-terminal  |\
    +----+-------+-------+-------+-------+-------+-------+-------+-------------+\
    
    \ The RCC1-type of repeat is also found in the X-linked retinitis pigmentosa\ GTPase regulator PUBMED:8817343. The RCC repeats form a beta-propeller\ structure.\ ' '4101' 'IPR007476' '\

    Members of the RdgC family may have exonuclease activity. RdgC is required for efficient pilin variation in Neisseria gonorrhoeae, suggesting that it may be involved in recombination reactions PUBMED:10655208. In Escherichia coli, RdgC is required for growth in recombination-deficient exonuclease-depleted strains. Under these conditions, RdgC may act as an exonuclease to remove collapsed replication forks, in the absence of the normal repair mechanisms PUBMED:8807285.

    \ ' '4102' 'IPR007855' '\

    This entry represents various eukaryotic RNA-dependent RNA polymerases (RDRP; ), such as RCRP-1, RDRP-2 and RDRP-6. These enzymes are involved in the amplification of regulatory microRNAs during post-transcriptional gene silencing PUBMED:12553882; they are also required for transcriptional gene silencing. Double-stranded RNA has been shown to induce gene silencing in diverse eukaryotes and by a variety of pathways PUBMED:16691418. These enzymes also play a role in the RNA interference (RNAi) pathway, which is important for heterochromatin formation, accurate chromosome segregation, centromere cohesion and telomere function during mitosis and meiosis. RDRP enzymes are highly conserved in most eukaryotes, but are missing in archaea and bacteria. The core catalytic domain of RDRP enzymes is structurally similar to the beta\' subunit of DNA-dependent RNA polymerases (DDRP), however the other domains of DDRP show no similarity to those of RDRP.

    \ ' '4103' 'IPR013765' '\

    The recA gene product is a multifunctional enzyme that plays a role in homologous recombination, DNA repair and induction of the SOS response PUBMED:1896024. In homologous recombination, the protein functions as a DNA-dependent ATPase, promoting synapsis, heteroduplex formation and strand exchange between homologous DNAs PUBMED:1896024. RecA also acts as a protease cofactor that promotes autodigestion of the lexA product and phage repressors. The proteolytic inactivation of the lexA repressor by an activated form of recA may cause a derepression of the 20 or so genes involved in the SOS response, which regulates DNA repair, induced mutagenesis, delayed cell division and prophage induction in response to DNA damage PUBMED:1896024.

    RecA is a protein of about 350 amino-acid residues. Its sequence is very well conserved PUBMED:9187054, PUBMED:7592482, PUBMED:8587109 among eubacterial species. It is also found in the chloroplast of plants PUBMED:1518831. RecA-like proteins are found in archaea and diverse eukaryotic organisms, like fission yeast, mouse or human. In the filament\ visualised by X-ray crystallography, beta-strand 3, the loop C-terminal to beta-strand 2, and alpha-helix D of the core domain form one surface that packs against\ alpha-helix A and beta-strand 0 (the N-terminal domain) of an adjacent monomer during polymerisation [Lusetti and Cox, Annu. Rev. Biochem. 2002. 71:71-100.]. The core ATP-binding site domain is well conserved, with 14 invariant residues. It contains the nucleotide binding loop between beta-strand 1 and\ alpha-helix C. The Escherichia coli sequence GPESSGKT matches the consensus sequence of amino acids (G/A)XXXXGK(T/S) for the Walker A box (also\ referred to as the P-loop) found in a number of nucleoside triphosphate (NTP)-binding proteins. Another\ nucleotide binding motif, the Walker B box is found at beta-strand 4 in the RecA structure. The Walker B\ box is characterised by four hydrophobic amino acids followed by an acidic residue (usually aspartate). Nucleotide specificity and additional ATP binding interactions are contributed by the amino acid residues at beta-strand 2 and the loop C-terminal to that\ strand, all of which are greater than 90% conserved among bacterial RecA proteins.

    \ ' '4104' 'IPR003717' '\ The damage avoidance-tolerance pathway(s) requires functional recA, recF, recO, and recR genes, suggesting the mechanism to be daughter strand gap repair. The ruvABC genes or the recG gene is also required. The RecG pathway appears to be more active than the RuvABC pathway PUBMED:11073901. RecO may contain a mononucleotide-binding fold PUBMED:2544549.\ ' '4105' 'IPR015967' '\

    The bacterial protein RecR seems to play a role in a recombinational process\ of DNA repair PUBMED:2674903. It may act with RecF and RecO. RecR is a protein of about 200 amino acid residues. This entry represents a putative\ C4-type zinc finger found in the N-terminal section.

    \ ' '4106' 'IPR018330' '\

    All proteins in this family for which functions are known bind single-stranded DNA and are involved in the the pairing of homologous DNA. RecT from Escherichia coli is a homotetramer which binds to single-stranded DNA and promotes the renaturation of complementary single-stranded DNA, and also plays a role in recombination. It is able to promote the annealing of complementary single\ DNA strands and can catalyze the formation of joint molecules PUBMED:12169595.

    \ ' '4107' 'IPR004612' '\ The Bacillus subtilis protein belonging to this family has been shown to be required for DNA recombination and repair.\ ' '4108' 'IPR003783' '\ RecX is a putative bacterial regulatory protein PUBMED:10869079. The gene encoding RecX is found downstream of recA, and it is suggested that the RecX protein might be regulator of RecA activity by interaction with the RecA protein or filament PUBMED:10869079.\ ' '4109' 'IPR002861' '\ Extracellular matrix (ECM) proteins play an important role in early cortical development, specifically in the formation of neural connections and in controlling the cyto-architecture of the central nervous system. \ The product of the reeler gene in mouse is reelin,a large extracellular protein secreted by pioneer neurons that coordinates cell positioning during neurodevelopment PUBMED:9338784. F-spondin and mindin are a family of matrix-attached adhesion molecules that share structural similarities and overlapping domains of expression. \ Both F-spondin and mindin promote adhesion and outgrowth of hippocampal embryonic neurons and bind to a putative receptor(s) expressed on both hippocampal and sensory neurons PUBMED:10409509.\ \

    This domain of unknown function is found at the N terminus of reelin\ and F-spondin.

    \ ' '4110' 'IPR007337' '\

    Plasmids may be maintained stably in bacterial populations through the action of addiction modules, in which a toxin and antidote are encoded in a cassette on the plasmid. In any daughter cell that lacks the plasmid, the toxin persists and is lethal after the antidote protein is depleted. Toxin/antitoxin pairs are also found on main chromosomes, and likely represent selfish DNA. Sequences in the seed for this alignment all were found adjacent to toxin genes. Several toxin/antitoxin pairs may occur in a single species. \ RelE and RelB form a toxin-antitoxin system; RelE represses translation, probably through binding ribosomes PUBMED:11274135, PUBMED:12123459. RelB stably binds RelE, presumably deactivating it.

    \ ' '4111' 'IPR005516' '\

    Remorin binds both simple and complex galaturonides. The N-terminal region of remorin is proline rich, while the C-terminal region has been predicted to form a coiled-coil, that is expected to interact with other macromolecules, most likely DNA. Functional similarities between the behavior of the proteins and viral proteins involved in intercellular communication have been noted PUBMED:8989883.

    \ ' '4112' 'IPR005518' '\

    Remorin binds both simple and complex galaturonides. The N-terminal region of remorin is proline rich, while the C-terminal region has been predicted to form a coiled-coil, that is expected to interact with other macromolecules, most likely DNA. Functional similarities between the behavior of the proteins and viral proteins involved in intercellular communication have been noted PUBMED:8989883.

    \ ' '4113' 'IPR008257' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of peptidases belong to the MEROPS peptidase family M19 (membrane dipeptidase family, clan MJ). The protein fold of the peptidase domain for members of this family resembles that of Klebsiella urease, the type example for clan MJ.

    \ \ \

    Renal dipeptidase (rDP) (), also known as microsomal dipeptidase,\ is a zinc-dependent metalloenzyme that hydrolyzes a wide range of dipeptides.\ It is involved in renal metabolism of glutathione and its conjugates. It is a\ homodimeric disulphide-linked glycoprotein attached to the renal brush border\ microvilli membrane by a GPI-anchor.\ \ A glutamate residue has recently been shown PUBMED:8097406,PUBMED:12144777 to be important for the\ catalytic activity of rDP.\ rDP seems to be evolutionary related to hypothetical proteins in the PQQ\ biosynthesis operons of Acinetobacter calcoaceticus and Klebsiella pneumoniae.

    \ ' '4114' 'IPR002592' '\

    This family consists of the Reovirus sp. sigma 1 haemagglutinin, cell attachment protein. This glycoprotein is a minor capsid \ protein and also determines the serotype-specific humoral immune response.\ Sigma 1 consist of a fibrous tail and a globular head. The head has\ important roles in the cell attachment function of sigma 1 \ and determinant of the type-specific humoral immune response PUBMED:2398530. Reovirus sp. are part of the Orthoreovirus group of Reoviridae with a dsRNA genome. Also present in this family is Bacteriophage SF6 lysozyme .

    \ ' '4115' 'IPR007662' '\ Protein sigmaC in its native state was shown to be a homotrimer. It was demonstrated that the sigmaC subunits are not covalently bound via disulphide linkages and the formation of an intrachain disulphide bond between the two cysteine residues of the sigmaC polypeptide may have a negative effect on oligomer stability. The susceptibility of the trimer to pH, temperature, ionic strength, chemical denaturants and detergents indicates that hydrophobic interactions contribute much more to oligomer stability than do ionic interactions and hydrogen bonding PUBMED:11752709.\ ' '4116' 'IPR000153' '\

    Reoviruses are double-stranded RNA viruses that lack a membrane envelope. Their capsid is organised in two concentric icosahedral layers: an inner core and an outer capsid layer. The sigma1 protein is found in the outer capsid, and the sigma2 protein is found in the core. There are four other kinds of protein (besides sigma2) in the core, termed lambda 1-3, mu2. Interactions between sigma2 and lambda 1 and lambda 3 are thought\ to initiate core formation, followed by mu2 and lambda2 PUBMED:9971813.

    \

    Sigma1 is a trimeric protein, and is positioned at the 12 vertices of the icosahedral outer capsid layer. Its N-terminal fibrous tail, arranged as a triple coiled coil,\ anchors it in the virion, and a C-terminal globular head interacts with the\ cellular receptor PUBMED:11438552. These two parts form by separate trimerization events.\ The N-terminal fibrous tail forms on the polysome, without the involvement\ of ATP or chaperones. The post- translational assembly of the C-terminal\ globular head involves the chaperone activity of Hsp90, which is associated\ with phosphorylation of Hsp90 during the process PUBMED:11438552. Sigma1 protein acts\ as a cell attachment protein, and determines viral virulence, pathways of\ spread, and tropism. Junctional adhesion molecule has been identified as a\ receptor for sigma1 PUBMED:11239401. In type 3 reoviruses, a small region, predicted to\ form a beta sheet, in the N-terminal tail was found to bind target cell surface\ sialic acid (i.e. sialic acid acts as a co-receptor) and promote apoptosis PUBMED:11287552.\ The sigma1 protein also binds to the lambda2 core protein PUBMED:9311901.

    \ ' '4117' 'IPR000989' '\ Replication proteins (rep) are involved in plasmid replication. The Rep protein binds to the plasmid \ DNA and nicks it at the double strand origin (dso) of replication. The 3\'-hydroxyl end created is \ extended by the host DNA replicase, and the 5\' end is displaced during synthesis. At the end of one \ replication round, Rep introduces a second single stranded break at the dso and ligates the ssDNA\ extremities generating one double-stranded plasmid and one circular ssDNA form. Complementary strand \ synthesis of the circular ssDNA is usually initiated at the single-stranded origin by the host RNA\ polymerase PUBMED:9570403.\ ' '4118' 'IPR007199' '\

    Replication factor-a protein 1 (RPA1) forms a multiprotein complex with RPA2 and RPA3 that binds single-stranded DNA and functions in the recognition of DNA damage for nucleotide excision repair. The complex binds to single-stranded DNA sequences participating in DNA replication in addition to those mediating transcriptional repression and activation, and stimulates the activity of cognate strand exchange protein Sep1. It cooperates with T-AG and DNA topoisomerase I to unwind template DNA containing the Simian Virus 40 origin of replication PUBMED:7753855.

    \ ' '4119' 'IPR002631' '\ This family consists of various bacterial plasmid replication (Rep) proteins. These proteins are essential for replication of plasmids, the Rep proteins are topoisomerases that nick the positive stand at the plus origin of replication and also at the single-strand conversion sequence PUBMED:2695401.\ ' '4120' 'IPR000525' '\ This entry represents the initiator of plasmid replication proteins, RepA and RepB. RepB possesses nicking-closing (topoisomerase I) like activity. It is also able to perform a strand transfer reaction on ssDNA that contains its target PUBMED:9180376, PUBMED:9618448. RepA is an Escherichia coli protein involved in plasmid replication. The RepA protein binds to DNA repeats that flank the repA gene PUBMED:8320218, PUBMED:3949778. A similar RepA family of proteins with wider distribution are the bacterial plasmid DNA replication initiator proteins (see ).\ ' '4121' 'IPR003491' '\ Plasmid replication is initiated by the replication initiation factor (REP). This family represents a probable topoisomerase that\ makes a sequence-specific single-stranded nick in the plasmid DNA at the origin of replication. Human proteins also belong to\ this family, including myelin transcription factor 2 and cerebrin-50 PUBMED:7735128.\ ' '4122' 'IPR006881' '\ This is a family of plasmid encoded proteins involved in plasmid replication. The role of RepA in the replication process is not clearly understood PUBMED:11914352.\ ' '4123' 'IPR001590' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M12, subfamily M12B (adamalysin family, clan (MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH PUBMED:7674922.

    \ \

    The adamalysins are zinc dependent endopeptidases found in snake venom. There are some mammalian proteins such as ,\ and fertilin . Fertilin and closely related\ proteins appear to not have some active site residues and\ may not be active enzymes.

    \ \ \

    CD156 (also called ADAM8 () or MS2 human) has been implicated in extravasation of leukocytes. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).\

    \ ' '4124' 'IPR007816' '\ This family includes both ResB and cytochrome c biogenesis proteins. Mutations in ResB indicate that they are essential for growth PUBMED:10844653. ResB is predicted to be a transmembrane protein PUBMED:10844653.\ ' '4125' 'IPR006935' '\

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    Type I restriction endonucleases are components of prokaryotic DNA restriction-modification mechanisms that protects the organism against invading foreign DNA. Type I enzymes have three different subunits subunits - M (modification), S (specificity) and R (restriction) - that form multifunctional enzymes with restriction (), methylase () and ATPase activities PUBMED:15121719, PUBMED:12595133. The S subunit is required for both restriction and modification and is responsible for recognition of the DNA sequence specific for the system. The M subunit is necessary for modification, and the R subunit is required for restriction. These enzymes use S-Adenosyl-L-methionine (AdoMet) as the methyl group donor in the methylation reaction, and have a requirement for ATP. They recognise asymmetric DNA sequences split into two domains of specific sequence, one 3-4 bp long and another 4-5 bp long, separated by a nonspecific spacer 6-8 bp in length. Cleavage occurs a considerable distance from the recognition sites, rarely less than 400 bp away and up to 7000 bp away. Adenosyl residues are methylated, one on each strand of the recognition sequence. These enzymes are widespread in eubacteria and archaea. In enteric bacteria they have been subdivide into four families: types IA, IB, IC and ID.

    \

    Type III restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. Type III enzymes are hetero-oligomeric, multifunctional proteins composed of two subunits, Res and Mod. The Mod subunit recognises the DNA sequence specific for the system and is a modification methyltransferase; as such it is functionally equivalent to the M and S subunits of type I restriction endonuclease. Res is required for restriction, although it has no enzymatic activity on its own. Type III enzymes recognise short 5-6 bp long asymmetric DNA sequences and cleave 25-27 bp downstream to leave short, single-stranded 5\' protrusions. They require the presence of two inversely oriented unmethylated recognition sites for restriction to occur. These enzymes methylate only one strand of the DNA, at the N-6 position of adenosyl residues, so newly replicated DNA will have only one strand methylated, which is sufficient to protect against restriction. Type III enzymes belong to the beta-subfamily of N6 adenine methyltransferases, containing the nine motifs that characterise this family, including motif I, the AdoMet binding pocket (FXGXG), and motif IV, the catalytic region (S/D/N (PP) Y/F) PUBMED:15121719, PUBMED:12595133.

    \

    This entry represents the R subunit (HsdR) of type I restriction endonucleases (), the Res subunit of type III endonucleases (), and the B subunit of excinuclease ABC (uvrB) PUBMED:11178902, PUBMED:9628345, PUBMED:16125908.

    \ ' '4126' 'IPR006119' '\

    Site-specific recombination plays an important role in DNA rearrangement in prokaryotic organisms. Two types of site-specific recombination are known to occur:

    \
      \
    1. Recombination between inverted repeats resulting in the reversal of a DNA segment.
    2. \
    3. Recombination between repeat sequences on two DNA molecules resulting in their cointegration, or between repeats on one DNA molecule resulting in the excision of a DNA fragment.
    4. \
    \

    Site-specific recombination is characterised by a strand exchange mechanism that requires no DNA synthesis or high energy cofactor; the phosphodiester bond energy is conserved in a phospho-protein linkage during strand cleavage and re-ligation.

    \

    Two unrelated families of recombinases are currently known PUBMED:3011407. The first, called the \'phage integrase\' family, groups a number of bacterial phage and yeast plasmid enzymes. The second PUBMED:2896291, called the \'resolvase\' family, groups enzymes which share the following structural characteristics: an N-terminal catalytic and dimerization domain that contains a conserved serine residue involved in the transient covalent attachment to DNA, and a C-terminal helix-turn-helix DNA-binding domain .

    \ ' '4127' 'IPR001789' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    Bipartite response regulator proteins are involved in a two-component signal transduction system in bacteria, and certain eukaryotes like protozoa, that functions to detect and respond to environmental changes PUBMED:7699720. These systems have been detected during host invasion, drug resistance, motility, phosphate uptake, osmoregulation, and nitrogen fixation, amongst others PUBMED:12015152. The two-component system consists of a histidine protein kinase environmental sensor that phosphorylates the receiver domain of a response regulator protein; phosphorylation induces a conformational change in the response regulator, which activates the effector domain, triggering the cellular response PUBMED:10966457. The domains of the two-component proteins are highly modular, but the core structures and activities are maintained.

    \

    The response regulators act as phosphorylation-activated switches to affect a cellular response, usually by transcriptional regulation. Most of these proteins consist of two domains, an N-terminal response regulator receiver domain, and a variable C-terminal effector domain with DNA-binding activity. This entry represents the response regulator receiver domain, which belongs to the CheY family, and receives the signal from the sensor partner in the two-component system.

    \ ' '4128' 'IPR004028' '\

    The Gag polyprotein directs the assembly and release of virus particles from infected cells. The Gag polyprotein has three domains required for activity: an N-terminal membrane-binding (M) domain that directs Gag to the plasma membrane, an interaction (I) domain involved in Gag aggregation, and a late assembly (L) domain that mediates the budding process PUBMED:10590103. During viral maturation, the Gag polyprotein is then cleaved into major structural proteins by the viral protease, yielding the matrix, capsid, nucleoprotein, and some smaller peptides. In Rous sarcoma virus (RSV), the M domain consists of the first 85 residues of the matrix protein. However, unlike other Gag polyproteins, the M domain of RSV Gag is not myristylated, but retains full activity PUBMED:11070020.This domain forms an alpha helical bundle structure PUBMED:9642071.

    \

    This entry represents the M domain of the Gag polyprotein found in avian retroviruses. This entry also identifies Gag polyproteins from several avian endogenous retroviruses, which arise when one or more copies of the retroviral genome becomes integrated into the host genome PUBMED:14680291.

    \ \ ' '4129' 'IPR000625' '\

    REV is a viral anti-repression trans-activator protein, which appears to act post-transcriptionally PUBMED:2550674 to relieve negative repression of GAG and ENV production. It is a phosphoprotein PUBMED:2741343, PUBMED:2846891 whose state of phosphorylation is mediated by a specific serine kinase activity present in the nucleus PUBMED:2741343. REV accumulates in the nucleoli PUBMED:2656990.

    \ \ ' '4130' 'IPR000352' '\ Peptide chain release factors (RFs) are required for the termination of\ protein biosynthesis PUBMED:8821264. At present two classes of RFs can be distinguished.\ Class I RFs bind to ribosomes that have encountered a stop codon at their\ decoding site and induce release of the nascent polypeptide. Class II RFs are\ GTP-binding proteins that interact with class I RFs and enhance class I RF\ activity.\ In prokaryotes there are two class I RFs that act in a codon specific manner\ PUBMED:2215213: RF-1 (gene prfA) mediates UAA and UAG-dependent termination while RF-2\ (gene prfB) mediates UAA and UGA-dependent termination. RF-1 and RF-2 are\ structurally and evolutionary related proteins which have been shown PUBMED:1408743 to\ be part of a larger family.\ ' '4131' 'IPR007594' '\

    Asymmetric lipid distribution is a fundamental characteristic of biological lipid bilayers, one such axample is the translocation of the Man5GlcNAc2-PP-Dol intermediate from the cytosolic side of the ER membrane to the lumen before the completion of the biosynthesis of Glc3Man9GlcNAc2-PP-Dol PUBMED:11807558. RFT1 encodes an evolutionarily conserved protein required for this translocation.

    \ ' '4132' 'IPR007668' '\ The RFX family is a family of winged-helix DNA-binding proteins. RFX1 is a regulatory factor essential for expression of MHC class II genes. This region is found N-terminal to the RFX DNA-binding region () in some mammalian RFX proteins, and is thought to activate transcription when associated with DNA. Deletion analysis has identified the region 233-351 in human RFX1 () as being required for maximal activation PUBMED:9278482.\ ' '4133' 'IPR003150' '\ RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer PUBMED:2253877.\ ' '4134' 'IPR004901' '\

    Alpha-1,4-glucan-protein synthase catalyses the reaction: .\ The enzyme has a possible role in the synthesis of cell wall polysaccharides in plants PUBMED:13677461. It is found associated with the cell wall, with the highest concentrations in the plasmodesmata. It is also located in the Golgi apparatus.

    \ ' '4135' 'IPR007739' '\ This family consists of a group of proteins which are related to the Streptococcal rhamnose-glucose polysaccharide assembly protein (RgpF). Rhamnan backbones are found in several O-polysaccharides found in phytopathogenic bacteria and are regarded as pathogenic factors PUBMED:12010977.\ ' '4136' 'IPR000342' '\

    RGS (Regulator of G Protein Signalling) proteins are multi-functional, GTPase-accelerating proteins that promote GTP hydrolysis by the alpha subunit of heterotrimeric G proteins, thereby inactivating the G protein and rapidly switching off G protein-coupled receptor signalling pathways PUBMED:10836135. Upon activation by GPCRs, heterotrimeric G proteins exchange GDP for GTP, are released from the receptor, and dissociate into free, active GTP-bound alpha subunit and beta-gamma dimer, both of which activate downstream effectors. The response is terminated upon GTP hydrolysis by the alpha subunit (), which can then bind the beta-gamma dimer (, ) and the receptor. RGS proteins markedly reduce the lifespan of GTP-bound alpha subunits by stabilising the G protein transition state.

    \

    All RGS proteins contain an \'RGS-box\' (or RGS domain), which is required for activity. Some small RGS proteins such as RGS1 and RGS4 are comprised of little more than an RGS domain, while others also contain additional domains that confer further functionality PUBMED:10987813. RGS domains can be found in conjunction with a variety of domains, including: DEP for membrane targeting (), PDZ for binding to GPCRs (), PTB for phosphotyrosine-binding (), RBD for Ras-binding (), GoLoco for guanine nucleotide inhibitor activity (), PX for phosphatidylinositol-binding (), PXA that is associated with PX (), PH for stimulating guanine nucleotide exchange (), and GGL (G protein gamma subunit-like) for binding G protein beta subunits () PUBMED:15090201. Those RGS proteins that contain GGL domains can interact with G protein beta subunits to form novel dimers that prevent G protein gamma subunit binding and G protein alpha subunit association, thereby preventing heterotrimer formation.

    \ ' '4137' 'IPR001903' '\ Different families of ssRNA negative-strand viruses contain glycoproteins responsible for forming spikes on the surface of the virion. The glycoprotein spike is made up of a trimer of glycoproteins. These proteins are frequently abbreviated to G protein. Channel formed by glycoprotein spike is thought to function in a similar manner to Influenza virus M2 protein channel, thus allowing a signal to pass across the viral membrane to signal for viral uncoating PUBMED:1660200, PUBMED:9000093.\ ' '4138' 'IPR005010' '\

    This is a family of phosphoproteins of unknown function expressed by Rhadovirus.

    \ \ ' '4139' 'IPR006870' '\ M protein is involved in condensing and targeting the ribonucleoprotein (RNP) coil to the plasma membrane. M interacts specifically with the transmembrane spike protein (G) and it is important for the incorporation of G protein into budding virions PUBMED:9847327.\ ' '4140' 'IPR005060' '\

    The matrix (M) proteins of Rabies virus (RV) plays a key role in both assembly and budding of progeny virions. A PPPY motif (PY motif or late-budding domain) is conserved in the M proteins. These PY motifs are important for virus budding and for mediating interactions with specific cellular proteins containing WW domains.

    \ ' '4141' 'IPR000448' '\ The Nucleocapsid (N) Protein is said to have a \'tight\' structure.\ The carboxyl end of the N-terminal domain possesses an RNA binding domain.\ Sequence alignments show 2 regions of reasonable conservation, \ approx. 64-103 and 201-329 PUBMED:9603315. A whole functional protein is required \ for encapsidation to take place PUBMED:9501055.\ ' '4142' 'IPR004902' '\ This is a family of Rhabdovirus nucleocapsid proteins. These proteins undergo phosphorylation.\ ' '4143' 'IPR003490' '\ Infectious hematopoietic necrosis virus (IHNV) is a member of the family Rhabdoviridae. The non-virion protein (NV) is coded\ for by one of the six genes of the IHNV genome PUBMED:8578857, but is absent in vesiculovirus-like rhabdovirus PUBMED:9010293.\ ' '4144' 'IPR011539' '\

    The Rel homology domain (RHD) is found in a family of eukaryotic transcription factors, which includes NF-kappaB, Dorsal, Relish, NFAT, among others. Some of these transcription factors appear to form multi-protein DNA-bound complexes PUBMED:9794820. Phosphorylation of the RHD appears to play a role in the regulation of some of these transcription factors, acting to modulate the expression of their target genes PUBMED:15516339. The RHD is composed of two immunoglobulin-like beta-barrel subdomains that grip the DNA in the major groove. The N-terminal specificity domain resembles the core domain of the p53 transcription factor, and contains a recognition loop that interacts with DNA bases; the C-terminal dimerisation domain contains the site for interaction with I-kappaB PUBMED:7830764.

    \ ' '4145' 'IPR000406' '\ The GDP dissociation inhibitor for rho proteins, rho GDI, regulates GDP/GTP\ exchange. The protein contains 204 amino acids, with a calculated Mr value\ of 23,421. Hydropathy analysis shows it to be largely hydrophilic, with a\ single hydrophobic region. Results of database searches suggest rho GDI is\ a novel protein, currently with no known homologue. \ The protein plays an important role in the activation of the superoxide\ (O2-)-generating NADPH oxidase of phagocytes. This process requires the\ interaction of membrane-associated cytochrome b559 with 3 cytosolic\ components: p47-phox, p67-phox and a heterodimer of the small G-protein\ p21rac1 and rho GDI PUBMED:8223583. The association of p21rac and GDI inhibits\ dissociation of GDP from p21rac, thereby maintaining it in an inactive form.\ The proteins are attached via a lipid tail on p21rac that binds to the\ hydrophobic region of GDI PUBMED:8796870. Dissociation of these proteins might be\ mediated by the release of lipids (e.g., arachidonate and phosphatidate)\ from membranes through the action of phospholipases PUBMED:8796870. The lipids may then\ compete with the lipid tail on p21rac for the hydrophobic pocket on GDI.\ ' '4146' 'IPR001763' '\

    Rhodanese, a sulphurtransferase involved in cyanide detoxification (see ) shares evolutionary relationship with a large family of proteins PUBMED:9733650, including\

    \

    Rhodanese has an internal duplication. This domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases PUBMED:8702871.

    \ ' '4147' 'IPR000198' '\ Members of the Rho family of small G proteins transduce signals from plasma-membrane\ receptors and control cell adhesion, motility and shape by actin cytoskeleton formation.\ Like all other GTPases, Rho proteins act as molecular switches, with an active\ GTP-bound form and an inactive GDP-bound form. The active conformation is promoted by\ guanine-nucleotide exchange factors, and the inactive state by GTPase-activating proteins\ (GAPs) which stimulate the intrinsic GTPase activity of small G proteins.\ This entry is a Rho/Rac/Cdc42-like GAP domain, that is found in a wide variety of large,\ multi-functional proteins PUBMED:9009196.\ A number of structure are known for this family\ PUBMED:9009196, PUBMED:8962058, PUBMED:9262406.\ The domain is composed of seven alpha helices.\ This domain is also known as the breakpoint cluster region-homology (BH) domain.\ ' '4148' 'IPR000219' '\

    The Rho family GTPases Rho, Rac and CDC42 regulate a diverse array of cellular processes. Like all members of the Ras superfamily, the Rho proteins cycle between active GTP-bound and inactive GDP-bound conformational states.\ Activation of Rho proteins through release of bound GDP and subsequent\ binding of GTP, is catalysed by guanine nucleotide exchange factors (GEFs) in\ the Dbl family. The proteins encoded by members of the Dbl family share a\ common domain, presented in this entry, of about 200 residues (designated the Dbl homology or DH domain) that has been shown to encode a GEF activity specific for a number of Rho family members. In addition, all family members possess a second, shared domain designated the pleckstrin homology (PH) domain (). Trio and its homologue UNC-73 are unique within the Dbl family insomuch as they encode two distinct DH/PH domain modules. The PH domain is invariably located immediately C-terminal to the DH domain and this invariant topography suggests a functional interdependence between these two structural modules. Biochemical data have established the role of the conserved DH domain in Rho GTPase interaction and activation, and the role of the tandem PH domain in intracellular targeting and/or regulation of DH domain function. The DH domain of Dbl has been shown to mediate oligomerisation that is mostly homophilic in nature. In addition to the tandem DH/PH domains Dbl family GEFs contain diverse structural motifs like serine/threonine kinase, RBD, PDZ, RGS, IQ, REM, Cdc25, RasGEF, CH, SH2, SH3, EF, spectrin or Ig.

    \ \

    The DH domain is composed of three structurally conserved regions separated by\ more variable regions. It does not share significant sequence homology with\ other subtypes of small G-protein GEF motifs such as the Cdc25 domain and the\ Sec7 domain, which specifically interact with Ras and ARF family small GTPases, respectively, nor with other Rho protein interactive motifs, indicating that the Dbl family proteins are evolutionarily unique. The DH domain is composed of 11 alpha helices that are folded into a flattened, elongated alpha-helix bundle in which two of the three conserved regions, conserved region 1 (CR1) and conserved region 3 (CR3), are exposed near the centre of one surface. CR1 and CR3, together with a part of alpha-6 and the DH/PH junction site, constitute the Rho GTPase interacting pocket.

    \ ' '4149' 'IPR001826' '\

    RHS elements are proteins of non-essential function believed to play an important role in the natural ecology of the cell. The protein sequences comprise highly conserved 141 kDa domain containing multiple tandem 22-residue repeats, followed by divergent C-terminal domains PUBMED:2403547, PUBMED:7934896. The 22 residue repeats contain a YD dipeptide which is the most strongly conserved motif of the repeat.

    \ \ ' '4150' 'IPR001676' '\

    This domain occurs in the capsid proteins of picornaviruses, which are non-enveloped plus-strand ssRNA animal viruses with icosahedral capsids. They include rhinovirus (common cold) and poliovirus.

    \

    The atomic structure of echovirus 1 (a member of the enterovirus genus of the picornavirus family) has been determined using cryo-crystallography and refined to 3.55 A resolution PUBMED:10089503. The common structure is an 8-stranded beta sandwich which can have one or two extra strands.

    \ ' '4151' 'IPR003193' '\

    CD38, the HUGO gene name, is also called T10 or ADP-ribosyl cyclase/cyclic ADP-ribose hydrolase (). CD38 is a novel enzyme capable of catalysing multiple reactions, including NAD glycohydrolase, ADP-ribosyl cyclase, cyclic ADP ribose hydrolase and base-exchange activities. Two of the enzymatic products, cyclic ADP-ribose (cADPR) and nicotinic acid adenine dinucleotide phosphate (NAADP), are calcium messengers in a wide variety of cells from protist, plant, and mammal to human. CD38 is a positive and negative regulator of cell activation and proliferation, depending on the cellular environment. It is involved in adhesion between human lymphocytes and endothelial cells and is involved in the metabolism of two calcium messengers, cADPR and NAADP.

    \ \

    CD157 (also called BP-3/IF-7, BST-1 or Mo5) has ADP-ribosyl cyclase and cyclic ADP-ribose hydrolase activities. CD157 supports the growth of a pre-B cell line, DW34. Anti-CD157 mAb IF-7 has synergistic effects on anti-CD3-induced growth of T progenitor cells, and facilitates the development of [alpha][beta] TCR+ cells in foetal thymic organ culture system.

    \ \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).

    \ ' '4152' 'IPR002734' '\ This domain is\ found in the C-terminus of the bifunctional deaminase-reductase of Escherichia coli, Bacillus subtilis and other bacteria in combination with that catalyses the second and third steps in the biosynthesis of riboflavin, i.e., the deamination of 2,5-diamino-6-ribosylamino-4(3H)-pyrimidinone 5\'-phosphate (deaminase) and the subsequent reduction of the ribosyl side chain (reductase) PUBMED:9068650. The domain is also present in some HTP reductases from archaea and fungi.\ ' '4153' 'IPR007756' '\ This domain is about 85 residues in length and very rich in charged residues, hence the name RICH (Rich In CHarged residues). It is found in secreted proteins such as PspC , SpsA and IgA FC receptor from Streptococcus agalactiae. This domain could be involved in bacterial adherence or cell wall binding.\ ' '4154' 'IPR006175' '\

    This domain is found in endoribonuclease, that is active on single-stranded mRNA and inhibits protein synthesis by cleavage of mRNA PUBMED:10368157. Previously it was thought to inhibit protein synthesis initiation PUBMED:8530410. This endoribonuclease may also be involved in the regulation of purine biosynthesis PUBMED:10400702.

    \ ' '4155' 'IPR013509' '\

    Ribonucleotide reductase () PUBMED:3286319, PUBMED:8511586 catalyzes the reductive\ synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides\ the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their\ metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a diiron-tyrosyl\ radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12\ (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria and bacteriophage, use an FeS cluster\ and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in\ their genomes.

    \

    Ribonucleotide reductase is an oligomeric\ enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to\ 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small\ chain PUBMED:11875520.

    The reduction of ribonucleotides to deoxyribonucleotides involves the transfer of free radicals,\ the function of\ each metallocofactor is to generate an active site thiyl radical. This thiyl radical then initiates the nucleotide reduction\ process by hydrogen atom abstraction from the ribonucleotide PUBMED:9309223. The radical-based reaction involves five\ cysteines: two of these are located at adjacent anti-parallel strands in a\ new type of ten-stranded alpha/beta-barrel; two others reside at the\ carboxyl end in a flexible arm; and the fifth, in a loop in the centre of\ the barrel, is positioned to initiate the radical reaction PUBMED:8052308. There are several regions of similarity in the sequence of the large \ chain of prokaryotes, eukaryotes and viruses spread across 3 domains:\ an N-terminal domain common to the mammalian and bacterial enzymes; a\ C-terminal domain common to the mammalian and viral ribonucleotide \ reductases; and a central domain common to all three PUBMED:9309223.

    \ ' '4156' 'IPR000788' '\

    Ribonucleotide reductase () PUBMED:3286319, PUBMED:8511586 catalyzes the reductive\ synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides\ the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their\ metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a diiron-tyrosyl\ radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12\ (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria and bacteriophage, use an FeS cluster\ and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in\ their genomes.

    \

    Ribonucleotide reductase is an oligomeric\ enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to\ 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small\ chain PUBMED:11875520.

    The reduction of ribonucleotides to deoxyribonucleotides involves the transfer of free radicals,\ the function of\ each metallocofactor is to generate an active site thiyl radical. This thiyl radical then initiates the nucleotide reduction\ process by hydrogen atom abstraction from the ribonucleotide PUBMED:9309223. The radical-based reaction involves five\ cysteines: two of these are located at adjacent anti-parallel strands in a\ new type of ten-stranded alpha/beta-barrel; two others reside at the\ carboxyl end in a flexible arm; and the fifth, in a loop in the centre of\ the barrel, is positioned to initiate the radical reaction PUBMED:8052308. There are several regions of similarity in the sequence of the large \ chain of prokaryotes, eukaryotes and viruses spread across 3 domains:\ an N-terminal domain common to the mammalian and bacterial enzymes; a\ C-terminal domain common to the mammalian and viral ribonucleotide \ reductases; and a central domain common to all three PUBMED:9309223.

    \ ' '4157' 'IPR000358' '\

    Ribonucleotide reductase () PUBMED:3286319, PUBMED:8511586 catalyzes the reductive synthesis\ of deoxyribonucleotides from their corresponding ribonucleotides:\ \ It provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a\ diiron-tyrosyl radical, Class II RNRs, found in bacteria,\ bacteriophage, algae and archaea, use coenzyme B12\ (adenosylcobalamin, AdoCbl). Class III RNRs, found in\ anaerobic bacteria and bacteriophage, use an FeS cluster and\ S-adenosylmethionine to generate a glycyl radical. Many\ organisms have more than one class of RNR present in their\ genomes.

    \

    Ribonucleotide reductase is an\ oligomeric enzyme composed of a large subunit (700 to 1000 residues) and a\ small subunit (300 to 400 residues) - class II RNRs are less complex,\ using the small molecule B12 in place of the small chain PUBMED:11875520.\ The small chain binds two iron atoms PUBMED:2190093 (three Glu, one Asp, and two His are\ involved in metal binding) and contains an active site tyrosine radical. The\ regions of the sequence that contain the metal-binding residues and the active\ site tyrosine are conserved in ribonucleotide reductase small chain from\ prokaryotes, eukaryotes and viruses.\ We have selected one of these regions as a signature pattern. It contains the\ active site residue as well as a glutamate and a histidine involved in the\ binding of iron.

    \ ' '4158' 'IPR000026' '\

    Ribonuclease N1 (RNase N1) is a guanine-specific ribonuclease from fungi. RNase T1 and other bacteria RNases are related.

    \ \

    The enzyme hydrolyses the phosphodiester bonds in RNA and oligoribonucleotides PUBMED:8110767, resulting in 3\'-nucleoside monophosphates via 2\',3\'-cyclophosphate intermediates.

    \ ' '4159' 'IPR004664' '\

    Members of this entry include ribonuclease BN (rbn) from Escherichia coli and homologues from a number of bacteria, including the largely uncharacterised BrkB (Bordetella spp. resist killing by serum B) from Bordetella pertussis. Some members have an additional C-terminal domain. Paralogs from E. coli (yhjD) and Mycobacterium tuberculosis (Rv3335c) are part of a smaller, related subfamily that form their own cluster. Ribonuclease BN is a homodimer in E. coli and does not contain a nucleic acid component. Bacteriophage T4 encodes several tRNAs that require this host ribonuclease for maturation. However, host tRNAs with the normal universal 3 sequence of CCA do not appear to be substrates. The substrate specificity of RNase BN appears to be very narrow and its biological role is uncertain. It is one of five ribonucleases in E. coli for which any of the five can confer viability, with the order of efficacy being RNase T > RNase PH > RNase D > RNase II > RNase BN.

    \ ' '4160' 'IPR000100' '\ Ribonuclease P () (RNase P) PUBMED:1689306, PUBMED:1700778, PUBMED:1374553 is a site specific endonuclease that generates mature tRNAs by catalysing the removal of the 5\'-leader sequence from pre-tRNA to produce the mature 5\'-terminus. It can also cleave other RNA substrates such as 4.5S RNA. In bacteria RNase P is known to be composed of two components: a large RNA (about 400 base pairs) encoded by rnpB, and a small protein (119 to 133 amino acids) encoded by rnpA. The RNA moiety of RNase P carries the catalytic activity; the protein component plays an auxiliary, but essential, role in vivo by binding to the 5\'-leader sequence and broadening the substrate specificity of the ribozyme.\ The sequence of rnpA is not highly conserved, however there is, in the central\ part of the protein, a conserved basic region.\ ' '4161' 'IPR001568' '\

    The fungal ribonucleases T2 from Aspergillus oryzae, M from Aspergillus saitoi and Rh from Rhizopus niveus are structurally and functionally related 30 Kd glycoproteins PUBMED:2229029 that cleave the 3\'-5\' internucleotide linkage of RNA via a nucleotide 2\',3\'-cyclic phosphate intermediate ().

    \

    Two histidines residues have been shown PUBMED:2298207, PUBMED:1633875 to be involved in the catalytic mechanism of RNase T2 and Rh. These residues and the region around them are highly conserved in a number of other RNAses that have been found to be evolutionary related to these fungal enzymes.

    \ ' '4162' 'IPR002143' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L1 is the largest protein from the large ribosomal subunit. The L1 protein contains two domains: 2-layer alpha/beta domain and a 3-layer alpha/beta domain (interrupts the first domain). In Escherichia coli, L1 is known to bind to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities PUBMED:8635468, PUBMED:8607874, groups:

    \ \ ' '4163' 'IPR001790' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    On the basis of sequence similarities the following prokaryotic and eukaryotic ribosomal proteins can be grouped:\

    \ ' '4165' 'IPR000911' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 \ is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the \ basis of sequence similarities PUBMED:2167467, PUBMED:, groups bacteria, plant chloroplast, read \ algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is \ a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been \ shown PUBMED:2483975 to be in an extended and loosely folded conformation and is likely to be buried \ within the ribosomal structure.

    \ ' '4166' 'IPR000911' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 \ is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the \ basis of sequence similarities PUBMED:2167467, PUBMED:, groups bacteria, plant chloroplast, read \ algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is \ a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been \ shown PUBMED:2483975 to be in an extended and loosely folded conformation and is likely to be buried \ within the ribosomal structure.

    \ ' '4167' 'IPR013823' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This entry represents the C-terminal domain of the large subunit ribosomal proteins, known as the L7/L12 family. L7/L12 is present in each 50S subunit in four copies organised as two dimers. The L8 protein complex consisting of two dimers of L7/L12 and L10 in Escherichia coli ribosomes is assembled on the conserved region of 23 S rRNA termed the GTPase-associated domain PUBMED:10488095. The L7/L12 dimer probably interacts with EF-Tu. L7 and L12 only differ in a single post translational modification of the addition of an acetyl group to the N terminus of L7.

    \ ' '4168' 'IPR005822' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L13 is one of the proteins from the large ribosomal subunit\ PUBMED:8119894. In Escherichia coli, L13 is known to be one of the early assembly\ proteins of the 50S ribosomal subunit.

    \ ' '4169' 'IPR001380' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The ribosomal protein L13e is widely found in vertebrates PUBMED:8198561, Drosophila melanogaster, plants, yeast, amongst others.

    \ ' '4170' 'IPR000218' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L14 is one of the proteins from the large ribosomal subunit.\ In eubacteria, L14 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins, which have been grouped on the basis of sequence similarities PUBMED:. Based on amino-acid sequence homology, it is predicted that ribosomal protein L14 is a member of a recently identified family of structurally related RNA-binding proteins PUBMED:15299380. L14 is a protein of 119 to 137 amino-acid residues.

    \ ' '4171' 'IPR002784' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This entry includes the eukaryotic ribosomal protein L14, which binds to the 60S ribosomal subunit, and archaebacterial ribosomal protein L14E, which binds to the 50S ribosomal subunit.

    \ ' '4173' 'IPR000439' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaebacterial ribosomal proteins can be grouped\ on the basis of sequence similarities PUBMED:7733938. One of these families consists of:

    \ \
  • Mammalian L15.
  • \
  • Insect L15.
  • \
  • Plant L15.
  • \
  • Yeast YL10 (L13) (Rp15r).
  • \
  • Archaebacterial L15e.
  • \ \

    These proteins have about 200 amino acid residues.

    \ ' '4174' 'IPR016180' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This entry represents a structural domain with an alpha/beta-hammerhead fold, where the beta-hammerhead motif is similar to that in barrel-sandwich hybrids. Domains of this structure can be found in ribosomal proteins L10e and L16.

    \ ' '4175' 'IPR000456' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L17 is one of the proteins from the large ribosomal subunit. Bacterial L17 is a protein of 120 to 130 amino-acid residues while yeast YmL8 is\ twice as large (238 residues). The N-terminal half of YmL8 is colinear\ with the sequence of L17 from Escherichia coli.

    \ ' '4176' 'IPR005484' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family includes L18 from bacteria and L5 from eukaryotes. The ribosomal 5S RNA is\ the only known rRNA species to bind a ribosomal protein before its assembly into the\ ribosomal subunits \ PUBMED:8474444. \ In eukaryotes, the 5S rRNA molecule binds one protein species, a 34-kDa protein which has been implicated in the intracellular\ transport of 5 S rRNA, while in bacteria it binds\ two or three different protein species \ PUBMED:8219074.

    \ ' '4177' 'IPR001857' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L19 is one of the proteins from the large ribosomal subunit PUBMED:8262035, PUBMED:1985969. In Escherichia coli, L19 is known to be located at the 30S-50S ribosomal subunit interface PUBMED:339951 and may play a role in the structure and function of the aminoacyl-tRNA binding site. It belongs to a family of ribosomal proteins, including L19 from bacteria and the chloroplasts of red algae.

    \

    L19 is a protein of 120 to 130 amino-acid residues.

    \ ' '4178' 'IPR000196' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This entry represents the ribosomal protein L19 from eukaryotes, as well as L19e from archaea PUBMED:10381320. L19/L19e is absent in bacteria. L19/L19e is part of the large ribosomal subunit, whose structure has been determined in a number of eukaryotic and archaeal species PUBMED:15184028. L19/L19e is a multi-helical protein consisting of two different 3-helical domains connected by a long, partly helical linker.

    \ ' '4179' 'IPR002171' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L2 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L2 is known to bind to the 23S rRNA and to have peptidyltransferase activity. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities PUBMED:1579444, PUBMED:, groups:

    \ \ ' '4180' 'IPR005813' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    L20 is a protein from the large (50S) subunit; in Escherichia coli it is known to\ bind directly to the 23S rRNA, and is required for ribosome assembly, but\ does not take part in protein synthesis. It belongs to a family of ribosomal\ proteins, including L20 from eubacteria, plant and alga chloroplasts and\ cyanelles PUBMED:.

    \ ' '4181' 'IPR001147' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    L21E family contains proteins from a number of eukaryotic\ and archaebacterial organisms which include; mammalian L2, Entamoeba histolytica L21,\ Caenorhabditis elegans L21 (C14B9.7), Saccharomyces cerevisiae (Baker\'s yeast) L21E (URP1) and Haloarcula marismortui HL31.

    \ ' '4182' 'IPR001787' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L21 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L21 is known to bind to the 23S rRNA in the presence of L20. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:\

    \

    Bacterial L21 is a protein of about 100 amino-acid residues, the mature form of the spinach chloroplast L21 has 200 residues.

    \ ' '4183' 'IPR001063' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L22 is one of the proteins from the large ribosomal subunit.\ In Escherichia coli, L22 is known to bind 23S rRNA. It belongs to a family of\ ribosomal proteins which includes: bacterial L22; algal and plant chloroplast L22\ (in legumes L22 is encoded in the nucleus instead of the chloroplast); cyanelle L22;\ archaebacterial L22; mammalian L17; plant L17 and yeast YL17.

    \ ' '4184' 'IPR002671' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L22e forms part of the 60S ribosomal subunit PUBMED:1840484. This family is found in eukaryotes. Rattus norvegicus (Rat) L22 is related to ribosomal proteins from other eukaryotes and is identical in amino acid sequence to human EAP, the EBER 1 (Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) encoded RNA) associated protein PUBMED:7999786.

    \ ' '4185' 'IPR013025' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This domain is found in both eukaryotic L25 and prokaryotic and eukaryotic L23 proteins.

    \ ' '4186' 'IPR005633' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The N-terminal domain appears to be specific to the eukaryotic ribosomal proteins L25, L23, and L23a.

    \ ' '4187' 'IPR000988' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaeabacterial ribosomal proteins can be grouped on the basis of sequence \ similarities. One of these families PUBMED:8048931 consists of mammalian ribosomal protein L24; yeast\ ribosomal protein L30A/B (Rp29) (YL21); Kluyveromyces lactis ribosomal protein L30; Arabidopsis thaliana \ ribosomal protein L24 homolog; Haloarcula marismortui ribosomal protein HL21/HL22; and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ1201. These proteins have 60 to 160 amino-acid residues.

    \ ' '4188' 'IPR020055' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The bacterial ribosomal protein L25 is an RNA binding protein. Ribosomal protein L25\ shows homology to general stress proteins and glutaminyl-tRNA synthetases PUBMED:9799245.

    \ ' '4189' 'IPR001684' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    L27 is a protein from the large (50S) subunit; it is essential for ribosome function, but its exact role is unclear. It belongs to a family of ribosomal proteins, examples of which are found in bacteria, chloroplasts of plants and red algae and the mitochondria of fungi (e.g. MRP7 from yeast mitochondria). The schematic relationship between these groups of proteins is shown below.\

    \
    Bacterial L27           Nxxxxxxxxx\
    Algal L27               Nxxxxxxxxx\
    Plant L27          tttttNxxxxxxxxxxxxx\
    Yeast MRP7           tttNxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
    \
    \'t\': transit peptide.\
    \'N\': N-terminal of mature protein.\
    

    \ ' '4190' 'IPR001141' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein, L27 is found in fungi, plants, algae and vertebrates\ PUBMED:8148381, PUBMED:8058833.\ The family has a specific signature at the C terminus.

    \ ' '4191' 'IPR001383' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The ribosomal L28 protein family include proteins from bacteria\ and chloroplasts. The L24 protein from yeast, found in the large subunit of the mitochodrial ribosome, contains a region similar to the bacterial L28 protein.

    \ ' '4192' 'IPR002672' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L28e forms part of the 60S ribosomal subunit PUBMED:1840484. This family is found in eukaryotes. In rat there are 9 or 10 copies of the L28 gene. The L28 protein contains a possible internal duplication of 9 residues PUBMED:2207170.

    \ ' '4193' 'IPR001854' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L29 is one of the proteins from the large ribosomal subunit. L29 belongs to a family of ribosomal proteins of 63 to 138 amino-acid residues which, on the basis of sequence similarities PUBMED:, groups:\

    \ ' '4194' 'IPR002673' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L29e forms part of the 60S ribosomal subunit PUBMED:1840484. This family is found in eukaryotes. There are there are 20 to 22 copies of the L29 gene in Rattus norvegicus (Rat). Rat L29 is related to yeast ribosomal protein YL43 PUBMED:8484767.

    \ ' '4195' 'IPR002171' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L2 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L2 is known to bind to the 23S rRNA and to have peptidyltransferase activity. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities PUBMED:1579444, PUBMED:, groups:

    \ \ ' '4196' 'IPR000597' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L3 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L3 is known to\ bind to the 23S rRNA and may participate in the formation of the peptidyltransferase centre of the ribosome. It\ belongs to a family of ribosomal proteins which, on the basis of sequence similarities includes bacterial, red algal, cyanelle, \ mammalian, yeast and Arabidopsis thaliana L3 proteins; archaeal Haloarcula marismortui\ HmaL3 (HL1), and yeast mitochondrial YmL9 PUBMED:1597181, PUBMED:1499563, PUBMED:2406244, PUBMED:.

    \ ' '4197' 'IPR000517' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L30 is one of the proteins from the large ribosomal subunit. L30 belongs to a family of ribosomal proteins which, on the basis of sequence similarities PUBMED:1549461, groups bacteria and archaea L30, yeast mitochondrial L33, and Drosophila melanogaster, Dictyostelium discoideum (Slime mold), fungal and mammalian L7 ribosomal proteins. L30 from bacteria are small proteins of about 60 residues, those from archaea are proteins of about 150 residues, and eukaryotic L7 are proteins of about 250 to 270 residues.

    \

    This entry represents the core domain of prokaryotic L30 and eukaryotic L7.

    \ ' '4198' 'IPR002150' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L31 is one of the proteins from the large ribosomal subunit. L31 is a protein of 66 to 97 amino-acid residues which has only been found so far in bacteria and in some plant and algal chloroplasts.

    \ ' '4199' 'IPR000054' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaebacterial large subunit ribosomal\ proteins can be grouped on the basis of sequence similarities.\ These proteins have 87 to 128 amino-acid residues. This family consists of:\

  • Yeast L34
  • \
  • Archaeal L31 PUBMED:2207169
  • \
  • Plants L31
  • \
  • Mammalian L31 PUBMED:3816785
  • \ ' '4200' 'IPR001515' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The L32e family consists of proteins that have 135 to 240 amino-acid residues.

    \ ' '4201' 'IPR002677' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L32p is part of the 50S ribosomal subunit. This family is found in both prokaryotes and eukaryotes. Ribosomal protein L32 of yeast binds to and regulates the splicing and the translation of the transcript of its own gene PUBMED:9121443}.

    \ ' '4202' 'IPR001705' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L33 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L33 has been shown to be on the surface of 50S subunit. L33 belongs to a family of ribosomal proteins which, on the basis of sequence similarities PUBMED:1742360, PUBMED:8112583, PUBMED:, groups:\

    \

    L33 is a small protein of 49 to 66 amino-acid residues.

    \ ' '4203' 'IPR000271' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L34 is one of the proteins from the large subunit of the prokaryotic ribosome. It is a small basic protein of 44 to 51 amino-acid residues PUBMED:1461740. L34 belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups: Eubacterial L34, Red algal chloroplast L34, Cyanelle L34.

    \ ' '4204' 'IPR008195' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaebacterial ribosomal proteins belong to the L34e\ family. These include, vertebrate L34, mosquito L31 PUBMED:8049275, plant L34 PUBMED:8075394,\ yeast putative ribosomal protein YIL052c and archaebacterial L34e.

    \ ' '4205' 'IPR001780' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of:\

    \

    These proteins have 87 to 110 amino-acid residues.

    \ ' '4206' 'IPR001706' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    L35 is a basic protein of 60 to 70 amino-acid residues from the large (50S) subunit PUBMED:3542048. Like many basic polypeptides, L35 completely inhibits ornithine decarboxylase when present unbound in the cell, but the inhibitory function is abolished upon its incorporation into ribosomes PUBMED:3542048. It belongs to a family of ribosomal proteins, including L35 from bacteria, plant chloroplast, red algae chloroplasts and cyanelles. In plants it is a nuclear encoded gene product, which suggests a chloroplast-to-nucleus relocation during the evolution of higher plants PUBMED:2271612.

    \ ' '4207' 'IPR000473' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \ Ribosomal protein L36 is the smallest protein from the large subunit of the prokaryotic ribosome. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities PUBMED: can be grouped into: bacterial L36; algal and plant chloroplast L36; Cyanelle L36. L36 is a small basic and cysteine-rich protein of 37 amino-acid residues.\ \ ' '4208' 'IPR000509' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. The L36E ribosomal family consists of mammalian, Caenorhabditis elegans and Drosophila L36, Candida albicans L39, and yeast YL39 ribosomal proteins PUBMED:8484789.

    \ ' '4209' 'IPR002674' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This ribosomal protein is found in archaebacteria and eukaryotes PUBMED:2546769. Ribosomal protein L37 has a single zinc finger-like motif of the C2-C2 type PUBMED:8484768.

    \ ' '4210' 'IPR002675' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L38e forms part of the 60S ribosomal subunit PUBMED:1840484. This family is found in eukaryotes.

    \ ' '4211' 'IPR000077' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities.\ These proteins are very basic. About 50 residues long, they are the smallest\ proteins of eukaryotic-type ribosomes.

    \ ' '4212' 'IPR002136' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family includes ribosomal L4/L1 from eukaryotes and plants and L4 from bacteria. L4 from yeast has been shown to bind rRNA PUBMED:9838082. These proteins have 246 (plant) to 427 (human) amino acids.

    \ ' '4213' 'IPR001975' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family contains the L40 ribosomal protein from both prokaryotes and eukaryotes. Bovine ribosomal protein L40 has been identified as a secondary RNA binding protein PUBMED:3129699. L40 is fused to a ubiquitin protein PUBMED:7488009.

    \ ' '4214' 'IPR007836' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \ L41 associates with the ribonucleoprotein particles of the 60S subunit late in the ribosomal maturation process. L41 is encoded by the smallest known open reading frame and in yeast is composed of only 24 amino acids, 17 of which are arginine or lysine.\ ' '4215' 'IPR000552' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence\ similarities. One of these families consists of mammalian PUBMED:3396452, Trypanosoma brucei,\ Caenorhabditis elegans and fungal L44, and Haloarcula marismortui LA PUBMED:8504167.

    \ ' '4216' 'IPR002132' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities PUBMED:2198942, PUBMED:2016059, PUBMED:1840500, PUBMED:, groups:

    \ \

    L5 is a protein of about 180 amino-acid residues.

    \ ' '4217' 'IPR002132' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities PUBMED:2198942, PUBMED:2016059, PUBMED:1840500, PUBMED:, groups:

    \ \

    L5 is a protein of about 180 amino-acid residues.

    \ ' '4218' 'IPR000915' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaeabacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families includes mammalian ribosomal protein L6 (L6 was previously known as TAX-responsive enhancer element binding protein 107); Caenorhabditis elegans ribosomal protein L6 (R151.3); Saccharomyces cerevisiae (Baker\'s yeast) ribosomal protein YL16A/YL16B; and Mesembryanthemum crystallinum (Common ice plant) ribosomal protein \ YL16-like. These proteins have 175 (yeast) to 287 (mammalian) amino acids.

    \ ' '4219' 'IPR005568' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding\ site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 contains two domains with almost identical folds, suggesting that is was derived by the duplication of an\ ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N-terminus being involved in protein-protein interactions and the C-terminus containing possible RNA-binding sites PUBMED:8262035.

    \ ' '4220' 'IPR004038' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, ribosomal protein L30 from eukaryotes and archaebacteria, Gadd45 and MyD118 PUBMED:9151207.

    \ ' '4221' 'IPR020069' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L9 is one of the proteins from the large ribosomal subunit.\ In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongs\ to a family of ribosomal proteins grouped on the basis of sequence similarities PUBMED:, PUBMED:8306963.

    \

    The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker PUBMED:12051860. Each domain contains an rRNA binding site, and the protein functions as a\ structural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an alpha-helix and a three-stranded mixed\ parallel, anti-parallel beta-sheet packed against the central alpha-helix. The long central alpha-helix is exposed to solvent in the middle and participates in the\ hydrophobic cores of the two domains at both ends.

    \ ' '4222' 'IPR020070' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein L9 is one of the proteins from the large ribosomal subunit.\ In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongs\ to a family of ribosomal proteins grouped on the basis of sequence similarities PUBMED:, PUBMED:8306963.

    \

    The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker PUBMED:12051860. Each domain contains an rRNA binding site, and the protein functions as a\ structural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an alpha-helix and a three-stranded mixed\ parallel, anti-parallel beta-sheet packed against the central alpha-helix. The long central alpha-helix is exposed to solvent in the middle and participates in the\ hydrophobic cores of the two domains at both ends.

    \ ' '4224' 'IPR001848' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA PUBMED:9281425. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.

    \

    The small ribosomal subunit protein S10 consists of about 100 amino acid residues. In Escherichia coli, S10 is involved in binding tRNA to the ribosome, and also operates as a transcriptional elongation factor PUBMED:8021936. Experimental evidence PUBMED:9371771 has revealed that S10 has virtually no groups exposed on the ribosomal surface, and is one of the "split proteins": these are a discrete group that are selectively removed from 30S subunits under low salt conditions and are required for the formation of activated 30S reconstitution intermediate (RI*) particles. S10 belongs to a family of proteins PUBMED:2179947 that includes: bacteria S10; algal chloroplast S10; cyanelle S10; archaebacterial S10; Marchantia polymorpha and Prototheca wickerhamii mitochondrial S10; Arabidopsis thaliana mitochondrial S10 (nuclear encoded); vertebrate S20; plant S20; and yeast URP2.

    \ ' '4225' 'IPR001971' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \ Ribosomal protein S11 PUBMED:3191988 plays an essential role in selecting the correct tRNA in protein biosynthesis. It is located on the large lobe of the small ribosomal subunit. On the basis of sequence similarities, S11 belongs to a family of bacterial, archaeal and eukaryotic ribosomal proteins PUBMED:.\ ' '4226' 'IPR006032' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S12 is one of the proteins from the small ribosomal subunit.\ In Escherichia coli, S12 is known to be involved in the translation initiation\ step. It is a very basic protein of 120 to 150 amino-acid residues. S12\ belongs to a family of ribosomal proteins which are grouped on the basis of sequence\ similarities. This protein is known typically as S12 in bacteria, S23 in eukaryotes and as either S12 or S23 in the Archaea PUBMED:.

    \

    Bacterial S12 molecules contain a conserved aspartic acid residue which undergoes a novel post-translational modification, beta-methylthiolation, to form the corresponding 3-methylthioaspartic acid.

    \ ' '4227' 'IPR001892' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S13 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S13 is known to be involved in binding fMet-tRNA and, hence, in the initiation of translation. It is a basic protein of 115 to 177 amino-acid residues that contains thee helices and a beta-hairpin in the core of the protein, forming a helix-two turns-helix (H2TH) motif, and a non-globular C-terminal extension. This family of ribosomal proteins is present in prokaryotes, eukaryotes and archaea PUBMED:1872840, PUBMED:.

    \ ' '4228' 'IPR001209' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    S14 is one of the proteins from the small ribosomal subunit.\ In Escherichia coli, S14 is known to be required for the assembly of 30S particles\ and may also be responsible for determining the conformation of 16S rRNA at the A site.\ It belongs to a family of ribosomal proteins PUBMED:8441676, PUBMED: that\ include, bacterial, algal and plant chloroplast, yeast mitochondrial, cyanelle and archael, Methanococcus vannielii S14\'s, as well as yeast mitochondrial MRP2,\ yeast YS29A/B and mammalian S29.

    \ ' '4229' 'IPR000589' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S15 is one of the proteins from the small ribosomal subunit. In Escherichia coli, this protein binds\ to 16S ribosomal RNA and functions at early steps in ribosome assembly. It belongs to a family of ribosomal proteins\ which, on the basis of sequence similarities PUBMED:, PUBMED:2263452,], groups bacterial and plant chloroplast S15;\ archaeal Haloarcula marismortui HmaS15 (HS11); yeast mitochondrial S28; and mammalian, yeast, Brugia pahangi\ and Wuchereria bancrofti S13. S15 is a protein of 80 to 250 amino-acid residues.

    \ ' '4230' 'IPR000307' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S16 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities PUBMED:, groups: \

    \ \ \ S16 proteins have about 100 amino-acid residues.

    \ ' '4231' 'IPR000266' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The ribosomal proteins catalyse ribosome assembly and stabilise the rRNA, tuning the structure of the ribosome for optimal function. Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA PUBMED:9281425. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.\ The small ribosomal subunit protein S17 is known to bind specifically to the 5\' end of 16S ribosomal RNA in Escherichia coli (primary rRNA binding protein), and is thought to be involved in the recognition of termination codons. Experimental evidence PUBMED:9371771 has revealed that S17 has virtually no groups exposed on the ribosomal surface.

    \ ' '4232' 'IPR001210' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaebacterial ribosomal proteins can be grouped\ in this family of ribosomal proteins, S17e. They include, vertebrate, Drosophila and\ Neurospora crassa (crp-3) S17\'s as well as yeast S17a (RP51A) and S17b (RP51B) and\ archaebacterial S17e PUBMED:3240863, PUBMED:2507396, PUBMED:6092944.

    \ ' '4233' 'IPR001648' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA PUBMED:9281425. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.

    \

    The small ribosomal subunit protein S18 is known to be involved in binding the aminoacyl-tRNA complex in Escherichia coli PUBMED:2647521, and appears to be situated at the tRNA A-site. Experimental evidence has revealed that S18 is well exposed on the surface of the E. coli ribosome, and is a secondary rRNA binding protein PUBMED:9371771. S18 belongs to a family of ribosomal proteins PUBMED:2179947 that includes: eubacterial S18; metazoan mitochondrial S18, algal and plant chloroplast S18; and cyanelle S18.

    \ \ ' '4234' 'IPR002222' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The small subunit ribosomal proteins can\ be categorised as: primary binding proteins, which bind directly and\ independently to 16S rRNA; secondary binding proteins, which display no\ specific affinity for 16S rRNA, but its assembly is contingent upon the\ presence of one or more primary binding proteins; and tertiary binding\ proteins, which require the presence of one or more secondary binding\ proteins and sometimes other tertiary binding proteins.\ The small ribosomal subunit protein S19 contains 88-144 amino acid residues.\ In Escherichia coli, S19 is known to form a complex with S13 that binds \ strongly to 16S ribosomal RNA. Experimental evidence PUBMED:9371771 has revealed that \ S19 is moderately exposed on the ribosomal surface, and is designated \ a secondary rRNA binding protein. S19 belongs to a family of ribosomal \ proteins PUBMED:9371771, PUBMED:2044758 that includes: eubacterial S19; algal and plant chloroplast \ S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and \ eukaryotic S15 (\'rig\' protein).

    \ ' '4235' 'IPR001266' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family includes a number of eukaryotic and archaebacterial ribosomal proteins; mammalian S19, Drosophila S19, Ascaris lumbricoides S19g (ALEP-1) and S19s, yeast YS16 \ (RP55A and RP55B), Aspergillus S16 and Haloarcula marismortui HS12.

    \ ' '4236' 'IPR001865' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal S2 proteins have been shown to belong to a family that includes 40S ribosomal subunit 40kDa proteins, putative laminin-binding proteins, NAB-1 protein and 29.3kDa protein from Haloarcula marismortui PUBMED:1531984, PUBMED:8119397. The laminin-receptor proteins are thus predicted to be the eukaryotic homologue of the eubacterial S2 risosomal proteins PUBMED:7899076.

    \ ' '4237' 'IPR002583' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family consists of bacterial (and chloroplast) examples of the ribosomal small subunit protein S20. Bacterial ribosomal protein S20 forms part of the 30S ribosomal subunit, and interacts with 16S rRNA.

    \ ' '4238' 'IPR001911' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Evidence suggests that, in prokaryotes, the peptidyl\ transferase reaction is performed by the large subunit 23S rRNA, whereas\ proteins probably have a greater role in eukaryote ribosomes. Most of the\ proteins lie close to, or on the surface of, the 30S subunit, arranged\ peripherally around the rRNA PUBMED:9281425. The small subunit ribosomal proteins can\ be categorised as primary binding proteins, which bind directly and\ independently to 16S rRNA; secondary binding proteins, which display no\ specific affinity for 16S rRNA, but its assembly is contingent upon the\ presence of one or more primary binding proteins; and tertiary binding\ proteins, which require the presence of one or more secondary binding\ proteins and sometimes other tertiary binding proteins.\ The small ribosomal subunit protein S21 contains 55-70 amino acid residues,\ and has only been found in eubacteria to date, though it has been reported that plant chloroplasts and mammalian mitochondria contain ribosomal subunit protein S21. Experimental evidence has\ revealed that S21 is well exposed on the surface of the Escherichia coli\ ribosome PUBMED:9371771, and is one of the \'split proteins\': these are a discrete group\ that are selectively removed from 30S subunits under low salt conditions\ and are required for the formation of activated 30S reconstitution\ intermediate (RI*) particles.

    \ ' '4239' 'IPR001931' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic ribosomal proteins can be grouped on the basis of\ sequence similarities. These proteins have 82 to 87 amino acids. The amino termini are all N alpha-acetylated. The N-terminal halves of the protein molecules are highly conserved in contrast to the carboxy-terminal parts PUBMED:3910104.

    \ ' '4240' 'IPR001976' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family contains the S24e ribosomal proteins from eukaryotes and archaebacteria. These proteins have 101 to 148 amino acids.

    \ ' '4241' 'IPR004977' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The S25 ribosomal protein is a component of the 40S ribosomal subunit.

    \ ' '4242' 'IPR000892' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One \ of these families, the S26E family, includes mammalian S26 PUBMED:2993263; Octopus S26 PUBMED:2731467;\ Drosophila S26 (DS31) PUBMED:2928115; plant cytoplasmic S26; and fungal S26 PUBMED:7821815. These proteins \ have 114 to 127 amino acids.

    \ ' '4243' 'IPR002906' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family of ribosomal proteins consists mainly of the 40S ribosomal protein S27a which is synthesized as a C-terminal extension of ubiquitin (CEP) (). The S27a\ domain compromises the C-terminal half of the protein.\ The synthesis of ribosomal proteins as extensions of ubiquitin promotes their incorporation into nascent ribosomes by a transient metabolic stabilisation and is required for efficient ribosome biogenesis PUBMED:2538753. The ribosomal extension protein S27a contains a basic region that is proposed to form a zinc finger; its fusion gene is proposed as a mechanism to maintain a fixed ratio between ubiquitin necessary for degrading proteins and ribosomes a\ source of proteins PUBMED:2538756.

    \ ' '4244' 'IPR000592' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence\ similarities. One of these families include mammalian, yeast, Chlamydomonas reinhardtii and Entamoeba histolytica\ S27, and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0250 PUBMED:8441676. These proteins have from 62 to 87 amino acids. They\ contain, in their central section, a putative zinc-finger region of the type C-x(2)-C-x(14)-C-x(2)-C.

    \ ' '4245' 'IPR000289' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaebacterial ribosomal proteins can be grouped \ on the basis of sequence similarities. Examples are:\ \

  • Mammalian S28 PUBMED:11875025
  • \
  • Plant S28 PUBMED:8278557
  • \
  • Fungi S33 PUBMED:1481571
  • \
  • Archaebacterial S28e.
  • \ \

    These proteins have from 64 to 78 amino acids and a highly conserved C-terminal extremity region.

    \ ' '4246' 'IPR006846' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This entry is for the ribosomal protein S30.

    \ ' '4247' 'IPR001351' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S3 is one of the proteins from the small ribosomal subunit. In \ Escherichia coli, S3 is known to be involved in the binding of initiator Met-tRNA. This family of ribosomal proteins includes S3 from bacteria, algae and \ plant chloroplast, cyanelle, archaebacteria, plant mitochondria, vertebrates, insects,\ Caenorhabditis elegans and yeast PUBMED:8036511. This entry is the C-terminal domain.

    \ ' '4248' 'IPR008282' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S3 is one of the proteins from the small ribosomal subunit. In \ Escherichia coli, S3 is known to be involved in the binding of initiator Met-tRNA. This family of ribosomal proteins includes S3 from bacteria, algae and \ plant chloroplast, cyanelle, archaebacteria, plant mitochondria, vertebrates, insects,\ Caenorhabditis elegans and yeast PUBMED:8036511. This entry is the N-terminal domain.

    \ ' '4249' 'IPR001593' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of proteins that have from 220 to 250 amino acids.

    \ ' '4250' 'IPR001912' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S4 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S4 is known to bind directly to 16S ribosomal RNA. Mutations in S4 have been shown to increase translational error frequencies PUBMED:2041737, PUBMED:.\ S4 is a protein of 171 to 205 amino-acid residues (except for NAM9, which is much larger). The crystal structure of a bacterial S4 protein revealed a two domain molecule. The first domain is composed of four helices in the known structure. The second domain is in the middle of the first one and displays some structural homology with the ETS DNA binding domain PUBMED:9707415.\ This family includes small ribosomal subunit S4 from prokaryotes and S9 from animals.

    \ ' '4251' 'IPR013845' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of \ sequence similarities. One of these families includes yeast S7 (YS6); archaeal S4e; and \ mammalian and plant cytoplasmic S4 PUBMED:2124517. Two highly similar isoforms of mammalian S4 \ exist, one coded by a gene on chromosome Y, and the other on chromosome X. These proteins have \ 233 to 264 amino acids.

    \ \

    This entry represents the central region of these proteins.

    \ ' '4252' 'IPR013810' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S5 is one of the proteins from the small ribosomal subunit, and is a protein of 166 to 254 amino-acid residues. In Escherichia coli, S5 is known to be important in the assembly and function of the 30S ribosomal subunit. Mutations in S5 have been shown to increase translational error frequencies. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities PUBMED:, PUBMED:2247072, groups bacterial, cyanelle, red algal chloroplast, archaeal and fungal mitochondrial S5; mammalian, Caenorhabditis elegans, Drosophila and plant S2; and yeast S4 (SUP44).

    \

    This entry represents the N-terminal domain of ribosomal protein S5, which has an alpha-beta(3)-alpha structure that folds into two layers, alpha/beta.

    \ ' '4253' 'IPR005324' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This is a family of proteins related to the 30S ribosomal protein S5P from Sulfolobus acidocaldarius (). Ribosomal protein S5 is one of the proteins from the small ribosomal subunit.\ In Escherichia coli, S5 is known to be important in the assembly and function of the 30S ribosomal subunit. Mutations in S5 have been shown to increase translational error frequencies.

    \ ' '4254' 'IPR000529' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S6 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S6 is known to bind together with S18 to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, red algal chloroplast and cyanelle S6 ribosomal proteins.

    \ ' '4255' 'IPR001377' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaebacterial ribosomal proteins have been grouped\ on the basis of sequence similarities. \ Ribosomal protein S6 is the major substrate of protein kinases in eukaryotic ribosomes PUBMED:8440735 and\ may play an important role in controlling cell growth and proliferation\ through the selective translation of particular classes of mRNA.

    \ ' '4256' 'IPR000235' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S7 is one of the proteins from the small ribosomal subunit.\ In Escherichia coli, S7 is known to bind directly to part of the 3\'end of 16S\ ribosomal RNA. It belongs to a family of ribosomal proteins which have been grouped on the\ basis of sequence similarities PUBMED:8338632, PUBMED:, PUBMED:8524651. The structure for S7 is known PUBMED:9331418.

    \ ' '4257' 'IPR000554' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities PUBMED:8371989.\ One of these families consists of Xenopus S8, and mammalian, insect and yeast S7. These proteins have about\ 200 amino acids.

    \ ' '4258' 'IPR000630' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S8 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S8 is known to bind\ directly to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence\ similarities PUBMED:, groups eubacterial, algal and plant chloroplast, cyanelle, archaebacterial and\ Marchantia polymorpha mitochondrial S8; mammalian and plant S15A; and yeast S22 (S24) ribosomal proteins.

    \ ' '4259' 'IPR000754' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Ribosomal protein S9 is one of the proteins from the small ribosomal subunit. It belongs to a\ family of ribosomal proteins which, on the basis of sequence similarities PUBMED:, PUBMED:2332055,\ groups bacterial; algal chloroplast; cyanelle and archaeal S9 proteins; and mammalian;\ plant; and yeast mitochondrial ribosomal S9 proteins. These proteins adopt a beta-alpha-beta fold similar to that found in numerous RNA/DNA-binding proteins, as well as in kinases from the GHMP kinase family PUBMED:8722013.

    \ ' '4260' 'IPR000056' '\ Ribulose-phosphate 3-epimerase () (also known as pentose-5-phosphate 3-epimerase or PPE) is the enzyme that converts D-ribulose 5-phosphate into D-xylulose 5-phosphate in Calvin\'s reductive pentose phosphate cycle. In Ralstonia eutropha (Alcaligenes eutrophus) two copies of the gene coding for PPE are known PUBMED:1429456, one is chromosomally encoded , the other one is on a plasmid . PPE has been found in a wide range of bacteria, archaebacteria, fungi and plants. All the proteins have from 209 to 241 amino acid residues. The enzyme has a TIM barrel structure.\ ' '4261' 'IPR002858' '\

    Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see PUBMED:10885986.

    \ Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading\ frames and the rif interspersed repetitive elements. Both families\ contain three predicted transmembrane segments. It has been proposed\ that stevor and rif are members of a larger superfamily that code\ for variant surface antigens PUBMED:9879895.\ ' '4262' 'IPR003117' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    In the absence of cAMP, Protein Kinase A (PKA) exists as an equimolar tetramer of regulatory (R) and catalytic (C) subunits PUBMED:11734894. In addition to its role as an inhibitor of the C subunit, the R subunit anchors the holoenzyme to specific intracellular locations and prevents the C subunit from entering the nucleus. All R subunits have a conserved domain structure consisting of the N-terminal dimerization domain, inhibitory region, cAMP-binding domain A and cAMP-binding domain B. R subunits interact with C subunits primarily through the inhibitory site. The cAMP-binding domains show extensive sequence similarity and bind cAMP cooperatively.

    \ \

    Two types of regulatory (R) subunits exist - types I and I - which differ in molecular weight, sequence, autophosphorylation cabaility, cellular location and tissue distribution. Types I and II were further sub-divided into alpha and beta subtypes, based mainly on sequence similarity. This entry represents types I-alpha, I-beta, II-alpha and II-beta regulatory subunits of PKA proteins. These subunits contain the dimerisation interface and binding site for A-kinase-anchoring proteins (AKAPs).

    \ ' '4263' 'IPR002676' '\

    The RimM protein is essential for efficient processing of 16S rRNA PUBMED:9422595. The RimM protein was shown to have affinity for free ribosomal 30S subunits but not for 30S subunits in the 70S ribosomes PUBMED:9422595.

    \ ' '4264' 'IPR015879' '\ Aromatic ring hydroxylating dioxygenases are multicomponent 1,2-dioxygenase complexes that convert closed-ring structures to non-aromatic cis-diols PUBMED:1885518. The complex has both hydroxylase and electron transfer components. The hydroxylase component is itself composed of two subunits: an alpha-subunit of about 50 kDa, and a beta-subunit of about 20 kDa. The electron transfer component is either composed of two subunits: a ferredoxin and a ferredoxin reductase or by a single bifunctional ferredoxin/reductase subunit. Sequence analysis of hydroxylase subunits of ring hydroxylating systems (including toluene, benzene and napthalene 1,2-dioxygenases) suggests they are derived from a common ancestor PUBMED:1885518. The alpha-subunit binds both a Rieske-like 2Fe-2S cluster and an iron atom: conserved Cys and His residues in the N-terminal region may provide 2Fe-2S ligands, while conserved His and Tyr residues may coordinate the iron. The beta subunit may be responsible for the substrate specificity of the dioxygenase system PUBMED:1885518.\ ' '4265' 'IPR001574' '\ A number of bacterial and plant toxins act by inhibiting protein synthesis in eukaryotic cells. The toxins of the shiga and ricin family inactivate 60S ribosomal subunits by an N-glycosidic cleavage which releases a specific adenine base from the sugar-phosphate backbone of 28S rRNA PUBMED:3276522, PUBMED:2714255, PUBMED:1742358. Members of the family include shiga and shiga-like toxins, and type I (e.g. trichosanthin and luffin) and type II (e.g. ricin, agglutinin and abrin) ribosome inactivating proteins (RIPs). All these toxins are structurally related. RIPs have been of considerable interest because of their potential use, conjugated with monoclonal antibodies, as immunotoxins to treat cancers. Further, trichosanthin has been shown to have potent activity against HIV-1-infected T cells and macrophages PUBMED:8066085. Elucidation of the structure-function relationships of RIPs has therefore become a major research effort. It is now known that RIPs are structurally related. A conserved glutamic residue has been implicated in the catalytic mechanism PUBMED:3357883; this lies near a conserved arginine, which also plays a role in catalysis PUBMED:8411176.\ ' '4266' 'IPR007040' '\

    This entry contains ribosome modulation factors (RMF). They associate with 70s ribosomes and converts them to a dimeric form (100S ribosomes) which appear during the transition from the exponential growth phase to the stationary phase of Escherichia colicells PUBMED:8440252, PUBMED:2181444.

    \ \

    It has been proposed that RMF mediates the formation of a \'storage ribosome\', the 100S particle, in stationary phase by inactivating excess ribosomes to protect them from degradation and to maintain the required balance between the concentrations of ribosomes and protein synthesis factors in order to maintain translational elongation efficiency PUBMED:16088400, PUBMED:17185553.

    \ ' '4267' 'IPR005913' '\

    dTDP-4-dehydrorhamnose reductase () catalyzes the last of 4 steps in making dTDP-rhamnose, a precursor of LPS molecules such as core antigen and O-antigen.\

    \ \ ' '4268' 'IPR004278' '\ Caliciviruses are a small round-structured virus group defined by RNA-dependent RNA polymerase and capsid diversity.\ ' '4269' 'IPR001205' '\

    RNA-directed RNA polymerase (RdRp) () is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage PUBMED:2759231, PUBMED:8709232. It catalyses synthesis of the RNA strand complementary to a given RNA template, but the precise molecular mechanism remains unclear.\ The postulated RNA replication process is a two-step mechanism. First, the initiation step of RNA synthesis begins at or near the 3\' end of the RNA template by means of a primer-independent (de novo) mechanism. The de novo initiation consists in the addition of a nucleotide tri-phosphate (NTP) to the 3\'-OH of the first initiating NTP. During the following so-called elongation phase, this nucleotidyl transfer reaction is repeated with subsequent NTPs to generate the complementary RNA product PUBMED:11531403.

    \

    All the RNA-directed RNA polymerases, and many DNA-directed polymerases, employ a fold whose organisation has been likened to the shape of a right hand with three subdomains termed fingers, palm and thumb PUBMED:9309225. Only the palm subdomain, composed of a four-stranded antiparallel beta-sheet with two alpha-helices, is well conserved among all of these enzymes. In RdRp, the palm subdomain comprises three well conserved motifs (A, B and C). Motif A (D-x(4,5)-D) and motif C (GDD) are spatially juxtaposed; the Asp residues of these motifs are implied in the binding of Mg2+ and/or Mn2+. The Asn residue of motif B is involved in selection of ribonucleoside triphosphates over dNTPs and thus determines whether RNA is synthesised rather than DNA PUBMED:10827187.\ The domain organisation PUBMED:9878607 and the 3D structure of the catalytic centre of a wide range of RdPp\'s, even those with a low overall sequence homology, are conserved. The catalytic centre is formed by several motifs containing a number of conserved amino acid residues.

    \

    There are 4 superfamilies of viruses that cover all RNA containing viruses with no DNA stage:

    \ The RNA-directed RNA polymerases in the first of the above superfamilies can be divided into the following three subgroups:\

    \ \

    This entry represents RNA-directed RNA polymerase found in many positive strand RNA eukaryotic viruses viruses. It is part of the genome polyprotein that contains other polypeptides such as coat proteins VP1 to VP4, core proteins P2A to P2C and P3A, genome-linked protein VPG and picornain 3C ().

    \ \

    Structural studies indicate that these proteins form the "right hand" structure found in all oligonucleotide polymerases, containing thumb, finger and palm domains, and also the additional bridging finger and thumb domains unique to RNA-directed RNA polymerases PUBMED:15306852, PUBMED:15296746.

    \ ' '4270' 'IPR001788' '\

    This entry represents RNA dependent RNA polymerases found in several types of viruses PUBMED:8269709, especially those with a tripartite genome (RNA1, RNA2 and RNA3) and an encapsidated subgenomic RNA (RNA4) from which the coat protein is expressed, such as Cucumber mosaic virus (strain NT9) (CMV). This entry contains the following proteins:

    \ \ ' '4271' 'IPR000605' '\

    Helicases have been classified in 5 superfamilies (SF1-SF5). All of the\ proteins bind ATP and, consequently, all of them carry the classical Walker A\ (phosphate-binding loop or P-loop) and Walker B\ (Mg2+-binding aspartic acid) motifs. Superfamily 3 consists of helicases\ encoded mainly by small DNA viruses and some large nucleocytoplasmic DNA\ viruses PUBMED:11689653, PUBMED:15037234. Small viruses are very dependent on the host-cell machinery to\ replicate. SF3 helicase in small viruses is associated with an origin-binding\ domain. By pairing a domain that recognises the ori with a helicase, the virus\ can bypass the host-cell-based regulation pathway and initiate its own\ replication. The protein binds to the viral ori leading to origin unwinding.\ Cellular replication proteins are then recruited to the ori and the viral DNA\ is replicated.

    \ \

    In SF3 helicases the Walker A and Walker B motifs are separated by spacers of\ rather uniform, and relatively short, length. In addition to the A and B\ motifs this family is characterised by a third motif (C) which resides between\ the B motif and the C-terminus of the conserved region. This motif consists of\ an Asn residue preceded by a run of hydrophobic residues PUBMED:2156730.

    \ \

    Several structures of SF3 helicases have been solved PUBMED:12774115. They\ all possess the same core alpha/beta fold, consisting of a five-stranded\ parallel beta sheet flanked on both sides by several alpha helices. In\ contrast to SF1 and SF2 helicases, which have RecA-like core folds, the strand\ connectivity within the alpha/beta core domain is that of AAA+ proteins PUBMED:15718137.\ The SF3 helicase proteins assemble into a hexameric ring.

    \ \

    Some proteins known to contain an SF3 helicase domain are listed below:\

    \

    \ \

    The entry represents the core alpha/beta fold of the SF3 helicase domain found predominantly in DNA viruses.

    \ ' '4272' 'IPR002092' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \

    This is a family of single chain polymerases, which\ are evolutionary related, and which are related to the T3/T7 bacteriophage polymerases PUBMED:7526118.

    \ ' '4273' 'IPR011260' '\

    The core of the bacterial RNA polymerase (RNAP) consists of four subunits, two alpha, a beta and a beta\', which are conserved from bacteria to mammals. The alpha subunit (RpoA) initiates RNAP assembly by dimerising to form a platform on which the beta subunits can interact. The alpha subunit consists of a N-terminal domain (NTD) and a C-terminal domain (CTD), connected by a short linker. The NTD is essential for RNAP assembly, while the CTD is necessary for transcription regulation, interacting with transcription factors and promoter upstream elements. In Escherichia coli, the catabolite activator protein (CAP or CRP) was shown to exert its effect through its interactions with the CTD, where CAP binding to CTD promotes RNAP binding to promoter DNA, thereby stimulating transcription initiation at class I CAP-dependent promoters. At class II CAP-dependent promoters, the interaction of CAP with CTD is one of multiple interactions involved in activation PUBMED:12202833.

    \

    The CTD has a compact structure of four helices and two long arms enclosing its hydrophobic core, making its folding topology distinct from most other binding proteins. The upstream promoter element-binding site is formed from helices 1 and 4 PUBMED:7491496.

    \ ' '4274' 'IPR007759' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \ \ \

    The delta protein is a dispensable subunit of Bacillus subtilis RNA polymerase (RNAP) that has major effects on the biochemical properties of the purified enzyme. In the presence of delta, RNAP displays an increased specificity of transcription, a decreased affinity for nucleic acids, and an increased efficiency of RNA synthesis because of enhanced recycling PUBMED:10336502. The delta protein, contains two distinct regions, an N-terminal domain and a glutamate and aspartate residue-rich C-terminal region PUBMED:7545758.

    \ ' '4275' 'IPR007224' '\

    The RNA polymerase I specific transcription initiation factor is a member of a multiprotein complex essential for the initiation of transcription by RNA polymerase I. Binding to the DNA template is dependent on the initial binding of other factors.

    \ ' '4276' 'IPR011261' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \

    RNA polymerase (RNAP) II, which is responsible for all mRNA synthesis in eukaryotes, consists of 12 subunits. Subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the archaeal RNAP D/L heterodimer, and to the prokaryotic RNAP alpha subunit (RpoA) homodimer. In each case, they play a key role in RNAP assembly by forming a platform on which the catalytic subunits (eukaryotic Rpb1/Rpb2, and prokaryotic beta/beta\') can interact PUBMED:11453250. These different subunits share regions of homology required for dimerisation. In eukaryotic Rpb11 and archaeal L subunits, the dimerisation domain consists of a contiguous Rpb11-like domain, whereas in eukaryotic Rpb3, archaeal D and bacterial RpoA subunits (), the dimerisation domain consists of the Rpb11-like domain interrupted by an insert domain. In the prokaryotic alpha subunit, this dimerisation domain is the N-terminal domain PUBMED:9657722.

    \ ' '4277' 'IPR001529' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \

    In archaebacteria, there is generally a single form of RNA polymerase which also consist of an oligomeric assemblage of 10 to 13 polypeptides.\ It has recently been shown PUBMED:8265347, PUBMED:8417319 that small subunits of about 15 kDa, found in polymerase types I and II, are highly conserved. These proteins contain a probable zinc finger in their N-terminal region and a C-terminal zinc ribbon domain (see ).

    \ ' '4278' 'IPR000268' '\ In eukaryotes, there are three different forms of DNA-dependent RNA polymerases () transcribing different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of RNA polymerase which also consists of an oligomeric assemblage of 10 to 13 polypeptides.\ Archaebacterial subunit N (gene rpoN) PUBMED:7597027 is a small protein of about 8 kDa, it\ is evolutionary related PUBMED:8045907 to a 8.3 kDa component shared by all three forms of\ eukaryotic RNA polymerases (gene RPB10 in yeast and POLR2J in mammals) as well\ as to African swine fever virus (ASFV) protein CP80R PUBMED:11831707.\ \ There is a conserved region which is located at the\ N-terminal extremity of these polymerase subunits; this region contains two\ cysteines that binds a zinc ion PUBMED:10841539.\ ' '4280' 'IPR007073' '\ RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain, domain 7, represents a mobile module of the RNA polymerase. Domain 7 interacts with the lobe domain of Rpb2 () PUBMED:8910400, PUBMED:11313498.\ ' '4281' 'IPR007644' '\

    RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain forms one of the two distinctive lobes of the Rpb2 structure. This domain is also known as the protrusion domain PUBMED:3116266. The other lobe, RNA polymerase Rpb2, domain 2, is nested within this domain.

    \ ' '4282' 'IPR007642' '\

    RNA polymerases catalyse the DNA-dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Rpb2 is the second largest subunit of the RNA polymerase. This domain forms one of the two distinctive lobes of the Rpb2 structure. This domain is also known as the lobe domain PUBMED:11313498. DNA has been demonstrated to bind to the concave surface of the lobe domain, and plays a role in maintaining the transcription bubble. Many of the bacterial members contain large insertions within this domain, a region known as dispensable region 1 (DRI).

    \ ' '4283' 'IPR007646' '\ RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Domain 4, is also known as the external 2 domain PUBMED:11313498.\ ' '4284' 'IPR000783' '\

    Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; ) that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region (), plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH) PUBMED:10841538, PUBMED:10191143, PUBMED:1729711, PUBMED:10841537.

    \

    This entry represents prokaryotic subunit H and the C-terminal domain of eukaryotic RPB5, which share a two-layer alpha/beta fold, with a core structure of beta/alpha/beta/alpha/beta(2).

    \ ' '4285' 'IPR005571' '\

    Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; ) that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH) () PUBMED:10841538, PUBMED:10191143, PUBMED:1729711, PUBMED:10841537.

    \ \ \

    This entry represents the N-terminal domain of eukaryotic RPB5, which has a core structure consisting of 3 layers alpha/beta/alpha PUBMED:. The N-terminal domain is involved in DNA binding and is part of the jaw module in the RNA pol II structure PUBMED:10784442. This module is important for positioning the downstream DNA.

    \ ' '4286' 'IPR006110' '\

    In eukaryotes, there are three different forms of DNA-dependent RNA polymerases () transcribing different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of RNA polymerase which also consists of an oligomeric assemblage of 10 to 13 polypeptides. A component of 14 to 18 kDa shared by all three forms of eukaryotic RNA polymerases and which has been sequenced in budding yeast (gene RPB6 or RPO26), in Schizosaccharomyces pombe (Fission yeast) (gene rpb6 or rpo15), in human and in African swine fever virus (ASFV) is evolutionary related to the archaebacterial subunit K (gene rpoK). The archaebacterial protein is colinear with the C-terminal part of the eukaryotic subunit.

    \

    The structures of the omega subunit and RBP6, and the structures of the omega/beta\' and RPB6/RPB1 interfaces, suggest a molecular mechanism for the function of omega and RPB6 in promoting RNAP assembly and/or stability. The conserved regions of omega and RPB6 form a compact structural domain that interacts simultaneously with conserved regions of the largest RNAP subunit and with the C-terminal tail following a conserved region of the largest RNAP subunit. The second half of the conserved region of omega and RPB6 forms an arc that projects away from the remainder of the structural domain and wraps over and around the C-terminal tail of the largest RNAP subunit, clamping it in a crevice, and threading the C-terminal tail of the largest RNAP subunit through the narrow gap between omega and RPB6 PUBMED:11158566.

    \ ' '4287' 'IPR005576' '\

    The eukaryotic RNA polymerase subunits RPB4 and RPB7 form a heterodimer that reversibly associates with the RNA polymerase II core. Archaeal cells contain a single RNAP made up of about 12 subunits, displaying considerable homology to the eukaryotic RNAPII subunits. The RPB4 and RPB7 homologs are called subunits F and E, respectively, and have been shown to form a stable heterodimer. While the RPB7 homologue is reasonably well conserved, the similarity between the eukaryotic RPB4 and the archaeal F subunit is barely detectable PUBMED:11741548.

    \ ' '4288' 'IPR005570' '\ Rpb8 is a subunit common to the three yeast RNA polymerases, pol I, II and III. Rpb8 interacts with the largest subunit Rpb1, and with Rpb3 and Rpb11, two smaller subunits.\ ' '4289' 'IPR007832' '\ The family comprises a subunit specific to RNA Pol III, the tRNA specific polymerase. The C34 subunit of Saccharomyces cerevisiae RNA Pol III is part of a subcomplex of three subunits which have no counterpart in the other two nuclear RNA polymerases. This subunit interacts with TFIIIB70 and therefore participates in Pol III recruitment PUBMED:9312031.\ ' '4290' 'IPR007811' '\ This family comprises a specific subunit for Pol III, the tRNA specific polymerase.\ ' '4291' 'IPR005093' '\

    RNA-directed RNA polymerase (RdRp) () is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage PUBMED:2759231, PUBMED:8709232. It catalyses synthesis of the RNA strand complementary to a given RNA template, but the precise molecular mechanism remains unclear.\ The postulated RNA replication process is a two-step mechanism. First, the initiation step of RNA synthesis begins at or near the 3\' end of the RNA template by means of a primer-independent (de novo) mechanism. The de novo initiation consists in the addition of a nucleotide tri-phosphate (NTP) to the 3\'-OH of the first initiating NTP. During the following so-called elongation phase, this nucleotidyl transfer reaction is repeated with subsequent NTPs to generate the complementary RNA product PUBMED:11531403.

    \

    All the RNA-directed RNA polymerases, and many DNA-directed polymerases, employ a fold whose organisation has been likened to the shape of a right hand with three subdomains termed fingers, palm and thumb PUBMED:9309225. Only the palm subdomain, composed of a four-stranded antiparallel beta-sheet with two alpha-helices, is well conserved among all of these enzymes. In RdRp, the palm subdomain comprises three well conserved motifs (A, B and C). Motif A (D-x(4,5)-D) and motif C (GDD) are spatially juxtaposed; the Asp residues of these motifs are implied in the binding of Mg2+ and/or Mn2+. The Asn residue of motif B is involved in selection of ribonucleoside triphosphates over dNTPs and thus determines whether RNA is synthesised rather than DNA PUBMED:10827187.\ The domain organisation PUBMED:9878607 and the 3D structure of the catalytic centre of a wide range of RdPp\'s, even those with a low overall sequence homology, are conserved. The catalytic centre is formed by several motifs containing a number of conserved amino acid residues.

    \

    There are 4 superfamilies of viruses that cover all RNA containing viruses with no DNA stage:

    \ The RNA-directed RNA polymerases in the first of the above superfamilies can be divided into the following three subgroups:\

    \ \

    This is a family of Leviviridae RNA replicases.

    \ ' '4292' 'IPR002738' '\

    Members of this protein family are part of the ribonuclease P complex () that takes part in endonucleolytic cleavage of RNA, removing 5\'-extra-nucleotide from tRNA precursor. This process is essential for tRNA processing.

    \ ' '4293' 'IPR002759' '\

    This family contains proteins found in some eukaryotes and archaebacteria that are related to yeast ribonuclease P. This enzyme is essential for tRNA processing generating 5\'-termini of mature tRNA\ molecules PUBMED:7731988. tRNA processing enzyme ribonuclease P (RNase P) consists of an RNA molecule associated with at least eight protein subunits, hPop1, Rpp14, Rpp20, Rpp25, Rpp29,\ Rpp30, Rpp38, and Rpp40 PUBMED:10024167.

    \ ' '4295' 'IPR001427' '\ Pancreatic ribonucleases (RNAse) are pyrimidine-specific endonucleases \ found in high quantity in the pancreas of certain mammals and of\ some reptiles PUBMED:3940901. Specifically, the enzymes are involved in endonucleolytic\ cleavage of 3\'-phosphomononucleotides and 3\'-phosphooligonucleotides ending\ in C-P or U-P with 2\',3\'-cyclic phosphate intermediates. Ribonuclease can\ unwind the DNA helix by complexing with single-stranded DNA; the complex\ arises by an extended multi-site cation-anion interaction between lysine\ and arginine residues of the enzyme and phosphate groups of the nucleotides.\ Other proteins belonging to the pancreatic RNAse family include: bovine\ seminal vesicle and brain ribonucleases; kidney non-secretory ribonucleases\ PUBMED:2734298; liver-type ribonucleases PUBMED:2611266; angiogenin, which induces vascularisation\ of normal and malignant tissues; eosinophil cationic protein PUBMED:2473157, a\ cytotoxin and helminthotoxin with ribonuclease activity; and frog liver\ ribonuclease and frog sialic acid-binding lectin.\ The sequence of pancreatic RNases contains four conserved disulphide bonds and\ three amino acid residues involved in the catalytic activity.\ ' '4296' 'IPR002156' '\

    The RNase H domain is responsible for hydrolysis of the RNA portion of RNA x DNA hybrids, and this activity requires the presence of divalent cations (Mg2+ or Mn2+) that bind its active site. This domain is a part of a large family of homologous RNase H enzymes of which the RNase HI protein from Escherichia coli is the best characterised PUBMED:9741851. Secondary structure predictions for the enzymes from E. coli, yeast, human liver and diverse retroviruses (such as Rous sarcoma virus and the Foamy viruses) supported, in every case, the five beta-strands (1 to 5) and four or five alpha-helices (A, B/C, D, E) that have been identified by crystallography in the RNase H domain of Human immunodeficiency virus 1 (HIV-1) reverse transcriptase and in E. coli RNase H PUBMED:10603172. Reverse transcriptase (RT) is a modular enzyme carrying polymerase and ribonuclease H (RNase H) activities in separable domains. Reverse transcriptase (RT) converts the single-stranded RNA genome of a retrovirus into a double-stranded DNA copy for integration into the host genome. This process requires ribonuclease H as well as RNA- and DNA-directed DNA polymerase activities.

    \ \

    Retroviral RNase H is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. Bacterial RNase H catalyses endonucleolytic cleavage to 5\'-phosphomonoester acting on RNA-DNA hybrids.

    \ \

    The 3D structure of the RNase H domain from diverse bacteria and retroviruses\ has been solved PUBMED:2169648, PUBMED:8108376, PUBMED:1707186. All have four beta\ strands and four to five alpha helices. The E. coli RNase H1 protein\ binds a single Mg2+ ion cofactor in the active site of the enzyme. The\ divalent cation is bound by the carboxyl groups of four acidic residues,\ Asp-10, Glu-48, Asp-70, and Asp-134 PUBMED:8108376. The first three acidic residues are\ highly conserved in all bacterial and retroviral RNase H sequences.\

    \ ' '4297' 'IPR001900' '\

    This group of bacterial and eukaryotic proteins represent both characterised and related sequences to exoribonuclease II (RNase II)and ribonuclease R; a bacterial 3\' --> 5\' exoribonuclease homologous to RNase II PUBMED:11948193,PUBMED:15604703,PUBMED:9829834.

    \ \

    The size of these proteins range from 644 residues (rnb) to 1250 (SSD1). While their sequence is highly divergent they share a conserved domain in their C-terminal section PUBMED:9241229. It is possible that this domain plays a role in the exonuclease function.

    \ ' '4298' 'IPR003667' '\

    The rnf genes of Rhodobacter capsulatus, essential for nitrogen fixation, are thought to encode a system for electron transport to nitrogenase. The rnfABCDGEH operon comprises seven genes that show similarities in gene arrangement and deduced protein sequences to homologous regions in the genomes of Haemophilus influenzae and Escherichia coli. Four of the rnf gene products were found to be similar in sequence to components of an Na+-dependent NADH:ubiquinone oxidoreductase (NQR) from Vibrio alginolyticus PUBMED:9492268. The NQR-type enzyme of Klebsiella pneumoniae was shown to catalyse sodium-dependent NADH oxidation in the respiratory chain PUBMED:15063750.

    \ ' '4299' 'IPR004942' '\

    This family includes proteins that are about 100 amino acids long and have been shown to be related PUBMED:11084347. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions PUBMED:10402468. This family also includes Golgi-associated MP1 adapter protein () and MglB from Myxococcus xanthus (), a protein involved in gliding motility PUBMED:2464581. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role.

    \ ' '4300' 'IPR000600' '\

    A family of bacterial proteins has been described which groups transcriptional repressors, sugar kinases and yet uncharacterised open reading frames PUBMED:7952186. This family, known as ROK (Repressor, ORF, Kinase) includes the xylose operon repressor, xylR, from Bacillus subtilis, Lactobacillus pentosus and Staphylococcus xylosus; N-acetylglucosamine repressor, nagC, from Escherichia coli; glucokinase from Streptomyces coelicolor; fructokinase from Pediococcus pentosaceus, Streptococcus mutans and Zymomonas mobilis; allokinase and mlc from E. coli; and E. coli hypothetical proteins yajF and yhcI and the corresponding Haemophilus influenzae proteins. The repressor proteins (xylR and nagC) from this family possess an N-terminal region not present in the sugar kinases and which contains an helix-turn-helix DNA-binding motif.

    \ ' '4301' 'IPR006064' '\

    These enzymes correspond to Agrobacterium rolC and were characterised along with rolB. RolB and rolC were originally classified as glycoside hydrolase family 40 and 41 respectively. RolB has subsequently been shown PUBMED:8596628 to have tyrosine phosphatase activity.

    \ ' '4302' 'IPR000769' '\

    The Rop protein regulates plasmid DNA replication by modulating the initiation of transcription \ of the primer RNA precursor. Processing of the precursor, RNAII, is inhibited by hydrogen bonding \ of RNAII to its complementary sequence in RNAI. Rop increases the affinity of RNAI for RNAII and \ thus decreases the rate of replication initiation events. The 3D structure of Rop has been \ determined by X-ray crystallography and refined to 1.7A resolution. The 63 amino acid protein is \ a homodimer, each monomer consisting almost entirely of two alpha-helices, the whole molecule \ forming a highly regular four-alpha-helix bundle PUBMED:3681971. This can be approximated by a \ four-stranded rope, with radius 7.0 A, a left-handed helical twist, and pitch 172.5 A. A very compact \ packing of side chains in the helix interfaces of the Rop coiled-coil structure is presumed to \ account for its high stability PUBMED:1841691. The overall details of the structure have been\ confirmed by proton NMR PUBMED:1841691, PUBMED:2223771.

    \ \ ' '4303' 'IPR001385' '\ Rotaviruses consist of three concentric protein shells. The intermediate\ (middle) protein layer contains VP6, the major internal structural protein. VP6 is the most \ abundant protein in the virion and is involved in virion assembly,\ VP6 possesses the ability to interact with VP2, VP4 and VP7 PUBMED:9266993, PUBMED:8057471.\ ' '4304' 'IPR002512' '\ Rotaviruses are dsRNA viruses that appear to infect a wide range of mammals. Gene 11 product is a non-structural phosphoprotein designated as NS26, now more commonly annotated as non-structural protein 5 (NSP5) PUBMED:2548010.\ ' '4305' 'IPR003668' '\

    Rotavirus non-structural protein 2 (NSP2) is a basic protein which possesses RNA-binding activity and is essential for genome replication PUBMED:8380660. It may also be important for viral RNA packaging.

    \ ' '4306' 'IPR002148' '\ The proteins in this entry are variously described as either non-structural protein 1 (NSP1) or non-structural RNA-binding protein 53(NS53). They are RNA binding proteins that contain a characteristic cysteine rich region PUBMED:8395125, PUBMED:9015101. They are made at low levels in infected cells and are a component of early replication and are known to accumulate on the cytoskeleton of the infected cell.\ ' '4307' 'IPR006950' '\ This family comprises the 11 kDa non-structural proteins found in segment S11 of the Rotavirus genome. They may form part of a complex that is involved in the replication of the genome.\ ' '4308' 'IPR002873' '\

    This family consists of rotaviral non-structural RNA binding protein 34 (NS34 or NSP3). The NSP3 protein has been shown to bind viral RNA. The NSP3 protein consists of 3 conserved functional domains; a basic region which binds ssRNA, a region containing heptapeptide repeats mediating oligomerisation and a leucine zipper motif PUBMED:1326821. NSP3 may play a central role in replication and assembly of genomic RNA structures PUBMED:1326821. Rotaviruses have a dsRNA genome and are a major cause cause of acute gastroenteritis in the young of many species PUBMED:7871749.

    \ ' '4309' 'IPR002107' '\ This entry contains rotaviral non-structural protein 4 (NSP4) as well as related proteins: NSP5, NS28, and NCVP5. The final steps in the assembly of rotavirus occur in the lumen of the endoplasmic reticulum (ER). Targeting of the immature inner capsid particle (ICP) to this compartment is mediated by the cytoplasmic tail of NSP4, located in the ER membrane PUBMED:2548854, PUBMED:8887538.\ ' '4310' 'IPR007779' '\ Rotavirus particles consist of three concentric proteinaceous capsid layers. The innermost capsid (core) is made of VP2. The genomic RNA and the two minor proteins VP1 and VP3 are encapsidated within this layer PUBMED:8178489. The N terminus of rotavirus VP2 is necessary for the encapsidation of VP1 and VP3 PUBMED:9420216.\ ' '4311' 'IPR000297' '\

    Recommended name: Peptidylprolyl isomerase

    \

    Synonyms for proteins with this domain are: Peptidyl-prolyl cis-trans isomerase, PPIase, rotamase, cyclophilin, FKBP65

    \ \

    Peptidylprolyl isomerase () is an\ enzyme that accelerates protein folding by catalyzing the cis-trans\ isomerization of proline imidic peptide bonds in oligopeptides PUBMED:2186809. It has been reported in bacteria and eukayotes.

    \ ' '4312' 'IPR005090' '\

    Proteins in this group have homology with the RepC protein of Agrobacterium Ri and Ti plasmids PUBMED:7991675. They may be involved in plasmid replication and stabilisation functions.

    \ ' '4313' 'IPR004294' '\

    Carotenoids such as beta-carotene, lycopene, lutein and beta-cryptoxanthine are produced in plants and certain bacteria, algae and fungi, where they function as accessory photosynthetic pigments and as scavengers of oxygen radicals for photoprotection. They are also essential dietary nutrients in animals. Carotenoid oxygenases cleave a variety of carotenoids into a range of biologically important products, including apocarotenoids in plants that function as hormones, pigments, flavours, floral scents and defence compounds, and retinoids in animals that function as vitamins, visual pigments and signalling molecules PUBMED:14704328. Examples of carotenoid oxygenases include:

    \

    \ ' '4314' 'IPR003315' '\

    This entry represents the zinc-binding domain found in rabphilin Rab3A, as well as synaptotagmin-like proteins. The small G protein Rab3A plays an important role in the regulation of neurotransmitter release. The crystal structure of the small G protein Rab3A complexed with the effector domain of rabphilin-3A shows that the effector domain of rabphilin-3A contacts Rab3A in two distinct areas. The first interface involves the Rab3A switch I and switch II regions, which are sensitive to the nucleotide-binding state of Rab3A. The second interface consists of a deep pocket in Rab3A that interacts with a SGAWFF structural element of rabphilin-3A. Sequence and structure analysis, and biochemical data suggest that this pocket, or Rab complementarity-determining region (RabCDR), establishes a specific interaction between each Rab protein and its effectors. It has been suggested that RabCDRs could be major determinants of effector specificity during vesicle trafficking and fusion PUBMED:10025402.

    \ ' '4315' 'IPR007485' '\ The Escherichia coli family member has been named Rare lipoprotein B (RplB). Thioglyceride and N-fatty acyl residues may be attached to the N-terminal cysteine, which is conserved in this family. RplB is speculated to be involved in cell duplication PUBMED:3316191.\ ' '4317' 'IPR007175' '\ This family contains a ribonuclease P subunit of human and yeast. Other members of the family include the probable archaeal homologues. This subunit possibly binds the precursor tRNA PUBMED:11497433.\ ' '4318' 'IPR002661' '\

    The ribosome recycling factor or ribosome release factor (RRF) dissociates ribosomes from mRNA after termination of translation, and is essential for bacterial growth PUBMED:8183897. Thus ribosomes are \'recycled\' and ready for another round of protein synthesis.

    \ ' '4319' 'IPR000504' '\

    Many eukaryotic proteins containing one or more copies of a putative RNA-binding domain of about 90 amino acids are known to bind single-stranded RNAs PUBMED:3072706, PUBMED:3192525, PUBMED:3313012. The largest group of single strand RNA-binding proteins is the eukaryotic RNA recognition motif (RRM) family that contains an eight amino acid RNP-1 consensus sequence PUBMED:2470643, PUBMED:2467746. RRM proteins have a variety of RNA binding preferences and functions, and include heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing (SR, U2AF, Sxl), protein components of small nuclear ribonucleoproteins (U1 and U2 snRNPs), and proteins that regulate RNA stability and translation (PABP, La, Hu) PUBMED:3192525, PUBMED:3313012, PUBMED:2467746. The RRM in heterodimeric splicing factor U2 snRNP auxiliary factor (U2AF) appears to have two RRM-like domains with specialised features for protein recognition PUBMED:15231733. The motif also appears in a few single stranded DNA binding proteins.

    \

    The typical RRM consists of four anti-parallel beta-strands and two alpha-helices arranged in a beta-alpha-beta-beta-alpha-beta fold with side chains that stack with RNA bases. Specificity of RNA binding is determined by multiple contacts with surrounding amino acids. A third helix is present during RNA binding in some cases PUBMED:8290338. The RRM is reviewed in a number of publications PUBMED:1716386, PUBMED:15853797, PUBMED:16387655.

    \ ' '4320' 'IPR001737' '\

    This family of proteins include rRNA adenine dimethylases (e.g. KsgA) and the Erythromycin resistance methylases (Erm).

    \ \

    The bacterial enzyme KsgA catalyses the transfer of a total of four methyl groups from S-adenosyl-l-methionine (S-AdoMet) to two adjacent adenosine bases in 16S rRNA. This enzyme and the resulting modified adenosine bases appear to be conserved in all species of eubacteria, eukaryotes, and archaea, and in eukaryotic organelles. Bacterial resistance to the aminoglycoside antibiotic kasugamycin involves inactivation of KsgA and resulting loss of the dimethylations, with modest consequences to the overall fitness of the organism. In contrast, the yeast ortholog, Dim1, is essential. In Saccharomyces cerevisiae (Baker\'s yeast), and presumably in other eukaryotes, the enzyme performs a vital role in pre-rRNA processing in addition to its methylating activity. The best conserved region in these enzymes is located in the N-terminal section and corresponds to a region that is probably involved in S-adenosyl methionine (SAM) binding domain.

    \ \

    The crystal structure of KsgA from Escherichia coli has been solved to a resolution of 2.1A. It bears a strong similarity to the crystal structure of ErmC\' from Bacillus stearothermophilus and a lesser similarity to the yeast mitochondrial transcription factor, sc-mtTFB PUBMED:15136037.

    \ \

    The Erm family of RNA methyltransferases, which methylate a single adenosine base in 23S rRNA confer resistance to the MLS-B group of\ antibiotics. Despite their sequence similarity, the two enzyme families have strikingly different levels of regulation that remain to be elucidated. Other orthologs, of this family include the yeast and Homo sapiens (Human) mitochondrial transcription factors (MTF1 and h-mtTFB respectively), which are nuclear encoded PUBMED:11567089. Human-mtTFB is able to stimulate transcription in vitro independently of its S-adenosylmethionine binding and rRNA methyltransferase activity PUBMED:12897151.

    \ ' '4321' 'IPR007448' '\ This family includes bacterial transcriptional regulators that are thought to act through an interaction with the conserved region 4 of the sigma(70) subunit of RNA polymerase. The Pseudomonas aeruginosa homologue, AlgQ, positively regulates virulence gene expression and is associated with the mucoid phenotype observed in P. aeruginosa isolates from cystic fibrosis patients.\ ' '4322' 'IPR005573' '\

    Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters PUBMED:9159523. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress.

    \ ' '4323' 'IPR005572' '\

    Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters PUBMED:9159523. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress.

    \ ' '4324' 'IPR007359' '\

    This bacterial family of integral membrane proteins represents a positive regulator of the sigma(E) transcription factor, namely RseC/MucC. The sigma(E) transcription factor is up-regulated by cell envelope protein misfolding, and regulates the expression of genes that are collectively termed ECF (devoted to Extra-Cellular Functions) PUBMED:9159522. In Pseudomonas aeruginosa, derepression of sigma(E) is associated with the alginate-overproducing phenotype characteristic of chronic respiratory tract colonization in cystic fibrosis patients. The mechanism by which RseC/MucC positively regulates the sigma(E) transcription factor is unknown. RseC is also thought to have a role in thiamine biosynthesis in Salmonella typhimurium PUBMED:9335303. In addition, this family also includes an N-terminal part of RnfF, a Rhodobacter capsulatus protein, of unknown function, that is essential for nitrogen fixation. This protein also contains a domain found in ApbE protein , which is itself involved in thiamine biosynthesis.

    \ ' '4325' 'IPR004336' '\ The molecular structure and function of the NS2 protein is not known. However, mutants lacking the NS2 grow at\ slower rates when compared to the wild-type yet NS2 is not essential for viral replication PUBMED:9847328.\ ' '4326' 'IPR007568' '\ This family is comprised of fungal proteins with multiple transmembrane regions. RTA1 () is involved in resistance to 7-aminocholesterol PUBMED:8660468, while RTM1 () confers resistance to an unknown toxic chemical in molasses PUBMED:7672593. These proteins may bind to the toxic substance, and thus prevent toxicity. They are not thought to be involved in the efflux of xenobiotics PUBMED:8660468.\ ' '4327' 'IPR003432' '\ The bacterial replication terminator protein (RTP) plays a role in the termination of DNA replication by impeding replication fork movement. Two RTP dimers bind to the two inverted repeat regions at the termination site.\ ' '4328' 'IPR018504' '\

    Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior PUBMED:1558765. Four principal exotoxin secretion systems have been described. In the type II and IV secretion systems, toxins are first exported to the periplasm by way of a cleaved N-terminal signal sequence; a second set of proteins is used for extracellular transport (type II), or the C terminus of the exotoxin itself is used (type IV). Type III secretion involves at least 20 molecules that assemble into a needle; effector proteins are then translocated through this without need of a signal sequence. In the Type I system, a complete channel is formed through both membranes, and the secretion signal is carried on the C-terminus of the exotoxin.

    \

    The RTX (repeats in toxin) family of cytolytic toxins belong to the Type I \ secretion system, and are important virulence factors in Gram-negative bacteria. As well as the C-terminal signal sequence, several glycine-rich\ repeats are also found. These are essential for binding calcium, and are critical for the biological activity of the secreted toxins PUBMED:8800842. All RTX toxin operons exist in the order rtxCABD, RtxA protein being the structural\ component of the exotoxin, both RtxB and D being required for its export from the bacterial cell; RtxC is an acyl-carrier-protein-dependent acyl- modification enzyme, required to convert RtxA to its active form PUBMED:10470043.

    \

    Escherichia coli haemolysin (HlyA) is often quoted as the model for RTX \ toxins. Recent work on its relative rtxC gene product HlyC PUBMED:9521785 has revealed that it provides the acylation aspect for post-translational modification of two internal lysine residues in the HlyA protein. To cause pathogenicity, the HlyA toxin must first bind Ca2+ ions to the set of glycine-rich repeats and then be activated by HlyC PUBMED:8808931. This has been demonstrated both in vitro and in vivo.

    \ \

    A number of the sequences in this family are metallopeptidases belonging to MEROPS peptidase family M10 (clan MA(M)), subfamily M10B: serralysin, epralysin and unassigned peptidases.

    \ ' '4329' 'IPR000685' '\

    Ribulose bisphosphate carboxylase (RuBisCO) PUBMED:6351728, PUBMED:12221984 catalyses the initial step in Calvin\'s reductive pentose phosphate cycle in plants as well as purple and green bacteria. It consists of a large catalytic unit and a small subunit of undetermined function. In plants, the large subunit is coded by the chloroplastic genome while the small subunit is encoded in the nuclear genome. Molecular activation of RuBisCO by CO2 involves the formation of a carbamate with the epsilon-amino group of a conserved lysine residue. This carbamate is stabilised by a magnesium ion. One of the ligands of the magnesium ion is an aspartic acid residue close to the active site lysine PUBMED:1969412.

    \ ' '4330' 'IPR017444' '\ Ribulose bisphosphate carboxylase (RuBisCO) PUBMED:6351728, PUBMED:12221984 catalyses the initial step in Calvin\'s reductive pentose phosphate cycle in plants as well as purple and green bacteria. It consists of a large catalytic unit and a small subunit of undetermined function. In plants, the large subunit is coded by the chloroplast genome while the small subunit is encoded in the nuclear genome. Molecular activation of RuBisCO by CO2 involves the formation of a carbamate with the epsilon-amino group of a conserved lysine residue. This carbamate is stabilised by a magnesium ion. One of the ligands of the magnesium ion is an aspartic acid residue close to the active site lysine PUBMED:1969412.\ ' '4331' 'IPR000894' '\ RuBisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) is a bifunctional enzyme that catalyses \ both the carboxylation and oxygenation of ribulose-1,5-bisphosphate (RuBP) PUBMED:, thus \ fixing carbon dioxide as the first step of the Calvin cycle. RuBisCO is the major protein in the \ stroma of chloroplasts, and in higher plants exists as a complex of 8 large and 8 small subunits. \ The function of the small subunit is unknown PUBMED:3012537. While the large subunit is coded for by \ a single gene, the small subunit is coded for by several different genes, which are distributed in a \ tissue specific manner. They are transcriptionally regulated by light receptor phytochrome PUBMED:3010233, \ which results in RuBisCO being more abundant during the day when it is required.\ ' '4332' 'IPR003251' '\

    Rubrerythrin (Rr), found in anaerobic sulphate-reducing bacteria PUBMED:7830612, is a fusion protein containing an N-terminal diiron-binding\ domain and a C-terminal domain homologous to rubredoxin PUBMED:1657933. The physiological role of Rr has not been identified.

    \ \

    The 3-D structure of Desulphovibrio vulgaris rubrerythrin has been solved PUBMED:8646540. The structure reveals a tetramer of two-domain\ subunits. In each monomer, the N-terminal 146 residues form a four-alpha-helix bundle containing the diiron-oxo site (centre I), and the C-terminal 45 residues form a rubredoxin-like FeS4 domain.

    \ ' '4333' 'IPR013849' '\

    In prokaryotes, RuvA, RuvB, and RuvC process the universal DNA intermediate of homologous recombination, termed Holliday junction. The tetrameric DNA helicase RuvA specifically binds to the Holliday junction and facilitates the isomerization of the junction from the stacked folded configuration to the square-planar structure PUBMED:12408833. In the RuvA tetramer, each subunit consists of three domains, I, II and III, where I and II form the major core that is responsible for Holliday junction binding and base pair rearrangements of Holliday junction executed at the crossover point, whereas domain III regulates branch migration through direct contact with RuvB.

    \ \

    This entry represents domain I of RuvA, which has an OB-fold structure. This domain forms the RuvA tetramer contacts PUBMED:8832889.

    \ ' '4334' 'IPR002176' '\

    The Escherichia coli ruvC gene is involved in DNA repair and in the late step of RecE and RecF pathway recombination PUBMED:1661673. RuvC protein () cleaves cruciform junctions, which are formed by the extrusion of inverted repeat sequences from a super-coiled plasmid and which are structurally analogous to Holliday junctions, by introducing nicks into strands with the same polarity. The nicks leave a 5\'terminal phosphate and a 3\'terminal hydroxyl group which are ligated by E. coli or Bacteriophage T4 DNA ligases. Analysis of the cleavage sites suggests that DNA topology rather than a particular sequence determines the cleavage site. RuvC protein also cleaves Holliday junctions that are formed between gapped circular and linear duplex DNA by the function of RecA protein. The active form of RuvC protein is a dimer. This is mechanistically suited for an endonuclease involved in swapping DNA strands at the crossover junctions. It is inferred that RuvC protein is an endonuclease that resolves Holliday structures in vivo PUBMED:1661673.

    \

    RucC is a small protein of about 20 kD. It requires and binds a magnesium ion. The structure of E. coli ruvC is a 3-layer alpha-beta sandwich containing a 5-stranded beta-sheet sandwiched between 5 alpha-helices PUBMED:8057369.

    \ ' '4335' 'IPR003035' '\ This domain is named RWP-RK after a conserved motif at the C terminus of the domain. The domain is found\ in algal minus dominance proteins as well as plant proteins involved in nitrogen-controlled development PUBMED:10647012.\ ' '4336' 'IPR000699' '\

    Ryanodine and Inositol 1,4,5-trisphosphate (IP3) receptors are intracellular Ca2+-release channels. They become activated upon binding of their respective ligands, Ca2+ and IP3, opening an intrgral Ca2+ channel. Ryanodine receptor activation is a key component of muscular contraction, their activation allowing release of Ca2+ from the sarcoplasmic reticulum. Mutations in the ryanodine receptor lead to malignant hyperthermia susceptibility the and central core disease of muscle.

    \ ' '4337' 'IPR003032' '\ This domain is called RyR for Ryanodine receptor PUBMED:10664581. The domain is found in four copies in the ryanodine receptor. The function of this domain is unknown.\ ' '4338' 'IPR003029' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    The S1 domain was originally identified in ribosomal protein S1 but is found in a large number of RNA-associated proteins. The structure of the S1 RNA-binding domain from the Escherichia coli polynucleotide phosphorylase has been determined using NMR methods and consists of a five-stranded antiparallel beta barrel. Conserved residues on one face of the barrel and adjacent loops form the putative RNA-binding site PUBMED:9008164.

    \

    The structure of the S1 domain is very similar to that of cold shock proteins. This suggests that they may both be derived from an ancient nucleic acid-binding protein PUBMED:9008164.

    \

    More information about these proteins can be found at Protein of the Month: RNA Exosomes PUBMED:.

    \ ' '4339' 'IPR002133' '\

    S-adenosylmethionine synthetase (MAT, ) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP PUBMED:1696256. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.

    \

    In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.

    \

    The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits,\ resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the\ structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex,\ and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance PUBMED:1213535.

    \ ' '4340' 'IPR002133' '\

    S-adenosylmethionine synthetase (MAT, ) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP PUBMED:1696256. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.

    \

    In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.

    \

    The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits,\ resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the\ structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex,\ and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance PUBMED:1213535.

    \ ' '4341' 'IPR002133' '\

    S-adenosylmethionine synthetase (MAT, ) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP PUBMED:1696256. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.

    \

    In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.

    \

    The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits,\ resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the\ structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex,\ and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance PUBMED:1213535.

    \ ' '4342' 'IPR003726' '\ S-methylmethionine: homocysteine methyltransferase from Escherichia coli accepts selenohomocysteine as a substrate. S-methylmethionine is an abundant plant product that can be utilised for methionine biosynthesis PUBMED:9882684. Human methionine synthase (5-methyltetrahydrofolate:L-homocysteine\ S-transmethylase; ) shares 53 and 63% identity with the E. coli and the presumptive Caenorhabditis elegans proteins, respectively, and contains all residues implicated in B12 binding to the E. coli protein PUBMED:9013615. Betaine--homocysteine S-methyltransferase () converts betaine and homocysteine to dimethylglycine and methionine, respectively. This reaction is also required for the irreversible oxidation of choline PUBMED:8798461.\ ' '4343' 'IPR006779' '\

    S1FA is an unusual small plant peptide of only 70 amino acids with a basic\ domain which contains a nuclear localization signal and a putative DNA binding helix. S1FA is highly conserved\ between dicotyledonous and monocotyledonous plants and may be a DNA-binding protein that specifically recognises the negative promoter element S1F PUBMED:7739894.

    \ ' '4344' 'IPR006380' '\

    This family of sequences represent sucrose phosphate phosphohydrolase (SPP) from plants and cyanobacteria PUBMED:11050182. SPP is a member of the Class IIB subfamily of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. SPP catalyzes the final step in the biosynthesis of sucrose, a critically important molecule for plants. Sucrose phosphate synthase (SPS), the prior step in the biosynthesis of sucrose contains a domain which exhibits considerable similarity to SPP albeit without conservation of the catalytic residues. The catalytic machinery of the synthase resides in another domain. It seems likely that the phosphatase-like domain is involved in substrate binding, possibly binding both substrates in a "product-like" orientation prior to ligation by the synthase catalytic domain.

    \ ' '4345' 'IPR013787' '\

    The calcium-binding domain found in S100 and CaBP-9k proteins is a subfamily of the EF-hand calcium-binding domain PUBMED:15284904. S100s are small dimeric acidic calcium and zinc-binding proteins abundant in the brain, with S100B playing an important role in modulating the proliferation and differentiation of neurons and glia cells PUBMED:15006498. S100 proteins have two different types of calcium-binding sites: a low affinity one with a special structure, and a \'normal\' EF-hand type high-affinity site.

    \

    Calbindin-D9k (CaBP-9k) also belong to this family of proteins, but it does not form dimers. CaBP-9k is a cytosolic protein expressed in a variety of tissues. Although its precise function is unknown, it appears to be under the control of the steroid hormones oestrogen and progesterone in the female reproductive system PUBMED:16288660. In the intestine, CaBP-9k may be involved in calcium absorption by mediating intracellular diffusion PUBMED:12520541.

    \

    This entry represents a subdomain of the calcium-binding domain found in S100, CaBP-9k, and related proteins.

    \ \ \ ' '4346' 'IPR006454' '\

    These sequences represent one of several families of proteins associated with the formation of prokaryotic S-layers. Members of this family are found in archaeal species, including Pyrococcus horikoshii (split into two tandem reading frames), Methanocaldococcus jannaschii (Methanococcus jannaschii), and related species. Some local similarity can be found to other S-layer protein families.

    \ ' '4347' 'IPR006454' '\

    These sequences represent one of several families of proteins associated with the formation of prokaryotic S-layers. Members of this family are found in archaeal species, including Pyrococcus horikoshii (split into two tandem reading frames), Methanocaldococcus jannaschii (Methanococcus jannaschii), and related species. Some local similarity can be found to other S-layer protein families.

    \ ' '4348' 'IPR000858' '\

    In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles PUBMED:7672580. Most of the proteins within this family contain apple-like domain (), which is predicted to possess protein- and/or carbohydrate-binding functions.

    \ ' '4349' 'IPR001673' '\

    Several Dictyostelium species have proteins that contain conserved repeats. These proteins have been variously described as \'extracellular matrix protein B\', \'cyclic nucleotide phosphodiesterase inhibitor precursor\', \'prestalk protein precursor\', \'putative calmodulin-binding protein CamBP64\', and \'cysteine-rich, acidic integral membrane protein precursor\' as well as \'hypothetical protein\'. The repeats are not confined to Dictyostelium spp, they occur in the Ascomycete Trichoderma harzianum (Hypocrea lixii) in one of the conidiospore surface proteins, .

    \ ' '4350' 'IPR000096' '\ The serum amyloid A (SAA) proteins comprise a family of vertebrate proteins\ that associate predominantly with high density lipoproteins (HDL) PUBMED:7504491, PUBMED:8188253. The\ synthesis of certain members of the family is greatly increased (as much as a\ 1000 fold) in inflammation; thus making SAA a major acute phase reactant.\ While the major physiological function of SAA is unclear, prolonged elevation\ of plasma SAA levels, as in chronic inflammation, however, results in a\ pathological condition, called amyloidosis, which affects the liver, kidney\ and spleen and which is characterised by the highly insoluble accumulation of\ SAA in these tissues.\ SAA are proteins of about 110 amino acid residues. The most highly conserved \ region is located in the central part of the sequence.\ ' '4351' 'IPR007545' '\ Lysine-oxoglutarate reductase/Saccharopine dehydrogenase (LOR/SDH) is a bifunctional enzyme. This conserved region is commonly found immediately N-terminal to saccharopine dehydrogenase conserved region () in eukaryotes PUBMED:9426595, PUBMED:9654071.\ ' '4352' 'IPR007226' '\ Toxoplasma gondii is a persistent protozoan parasite capable of infecting almost any warm-blooded vertebrate. The surface of T. gondii is coated with a family of developmentally regulated glycosylphosphatidylinositol (GPI)-linked proteins (SRSs), of which SAG1 is the prototypic member. SRS proteins mediate attachment to host cells and interface with the host immune response to regulate the virulence of the parasite. The structure of the immunodominant SAG1 antigen reveals a homodimeric configuration PUBMED:9418898. This family of surface antigens is found in other apicomplexans.\ ' '4353' 'IPR001636' '\

    Phosphoribosylaminoimidazole-succinocarboxamide synthase () (SAICAR synthetase) catalyzes the seventh step in the de novo purine biosynthetic pathway; the ATP-dependent conversion of 5\'-phosphoribosyl-5-aminoimidazole-4-carboxylic acid and aspartic acid to SAICAR PUBMED:1574589.

    \

    In bacteria (purC), fungi (ADE1) and plants (Pur7), SAICAR synthetase is a monofunctional protein; in animals it is the N-terminal domain of a bifunctional enzyme that also catalyse phosphoribosylaminoimidazole carboxylase (AIRC) activity (see ).

    \ \ ' '4354' 'IPR003284' '\ Salmonella typhimurium contains a 90kb plasmid that is associated with\ virulence PUBMED:2164511. This plasmid encodes at least 6 genes needed by the bacterium for invading host macrophages during infection. These include the\ 70kDa mkaA protein PUBMED:1657882, a recognised virulence factor, and more recently described, four spv genes under the control of a regulator PUBMED:8483415. The spv genes are induced under carbon-poor conditions at a stationary phase of growth, and their expression is under the control of both the spvR regulator, and the katF locus in Salmonella. It has been proposed that individual spv proteins may be required at different time points during \ infection PUBMED:9234805.\

    SpvB is a 65kDa protein that has been localised to the bacterial cytoplasm \ PUBMED:9234805. Its expression peaks during early stationary phase, but declines as the latent phase of the infection is reached, suggesting a role in initiating virulence.

    \ ' '4355' 'IPR018003' '\

    This entry included insecticidal toxin complex proteins (TcaA, TccA, TcbA, TcdA) from Photorhabdus luminescens subsp. laumondii and Xenorhabdus nematophilus (Achromobacter nematophilus) PUBMED:16509446, and virulence proteins from Salmonella typhimurium that are encoded on a 90kb plasmid.

    \

    P. luminescens and X. nematophilus are Gram-negative bacteria that form entomopathogenic symbioses with soil nematodes. The bacteria are found in the gut of entomopathogenic nematodes that invade and kill insects. When the nematode invades an insect host the bacteria are released into the insect haemocoel (the open circulatory system), both the bacteria and the nematode undergo multiple rounds of replication which kills the insect host. Mapping of the insecticidal toxin loci and studies on knockout mutants in P. luminescens showed that deletion of either tca or tcd loci dramatically reduced toxicity, while the double mutant tca/tcd abolished toxicity PUBMED:10383860. However the biology of toxin action is unclear as is the species range of insects the toxins are active against.

    \

    S. typhimurium contains a 90kb plasmid that is associated with\ virulence. This plasmid encodes at least 6 genes needed by the\ bacterium for invading host macrophages during infection. These include\ the 70kDa mkaA protein PUBMED:2164511, a recognised virulence factor, and more recently described, four spv genes under the control of a regulator PUBMED:8483415.

    \

    Deletion studies on the virulence plasmid have shown that an open reading \ frame encoding a 28kDa protein was needed for successful invasion of the \ host. This protein, designated mkfA PUBMED:2164511, VRP4 PUBMED:2696057 or VirA PUBMED:1657882 by different\ groups, is utilised by the microbe upon entry into macrophages, although the \ exact mechanism is unclear.

    \ ' '4356' 'IPR002657' '\

    This family of proteins are found both in prokaryotes and eukaryotes. They are related to the human bile acid:sodium symporters, which are transmembrane proteins functioning in the liver in the uptake of bile acids from portal blood plasma, a process mediated by the co-transport of Na+ PUBMED:1961729.

    \

    In yeast, overexpression of the ACR3 gene confers an arsenite- but not an arsenate-resistance phenotype PUBMED:9234670.

    \ ' '4357' 'IPR003519' '\

    Salmonella typhimurium contains a 90kb plasmid that is associated with\ virulence. This plasmid encodes at least 6 genes needed by the \ bacterium for invading host macrophages during infection. These include \ the 70kDa mkaA protein PUBMED:2164511, a recognised virulence factor.

    \

    Deletion studies into the virulence plasmid have shown that an open reading\ frame encoding a 28kDa protein was needed for successful invasion of the \ host. This protein, designated mkfA PUBMED:2164511, VRP4 PUBMED:2696057 or VirA PUBMED:1657882 by different\ groups, is utilised by the microbe upon entry into macrophages, although the\ exact mechanism is unclear.

    \ ' '4358' 'IPR001985' '\

    S-adenosylmethionine decarboxylase (AdoMetDC) PUBMED:10378277 catalyzes the removal of the carboxylate group of S-adenosylmethionine to form S-adenosyl-5\'-3-methylpropylamine which then acts as the n-propylamine group donor in the synthesis of the polyamines spermidine and spermine from putrescine.

    \

    The catalytic mechanism of AdoMetDC involves a covalently-bound pyruvoyl group. This group is post-translationally generated by a self-catalyzed intramolecular proteolytic cleavage reaction between a glutamate and a serine. This cleavage generates two chains, beta (N-terminal) and alpha (C-terminal). The N-terminal serine residue of the alpha chain is then converted by nonhydrolytic serinolysis into a pyruvyol group.

    \ ' '4359' 'IPR003119' '\

    Saposins are small lysosomal proteins that serve as activators of various\ lysosomal lipid-degrading enzymes PUBMED:7595087. They probably act by isolating the\ lipid substrate from the membrane surroundings, thus making it more \ accessible to the soluble degradative enzymes. All mammalian saposins\ are synthesized as a single precursor molecule (prosaposin) which contains\ four Saposin-B domains, yielding the active saposins after proteolytic\ cleavage, and two Saposin-A domains that are removed in the activation\ reaction. \ The Saposin-B domains also occur in other \ proteins, many of them active in the lysis of membranes PUBMED:8003971, PUBMED:8868085. The saposin A-type domain may play a role in targeting, as propeptides containing the saposin A-type domain of the C-terminus of prosaposin and of the N-terminal part of pulmonary surfactant-associated protein B are involved in the transport to the lysosome and to secretory granules (lamellar bodies, which are lysosomal-like organelles), respectively PUBMED:8702672.

    \ ' '4360' 'IPR004297' '\ Members of this family are found in Solanaceae spp. plants, a taxonomic group (family) that includes pepper and tobacco\ plant species. Synthesis of these proteins is induced by Tobacco mosaic virus and salicylic acid PUBMED:1477404; indeed they\ are thought to be involved in the development of systemic acquired resistance (SAR) after an initial hypersensitive\ response to microbial infection PUBMED:1477404, PUBMED:10888849. SAR is characterised by long-lasting resistance to infection by a wide range of\ pathogens, extending to plant tissues distant from the initial infection site PUBMED:10888849.\ ' '4361' 'IPR006875' '\ The dystrophin glycoprotein complex (DGC) is a membrane-spanning complex that links the interior cytoskeleton to the extracellular matrix in muscle. The sarcoglycan complex is a subcomplex within the DGC and is composed of several muscle-specific, transmembrane proteins (alpha-, beta-, gamma-, delta- and zeta-sarcoglycan). The sarcoglycans are asparagine-linked glycosylated proteins with single transmembrane domains. This family contains beta, gamma and delta members PUBMED:12107060, PUBMED:12189167.\ ' '4362' 'IPR005011' '\ This family of proteins appear to contain a leucine zipper PUBMED:10887110 and may therefore be a family of transcription factors.\ ' '4363' 'IPR001448' '\

    Small, acid-soluble spore proteins (SASP or ASSP) are proteins bound to the spore DNA of bacteria of the genera Bacillus, Thermoactynomycetes, and\ Clostridium PUBMED:3059997, PUBMED:1569005. They are double-stranded DNA-binding\ proteins that cause DNA to change to an A-like conformation. They protect the\ DNA backbone from chemical and enzymatic cleavage and are thus involved in\ dormant spore\'s high resistance to UV light. SASP are degraded in the first\ minutes of spore germination and provide amino acids for both new protein\ synthesis and metabolism.

    \

    There are two distinct families of SASP: the alpha/beta type and the gamma-\ type. Alpha/beta SASP are small proteins of about sixty to seventy amino acid\ residues that are generally coded by a multigene family. The N terminus of\ alpha/beta SASP contains the site which is cleaved by a SASP-\ specific protease that acts during germination while the C terminus and is probably involved in DNA-binding.

    \ ' '4364' 'IPR006341' '\

    This is a family of small, glutamine and asparagine-rich peptides that store amino acids in the spores of Bacillus subtilis and related bacteria. Most members of the family have two copies of the spore protease (GPR) cleavage motif, typically EFASE in this family, separating three low-complexity repeats.

    \ ' '4365' 'IPR005597' '\

    The monomer of the Satellite tobacco mosaic virus (STMV) protein contains a "jelly-roll" motif. The narrow end of the jelly roll forms fivefold contacts about a Ca2+ ion. Electron density maps suggest that double-helical RNA segments are associated with each coat protein dimer PUBMED:8553559.

    \ ' '4366' 'IPR004333' '\ The SBP plant protein domain is a sequence\ specific DNA-binding domain PUBMED:8569690. Proteins with this domain probably function as transcription factors involved in the control of\ early flower development. The domain contains 10 conserved cysteine and histidine residues that probably are zinc\ ligands.\ ' '4367' 'IPR000914' '\

    Bacterial high affinity transport systems are involved in active transport of solutes across the cytoplasmic membrane. The protein components of these traffic systems include one or two transmembrane protein components, one or two membrane-associated ATP-binding proteins and a high affinity periplasmic \ solute-binding protein. The latter are thought to bind the substrate in the vicinity of the inner membrane, and to transfer it to a complex of inner membrane proteins for concentration into Gram-positive bacteria which are surrounded by a single membrane and therefore have no periplasmic region \ the equivalent proteins are bound to the membrane via an N-terminal lipid anchor. These homologue proteins do not play an integral role in the transport process per se, but probably serve as receptors to trigger or initiate translocation of the solute throught the membrane by binding to external sites of the integral membrane proteins of the efflux system. In addition at least some solute-binding proteins function in the initiation of sensory transduction pathways. On the basis of sequence similarities, the vast majority of these solute-binding proteins can be grouped PUBMED:8336670 into eight families of clusters, which generally correlate with the nature of the solute bound. Family 5 currently includes periplasmic oligopeptide-binding proteins (oppA) of Gram-negative bacteria and homologous lipoproteins in Gram-positive bacteria (oppA, amiA or appA); periplasmic dipeptide-binding proteins of Escherichia coli (dppA) and Bacillus subtilis (dppE); periplasmic murein peptide-binding protein of E. coli (mppA); periplasmic peptide-binding proteins sapA of E. coli, Salmonella typhimurium and Haemophilus influenzae; periplasmic nickel-binding protein (nikA) of E. coli;\ haem-binding lipoprotein (hbpA or dppA) from H. influenzae; lipoprotein xP55 from Streptomyces lividans; and hypothetical proteins from H. influenzae (HI0213) and Rhizobium sp. (strain NGR234) symbiotic plasmid (y4tO and y4wM).

    \ ' '4368' 'IPR018389' '\

    This family of proteins are involved in binding extracellular solutes for transport across the bacterial cytoplasmic membrane. This family includes a C4-dicarboxylate-binding protein DctP PUBMED:16262798, PUBMED:1809844 and the sialic acid-binding protein SiaP. The structure of the SiaP receptor has revealed an overall topology similar to ATP binding cassette ESR (extracytoplasmic solute receptors) proteins PUBMED:16702222. Upon binding of sialic acid, SiaP undergoes domain closure about a hinge region and kinking of an alpha-helix hinge component PUBMED:16702222.

    \ ' '4369' 'IPR006127' '\

    This is a family of ABC transporter metal-binding lipoproteins. An example is the periplasmic zinc-binding protein TroA that interacts with an ATP-binding cassette transport system in Treponema pallidum and plays a role in the transport of zinc across the cytoplasmic membrane. Related proteins are found in both Gram-positive and Gram-negative bacteria.

    \ ' '4370' 'IPR007500' '\

    This is a domain of unknown function found at the N-terminus of genes involved in cell wall development and nitrous oxide protection.

    \ \

    ScdA is required for normal cell growth and development; mutants have an increased level of peptidoglycan cross-linking and aberrant cellular morphology suggesting a role for ScdA in cell wall metabolism PUBMED:9308171.

    \ \

    NorA1, NorA2, and YtfE are involved in the nitrous oxide response. NorA1 and NorA2, which are similar to YtfE, are co-transcribed with the membrane-bound nitrous oxide (NO) reductases. The genes appear to be involved in NO protection but their function is unknown PUBMED:11069685, PUBMED:15546870.\

    \ ' '4371' 'IPR006160' '\ Members of this family may be short chain fatty acid transporters although there has been no experimental characterisation of this function.\ ' '4372' 'IPR007575' '\ Members of this entry have only been identified in species of the Streptomyces genus. Two family members are known to be part of gene clusters involved in the synthesis of polyketide-based spore pigments, homologous to clusters involved in the synthesis of polyketide antibiotics. The function of this protein is unknown, but it has been speculated to contain a NAD(P) binding site PUBMED:8344517.\ ' '4373' 'IPR003782' '\

    This family is involved in biogenesis of respiratory and photosynthetic systems. In yeast the SCO1 protein is specifically required for a post-translational step in the accumulation of subunits 1 and 2 of cytochrome c oxidase (COXI and COX-II) PUBMED:1944230. It is a mitochondrion-associated cytochrome c oxidase assembly factor.

    \

    The purple nonsulphur photosynthetic eubacterium Rhodobacter capsulatus is a versatile organism that can obtain cellular energy by several means, including the capture of light energy for photosynthesis as well as the use of light-independent respiration, in which molecular oxygen serves as a terminal electron acceptor. The SenC protein is required for optimal cytochrome c oxidase activity in aerobically grown R. capsulatus cells and is involved in the induction of structural polypeptides of the light-harvesting and reaction centre complexes PUBMED:7592491.

    \ ' '4374' 'IPR003033' '\

    This domain is involved in binding sterols. The human sterol carrier protein 2 (SCP2) is a basic protein that is believed to participate in the intracellular transport of cholesterol and various other lipids PUBMED:8243660. The unc-24 protein of Caenorhabditis elegans contains a domain similar to part of two ion channel regulators (the erythrocyte integral membrane protein stomatin and the C. elegans neuronal protein MEC-2) juxtaposed to a domain similar to nonspecific lipid transfer protein (nsLTP; also called sterol carrier protein 2) PUBMED:8667025.

    \ \ ' '4375' 'IPR001991' '\

    It has been shown PUBMED:8031825 that integral membrane proteins that mediate the uptake\ of a wide variety of molecules with the concomitant uptake of sodium ions\ (sodium symporters) can be grouped, on the basis of sequence and functional\ similarities into a number of distinct families. One of these families PUBMED:1279699 is\ known as the sodium:dicarboxylate symporter family (SDF).

    \ \

    Such re-uptake of neurotransmitters from the synapses, is thought to be an important mechanism for terminating their action, by removing these chemicals from the synaptic cleft, and transporting them into presynaptic nerve terminals, and surrounding neuroglia. this removal is also believed to prevent them accumulating to the point of reaching neurotoxic PUBMED:1448170, PUBMED:1280334.

    \ \

    The structure of these transporter proteins has been variously reported to\ contain from 8 to 10 transmembrane (TM) regions, although 10 now seems to\ be the accepted value.

    \ \

    Members of the family include: several mammalian excitatory amino acid transporters, and a number of bacterial transporters. They vary with regars to their dependence on transport of sodium, and other ions.

    \ ' '4376' 'IPR004235' '\

    Scytalone dehydratase is a member of the group of enzymes involved in fungal melanin biosynthesis. It was first identified in a phytopathogenic fungus, Magnaporthe grisea (Rice blast fungus), which causes rice blast disease. Scytalone dehydratase is a molecular target of inhibitor design efforts aimed at protecting rice plants from fungal disease PUBMED:9922139, PUBMED:14716498.

    \ ' '4377' 'IPR005130' '\

    L-serine dehydratase is found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyses the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway.

    \ ' '4378' 'IPR005131' '\ L-serine dehydratase is found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyses the deamination of serine\ to form pyruvate and is part of the gluconeogenesis pathway.\ ' '4379' 'IPR000701' '\

    Succinate dehydrogenase (SDH) is a membrane-bound complex of two main components: a membrane-extrinsic component composed of an FAD-binding flavoprotein and an iron-sulphur protein, and a hydrophobic component composed of a cytochrome b and a membrane anchor protein.

    \ \

    The cytochrome b component is a mono-haem transmembrane protein PUBMED:1447196, PUBMED:8152421, PUBMED:7616569 belonging to a family that includes:

    \ \

    \ \

    These cytochromes are proteins of about 130 residues that comprise three transmembrane regions. There are two conserved histidines which may be involved in binding the haem group.

    \ ' '4380' 'IPR004027' '\ The SEC-C motif found in the C-terminus of the SecA protein, in the middle of some SWI2 ATPases and also solo in several proteins. The motif is predicted to chelate zinc with the CXC and C[HC] pairs that constitute the most conserved feature of the motif. It is predicted to be a potential nucleic acid binding domain.\ ' '4381' 'IPR001619' '\

    Sec1-like molecules have been implicated in a variety of eukaryotic\ vesicle transport processes including neurotransmitter release by exocytosis PUBMED:8769846. They regulate\ vesicle transport by binding to a t-SNARE from the syntaxin family. This process is thought\ to prevent SNARE complex formation, a protein complex required for membrane fusion.\ Whereas Sec1 molecules are essential for neurotransmitter release and other secretory\ events, their interaction with syntaxin molecules seems to represent a negative regulatory\ step in secretion PUBMED:10903948.

    \ ' '4382' 'IPR005606' '\

    Sec20 is a membrane glycoprotein associated with secretory pathway.

    \ ' '4383' 'IPR005609' '\

    This family consists of homologues of Sec61beta - a component of the Sec61/SecYEG protein secretory system. The domain is found in eukaryotes and archaea and is possibly homologous to the bacterial SecG.

    \ ' '4384' 'IPR004728' '\ Members of the NSCC2 family have been sequenced from various yeast, fungal and animals species including Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. These proteins are the Sec62 proteins, believed to be associated with the Sec61 and Sec63 constituents of the general protein secretary systems of yeast microsomes. They are also the non-selective cation (NS) channels of the mammalian cytoplasmic membrane. The yeast Sec62 protein has been shown to be essential for cell growth. The mammalian NS channel proteins have been implicated in platelet derived growth factor(PGDF) dependent single channel current in fibroblasts. These channels are essentially closed in serum deprived tissue-culture cells and are specifically opened by exposure to PDGF. These channels are reported to exhibit equal selectivity for Na+, K+ and Cs+ with low permeability to Ca2+, and no permeability to anions.\ ' '4385' 'IPR000904' '\ The SEC7 domain was named after the first protein found to contain such a region PUBMED:3042778. \ It has been shown to be linked with guanine nucleotide exchange function PUBMED:9072969, PUBMED:9442017.\ The 3D structure of the domain displays several alpha-helices PUBMED:9653114. It was found to be \ associated with other domains involved in guanine nucleotide exchange (e.g., CDC25, Dbl) in mammalian \ factors PUBMED:9868368.\ ' '4386' 'IPR011130' '\

    The SecA ATPase is involved in the insertion and retraction of preproteins through the plasma membrane. This domain has been found to cross-link to preproteins, thought to indicate a role in preprotein binding. The pre-protein cross-linking domain is comprised of two sub domains that are inserted within the ATPase domain PUBMED:12242434.

    \ ' '4387' 'IPR003708' '\

    Secretion across the inner membrane in some Gram-negative bacteria occurs\ via the preprotein translocase pathway. Proteins are produced in the \ cytoplasm as precursors, and require a chaperone subunit to direct them to \ the translocase component PUBMED:2202721. From there, the mature proteins are either \ targeted to the outer membrane, or remain as periplasmic proteins. The \ translocase protein subunits are encoded on the bacterial chromosome.

    \

    \ The translocase itself comprises 7 proteins, including a chaperone protein\ (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and \ SecG), and two additional membrane proteins that promote the release of \ the mature peptide into the periplasm (SecD and SecF) PUBMED:2202721. The chaperone \ protein SecB PUBMED:11336818 is a highly acidic homotetrameric protein that exists\ as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains \ preproteins in an unfolded state after translation, and targets these to \ the peripheral membrane protein ATPase SecA for secretion PUBMED:10418149.

    \

    \ Recently, the tertiary structure of Haemophilus influenzae SecB () was resolved\ by means of X-ray crystallography to 2.5A PUBMED:11101901. The chaperone comprises four\ chains, forming a tetramer, each chain of which has a simple alpha+beta fold\ arrangement. While one binding site on the homotetramer recognises unfolded\ polypeptides by hydrophobic interactions, the second binds to SecA through\ the latter\'s C-terminal 22 residues.

    \ ' '4388' 'IPR003335' '\

    Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase\ pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to\ the translocase component PUBMED:2202721. From there, the mature proteins are either targeted to the outer\ membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial\ chromosome.

    \

    \ The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral\ membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of\ the mature peptide into the periplasm (SecD and SecF) PUBMED:2202721. The chaperone protein SecB PUBMED:11336818 is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm.\ SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane\ protein ATPase SecA for secretion PUBMED:10418149. Together with SecY and SecG, SecE forms a multimeric\ channel through which preproteins are translocated, using both proton motive forces and ATP-driven secretion. The\ latter is mediated by SecA. The structure of the\ Escherichia coli SecYEG assembly revealed a sandwich of two membranes interacting through the extensive cytoplasmic\ domains PUBMED:12167867. Each membrane is composed of dimers of SecYEG. The monomeric complex contains 15\ transmembrane helices. \

    \

    This family consists of various prokaryotic SecD and SecF protein export membrane proteins. The SecD and SecF equivalents of the\ Gram-positive bacterium Bacillus subtilis are jointly present in one polypeptide,\ denoted SecDF, that is required to maintain a high capacity for protein secretion.\ Unlike the SecD subunit of the pre-protein translocase of E. coli, SecDF\ of B. subtilis was not required for the release of a mature secretory protein from\ the membrane, indicating that SecDF is involved in earlier translocation steps PUBMED:9694879.\ Comparison with SecD and\ SecF proteins from other organisms revealed the presence of 10 conserved\ regions in SecDF, some of which appear to be important for SecDF function.\ Interestingly, the SecDF protein of B. subtilis has 12 putative transmembrane\ domains. Thus, SecDF does not only show sequence similarity but also structural\ similarity to secondary solute transporters PUBMED:9694879.

    \ ' '4389' 'IPR001901' '\

    Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase\ pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to\ the translocase component PUBMED:2202721. From there, the mature proteins are either targeted to the outer\ membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial\ chromosome.\

    \

    The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral\ membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of\ the mature peptide into the periplasm (SecD and SecF) PUBMED:2202721. The chaperone protein SecB PUBMED:11336818 is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm.\ SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane\ protein ATPase SecA for secretion PUBMED:10418149. SecE, part of the main \ SecYEG translocase complex, is ~106 residues in length, and spans the \ inner membrane of the Gram-negative bacterial envelope. Together with\ SecY and SecG, SecE forms a multimeric channel through which preproteins\ are translocated, using both proton motive forces and ATP-driven secretion. The latter is mediated by SecA.

    \ \

    In eukaryotes, the evolutionary related protein sec61-gamma plays a role in protein translocation through the endoplasmic reticulum; it is part of a trimeric complex that also consist of sec61-alpha and beta PUBMED:8107851. Both secE and sec61-gamma are small proteins of about 60 to 90 amino acids that contain a single transmembrane region at their C-terminal extremity (Escherichia coli secE is an exception, in that it possess an extra N-terminal segment of 60 residues that contains two additional transmembrane domains) PUBMED:9393849.

    \ ' '4390' 'IPR004692' '\

    Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase\ pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to\ the translocase component PUBMED:2202721. From there, the mature proteins are either targeted to the outer\ membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial\ chromosome.\

    \

    The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral\ membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of\ the mature peptide into the periplasm (SecD and SecF) PUBMED:2202721. The chaperone protein SecB PUBMED:11336818 is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm.\ SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane\ protein ATPase SecA for secretion PUBMED:10418149. Together with\ SecY and SecG, SecE forms a multimeric channel through which preproteins\ are translocated, using both proton motive forces and ATP-driven secretion. The latter is mediated by SecA.

    \ \

    SecG has two transmembrane\ domains, both of which contribute to the recognition of preprotein signal\ sequences by the translocation complex PUBMED:7650029. The protein also undergoes\ membrane topology inversion when coupled to the SecA cycle PUBMED:11445571.

    \ \ ' '4391' 'IPR006940' '\

    Securin, also known as pituitary tumour-transforming gene product is a regulatory protein which plays a central role in chromosome stability in the p53/TP53 pathway, and in DNA repair. It probably acts by blocking the action of key proteins, for example, during mitosis it blocks Separase/ESPL1 function preventing the proteolysis of the cohesin complex and the subsequent segregation of the chromosomes. At the onset of anaphase, it is ubiquitinated, leading to its destruction and to the liberation of ESPL1. Its function is however not limited to an inhibitory activity, since it is required to activate ESPL1. The negative regulation of the transcriptional activity and related apoptosis activity of TP53 may explain the strong transforming capability of the protein when it is overexpressed. Over-expression of securin is associated with a number of tumours, and it has been proposed that this may be due to erroneous chromatid separation leading to chromosome gain or loss PUBMED:10411507.

    \ ' '4392' 'IPR002208' '\

    Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase\ pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to\ the translocase component PUBMED:2202721. From there, the mature proteins are either targeted to the outer\ membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial\ chromosome.\

    \

    The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral\ membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of\ the mature peptide into the periplasm (SecD and SecF) PUBMED:2202721. The chaperone protein SecB PUBMED:11336818 is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm.\ SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane\ protein ATPase SecA for secretion PUBMED:10418149. The structure of the Escherichia coli SecYEG assembly revealed a sandwich of two membranes\ interacting through the extensive cytoplasmic domains PUBMED:12167867. Each membrane is composed of dimers of SecYEG. The\ monomeric complex contains 15 transmembrane helices.

    \

    The eubacterial secY protein PUBMED:1406280 interacts with the signal sequences of secretory proteins as well as with two other components of the protein translocation system: secA and secE. SecY is an integral plasma membrane protein of 419 to 492 amino acid residues that apparently contains 10 transmembrane (TM), 6 cytoplasmic and 5 periplasmic regions.

    \

    Cytoplasmic regions 2 and 3, and TM domains 1, 2, 4, 5, 7 and 10 are well conserved: the conserved cytoplasmic regions are believed to interact with cytoplasmic secretion factors, while the TM domains may participate in protein export PUBMED:2110998. Homologs of secY are found in archaebacteria PUBMED:1764515. SecY is also encoded in the chloroplast genome of some algae PUBMED:1544427 where it could be involved in a prokaryotic-like protein export system across the two membranes of the chloroplast endoplasmic reticulum (CER) which is present in chromophyte and cryptophyte algae.

    \ ' '4393' 'IPR006722' '\

    Sedlin is a 140 amino-acid protein with a putative role in endoplasmic reticulum-to-Golgi transport. Several\ missense mutations and deletion mutations in the SEDL gene, which result in protein truncation by frame shift, are responsible for\ spondyloepiphyseal dysplasia tarda, a progressive skeletal disorder (OMIM:313400). PUBMED:11349230.

    \ ' '4394' 'IPR000389' '\ A number of small hydrophilic plant seed proteins are structurally related.\ These proteins contains from 83 to 153 amino acid residues and may play a role\ PUBMED:12231998, PUBMED:8492809 in equipping the seed for survival, maintaining a minimal level of\ hydration in the dry organism and preventing the denaturation of cytoplasmic\ components. They may also play a role during imbibition by controlling water\ uptake.\ ' '4395' 'IPR018319' '\ In prokaryotes, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3-prime or 5-prime non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation. \

    This family describes SelA. A close homologue of SelA is found in Helicobacter pylori, but all other required elements are missing and the protein is shorter at the N-terminus than SelA from other species. The trusted cut-off is set above the score generated for Helicobacter pylori putative SelA.

    \ ' '4397' 'IPR007672' '\ SelP is the only known eukaryotic selenoprotein that contains multiple selenocysteine (Sec) residues, and accounts for more than 50% of the selenium content of rat and human plasma PUBMED:10775431. It is thought to be glycosylated PUBMED:11168591. SelP may have antioxidant properties. It can attach to epithelial cells, and may protect vascular endothelial cells against peroxynitrite toxicity PUBMED:10775431. The high selenium content of SelP suggests that it may be involved in selenium intercellular transport or storage PUBMED:11168591. The promoter structure of bovine SelP suggests that it may be involved in countering heavy metal intoxication, and may also have a developmental function PUBMED:9358058. The N-terminal region always contains one Sec residue, and this is separated from the C-terminal region (9-16 sec residues) by a histidine-rich sequence PUBMED:11168591. The large number of Sec residues in the C-terminal portion of SelP suggests that it may be involved in selenium transport or storage. However, it is also possible that this region has a redox function PUBMED:11168591.\ ' '4398' 'IPR007671' '\ SelP is the only known eukaryotic selenoprotein that contains multiple selenocysteine (Sec) residues, and accounts for more than 50% of the selenium content of rat and human plasma PUBMED:10775431. It is thought to be glycosylated PUBMED:11168591. SelP may have antioxidant properties. It can attach to epithelial cells, and may protect vascular endothelial cells against peroxynitrite toxicity PUBMED:10775431. The high selenium content of SelP suggests that it may be involved in selenium intercellular transport or storage PUBMED:11168591. The promoter structure of bovine SelP suggests that it may be involved in countering heavy metal intoxication, and may also have a developmental function PUBMED:9358058. The N-terminal region of SelP can exist independently of the C-terminal region. Zebrafish selenoprotein Pb () lacks the C-terminal Sec-rich region, and a protein encoded by the rat SelP gene and lacking this region has also been reported PUBMED:11168591. The N-terminal region contains a conserved SecxxCys motif, which is similar to the CysxxCys found in thioredoxins. It is speculated that the N-terminal region may adopt a thioredoxin fold and catalyse redox reactions PUBMED:11168591. The N-terminal region also contains a His-rich region, which is thought to mediate heparin binding. Binding to heparan proteoglycans could account for the membrane binding properties of SelP PUBMED:10775431.\ ' '4399' 'IPR002579' '\

    Peptide methionine sulphoxide reductase (Msr) reverses the inactivation of many proteins due to the oxidation of critical methionine residues by reducing methionine sulphoxide, Met(O), to methionine PUBMED:10841552. It is present in most living organisms, and the cognate structural gene belongs to the so-called minimum gene set PUBMED:8994848, PUBMED:8816789.

    \ \

    The domains: MsrA and MsrB, reduce different epimeric forms of methionine sulphoxide. This group represents MsrB, the crystal structure of which has been determined to 1.8A PUBMED:11938352. The overall structure shows no resemblance to the structures of MsrA () from other organisms; though the active sites show approximate mirror symmetry. In each case, conserved amino acid motifs mediate the stereo-specific recognition and reduction of the substrate. Unlike the MsrA domain, the MsrB domain activates the cysteine or selenocysteine nucleophile through a unique Cys-Arg-Asp/Glu catalytic triad. The collapse of the reaction intermediate most likely results in the formation of a sulphenic or selenenic acid moiety. Regeneration of the active site occurs through a series of thiol-disulphide exchange steps involving another active site Cys residue and thioredoxin.

    \ \

    In a number of pathogenic bacteria, including Neisseria gonorrhoeae, the MsrA and MsrB domains are fused; the MsrA being N-terminal to MsrB. This arrangement is reversed in Treponema pallidum. In N. gonorrhoeae and Neisseria meningitidis, a thioredoxin domain is fused to the N-terminus. This may function to reduce the active sites of the downstream MsrA and MsrB domains.

    \ ' '4400' 'IPR001627' '\

    The Sema domain occurs in semaphorins, which are a large family of secreted and transmembrane proteins, some of which function as repellent signals during axon guidance. Sema domains also occur in a hepatocyte growth factor receptor, in SEX protein PUBMED:9875845 and in viral proteins.

    \ \

    CD100 (also called SEMA4D) is associated with PTPase and serine kinase activity. CD100 increases PMA, CD3 and CD2 induced T cell proliferation, increases CD45 induced T cell adhesion, induces B cell homotypic adhesion and down-regulates B cell expression of CD23.

    \

    \ The Sema domain is characterised by a conserved set of cysteine residues,\ which form four disulphide bonds to stabilise the structure. The Sema domain\ fold is a variation of the beta propeller topology, with seven blades radially\ arranged around a central axis. Each blade contains a four-\ stranded (strands A to D) antiparallel beta sheet. The inner strand of each\ blade (A) lines the channel at the centre of the propeller, with strands B and\ C of the same repeat radiating outward, and strand D of the next repeat\ forming the outer edge of the blade. The large size of the Sema domain is not\ due to a single inserted domain but results from the presence of additionnal\ secondary structure elements inserted in most of the blades. The Sema domain\ uses a \'loop and hook\' system to close the circle between the first and the\ last blades. The blades are constructed sequentially with an N-terminal beta-\ strand closing the circle by providing the outermost strand (D) of the seventh\ (C-terminal) blade. The beta-propeller is further stabilised by an extension\ of the N-terminus, providing an additional, fifth beta-strand on the outer\ edge of blade 6 PUBMED:12925274, PUBMED:12958590, PUBMED:15167892.

    \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/). \

    \ ' '4401' 'IPR000534' '\ The semialdehyde dehydrogenase family is found in N-acetyl-glutamine semialdehyde dehydrogenase (AgrC), which is involved in arginine biosynthesis, and aspartate-semialdehyde dehydrogenase PUBMED:10369777, an enzyme involved in the biosynthesis of various amino acids from aspartate. This family is also found in yeast and fungal Arg5,6 protein, which is cleaved into the enzymes N-acety-gamma-glutamyl-phosphate reductase and acetylglutamate kinase. These are also involved in arginine biosynthesis. All proteins in this entry contain a NAD binding region of semialdehyde dehydrogenase.\ ' '4402' 'IPR012280' '\ This domain contains N-acetyl-glutamine semialdehyde dehydrogenase (AgrC), which is involved in arginine biosynthesis, and aspartate-semialdehyde dehydrogenase PUBMED:10369777, an enzyme involved in the biosynthesis of various amino acids from aspartate. It also contains the yeast and fungal Arg5,6 protein, which is cleaved into the enzymes N-acety-gamma-glutamyl-phosphate reductase and acetylglutamate kinase. These are also involved in arginine biosynthesis. All proteins in this entry contain a dimerisation domain of semialdehyde dehydrogenase.\ ' '4403' 'IPR005621' '\

    The binding of SeqA protein to hemimethylated GATC sequences is important in the negative modulation of chromosomal initiation at oriC, and in the formation of SeqA foci necessary for Escherichia coli chromosome segregation PUBMED:11457824. SeqA tetramers are able to aggregate or multimerize in a reversible, concentration-dependent manner PUBMED:11457824. Apart from its function in the control of DNA replication, SeqA may also be a specific transcription factor PUBMED:11442835. The C-terminal domain binds DNA, binding to fully methylated and hemimethylated GATC sequences at oriC. The structure of the C-terminal domain consists of seven alpha-helices and three-stranded beta-sheet.

    \ ' '4404' 'IPR007455' '\ Serglycin is the most prevalent proteoglycan produced in haemopoietic cells. Serglycin is a proteinase resistant secretory granule proteoglycan PUBMED:2261494.\ ' '4405' 'IPR001563' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to MEROPS peptidase family S10 (clan SC). The type example is carboxypeptidase Y from Saccharomyces cerevisiae (Baker\'s yeast) PUBMED:7845208.

    All known carboxypeptidases are either metallo carboxypeptidases or serine carboxypeptidases ( and ). The catalytic activity of the serine carboxypeptidases, like that of the trypsin family serine proteases, is provided by a charge relay system involving an aspartic acid residue hydrogen-bonded to a histidine, which is itself hydrogen-bonded to a serine PUBMED:2324088. The sequences surrounding the active site serine and histidine residues are highly conserved in all the serine carboxypeptidases.

    \ ' '4406' 'IPR000215' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    Serpins (SERine Proteinase INhibitors) PUBMED:14705960, PUBMED:2690952, PUBMED:8417965 belong to MEROPS inhibitor family I4, clan ID. \ \ Serpins are proteins that are primarily known as irreversible serine protease inhibitors active against S1 (), S8 () and C14 () peptidases. There are both extra- and intra-cellular serpins, which are found in all groups of organisms with the notable exception of fungi PUBMED:11116082, PUBMED:12411597. In contrast to "rigid" proteinase inhibitors, such as those of the Kunitz or Kazal families, the serpins are metastable proteins (active-state proteins) which interact with their substrate and irreversibly trap the acyl intermediate as a result of a major conformational change PUBMED:11116079; they are best described as suicide substrate inhibitors. The common structure of these proteins is a multi-domain fold containing a bundle of 8 or 9 alpha helices and a beta sandwich formed by 3 beta sheets. The reactive centre loop (RCL) is found in the C-terminal part of these proteins.

    \ \

    Serpins and their homologues are a group of high molecular weight (40 to 50 kDa) structurally related proteins involved in a number of fundamental biological processes such as blood coagulation, complement activation, fibrinolysis, angiogenesis, inflammation, tumour suppression and hormone transport. All known serpins have been classified into 16 clades and 10 orphan sequences; the vertebrate serpins can be conveniently classified into six sub-groups PUBMED:11116082. In human plasma they represent approximately 2% of the total protein, of which 70% is alpha-1-antitrypsin.\ \ \ On the basis of strong sequence similarities, a number of proteins with no\ known inhibitory activity also belong to this family, these include: angiotensinogen, corticosteroid-binding globulin and thyroxin-binding globulin PUBMED:12824063.

    \ ' '4407' 'IPR015866' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This entry represents the N-terminal domain of Seryl-tRNA synthetase, which consists of two helices in a long alpha-hairpin. Seryl-tRNA synthetase () exists as monomer and belongs to class IIa PUBMED:7540217.

    \ ' '4410' 'IPR001477' '\ The mumps virus SH protein is a membrane protein and not\ essential for virus growth PUBMED:8918542. Its function is unknown.\ ' '4411' 'IPR000980' '\

    The Src homology 2 (SH2) domain is a protein domain of about 100 amino-acid residues first identified as a conserved sequence region between the oncoproteins Src and Fps PUBMED:3025655. Similar sequences were later found in many other intracellular signal-transducing proteins PUBMED:1377638. SH2 domains function as regulatory modules of intracellular signalling cascades by interacting with high affinity to \ phosphotyrosine-containing target peptides in a sequence-specific, SH2 domains recognise between 3-6 residues C-terminal to the phosphorylated tyrosine in a fashion that differs from one SH2 domain to another, and strictly phosphorylation-dependent manner PUBMED:7883800, PUBMED:15335710, PUBMED:14731533, PUBMED:7531822. They are found in a wide variety of protein contexts e.g., in association with catalytic domains of phospholipase Cy (PLCy) and the non-receptor protein \ tyrosine kinases; within structural proteins such as fodrin and tensin; and in a group of small adaptor molecules, i.e Crk and Nck. The domains are frequently found as repeats in a single protein sequence and will then often bind both mono- and di-phosphorylated substrates.

    \ \

    The structure of the SH2 domain belongs to the alpha+beta class, its overall shape forming a compact flattened hemisphere. The core structural elements comprise a central hydrophobic anti-parallel beta-sheet, flanked by 2 short alpha-helices. The loop between strands 2 and 3 provides many of the binding interactions with the phosphate group of its phosphopeptide ligand, and is hence designated the phosphate binding loop, the phosphorylated ligand binds perpendicular to the beta-sheet and typically interacts with the phosphate binding loop and a hydrophobic binding pocket that interacts with a pY+3 side chain. The N- and C-termini of the domain are close together in space and on the opposite face from the phosphopeptide binding surface and it has been speculated that this has facilitated their integration into surface-exposed regions of host proteins PUBMED:11911873.

    \ ' '4412' 'IPR001452' '\ SH3 (src Homology-3) domains are small protein modules containing \ approximately 50 amino acid residues PUBMED:15335710, PUBMED:11256992. They are found in a \ great variety of intracellular or\ membrane-associated proteins PUBMED:1639195, PUBMED:14731533, PUBMED:7531822 for example, in a variety of\ proteins with enzymatic activity, in adaptor\ proteins that lack catalytic sequences and in cytoskeletal\ proteins, such as fodrin and yeast actin binding protein ABP-1. \

    The SH3 domain has a characteristic fold which consists of five or six beta-strands arranged as two tightly packed anti-parallel beta sheets. The linker\ regions may contain short helices PUBMED:. The surface of the SH3-domain bears a flat, hydrophobic ligand-binding pocket which consists of three shallow grooves defined by conservative aromatic residues in which the ligand adopts an extended left-handed helical arrangement. The ligand binds with low affinity but this may be enhanced by multiple interactions.\ The region bound by the SH3 domain is in all cases proline-rich and contains PXXP as a core-conserved binding motif. The function of the SH3 domain is not well understood but they may mediate many diverse processes such as increasing local concentration of proteins, altering their subcellular location and mediating the assembly of large multiprotein complexes PUBMED:7953536.

    \ ' '4413' 'IPR006993' '\

    This family of proteins, which contains SH3BGRL3, is functionally uncharacterised. SH3BGRL3 is a highly conserved small protein, which is widely expressed and shows a significant similarity to glutaredoxin 1 (GRX1) of Escherichia coli which is predicted to belong to the thioredoxin\ superfamily. However, SH3BGRL3 lacks both conserved cysteine residues, which characterise\ the enzymatic active site of GRX. This structural feature raises the possibility that SH3BGRL3 and its homologues could function as\ endogenous modulators of GRX activity PUBMED:11444877.

    \ ' '4414' 'IPR005327' '\

    The small hydrophobic integral membrane protein, SH (previously designated 1A) is found to have a variety of glycosylated forms PUBMED:1413513, PUBMED:2374008. This protein is a component of the mature respiratory syncytial virion PUBMED:1413513 where it may form complexes and appears to play a structural role.

    \ ' '4415' 'IPR006151' '\

    This entry contains shikimate and quinate dehydrogenases, as well as glutamyl-tRNA reductases.

    \

    Shikimate 5-dehydrogenase () catalyses the conversion of shikimate to 5-dehydroshikimate PUBMED:12906831, PUBMED:12837789. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids PUBMED:15012217. Quinate 5-dehydrogenase catalyses the conversion of quinate to 5-dehydroquinate. This reaction is part of the quinate pathway where quinic acid is exploited as a source of carbon in prokaryotes and microbial eukaryotes. Both the shikimate and quinate pathways share two common pathway metabolites, 3-dehydroquinate and dehydroshikimate.

    \

    Glutamyl-tRNA reductase () catalyzes the first step of tetrapyrrole biosynthesis in plants, archaea and most bacteria. The dimeric enzyme has an unusual V-shaped architecture where each monomer consists of three domains linked by a long \'spinal\' alpha-helix. The central catalytic domain specifically recognises the glutamate moiety of the substrate PUBMED:16228559.

    \ ' '4416' 'IPR001085' '\ Synonym(s): Serine hydroxymethyltransferase, Serine aldolase, Threonine aldolase\

    Serine hydroxymethyltransferase (SHMT) is a pyridoxal phosphate (PLP) dependent enzyme and belongs to the aspartate aminotransferase superfamily (fold type I) PUBMED:10828359. The pyridoxal-P group is attached to a lysine residue around which the sequence is highly conserved in all forms of the enzyme PUBMED:8305478. The enzyme carries out interconversion of serine and glycine using PLP as the cofactor. SHMT catalyses the transfer of a hydroxymethyl group from N5, N10- methylene tetrahydrofolate to glycine, resulting in the formation of serine and tetrahydrofolate. Both eukaryotic and prokaryotic SHMT enzymes form tight obligate homodimers and the mammalian enzyme forms a homotetramer PUBMED:10828359, PUBMED:11877399. PLP dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalysed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), D-amino acid superfamily (fold type IV) and glycogen phophorylase family (fold type V) PUBMED:8112347, PUBMED:7748903.

    \

    In vertebrates, glycine hydroxymethyltransferase exists in a cytoplasmic and a mitochondrial form whereas\ only one form is found in prokaryotes.

    \ ' '4417' 'IPR007001' '\ This domain represents the high-similarity N-terminal constant region shared by shufflon proteins. Shufflon proteins are created as a result of a clustered inversion region. The proteins retain a constant N-terminal domain, with different C-terminal domains.\ ' '4418' 'IPR004124' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Sialidases () hydrolyse alpha-(2->3)-, alpha-(2->6)-, alpha-(2->8)-glycosidic linkages of terminal sialic residues in oligosaccharides, glycoproteins, glycolipids, colominic acid and synthetic substrates. Sialidases may act as pathogenic factors in microbial infections PUBMED:2034213.

    \

    The 1.8 A\ structure of trans-sialidase from leech (Macrobdella decora, ) in complex with 2-deoxy-2, 3-didehydro-NeuAc was solved. The refined model comprising\ residues 81-769 has a catalytic beta-propeller domain, a N-terminal lectin-like domain and an irregular beta-stranded\ domain inserted into the catalytic domain PUBMED:9562562.

    \ ' '4419' 'IPR005328' '\

    Serotype M1 group A Streptococcus strains cause epidemic waves of human infections. This family includes the sic protein, an extracellular protein (streptococcal inhibitor of complement) that inhibits human complement PUBMED:10426317. The exact mechanism of\ inhibition has not been completely elucidated, but Sic is\ incorporated into the complement membrane-attack complex\ (C5b-C9) responsible for target killing. Preliminary analysis of variation in the sic gene in\ M1 Group A streptococci strains identified a level of polymorphism far\ exceeding that of other genes in these organisms, selection of new\ Sic structural variants on mucosal surfaces generates a very\ large pool of subclones in the course of epidemic waves. This\ process may help to sustain and enlarge the epidemic waves

    \ ' '4420' 'IPR002078' '\

    Some bacterial regulatory proteins activate the expression of genes from\ promoters recognised by core RNA polymerase associated with the alternative\ sigma-54 factor. These have a conserved domain of about 230 residues involved\ in the ATP-dependent PUBMED:8407777, PUBMED:2041769 interaction with sigma-54. About half of the proteins in which this domain is found (algB, dcdT, flbD, hoxA, hupR1, hydG, ntrC, pgtA and pilR) belong to signal transduction two-component systems PUBMED:2694934 and possess a domain that can be phosphorylated by a sensor-kinase protein in their N-terminal section. Almost all of these proteins possess a helix-turn-helix DNA-binding domain in their C-terminal section. The domain which interacts with the sigma-54 factor has an ATPase activity. This may be required to promote a conformational change necessary for the interaction PUBMED:1534752. The domain contains an atypical ATP-binding motif A (P-loop) as well as a form of motif B. The two ATP-binding motifs are located in the N-terminal section of the domain.

    \ ' '4421' 'IPR000394' '\

    Sigma factors PUBMED:3052291 are bacterial transcription initiation factors that promote the attachment of the core RNA polymerase to specific initiation sites and are then released. They alter the specificity of promoter recognition. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes. With regard to sequence similarity, sigma factors can be grouped into two classes: the sigma-54 and sigma-70 families. The sigma-70 family has many different sigma factors (see the relevant entry ). The sigma-54 family consists exclusively of sigma-54 factor PUBMED:2517036, PUBMED:7934866 required for the transcription of promoters that have a characteristic -24 and -12 consensus recognition element but which are devoid of the typical -10, -35 sequences recognised by the major sigma factors. The sigma-54 factor is also characterised by its interaction with ATP-dependent positive regulatory proteins that bind to upstream activating sequences.\ Structurally sigma-54 factors consist of three distinct regions:

    \

      \
    1. A relatively well conserved N-terminal glutamine-rich region of about 50 residues that contains a potential leucine zipper motif.
    2. \
    3. A region of variable length which is not well conserved.
    4. \
    5. A well conserved C-terminal region of about 350 residues that contains a second potential leucine zipper, a potential DNA-binding \'helix-turn-helix\' motif and a perfectly conserved octapeptide whose function is not known.
    6. \

    \ ' '4422' 'IPR007046' '\ This domain makes a direct interaction with the core RNA polymerase, to form an enhancer dependent holoenzyme PUBMED:10894718. The centre of this domain contains a very weak similarity to a helix-turn-helix motif, which may represent a DNA binding domain.\ ' '4423' 'IPR007634' '\ This DNA-binding domain is based on peptide fragmentation data. This domain is proximal to DNA in the promoter/holoenzyme complex. Furthermore, this region contains a putative helix-turn-helix motif. At the C terminus, there is a highly conserved region known as the RpoN box and is the signature of the sigma-54 proteins PUBMED:10894718.\ ' '4424' 'IPR007631' '\

    The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes.

    With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding \'helix-turn-helix\' motif involved in binding the conserved -35 region of promoters recognised by the major sigma factors PUBMED:3092189, PUBMED:1597408. \

    \ The domain is found in the primary vegetative sigma factor. The function of this domain is unclear, and it can be removed without apparent loss of function PUBMED:8858155, PUBMED:11931761.\ ' '4425' 'IPR007127' '\

    The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes.

    With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding \'helix-turn-helix\' motif involved in binding the conserved -35 region of promoters recognised by the major sigma factors PUBMED:3092189, PUBMED:1597408. \

    \

    This entry represents Region 1.1 which modulates DNA binding by region 2 and 4 when sigma is unbound by the core RNA polymerase PUBMED:9927430, PUBMED:10613885. Region 1.1 is also involved in promoter binding.

    \ ' '4426' 'IPR009042' '\

    The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes.

    With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding \'helix-turn-helix\' motif involved in binding the conserved -35 region of promoters recognised by the major sigma factors PUBMED:3092189, PUBMED:1597408. \

    \ ' '4427' 'IPR007624' '\

    The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes.

    With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding \'helix-turn-helix\' motif involved in binding the conserved -35 region of promoters recognised by the major sigma factors PUBMED:3092189, PUBMED:1597408. \

    \ Region 3 forms a discrete compact three helical domain within the sigma-factor. Region is not normally involved in the recognition of promoter DNA, but in some specific bacterial promoters containing an extended -10 promoter element, residues within region 3 play an important role. Region 3 primarily is involved in binding the core RNA polymerase in the holoenzyme PUBMED:11931761.\ ' '4428' 'IPR004317' '\

    Reoviruses are double-stranded RNA viruses that lack a membrane envelope. Their capsid is organised in two concentric icosahedral layers: an inner core and an outer capsid layer. The sigma1 protein is found in the outer capsid, and the sigma2 protein is found in the core. There are four other kinds of protein (besides sigma2) in the core, termed lambda 1-3, mu2. Interactions between sigma2 and lambda 1 and lambda 3 are thought\ to initiate core formation, followed by mu2 and lambda2 PUBMED:9971813.

    \

    Sigma1 is a trimeric protein, and is positioned at the 12 vertices of the icosahedral outer capsid layer. Its N-terminal fibrous tail, arranged as a triple coiled coil,\ anchors it in the virion, and a C-terminal globular head interacts with the\ cellular receptor PUBMED:11438552. These two parts form by separate trimerization events.\ The N-terminal fibrous tail forms on the polysome, without the involvement\ of ATP or chaperones. The post- translational assembly of the C-terminal\ globular head involves the chaperone activity of Hsp90, which is associated\ with phosphorylation of Hsp90 during the process PUBMED:11438552. Sigma1 protein acts\ as a cell attachment protein, and determines viral virulence, pathways of\ spread, and tropism. Junctional adhesion molecule has been identified as a\ receptor for sigma1 PUBMED:11239401. In type 3 reoviruses, a small region, predicted to\ form a beta sheet, in the N-terminal tail was found to bind target cell surface\ sialic acid (i.e. sialic acid acts as a co-receptor) and promote apoptosis PUBMED:11287552.\ The sigma1 protein also binds to the lambda2 core protein PUBMED:9311901.

    \ ' '4429' 'IPR003478' '\ The reoviral gene S1 encodes for haemagglutinin (sigma 1 protein), an outer capsid protein and a major factor in determining virus-host cell interactions. Sigma 1s is one of two translation products of the S1 gene.\ ' '4430' 'IPR004693' '\ Marine diatoms such as Cylindrotheca fusiformis encode at least six silicon transport protein homologues which exhibit similar size and topology. One characterised member of the family (Sit1) functions in the energy-dependent uptake of either silicic acid [Si(OH)4] or silicate [Si(OH)3O-] by a Na+ symport mechanism. The system is found in marine diatoms, which make their "glass houses" out of silicon.\ ' '4431' 'IPR006886' '\ This is a family of higher eukaryotic proteins. SIN was identified as a protein that interacts specifically with SXL (sex lethal) in a yeast two-hybrid assay. The interaction is mediated by one of the SXL RNA-binding domains PUBMED:10521666.\ ' '4432' 'IPR007037' '\

    This entry includes the vibriobactin utilization protein viuB, which is involved in the removal of iron from iron-vibriobactin complexes, as well as several hypothetical proteins.

    \ ' '4433' 'IPR003000' '\ These sequences represent the Sir2 family of NAD+-dependent deacetylases. Silent Information Regulator protein of Saccharomyces cerevisiae (Sir2p) is one of several factors critical\ for silencing at least three loci. Among them, it is unique because\ it silences the rDNA as well as the mating type loci and telomeres. Sir2p\ interacts in a complex with itself and with Sir3p and Sir4p, two proteins that\ are able to interact with nucleosomes. In addition Sir2p also interacts with\ ubiquitination factors and/or complexes PUBMED:9214640.\ Unlike Sir3p and Sir4p, for which no homologues are known, Sir2p is part of a\ multigene family in yeast, the homolgues being HST1, HST2, HST3 and HST4. \ \ \ Highly conserved structural homologues also occur in other organisms ranging from bacteria to man and plants. Proteins of this family have been proposed to play a role in\ silencing, chromosome stability and ageingPUBMED:7498786. In addition, an in\ vitro ADP ribosyltransferase activity has been associated with Escherichia coli and\ human members of this family PUBMED:10381378.\ Homologues of Sir2 share a core domain including the GAG and NID motifs and a\ putative C4 Zinc finger. The regions containing these three conserved motifs\ are individually essential for Sir2 silencing function, as are the four\ cysteins PUBMED:10473645. In addition, the conserved residues HG next to the putative Zn\ finger have been shown to be essential for the ADP ribosyltransferase activity\ PUBMED:10381378. \ \ \ Sir2-like enzymes catalyze a reaction in which the cleavage of NAD(+)and histone and/or protein\ deacetylation are coupled to the formation of O-acetyl-ADP-ribose, a novel metabolite. The dependence of the reaction on both\ NAD(+) and the generation of this potential second messenger offers new clues to understanding the function and regulation of nuclear,\ cytoplasmic and mitochondrial Sir2-like enzymes PUBMED:12517451.\ ' '4434' 'IPR007360' '\ SirB up-regulates Salmonella typhimurium invasion gene transcription. It is, however, not essential for the expression of these genes. Its function is unknown PUBMED:10322010.\ ' '4435' 'IPR001347' '\ The SIS (Sugar ISomerase) domain is a phosphosugar-binding domain PUBMED:10203754 found in \ many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found\ in proteins that regulate the expression of genes involved in synthesis of phosphosugars\ possibly by binding to the end-product of the pathway.\ ' '4436' 'IPR007857' '\ The human homologue of Saccharomyces cerevisiae Skb1 (Shk1 kinase-binding protein 1) is a protein methyltransferase PUBMED:10531356. These proteins seem to play a role in Jak signalling.\ ' '4437' 'IPR000623' '\

    Shikimate kinase () catalyses the fifth step in the biosynthesis of aromatic amino acids from chorismate (the so-called shikimate pathway) PUBMED:7612934. The enzyme catalyses the following reaction:

    \ \ \ \

    The protein is found in bacteria (gene aroK or aroL), plants and fungi (where\ it is part of a multifunctional enzyme that catalyses five consecutive steps in this pathway). In 1994, the 3D structure of shikimate kinase was predicted to be very close to that of adenylate kinase, suggesting a functional similarity as well as an evolutionary relationship PUBMED:7703851. This prediction has since been confirmed experimentally. The protein is reported to possess an alpha/beta fold, consisting of a central sheet of five parallel beta-strands flanked by alpha-helices. Such a topology is very similar to that of adenylate kinase PUBMED:9600856.

    \ ' '4438' 'IPR004015' '\

    SKIP (SKI-interacting protein) is an essential spliceosomal component and transcriptional coregulator, which may provide regulatory coupling of transcription initiation and splicing PUBMED:15052407. SKIP was identified in a yeast 2-hybrid screen, where it was shown to interact with both the cellular and viral forms of SKI through the highly conserved region on SKIP known as the SNW domain PUBMED:11522815. SKIP is now known to interact with a number of other proteins as well. SKIP potentiates the activity of important transcription factors, such as vitamin D receptor, CBF1 (RBP-Jkappa), Smad2/3, and MyoD. It works with Ski in overcoming pRb-mediated cell cycle arrest, and it is targeted by the viral transactivators EBNA2 and E7 PUBMED:10644367.

    \

    This entry represents the SNW domain.

    \ ' '4439' 'IPR004903' '\

    Clostridium difficile is the most common cause of antibiotic-associated diarrhoea. It makes a crystalline surface protein layer (S-layer), which is encoded by the slpA gene. Its product is cleaved to give two mature peptides that associate to form the layer. The larger peptide is derived from the C-terminal portion of the precursor and is relatively conserved. The smaller peptide is derived from the N-terminal portion of the precursor and is the dominant antigen PUBMED:16388033. The precursors of the two proteins are called P36 and P37. The slpA gene is strongly transcribed during the entire growth phase as a bicistronic transcript PUBMED:12867455.

    \ \

    The bacterial S-layer forms a regular structure, composed of a monolayer of one glycolprotein on the surfaces of many prokaryotic species. S-layers fulfil different functions, such as serving as attachment structures for extracellular enzymes and acting as major virulence determinants for pathogenic species. Lactobacillus acidophilus, which originates from the human pharynx, possesses an S-layer PUBMED:8522531.

    \ ' '4440' 'IPR004658' '\ Slp superfamily members are present in the Gram-negative gamma proteobacteria Escherichia coli (which also contains a close paralog), Haemophilus influenzae and Pasteurella multocida and Vibrio cholerae. The known members of the family to date share a motif LX[GA]C near the N-terminus, which is compatible with the possibility that the protein is modified into a lipoprotein with Cys as the new N-terminus. Slp from E. coli is known to be a lipoprotein of the outer membrane and to be expressed in response to carbon starvation.\ ' '4441' 'IPR008258' '\

    Bacterial lytic transglycosylases degrade murein via cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid and N-acetylglucosamine, with the concomitant formation of a 1,6-anhydrobond in the muramic acid residue. There are both soluble (Slt enzymes) and membrane-bound (Mlt enzymes) lytic transglycosylases that differ in size, sequence, activity, specificity and location. The multi-domain structure of the 70 Kd soluble lytic transglycosylase Slt70 is known PUBMED:10452894. Slt70 has 3 distinct domains, each rich in alpha helices: an N-terminal superhelical U-shaped domain (U-domain; ), a superhelical linker domain (L-domain, ), and a C-terminal catalytic domain (). Both the U- and L-domain share a similar superhelical structure. These two domains are connected, and together form a closed ring with a large central hole; the catalytic domain is packed on top of, and interacts with, this ring. The catalytic domain has a lysosome-like fold.

    \

    This entry represents the catalytic domain, which is structurally conserved in some membrane-bound lytic glycosylases and in bacteriophage transglycosylases, even though their sequences can differ considerably proteins PUBMED:8203016. The most conserved part of this domain is its N-terminal extremity that contains two conserved serines and a glutamate, which have been shown PUBMED:8107871 to be involved in the catalytic mechanism. This family is distantly related to .

    \ ' '4442' 'IPR003189' '\ This family represents the B subunit of shiga-like toxin (SLT or verotoxin) produced by some strains of Escherichia coli associated with hemorrhagic colitis and hemolytic uremic syndrome. SLT s are composed of one enzymatic A subunit and five cell binding B subunits.\ ' '4443' 'IPR007236' '\ The SlyX protein has no known function. It is short, less than 80 amino acids, and its gene is found close to the slyD gene. The SlyX protein has a conserved PPH(Y/W) motif at its C terminus. The protein may be a coiled-coil structure.\ ' '4445' 'IPR003488' '\

    The SMF family, of DNA processing chain A, dprA, are a group of bacterial proteins. In Helicobacter pylori, dprA is required for natural chromosomal and plasmid transformation PUBMED:10640603. It has now been shown that DprA is found to bind cooperatively to single-stranded DNA (ssDNA) and to interact with RecA. In the process, DprA-RecA-ssDNA filaments are produced and these filaments catalyse the homology-dependent formation of joint molecules. While the Escherichia coli SSB protein limits access of RecA to ssDNA, DprA alleviates this barrier. It is proposed that DprA is a new member of the recombination-mediator protein family, dedicated to natural bacterial transformation PUBMED:17803906.

    \ ' '4446' 'IPR005120' '\

    Nonsense-mediated mRNA decay (NMD) is a surveillance mechanism by which eukaryotic cells detect and degrade transcripts containing premature termination codons. Three \'up-frameshift\' proteins, UPF1, UPF2 and UPF3, are essential for this process in organisms ranging from yeast, human to plants PUBMED:11368911. \ \ Exon junction complexes (EJCs) are deposited ~24 nucleotides upstream of exon-exon junctions after splicing. Translation causes displacement of the EJCs, however, premature translation termination upstream of one or more EJCs triggers the recruitment of UPF1, UPF2 and UPF3 and activates the NMD pathway PUBMED:12718880, PUBMED:15048104.

    \ \

    This family contains UPF3. \ The crystal structure of the complex between human UPF2 and UPF3b, which are, respectively, a MIF4G (middle portion of eIF4G) domain and an RNP domain (ribonucleoprotein-type RNA-binding domain) has been determined to 1.95A. The protein-protein interface is mediated by highly conserved charged residues in UPF2 and UPF3b and involves the beta-sheet surface of the UPF3b ribonucleoprotein (RNP) domain, which is generally used by these domains to bind nucleic acids. In UPF3b the RNP domain does not bind RNA, whereas the UPF2 construct and the complex do. It is clear that some RNP domains have evolved for specific protein-protein interactions rather than as nucleic acid binding modules PUBMED:15004547.

    \ ' '4447' 'IPR007011' '\

    Late embryogenesis abundant (LEA) proteins accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. LEA proteins have been grouped into at least six families on the basis of sequence similarity. Although significant similarity has not been detected between the members of the different classes, a unifying and outstanding feature of these proteins is their high hydrophilicity and high percentage of glycines. Amino acid sequence analysis allows one to predict that these proteins exist primarily as random coils. This property has been confirmed in few cases with purified proteins and is supported by the fact that proteins of this type do not coagulate upon heating. LEA protein families have been identified in a wide range of different plant species to the extent that they can be considered ubiquitous in plants. Moreover, it has been shown that members of at least one of the LEA protein families, the so-called dehydrins, are present in a range of photosynthetic organisms, including lower plants, algae, and cyanobacteria. In addition similar proteins, the hydrophilins are induced in a variety of different taxons, of non-photosynthetic organsims, in response to osmotic stress. All of these proteins have a high hydrophilicity index, generally greater than 1.0 PUBMED:10681550.

    \ \

    This conserved region identifies a set of plant seed maturation proteins described as LEA D34.

    \ ' '4449' 'IPR007450' '\

    This is a bacterial outer membrane lipoprotein, possibly involved in maintaining the structural integrity of the cell envelope PUBMED:9973334. The lipid attachment site is a conserved N-terminal cysteine residue sometimes found adjacent to the OmpA domain ().

    \ ' '4450' 'IPR000037' '\

    This entry represents SsrA-binding protein (aka small protein B or SmpB), which is a unique RNA-binding protein that is conserved throughout the bacterial kingdom and is an essential component of the SsrA quality-control system. Tight recognition of codon-anticodon pairings by the ribosome ensures the accuracy and fidelity of protein synthesis. In eubacteria, translational surveillance and ribosome rescue are performed by the \'tmRNA-SmpB\' system (transfer messenger RNA-small protein B). SmpB binds specifically to the ssrA RNA (tmRNA) and is required for stable association of ssrA with ribosomes. SsrA RNA recognises ribosomes stalled on defective messages and acts to mediate the addition of a short peptide tag to the C-terminus of the partially synthesised nascent polypeptide chain. Within a stalled ribosome, SmpB interacts with the three universally conserved bases G530, A1492 and A1493 that form the 30S subunit decoding centre, in which canonical codon-anticodon pairing occurs PUBMED:19132006. The SsrA-tagged protein is then degraded by C-terminal-specific proteases. Formation of an SmpB-SsrA complex appears to be critical in mediating SsrA activity after aminoacylation with alanine but prior to the transpeptidation reaction that couples this alanine to the nascent chain PUBMED:10393194. The SmpB protein has functional and structural similarities with initiation factor 1, and is proposed to be a functional mimic of the pairing between a codon and an anticodon.

    \ ' '4451' 'IPR002625' '\ This family includes the Smr (Small MutS Related) proteins,\ and the C-terminal region of the MutS2 protein. It has been suggested that this domain interacts with the MutS1 () protein in the case of Smr proteins and with the N-terminal MutS related region of MutS2, PUBMED:10431172.\ ' '4452' 'IPR000928' '\

    SNAP-25 (synaptosome-associated protein 25 kDa) proteins are components of SNARE complexes, which are proposed to account for the specificity of membrane fusion and to\ directly execute fusion by forming a tight complex (the SNARE or core\ complex) that brings the synaptic vesicle and plasma membranes\ together. The SNAREs constitute a large family of proteins that\ are characterised by 60-residue sequences known as SNARE motifs (),\ which have a high propensity to form coiled coils and often precede\ carboxy-terminal transmembrane regions. The synaptic core complex is formed by four SNARE motifs (two from\ SNAP25 and one each from synaptobrevin and syntaxin 1) that are\ unstructured in isolation but form a parallel four-helix bundle on\ assembly. The crystal structure of the core complex revealed\ that the helix bundle is highly twisted and contains several salt bridges on\ the surface, as well as layers of interior hydrophobic residues.\ However, a polar layer in the centre of the complex is formed by three\ glutamines (two from SNAP25 and one from syntaxin 1) and one arginine\ (from synaptobrevin) PUBMED:12154365.

    \

    Members \ of the SNAP-25 family contain a cluster of cysteine residues that can be palmitoylated for membrane attachment \ PUBMED:8226991.

    \ ' '4453' 'IPR006021' '\

    Staphylococcus aureus nuclease (SNase) homologues, previously thought to be restricted to bacteria and archaea, are also in eukaryotes. Staphylococcal nuclease has multidomain organization PUBMED:9003410. The human cellular coactivator p100 contains four repeats, each of which is a SNase homologue. These repeats are unlikely to possess SNase-like activities as each lacks equivalent SNase catalytic residues, yet they may mediate p100\'s single-stranded DNA-binding function PUBMED:9041650. alA variety of proteins including many that are still uncharacterised belong to this group.

    \ ' '4454' 'IPR000175' '\

    Neurotransmitter transport systems are integral to the release, re-uptake and recycling of neurotransmitters at synapses. High affinity transport proteins found in the plasma membrane of presynaptic nerve terminals and glial cells are responsible for the removal from the extracellular space of released-transmitters, thereby terminating their actions PUBMED:15336049. Plasma membrane neurotransmitter transporters fall into two structurally and mechanistically distinct families. The majority of the transporters constitute an extensive family of homologous proteins that derive energy from the co-transport of Na+ and Cl-, in order to transport neurotransmitter molecules into the cell against their concentration gradient. The family has a common structure of 12 presumed transmembrane helices and includes carriers for gamma-aminobutyric acid (GABA), noradrenaline/adrenaline, dopamine, serotonin, proline, glycine, choline, betaine and taurine. They are structurally distinct from the second more-restricted family of plasma membrane transporters, which are responsible for excitatory amino acid transport. The latter couple glutamate and aspartate uptake to the cotransport of Na+ and the counter-transport of K+, with no apparent dependence on Cl- PUBMED:8811182. In addition, both of these transporter families are distinct from the vesicular neurotransmitter transporters PUBMED:8103691, PUBMED:7823024.

    Sequence analysis of the Na+/Cl- neurotransmitter superfamily reveals that it can be divided into four subfamilies, these being transporters for monoamines, the amino acids proline and glycine, GABA, and a group of orphan transporters PUBMED:9779464.

    \ ' '4455' 'IPR000330' '\

    This domain is found in proteins involved in a variety of processes including transcription regulation (e.g., SNF2, STH1, brahma, MOT1), DNA repair (e.g., ERCC6, RAD16, RAD5), DNA recombination (e.g., RAD54), and chromatin unwinding (e.g., ISWI) as well as a variety of other proteins with little functional information (e.g., lodestar, ETL1) PUBMED:7651832, PUBMED:14729263. SNF2 functions as the ATPase component of the SNF2/SWI multisubunit complex, which utilises energy derived from ATP hydrolysis to disrupt histone-DNA interactions, resulting in the increased accessibility of DNA to transcription factors.

    \

    Proteins that contain this domain appear to be distantly related to the\ DEAX box helicases , however\ no helicase activity has ever been demonstrated for these proteins.

    \ ' '4456' 'IPR006939' '\ SNF5 is a component of the yeast SWI/SNF complex, which is an ATP-dependent nucleosome-remodelling complex that regulates the transcription of a subset of yeast genes. SNF5 is a key component of all SWI/SNF-class complexes characterised so far PUBMED:10325430. This family consists of the conserved region of SNF5, including a direct repeat motif. SNF5 is essential for the assembly promoter targeting and chromatin remodelling activity of the SWI-SNF complex PUBMED:11390659. SNF5 is also known as SMARCB1, for SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin, subfamily b, member 1, and also INI1 for integrase interactor 1. Loss-of function mutations in SNF5 are thought to contribute to oncogenesis in malignant rhabdoid tumours (MRTs) PUBMED:9671307.\ ' '4457' 'IPR002161' '\

    Members of this family are involved in the pyridoxine biosynthetic pathway PUBMED:8955308, PUBMED:9791124. The regulation of cellular growth and proliferation in response to environmental cues is\ critical for development and the maintenance of viability in all organisms. In unicellular\ organisms, such as the budding yeast Saccharomyces cerevisiae (Baker\'s yeast), growth and proliferation\ are regulated by nutrient availability.

    \ ' '4458' 'IPR001424' '\

    Superoxide dismutases are ubiquitous metalloproteins that prevent damage\ by oxygen-mediated free radicals by catalysing the dismutation of superoxide\ into molecular oxygen and hydrogen peroxide PUBMED:2751312. Superoxide is a normal \ by-product of aerobic respiration and is produced by a number of reactions, \ including oxidative phosphorylation and photosynthesis. The dismutase\ enzymes have a very high catalytic efficiency due to the attraction of\ superoxide to the ions bound at the active site PUBMED:1463506, PUBMED:3891411.

    \

    There are three forms of superoxide dismutase, depending on the metal cofactor: \ Cu/Zn (which binds both copper and zinc), Fe and Mn types. The Fe and Mn\ forms are similar in their primary, secondary and tertiary structures, but \ are distinct from the Cu/Zn form PUBMED:2263641. Prokaryotes and protists contain Mn,\ Fe or both types, while most eukaryotic organisms utilise the Cu/Zn type.

    \ ' '4459' 'IPR019831' '\

    Superoxide dismutases (SODs) () catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one PUBMED:3315461, PUBMED:3345848, PUBMED:1556751. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.

    \

    The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist PUBMED:9537987. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2) PUBMED:9931259.

    \ \

    This entry represents the N-terminal domain of Manganese/iron superoxide dismutase.

    \ ' '4460' 'IPR019832' '\

    Superoxide dismutases (SODs) () catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one PUBMED:3315461, PUBMED:3345848, PUBMED:1556751. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.

    \

    The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist PUBMED:9537987. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2) PUBMED:9931259.

    \ \

    This entry represents the C-terminal domain of Manganese/iron superoxide dismutase.

    \ ' '4461' 'IPR018142' '\ Somatostatin inhibits the release of the pituitary growth hormone, somatotropin and inhibits the release of glucagon and insulin from the pancreas of fasted animals. Cortistatin is a cortical neuropeptide with neuronal depressant and sleep-modulating properties PUBMED:8622767.\ ' '4462' 'IPR001852' '\

    Snz1p is a highly conserved protein involved in growth arrest in Saccharomyces cerevisiae (Baker\'s yeast) PUBMED:8955308. Sor1 (singlet oxygen resistance) is essential in pyridoxine (vitamin B6)\ synthesis in Cercospora nicotianae and Aspergillus flavus. Pyridoxine\ quenches singlet oxygen at a rate comparable to that of vitamins C and E, two of the most highly efficient biological antioxidants, suggesting a previously unknown role for pyridoxine in\ active oxygen resistance PUBMED:10430950.

    \ ' '4463' 'IPR003127' '\

    Sorbin is an active peptide present in the digestive tract, where it has pro-absorptive and anti-secretory effects in different parts of the intestine, including the ability to decrease VIP (vasoactive intestinal peptide) and cholera toxin-induced secretion. It is expressed in some intestinal and pancreatic endocrine tumours in humans PUBMED:10704721.

    \

    Sorbin-homology domains are found in adaptor proteins such as vinexin, CAP/ponsin and argBP2, which regulate various cellular functions, including cell adhesion, cytoskeletal organisation, and growth factor signalling PUBMED:11937713. In addition to the sorbin domain, these proteins contain three SH3 (src homology 3) domains. The sorbin homology domain mediates the interaction of vinexin and CAP with flotillin, which is crucial for the localisation of SH3-binding proteins to the lipid raft, a region of the plasma membrane rich in cholesterol and sphingolipids that acts to concentrate certain signalling molecules. The sorbin homology domain of adaptor proteins may mediate interactions with the lipid raft that are crucial to intracellular communication PUBMED:11481476.

    \ \ ' '4464' 'IPR005754' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to MEROPS peptidase family C60 (clan C-) and include the members of both subfamilies of sortases. The Staphylococcus aureus sortase is a transpeptidase that attaches surface proteins to the cell wall; it cleaves between the Gly and Thr of the LPXTG motif and catalyses the formation of an amide bond between the carboxyl-group of threonine and the amino-group of the cell-wall peptidoglycan PUBMED:10427003. Sortase homologues are found in almost all Gram-positives, a single Gram-negative (Shewanella putrefaciens) and an archaean (Methanobacterium thermoautotrophicum), where cell wall LPXTG-mediated decoration has not been reported PUBMED:11401711, PUBMED:14572546.

    \ \

    Surface proteins not only promote interaction between the invading pathogen and animal tissues, but also provide ingenious strategies for bacterial escape from the host\'s immune response. In the case of S. aureus protein A, immunoglobulins are captured on the microbial surface and camouflage bacteria during the invasion of host tissues. S. aureus mutants lacking the srtA gene fail to anchor and display some surface proteins and are impaired in the ability to cause animal infections. Sortase acts on surface proteins that are initiated into the secretion (Sec) pathway and have their signal peptide removed by signal peptidase. The S. aureus genome encodes two sets of sortase and secretion genes. It is conceivable that S. aureus has evolved more than one pathway for the transport of 20 surface proteins to the cell wall envelope.

    \ \ ' '4465' 'IPR006917' '\ This family represents a group of putative haem-binding proteins PUBMED:10640688. It includes archaeal and bacterial homologues.\ ' '4466' 'IPR006279' '\

    These sequences represent the delta subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Rhizobium loti (Mesorhizobium loti) and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members share the same function. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate PUBMED:3790506.

    \

    Bacterial sarcosine oxidases have been isolated from over a dozen different organisms and fall into two major classes (1) monomeric form that contains only covalent flavin and (2) heterotetrameric (alpha, beta, gamma, delta) form that contain a covalent and noncovalent flavin, this entry represents the heterotetrameric form.

    \ ' '4467' 'IPR007375' '\ Sarcosine oxidase is a hetero-tetrameric enzyme that contains both covalently bound FMN and non-covalently bound FAD and NAD+. This enzyme catalyzes the oxidative demethylation of sarcosine to yield glycine, H2O2, and 5,10-CH2-tetrahydrofolate (H4folate) in a reaction requiring H4folate and O2 PUBMED:11330998, PUBMED:7543100.\ ' '4468' 'IPR004865' '\

    \ The Sp100 and promyelocytic leukemia proteins (PML) are constituents of nuclear domains, known as nuclear dots (NDs or NBs - nuclear bodies or PML bodies), and are both covalently modified by the small ubiquitin-related protein SUMO-1. NBs play a role in autoimmunity, virus infections, and in the etiology of acute promyelocytic leukemia PUBMED:10212234.

    \ \

    A functional nuclear localization signal and an NB-targeting region that coincides with an Sp100 homodimerization domain have been mapped. Sequences similar to the Sp100 homodimerization/ND-targeting region occur in several other proteins, which include the autoimmune regulator proteins (AIRE) and other numerous other transiently or permanently localised proteins. PML is expressed as a family of isoforms (PML I-VII) as a result of alternative splicing, most of which are found in the nucleus. Although there are many other functions of PML NBs in a wide range of cellular pathways, there is accumulating evidence that they represent preferential targets for viral infections and that PML plays a role in the mechanism of the antiviral action of interferon PUBMED:17343971.

    \ \ \ \

    The Sp100 domain is usually found at the amino terminus of proteins that contain a SAND domain .

    \ ' '4469' 'IPR004261' '\ The Hepatitis E virus (HEV) structural protein 2 has a high basic amino acid content suggesting that it may play a role in viral genomic RNA encapsidation.\ ' '4470' 'IPR002954' '\ The Salmonella typhimurium Surface Presentation of Antigens M gene (SpaM)\ is one of 12 that form a cluster responsible for invasion properties PUBMED:8404849.\ The gene product is required for entry by the bacterium into epithelial\ cells, and is thus considered to be a virulence factor PUBMED:8404849. Other Spa genes \ in the cluster are related to invasion (Inv) genes in similar Salmonella \ and Shigella species PUBMED:7752894, and flagella biosynthesis genes in Helicobacter pylori PUBMED:10066464.\ \

    A homologue of this protein has been found recently in Salmonella enterica\ PUBMED:9068645. The protein, named InvI, is required by the organism to gain access to\ mammalian epithelial cells, and cellular mutants (InvI-) failed to\ successfully infect these cells. It has also been found that the inv-spa \ loci of this particular species encode for a type III protein secretion\ system, essential in the bacterium\'s host cell invasion process PUBMED:8751894.

    \ ' '4471' 'IPR003066' '\ The Salmonella typhimurium surface presentation of antigens N/invasion \ protein J gene (SpaN/InvJ) is one of 12 that form a cluster responsible for \ invasion properties. The gene product is required for entry by the \ bacterium into epithelial cells, and is thus considered to be a virulence \ factor PUBMED:8404849. Other Spa genes in the cluster are related to invasion (Inv) genes in similar Salmonella and Shigella species PUBMED:9068645, and to flagella biosynthesis genes in Helicobacter pylori PUBMED:10066464.\

    Functional analysis of the gene product from SpaN/InvJ has revealed the\ protein to have a molecular weight of 36.4 kDa PUBMED:7752894. It is required by the organism to gain access to mammalian epithelial cells, and cellular mutants (InvJ-) fail successfully to infect these cells.\ It has been found, also, that the inv-spa loci of Salmonella species \ encode a type III protein secretion system, essential to the bacterium\'s\ host cell invasion process PUBMED:8751894. Suprisingly, type III-secreted proteins\ lack the customary signal sequence characteristic of most bacterial\ secretory peptides PUBMED:7752894.\

    \ ' '4472' 'IPR007653' '\ Translocation of polypeptide chains across the endoplasmic reticulum membrane is triggered by signal sequences. During translocation of the nascent chain through the membrane, the signal sequence of most secretory and membrane proteins is cleaved off. Cleavage occurs by the signal peptidase complex (SPC), which consists of four subunits in yeast and five in mammals. This family is is described as similar to microsomal signal peptidase 23 kDa subunit. Found in eukaryotes PUBMED:8632014, PUBMED:9148931.\ ' '4473' 'IPR007806' '\ This family is found in proteins involved in transferring a group of integrating conjugative DNA elements, such as pSAM2 from Streptomyces ambofaciens during mating PUBMED:8366038. Their precise role is not known.\ ' '4474' 'IPR005523' '\

    This domain is currently found in streptomyces bacteria, in a set of bacterial proteins with no known function. Most proteins contain two copies of this domain.

    \ ' '4475' 'IPR002017' '\

    Spectrin repeats PUBMED:8266097 are found in several proteins involved in cytoskeletal structure. These include spectrin alpha and beta subunits PUBMED:12672815, PUBMED:15062087, alpha-actinin PUBMED:10481917 and dystrophin. The spectrin repeat forms a three-helix bundle. The second helix is interrupted by proline in some sequences. The repeats are defined by a characteristic tryptophan (W) residue at position 17 in helix A and a leucine (L) at 2 residues from the carboxyl end of helix C.

    \ ' '4476' 'IPR001045' '\ Synonym(s): Spermidine aminopropyltransferase\

    A group of polyamine biosynthetic enzymes involved in the fifth (last) step in the\ biosynthesis of spermidine from arginine and methionine which includes; \ spermidine synthase (), \ spermine synthase () and \ putrescine N-methyltransferase () PUBMED:9517003.

    \

    The Thermotoga maritima spermidine synthase monomer consists of two domains:\ an N-terminal domain composed of six beta-strands, and a Rossmann-like C-\ terminal domain PUBMED:11731804. The larger C-terminal catalytic core domain\ consists of a seven-stranded beta-sheet flanked by nine alpha helices. This\ domain resembles a topology observed in a number of nucleotide and\ dinucleotide-binding enzymes, and in S-adenosyl-L-methionine (AdoMet)-\ dependent methyltransferase (MTases) PUBMED:11731804.

    \ ' '4477' 'IPR004169' '\ This family of spider neurotoxins are thought to be calcium ion channel inhibitors.\ ' '4478' 'IPR003671' '\

    Spindlin (Spin) and Ssty were first identified for their involvement in gametogenesis. Spindlin was identified as a maternal transcript present in the unfertilised egg and early embryo, and was subsequently shown to interact with the spindle apparatus during oogenesis, and may therefore be important for mitosis PUBMED:9053325. In addition, spindlin appears to be a target for cell cycle-dependent phosphorylation, and as such may play a role in cell cycle regulation during the transition from gamete to embryo PUBMED:11806826. Ssty is a multi-copy, Y-linked spermatogenesis-specific transcript that appears to be required for normal spermatogenesis PUBMED:15020475. Ssty may play an analogous role to spindlin in sperm cells, namely during the transition from sperm cells to early embryo, and in mitosis.

    \ ' '4479' 'IPR007880' '\ This family consists of Spiralin proteins found in spiroplasma bacteria. Spiroplasmas are helically shaped pathogenic bacteria related to the mycoplasmas. The surface of spiroplasma bacteria is crowded with the membrane-anchored lipoprotein spiralin whose structure and function are unknown although its cellular function is thought to be a structural and mechanical one rather than catalytical PUBMED:11988221.\ ' '4480' 'IPR004084' '\

    Spo11 is a meiosis-specific protein in yeast that covalently binds to DNA\ double-strand breaks (DSBs) during the early stages of meiosis PUBMED:10534401. These DSBs initiate homologous recombination, which is required for chromosomal \ segregation and generation of genetic diversity during meiosis. Mouse and human homologues of Spo11 have been cloned and characterised. The proteins are 82% identical and share ~25% identity with other family members. Mouse Spo11 has been localised to chromosome 2H4, and human SPO11 to chromosome 20q13.2-q13.3, a region amplified in some breast and ovarian tumours PUBMED:10534401.

    Similarity between SPO11 and archaebacterial TOP6A proteins points to \ evolutionary specialisation of a DNA-cleavage function for meiotic recombination PUBMED:10622720. Note that the yeast SPO11 protein shares far less similarity to other SPO11 proteins than the human and mouse homologues do to each other.

    \ ' '4481' 'IPR007727' '\ This family of proteins includes Spo12 from Saccharomyces cerevisiae . The Spo12 protein plays a regulatory role in two of the most fundamental processes of biology, mitosis and meiosis, and yet its biochemical function remains elusive PUBMED:11729145. Spo12 is a nuclear protein PUBMED:11278742. Spo12 is a component of the FEAR (Cdc fourteen early anaphase release) regulatory network, which promotes Cdc14 release from the nucleolus during early anaphase PUBMED:11832211. The FEAR network is comprised of the polo kinase Cdc5, the separase Esp1, the kinetochore-associated protein Slk19, and Spo12 PUBMED:11832211.\ ' '4482' 'IPR005605' '\

    Saccharomyces cerevisiae (Baker\'s yeast) Spo7 is an integral nuclear/ER membrane protein of unknown function, required for normal nuclear envelope morphology and sporulation PUBMED:9822591.

    \ ' '4483' 'IPR001543' '\ Proteins in this group are involved in a secretory pathway responsible for the surface presentation of invasion plasmid antigen needed for the entry of Salmonella and other species into mammalian cells\ PUBMED:1447979, PUBMED:8885278.They could play a role in preserving the translocation competence of the IPA antigens and are required for secretion of the three IPA proteins PUBMED:1312536.\ ' '4484' 'IPR007730' '\

    This 70 residue domain is composed of two 35 residue repeats that are found in bacterial proteins involved in sporulation and cell division, such as FtsN, CwlM and RlpA. This repeat might be involved in binding peptidoglycan. FtsN is an essential cell division protein with a simple bitopic topology: a short N-terminal cytoplasmic segment fused to a large carboxy periplasmic domain through a single transmembrane domain. The repeats lie at the periplasmic C-terminus, which has an RNP-like fold PUBMED:15101973. FtsN localises to the septum ring complex. The CwlM protein is a cell wall hydrolase, where the C-terminal region, including the repeats, determines substrate specificity PUBMED:1495475. RlpA is a rare lipoprotein A protein that may be important for cell division. Its N-terminal cysteine may be attached to thioglyceride and N-fatty acyl residues PUBMED:3316191.

    \ ' '4487' 'IPR004761' '\

    Amino acid permeases are integral membrane proteins involved in the transport\ of amino acids into the cell. A number of such proteins have been found to be\ evolutionary related PUBMED:3146645, PUBMED:2687114, PUBMED:8382989.\ These proteins seem to contain up to 12 transmembrane segments. The best conserved region\ in this family is located in the second transmembrane segment.

    \

    Spore germination protein (amino acid permease) is involved in the response to the germinative mixture of L-asparagine, glucose, fructose and potassium ions (AFFK). These proteins could be amino acid transporters.

    \ ' '4488' 'IPR001537' '\ The spoU gene of Escherichia coli codes for a protein that shows strong similarities to previously characterised 2\'-O-methyltransferases PUBMED:9321663, PUBMED:8265370. The Pet56 protein of Saccharomyces cerevisiae has been shown to be required for ribose methylation at a universally conserved nucleotide in the peptidyl transferase centre of the mitochondrial large ribosomal RNA (21S rRNA). Cells reduced in this activity were deficient in formation of functional large subunits of the mitochondrial ribosome. The Pet56 protein catalyzes the site-specific formation of 2\'-O-methylguanosine on in vitro transcripts of both mitochondrial 21S rRNA and E. coli 23S rRNA providing evidence for an essential modified nucleotide in rRNA PUBMED:8266080.\ ' '4489' 'IPR005562' '\

    Members of this family are all transcribed from the spoVA operon. These proteins are poorly characterised, but are thought to be involved in dipicolinic acid transport into the developing forespore during sporulation PUBMED:11751839.

    \ ' '4490' 'IPR007170' '\ This is a stage V sporulation protein G. It is essential for sporulation and specific to stage V sporulation in Bacillus megaterium and Bacillus subtilis PUBMED:1373326. In B. subtilis, expression decreases after 30-60 minutes of cold shock PUBMED:8755892.\ ' '4491' 'IPR007390' '\ One of the family members is Bacillus subtilis stage V sporulation protein R, which is involved in spore cortex formation PUBMED:8144469. Little is known about cortex biosynthesis, except that it depends on several sigma E controlled genes, including spoVR PUBMED:8982457.\ ' '4492' 'IPR007347' '\ In Bacillus subtilis this protein interferes with sporulation at an early stage and this inhibitory effect is overcome by SpoIIB and SpoVG. SpoVS seems to play a positive role in allowing progression beyond stage V of sporulation. Null mutations in the spoVS gene block sporulation at stage V, impairing the development of heat resistance and coat assembly PUBMED:7559352.\ ' '4493' 'IPR001641' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of aspartic peptidases belong to MEROPS peptidase family A9 (spumapepsin family, clan AA).

    \ \

    Foamy viruses are single-stranded enveloped retroviruses that have been noted to infect monkeys, cats and humans. In the human virus, the aspartic protease is encoded by the retroviral gag gene PUBMED:2451755, and in monkeys by the pol gene PUBMED:1647358. At present, the virus has not been proven to cause any particular disease. However, studies have shown Human foamy virus causes neurological disorders in infected mice PUBMED:9549727. It is not clear whether the Foamy virus/spumavirus proteases share a common evolutionary origin with other aspartic proteases.

    \ ' '4494' 'IPR005530' '\

    A short repeat found in a small family of membrane-bound proteins. This repeat contains a conserved SPW motif in the first of two transmembrane helices.

    \ ' '4495' 'IPR002060' '\ Squalene synthase (farnesyl-diphosphate farnesyltransferase) (SQS) and Phytoene synthase (PSY) share a number of functional similarities. These similarities are also reflected at the level of their primary structure PUBMED:8294001, PUBMED:8474436, PUBMED:8250898. In particular three well conserved regions are shared by\ SQS and PSY; they could be involved in substrate binding and/or the catalytic\ mechanism. \

    SQS catalyzes the conversion of two molecules of farnesyl diphosphate (FPP) into squalene. It is the first committed step in the cholesterol biosynthetic pathway. The reaction carried out by SQS is catalyzed in two separate steps: the first is a head-to-head condensation of the two molecules of FPP to form presqualene diphosphate; this intermediate is then rearranged in a NADP-dependent reduction, to form squalene:\ \ SQS is found in eukaryotes. In yeast it is encoded by the ERG9 \ gene, in mammals by the FDFT1 gene. SQS seems to be membrane-bound.

    \

    PSY catalyzes the conversion of two molecules of geranylgeranyl diphosphate (GGPP) into phytoene. It is the second step in the biosynthesis of carotenoids from isopentenyl diphosphate. The reaction carried out by PSY is catalyzed in two separate steps: the first is a head-to-head condensation of the two molecules of GGPP to form prephytoene diphosphate; this intermediate is then rearranged to form phytoene.\ \ PSY is found in all organisms that synthesize carotenoids: plants and \ photosynthetic bacteria as well as some non- photosynthetic bacteria and \ fungi. In bacteria PSY is encoded by the gene crtB. In plants PSY is localized in the chloroplast.

    \ ' '4496' 'IPR000737' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    The squash inhibitors form one of a number of serine proteinase inhibitor families. They belong to MEROPS inhibitor family I7, clan IE. They are generally annotated as either trypsin or elastase inhibitors (MEROPS peptidase family S1, ). The proteins, found exclusively in the seeds of the cucurbitaceae, e.g. Citrullus lanatus (watermelon), Cucumis sativus (cucumber), Momordica charantia (balsam pear), are approximately 30 residues in length and contain 6 Cys residues, which form 3 disulphide bonds PUBMED:2914611. The inhibitors function by being taken up by a serine protease (such as trypsin),\ which cleaves the peptide bond between Arg/Lys and Ile residues in the N-terminal portion of the protein PUBMED:1731946, PUBMED:2914611. Structural studies have shown that the inhibitor has an ellipsoidal shape, and is largely composed of beta-turns PUBMED:2914611. The fold and Cys connectivity\ of the proteins resembles that of potato carboxypeptidase A inhibitor PUBMED:1731946.

    \ \ ' '4497' 'IPR000344' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class a (Sra) from the Sra superfamily PUBMED:15618405. Sra receptors contain 6-7 hydrophobic, putative transmembrane, regions and can be distinguished from other 7TM GPCR receptors by their own characteristic TM signatures.

    \ ' '4498' 'IPR002184' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class b (Srb) from the Sra superfamily PUBMED:15618405. Srb receptors contain 6-8 hydrophobic, putative transmembrane, regions and can be distinguished from other 7TM GPCR receptors by their own characteristic TM signatures.

    \ ' '4499' 'IPR001190' '\

    The egg peptide speract receptor is a transmembrane glycoprotein PUBMED:8140623. Other members of this family include the macrophage\ scavenger receptor type I (a membrane glycoprotein implicated in the pathologic\ deposition of cholesterol in arterial walls during artherogenesis), an enteropeptidase\ and T-cell surface glycoprotein CD5 (may act as a receptor in regulating T-cell\ proliferation).

    \ ' '4500' 'IPR002100' '\ Human serum response factor (SRF) is a ubiquitous nuclear protein important\ for cell proliferation and differentiation. SRF function is essential\ for transcriptional regulation of numerous growth-factor-inducible genes,\ such as c-fos oncogene and muscle-specific actin genes. \ A core domain of around 90 amino acids is sufficient for the activities\ of DNA-binding, dimerisation and interaction with accessory factors. Within\ the core is a DNA-binding region, designated the MADS box PUBMED:7637780, that is\ highly similar to many eukaryotic regulatory proteins: among these are\ MCM1, the regulator of cell type-specific genes in fission yeast; DSRF,\ a Drosophila trachea development factor; the MEF2 family of myocyte-specific enhancer factors; and the Agamous and Deficiens families of\ plant homeotic proteins. \

    Proteins belonging to the MADS family function as dimers, the primary\ DNA-binding element of which is an anti-parallel coiled coil of two\ amphipathic alpha-helices, one from each subunit. The DNA wraps around\ the coiled coil allowing the basic N-termini of the helices to fit into\ the DNA major groove. The chain extending from the helix N-termini reaches\ over the DNA backbone and penetrates into the minor groove. A 4-stranded,\ anti-parallel beta-sheet packs against the coiled-coil face opposite the\ DNA and is the central element of the dimerisation interface.\ The MADS-box domain is commonly found associated with K-box region see

    \ ' '4501' 'IPR000609' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class g (Srg) from the Srg superfamily PUBMED:18050473, PUBMED:9582190. Srg receptors contain seven hydrophobic, putative transmembrane, regions and can be distinguished from other 7TM GPCR receptors by their own characteristic TM signatures.

    \ ' '4502' 'IPR003210' '\

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes PUBMED:17622352, PUBMED:16469117. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor PUBMED:17507650. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5\' and 3\' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    \

    This entry represents the 14 kDa SRP14 component. Both SRP9 and SRP14 have the same (beta)-alpha-beta(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded beta-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP PUBMED:7730321.

    \ ' '4503' 'IPR002778' '\

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes PUBMED:17622352, PUBMED:16469117. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor PUBMED:17507650. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5\' and 3\' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    \

    This entry represents the SRP19 subunit. The SRP19 protein is unstructured but forms a compact core domain and two extended RNA-binding loops upon binding the signal recognition particle (SRP) RNA PUBMED:17434535.

    \ ' '4504' 'IPR000992' '\ It has recently been shown PUBMED:1304897 that three yeast proteins, two of which are known to be induced \ by various stress conditions, are structurally related and are probably part of a larger family. These \ proteins include cold-shock inducible protein TIR1 (also known as serine-rich protein 1, SRP1), which is \ induced by glucose PUBMED:3139887 and cold shock PUBMED:7746155; temperature-shock inducible protein 1 \ (SRP2) PUBMED:7746155; seripauperins, which are closely related protein of about 13kDa (120 to 124 residues) \ and are generally encoded at the extremity of yeast chromosomes (eg. PAU1, PAU2, PAU3, PAU4, PAU5, PAU6, \ YBR301w, YGL261c, YGR294w, YHL046c, YIL176c, YIR041w and YKL224c) PUBMED:7926827; and hypothetical proteins \ YIL011w, YJR150c and YJR151c. These proteins all seem to start with a putative signal sequence followed by \ a conserved domain of about 90 residues. In TIR1, TIR2, TIP1, YIL011w, YJR150c and YJR151c, this domain is \ followed by a repetitive serine and alanine rich region absent in the other members of this family.\ ' '4505' 'IPR007718' '\ This presumed domain is found at the C terminus of the Saccharomyces cerevisiae SRP40 protein and its homologues. SRP40/nopp40 is a chaperone involved in nucleocytoplasmic transport. SRP40 is also a suppressor of mutant AC40 subunit of RNA polymerase I and III.\ ' '4506' 'IPR006972' '\ SseC is a secreted protein that forms a complex together with SecB and SecD on the surface of Salmonella typhimurium. All these proteins are secreted by the type III secretion system PUBMED:1156700. Many mucosal pathogens use type III secretion systems for the injection of effector proteins into target cells. SecB, SseC and SecD are inserted into the target cell membrane. where they form a small pore or translocon PUBMED:1156700, PUBMED:11580752. In addition to SseC, this family includes the bacterial secreted proteins PopB, PepB, YopB and EspD which are thought to be directly involved in pore formation, and type III secretion system translocon.\ ' '4507' 'IPR001734' '\

    Sodium/substrate symport (or co-transport) is a widespread mechanism of solute transport across cytoplasmic membranes of pro- and eukaryotic cells. Thereby the\ energy stored in an inwardly directed electrochemical sodium gradient (sodium motive force, SMF) is used to drive solute accumulation against a concentration\ gradient. The SMF is generated by primary sodium pumps (e.g. sodium/potassium ATPases, sodium translocating respiratory chain complexes) or via the action of\ sodium/proton antiporters. Sodium/substrate transporters are grouped in different families based on sequence similarities PUBMED:1965458, PUBMED:8031825.

    \

    One of these families, known as the sodium:solute symporter family (SSSF), contains over a hundred members of pro- and eukaryotic origin PUBMED:12354616. The average hydropathy plot for SSSF proteins predicts 11 to 15 putative transmembrane domains (TMs) in alpha-helical conformation. A secondary structure model of PutP from Escherichia coli suggests the protein contains 13 TMs with the N-terminus located\ on the periplasmic side of the membrane and the C-terminus facing the cytoplasm. The results support the idea of a common topological motif for members of the SSSF. Transporters with a C-terminal extension are proposed to have\ an additional 14th TM.

    \

    An ordered binding model of sodium/substrate transport suggests that sodium binds to\ the empty transporter first, thereby inducing a conformational alteration which increases the affinity of the transporter for the solute. The formation of the ternary\ complex induces another structural change that exposes sodium and substrate to the other site of the membrane. Substrate and sodium are released and the empty\ transporter re-orientates in the membrane allowing the cycle to start again.

    \ ' '4508' 'IPR006776' '\

    The precise function of SsgA is unknown. It is an acidic, cytosolic protein which has been found to be essential for spore formation, and to stimulate cell division in Streptomyces coelicolor PUBMED:11004161.

    \ ' '4509' 'IPR000691' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    The Streptomyces family of bacteria produce a number of proteinase inhibitors, which belong to MEROPS inhibitor family I16, clan IY. They are characterised by their strong activity towards subtilisin (MEROPS peptidase family S8, ) and are collectively known as Streptomyces subtilisin inhibitors (SSI). Some SSI also inhibit trypsin, chymotrypsin (MEROPS peptidase family S1, ) and griselysin (MEROPS peptidase family M4, ) PUBMED:14705960. Mutation of the active site residue can influence\ inhibition specificity PUBMED:1908859. SSI is a homodimer, each monomer containing 2 anti-parallel beta-sheets\ and 2 short alpha-helices. Protease binding induces the widening of a channel-like structure, in which\ hydrophobic side-chains are sandwiched between 2 lobes PUBMED:6387152. Loss of the C-terminal tetrapeptide\ VFAF drastically reduces the inhibitory effect of the proteins when there is less than one molecule of\ inhibitor present per molecule of enzyme. This implies that the tetrapeptide is neccessary to maintain the\ correct 3D fold PUBMED:6993452. Structural similarities between the primary and secondary contact loops of SSI,\ and the ovomucoid and pancreatic secretory trypsin inhibitor family suggest evolution of the 2 families from\ a common ancestor PUBMED:6387152.

    \ \ \ ' '4510' 'IPR007198' '\

    Ssl1-like proteins are 40 kDa subunits of the transcription factor II H complex. This domain is often found associated with the C2H2 type Zn-finger ().

    \ ' '4511' 'IPR007481' '\

    Escherichia coli stringent starvation protein B (SspB), is thought to enhance the specificity of degradation of tmRNA-tagged proteins by the ClpXP protease. The tmRNA tag, also known as ssrA, is an 11-aa peptide added to the C terminus of proteins stalled during translation, targets proteins for degradation by ClpXP and ClpAP. SspB is a cytoplasmic protein that specifically binds to residues 1-4 and 7 of the tag. Binding of SspB enhances degradation of tagged proteins by ClpX, and masks sequence elements important for ClpA interactions, inhibiting degradation by ClpA PUBMED:11535833. However, more recent work has cast doubt on the importance of SspB in wild-type cells PUBMED:11810257. SspB is encoded in an operon whose synthesis is stimulated by carbon, amino acid, and phosphate starvation. SspB may play a special role during nutrient stress, for example by ensuring rapid degradation of the products of stalled translation, without causing a global increase in degradation of all ClpXP substrates PUBMED:11009422.

    \ ' '4512' 'IPR000969' '\

    Human structure-specific recognition protein, SSRP1, PUBMED:1372440 binds specifically to DNA modified with the anti-cancer drug cisplatin. An 81kDa protein is predicted, containing several highly-charged domains and a stretch of 75 residues that share 47% identity with a portion of the high mobility group (HMG) protein HMG1. This HMG box probably constitutes the structure recognition element for cisplatin-modified DNA, the \ probable recognition motif being the local duplex unwinding and bending that occurs on formation of intra-strand cross-links PUBMED:1372440. SSRP1 is the human homologue of a recently identified mouse protein that binds to recombination signal sequences PUBMED:1678855. These sequences have been postulated to form stem-loop structures, further implicating local bends and unwinding in DNA as a recognition target for HMG-box proteins. A Drosophila melanogaster cDNA encoding an HMG-box-containing protein has also been isolated PUBMED:7688122, PUBMED:8479916. This protein shares 50% sequence identity with human SSRP1. In vitro binding studies using Drosophila SSRP showed that the protein binds to single-stranded DNA and RNA, with highest affinity for nucleotides G and U. Comparison of the predicted amino acid sequences among SSRP family \ members reveals 48% identity, with structural conservation in the C-terminus of the HMG box, as well as domains of highly charged residues. The most highly conserved regions lie in the poorly understood N-terminus, suggesting that this portion of the protein is critical for its function PUBMED:8479916.

    \ \ \

    This entry contains Pob3 , which is a subunit of the heterodimeric yeast FACT complex (Spt16p ()-Pob3p) PUBMED:15987999. The FACT complex facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilising and then reassembling nucleosome structure PUBMED:12524332, PUBMED:12934006.

    \ \ ' '4513' 'IPR006811' '\ The highly conserved and essential protein Ssu72 has intrinsic phosphatase activity and plays an essential role in the transcription cycle. Ssu72 was originally identified in a yeast genetic screen as enhancer of a defect caused by a mutation in the transcription initiation factor TFIIB PUBMED:12660165. It binds to TFIIB and is also involved in mRNA elongation. Ssu72 is further involved in both poly(A) dependent and independent termination. It is a subunit of the yeast cleavage and polyadenylation factor (CPF), which is part of the machinery for mRNA 3\'-end formation. Ssu72 is also essential for transcription termination of snRNAs PUBMED:12453421.\ ' '4514' 'IPR007311' '\ The ST7 (for suppression of tumorigenicity 7) protein is thought to be a tumour suppressor gene. The molecular function of this protein is uncertain.\ ' '4515' 'IPR004978' '\

    Stanniocalcin (STC) is a calcium- and phosphate-regulating hormone produced in bony fish by the corpuscles of Stannius,\ which are located close to the kidney. It is a major antihypercalcemic hormone in fish. Recent results\ suggest that the biological repertoires of STCs in mammals will be considerably larger than in fish and may not be limited to\ mineral metabolism.

    \ ' '4516' 'IPR003120' '\ This family consists of transcription factors related to STE and is found associated with the C2H2 zinc finger in some proteins.\ ' '4517' 'IPR006123' '\

    Staphylococcal enterotoxins and streptococcal pyrogenic exotoxins constitute a family of biologically and structurally related toxins produced by Staphylococcus aureus and Streptococcus pyogenes PUBMED:2679358, PUBMED:2185544. These toxins share the ability to bind to the major histocompatibility complex proteins of their hosts. A more distant relative of the family is the S. aureus toxic shock syndrome toxin, which shares only a low level of sequence similarity with this group.

    All of these toxins share a similar two-domain fold (N and C-terminal domains) with a long alpha-helix in the middle of the molecule, a\ characteristic beta-barrel known as the "oligosaccharide/oligonucleotide fold" at the N-terminal domain and a beta-grasp motif at the C-terminal domain. Each superantigen possesses slightly different binding mode(s) when it interacts with MHC class II molecules or the T-cell receptor PUBMED:9514739.

    The beta-grasp domain has some structural similarities to the beta-grasp motif present in immunoglobulin-binding\ domains, ubiquitin, 2Fe-2 S ferredoxin and translation initiation factor 3 as identified by the SCOP database.

    \ ' '4518' 'IPR006173' '\

    Staphylococcus aureus is a Gram-positive coccus that grows in clusters or\ pairs, and is the major cause of nosocomial infections due to its multiple \ antibiotic resistant nature PUBMED:3782090. Patients who are immunocompromised (e.g., \ those suffering from third degree burns or chronic illness) are at risk \ from deep staphylococcal infections, such as osteomyelitis and pneumonia.\ Most skin infections are also caused by this bacterium.

    \ \ Many virulence mechanisms are employed by Staphylococci to induce \ pathogenesis: these can include polysaccharide capsules and exotoxins PUBMED:3782090.\ One of the major virulence exotoxins is toxic shock syndrome toxin (TSST),\ which is secreted by the organism upon successful invasion. It causes a\ major inflammatory response in the host via superantigenic properties,\ and is the causative agent of toxic shock syndrome.

    \

    The structure of the TSST protein was originally determined to 2.5A by means\ of X-ray crystallography PUBMED:8107781. The N- and C-terminal domains both contain\ regions involved in MHC class II association; the C-terminal domain is also\ implicated in binding the T-cell receptor. Overall, the structure \ resembles that of Staphylococcal enterotoxin B (SEB), but differs in its\ N-terminus and in the degree to which a long central helix is covered by \ surface loops PUBMED:8268150. The region around the carboxyl end of this helix is \ proposed to govern the superantigenic properties of TSST. An adjacent\ region along this helix is thought to be critical in the ability of TSST\ to induce toxic shock syndrome. Most recently, the structures of five \ mutants of TSST have been determined to 1.95A PUBMED:9194182. The mutations are in \ the central alpha-helix, and allow mapping of portions of TSST involved in\ superantigenicity and lethality.

    \ ' '4519' 'IPR001443' '\ Staphylocoagulase is an extracellular protein produced by several\ strains of Staphylococcus aureus and which specifically forms a complex with\ prothrombin PUBMED:3481366, PUBMED:2587230. This complex named staphylothrombin can clot fibrinogen without\ any proteolytic cleavage of prothrombin.\ The C terminus of staphylocoagulase contains the tandem repeat which does not seem to be \ required for the procoagulant activity.\ ' '4520' 'IPR004093' '\ Staphylokinases and streptokinases are not proteases. They are involved in plasminogen activation. The three-dimensional structure of streptokinase is believed to contain two independently folded domains, each homologous to serine proteases PUBMED:6760891.\ ' '4521' 'IPR013800' '\

    The STAT protein (Signal Transducers and Activators of Transcription) family contains transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors, hence they act as signal transducers in the cytoplasm and transcription activators in the nucleus PUBMED:12039028. Binding of these factors to cell-surface receptors leads to receptor autophosphorylation at a tyrosine, the phosphotyrosine being recognised by the STAT SH2 domain, which mediates the recruitment of STAT proteins from the cytosol and their association with the activated receptor. The STAT proteins are then activated by phosphorylation via members of the JAK family of protein kinases, causing them to dimerise and translocated to the nucleus, where they bind to specific promoter sequences in target genes. In mammals, STATs comprise a family of seven structurally and functionally related proteins: Stat1, Stat2, Stat3, Stat4, Stat5a and Stat5b, Stat6. STAT proteins play a critical role in regulating innate and acquired host immune responses. Dysregulation of at least two STAT signalling cascades (i.e. Stat3 and Stat5) is associated with cellular transformation.

    \ \

    Signalling through the JAK/STAT pathway is initiated when a cytokine binds to its corresponding receptor. This leads to conformational changes in the cytoplasmic portion of the receptor, initiating activation of receptor associated members of the JAK family of kinases. The JAKs, in turn, mediate phosphorylation at the specific receptor tyrosine residues, which then serve as docking sites for STATs and other signalling molecules. Once recruited to the receptor, STATs also become phosphorylated by JAKs, on a single tyrosine residue. Activated STATs dissociate from the receptor, dimerise, translocate to the nucleus and bind to members of the GAS (gamma activated site) family of enhancers.

    \ \

    The seven STAT proteins identified in mammals range in size from 750 and 850 amino acids. The chromosomal distribution of these STATs, as well as the identification of STATs in more primitive eukaryotes, suggest that this family arose from a single primordial gene. STATs share structurally and functionally conserved domains including: an N-terminal domain that strengthens interactions between STAT dimers on adjacent DNA-binding sites; a coiled-coil STAT domain that is implicated in protein-protein interactions; a DNA-binding domain with an immunoglobulin-like fold similar to p53 tumour suppressor protein; an EF-hand-like linker domain connecting the DNA-binding and SH2 domains; an SH2 domain () that acts as a phosphorylation-dependent switch to control receptor recognition and DNA-binding; and a C-terminal transactivation domain PUBMED:9630226. The crystal structure of the N-terminus of Stat4 reveals a dimer. The interface of this dimer is formed by a ring-shaped element consisting of five short helices. Several studies suggest that this N-terminal dimerisation promotes cooperativity of binding to tandem GAS elements and with the transcriptional coactivator CBP/p300.

    \ \

    This entry represents the all-alpha helical domain, which consists of four long helices arranged in a bundle with a left-handed twist (coiled-coil), which in turn forms a right-handed superhelix.

    \ ' '4522' 'IPR013801' '\

    The STAT protein (Signal Transducers and Activators of Transcription) family contains transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors, hence they act as signal transducers in the cytoplasm and transcription activators in the nucleus PUBMED:12039028. Binding of these factors to cell-surface receptors leads to receptor autophosphorylation at a tyrosine, the phosphotyrosine being recognised by the STAT SH2 domain, which mediates the recruitment of STAT proteins from the cytosol and their association with the activated receptor. The STAT proteins are then activated by phosphorylation via members of the JAK family of protein kinases, causing them to dimerise and translocated to the nucleus, where they bind to specific promoter sequences in target genes. In mammals, STATs comprise a family of seven structurally and functionally related proteins: Stat1, Stat2, Stat3, Stat4, Stat5a and Stat5b, Stat6. STAT proteins play a critical role in regulating innate and acquired host immune responses. Dysregulation of at least two STAT signalling cascades (i.e. Stat3 and Stat5) is associated with cellular transformation.

    \ \

    Signalling through the JAK/STAT pathway is initiated when a cytokine binds to its corresponding receptor. This leads to conformational changes in the cytoplasmic portion of the receptor, initiating activation of receptor associated members of the JAK family of kinases. The JAKs, in turn, mediate phosphorylation at the specific receptor tyrosine residues, which then serve as docking sites for STATs and other signalling molecules. Once recruited to the receptor, STATs also become phosphorylated by JAKs, on a single tyrosine residue. Activated STATs dissociate from the receptor, dimerise, translocate to the nucleus and bind to members of the GAS (gamma activated site) family of enhancers.

    \ \

    The seven STAT proteins identified in mammals range in size from 750 and 850 amino acids. The chromosomal distribution of these STATs, as well as the identification of STATs in more primitive eukaryotes, suggest that this family arose from a single primordial gene. STATs share structurally and functionally conserved domains including: an N-terminal domain that strengthens interactions between STAT dimers on adjacent DNA-binding sites; a coiled-coil STAT domain that is implicated in protein-protein interactions; a DNA-binding domain with an immunoglobulin-like fold similar to p53 tumour suppressor protein; an EF-hand-like linker domain connecting the DNA-binding and SH2 domains; an SH2 domain () that acts as a phosphorylation-dependent switch to control receptor recognition and DNA-binding; and a C-terminal transactivation domain PUBMED:9630226. The crystal structure of the N-terminus of Stat4 reveals a dimer. The interface of this dimer is formed by a ring-shaped element consisting of five short helices. Several studies suggest that this N-terminal dimerisation promotes cooperativity of binding to tandem GAS elements and with the transcriptional coactivator CBP/p300.

    \ \

    This entry represents the DNA-binding domain, which has an immunoglobulin-like structural fold.

    \ ' '4523' 'IPR013799' '\

    The STAT protein (Signal Transducers and Activators of Transcription) family contains transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors, hence they act as signal transducers in the cytoplasm and transcription activators in the nucleus PUBMED:12039028. Binding of these factors to cell-surface receptors leads to receptor autophosphorylation at a tyrosine, the phosphotyrosine being recognised by the STAT SH2 domain, which mediates the recruitment of STAT proteins from the cytosol and their association with the activated receptor. The STAT proteins are then activated by phosphorylation via members of the JAK family of protein kinases, causing them to dimerise and translocated to the nucleus, where they bind to specific promoter sequences in target genes. In mammals, STATs comprise a family of seven structurally and functionally related proteins: Stat1, Stat2, Stat3, Stat4, Stat5a and Stat5b, Stat6. STAT proteins play a critical role in regulating innate and acquired host immune responses. Dysregulation of at least two STAT signalling cascades (i.e. Stat3 and Stat5) is associated with cellular transformation.

    \ \

    Signalling through the JAK/STAT pathway is initiated when a cytokine binds to its corresponding receptor. This leads to conformational changes in the cytoplasmic portion of the receptor, initiating activation of receptor associated members of the JAK family of kinases. The JAKs, in turn, mediate phosphorylation at the specific receptor tyrosine residues, which then serve as docking sites for STATs and other signalling molecules. Once recruited to the receptor, STATs also become phosphorylated by JAKs, on a single tyrosine residue. Activated STATs dissociate from the receptor, dimerise, translocate to the nucleus and bind to members of the GAS (gamma activated site) family of enhancers.

    \ \

    The seven STAT proteins identified in mammals range in size from 750 and 850 amino acids. The chromosomal distribution of these STATs, as well as the identification of STATs in more primitive eukaryotes, suggest that this family arose from a single primordial gene. STATs share structurally and functionally conserved domains including: an N-terminal domain that strengthens interactions between STAT dimers on adjacent DNA-binding sites; a coiled-coil STAT domain that is implicated in protein-protein interactions; a DNA-binding domain with an immunoglobulin-like fold similar to p53 tumour suppressor protein; an EF-hand-like linker domain connecting the DNA-binding and SH2 domains; an SH2 domain () that acts as a phosphorylation-dependent switch to control receptor recognition and DNA-binding; and a C-terminal transactivation domain PUBMED:9630226. The crystal structure of the N-terminus of Stat4 reveals a dimer. The interface of this dimer is formed by a ring-shaped element consisting of five short helices. Several studies suggest that this N-terminal dimerisation promotes cooperativity of binding to tandem GAS elements and with the transcriptional coactivator CBP/p300.

    \ \

    This entry represents the N-terminal domain, which is responsible for protein interactions. This domain has a multi-helical structure that can be subdivided into two structural sub-domains.

    \ ' '4524' 'IPR005575' '\

    Statherin functions biologically to inhibit the nucleation and growth of calcium phosphate minerals. The N-terminus of statherin is highly charged, the glutamic acids of which have been shown to be important in the recognition hydroxyapatite PUBMED:1313424.

    \ ' '4525' 'IPR000956' '\ Stathmin is a ubiquitous phosphorylated protein thought to act as an intracellular relay for diverse \ regulatory pathways PUBMED:2358074, functioning through a variety of second messengers. Its phosphorylation \ and gene expression are regulated throughout development PUBMED:8344928 and in response to extracellular \ signals regulating cell proliferation, differentiation and function PUBMED:2745432. Stathmin, and the \ related proteins SCG10 and XB3, contain a N-terminal domain (XB3 contains an additional N-terminal \ hydrophobic region), a 78 amino acid coiled-coil region, and a short C-terminal domain.\ ' '4526' 'IPR000366' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \ \

    Little is known about the structure and function of the mating factor\ receptors, STE2 and STE3. It is believed, however, that they are integral\ membrane proteins that may be involved in the response to mating factors\ on the cell membrane PUBMED:16453635, PUBMED:3001640, PUBMED:2836861. The amino acid sequences of both receptors\ contain high proportions of hydrophobic residues grouped into 7 domains,\ in a manner reminiscent of the rhodopsins and other receptors believed to\ interact with G-proteins. However, while a similar 3D framework has been\ proposed to account for this, there is no significant sequence similarity\ either between STE2 and STE3, or between these and the rhodopsin-type\ family: the receptors thus bear their own unique \'7TM\' signatures.

    \ \ ' '4527' 'IPR001499' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    Little is known about the structure and function of the mating factor\ receptors, STE2 and STE3. It is believed, however, that they are integral\ membrane proteins that may be involved in the response to mating factors\ on the cell membrane PUBMED:16453635, PUBMED:3001640, PUBMED:2836861. The amino acid sequences of both receptors\ contain high proportions of hydrophobic residues grouped into 7 domains,\ in a manner reminiscent of the rhodopsins and other receptors believed to\ interact with G-proteins. However, while a similar 3D framework has been\ proposed to account for this, there is no significant sequence similarity\ either between STE2 and STE3, or between these and the rhodopsin-type\ family: the receptors thus bear their own unique \'7TM\' signatures.

    \

    The STE3 gene of Saccharomyces cerevisiae (Baker\'s yeast) is the cell-surface receptor that binds the\ 13-residue lipopeptide a-factor. Several related fungal pheromone receptor\ sequences are known: these include pheromone B alpha 1 and B alpha 3, and\ pheromone B beta 1 receptors from Schizophyllum commune; pheromone receptor\ 1 from Ustilago hordei; and pheromone receptors 1 and 2 from Ustilago maydis.\ Members of the family share about 20% sequence identity.

    \ ' '4529' 'IPR006969' '\ This family represents the Stig1 cysteine rich plant protein.The tobacco stigma-specific gene, STIG1 is developmentally regulated and expressed specifically in the stigmatic secretory zone. Pistils of transgenic STIG1-barnase tobacco plants undergo normal development, but lack the stigmatic secretory zone and are female sterile. Pollen grains are unable to penetrate the surface of the ablated pistils. Application of stigmatic exudate from wild-type pistils to the ablated surface increases the efficiency of pollen tube germination and growth and restores the capacity of pollen tubes to penetrate the style PUBMED:8039494. The function of STIG1 is unknown.\ ' '4530' 'IPR007882' '\

    Neurons contain abundant subsets of highly stable microtubules that resist de-polymerising conditions such as exposure to the cold. Stable microtubules are thought to be essential for neuronal development, maintenance, and function. STOP is a major factor responsible for the intriguing stability properties of neuronal microtubules and is important for synaptic plasticity. STOPs (for stable tubule only polypeptides) are calmodulin-binding and calmodulin-regulated proteins which, in mammals, are encoded by a single gene but exhibit\ substantial cell specific variability due to mRNA splicing and alternative promoter use. STOP microtubule stabilising activity has been ascribed to two classes of new bifunctional calmodulin- and\ microtubule-binding motifs, with distinct microtubule binding properties in vivo. STOPs seem to be restricted to vertebrates and are composed of a conserved domain split by the apparent insertion of\ variable sequences that are completely unrelated among species PUBMED:14567673.

    N-STOP (for neuronal adult STOP) contains two repeat domains. The central repeat domain is composed of five repeated sequences of 46\ amino acids. These sequences are almost completely identical, exhibiting an unusual degree of conservation of the repeat motif, compared to repeated sequences in other microtubule-associated proteins. The\ carboxy-terminal repeat domain is composed of 28 imperfect repeats of an 11 amino acid consensus sequence. Upstream of the carboxy-terminal repeat domain, rat N-STOP contains a highly basic sequence (called\ the "KR domain" after its high content in lysine and arginine residues) and a so-called "linker domain" located between the central repeat domain and the KR domain. To date, two splicing variants of STOP, E-STOP and F-STOP, have been characterised in rodents. Knowledge of STOPs function and properties may help in the treatment of neuroleptics in illnesses such as schizophrenia, currently thought to result from synaptic defects PUBMED:12231625.

    \ ' '4531' 'IPR018119' '\

    This entry represents a conserved region found in strictosidine synthase (), a key enzyme in alkaloid biosynthesis. It catalyses the Pictet-Spengler stereospecific condensation of tryptamine with secologanin to form strictosidine PUBMED:18081287. The structure of the native enzyme from the Indian medicinal plant Rauvolfia serpentina (Serpentwood) (Devilpepper) represents the first example of a six-bladed four-stranded beta-propeller fold from the plant kingdom PUBMED:18280746.

    \ ' '4532' 'IPR006270' '\

    The sequences represented in this group are identified by a domain which consists of the N-terminal half of a family of Streptococcal proteins that contain a signal peptide and then up to five repeats of a region that includes a His-X-X-His-X-His (histidine triad) motif. Additional copies of the repeats are found in more poorly conserved regions. Members of this family from Streptococcus pneumoniae are suggested to cleave human C3, and the member PhpA has been shown in vaccine studies to be a protective antigen in mice PUBMED:11349048.

    \ ' '4533' 'IPR003674' '\

    N-linked glycosylation is a ubiquitous protein modification, and is essential for viability in eukaryotic cells. A lipid-linked core-oligosaccharide is assembled at the membrane of the endoplasmic reticulum and transferred to selected asparagine residues of nascent polypeptide chains by the oligosaccharyl transferase (OTase) complex PUBMED:7588624.

    \ \

    This family consists of the oligsacharyl transferase STT3 subunit and related proteins. The STT3 subunit is part of the oligosccharyl transferase (OTase) complex of proteins and is required for its activity PUBMED:7588624.

    \ ' '4534' 'IPR005145' '\

    The function of this domain is unknown, it is found in and its relatives. It is found C-terminal to the .

    \ ' '4535' 'IPR006070' '\

    The YrdC family of hypothetical proteins are widely distributed in eukaryotes and prokaryotes and occur as: (i) independent proteins, (ii) with C-terminal extensions, and (iii) as domains in larger proteins, some of which are implicated in regulation PUBMED:12211024. The YrdC protein, which consists solely of this domain, forms an alpha/beta twisted open-sheet structure composed of seven alpha helices and seven beta strands PUBMED:11206077. YrdC from Escherichia coli preferentially binds to double-stranded RNA and DNA. YrdC is predicted to be an rRNA maturation factor, as deletions in its gene lead to immature ribosomal 30S subunits and, consequently, fewer translating ribosomes PUBMED:15716138. Therefore, YrdC may function by keeping an rRNA structure needed for proper processing of 16S rRNA, especially at lower temperatures. Sua5 is an example of a multi-domain protein that contains an N-terminal YrdC-like domain and a C-terminal Sua5 domain. Sua5 was identified in Saccharomyces cerevisiae (Baker\'s yeast) as a suppressor of a translation initiation defect in the cytochrome c gene and is required for normal growth in yeast; however its exact function remains unknown PUBMED:18004774. HypF is involved in the synthesis of the active site of [NiFe]-hydrogenases PUBMED:12377778.

    \ ' '4536' 'IPR000368' '\ Sucrose synthases catalyse the synthesis of sucrose\ in the following reaction:\ \ This family includes the bulk of the sucrose synthase\ protein. However the carboxyl terminal region of the\ sucrose synthases belongs to the glycosyl transferase\ family . This enzyme is found mainly in plants\ but also appears in bacteria.\ ' '4537' 'IPR003808' '\

    This family consists of the SufE-related proteins. These have been implicated in Fe-S metabolism and export PUBMED:11251816.

    \ ' '4538' 'IPR007768' '\ SUFU, encoding the human ortholog of Drosophila suppressor of fused, appears to have a conserved role in the repression of Hedgehog signalling. SUFU exerts its repressor role by physically interacting with GLI proteins in both the cytoplasm and the nucleus PUBMED:12150819. SUFU has been found to be a tumour-suppressor gene that predisposes individuals to medulloblastoma by modulating the SHH signalling pathway PUBMED:12068298.\ ' '4539' 'IPR007324' '\

    This probable domain is found in bacterial transcriptional regulators such as DeoR and SorC. One of these proteins, , has an N-terminal helix-turn-helix that binds to DNA. This domain is probably the ligand regulator binding region. SorC is regulated by sorbose and other members of this family are likely to be regulated by other sugar substrates.

    \ ' '4540' 'IPR005828' '\

    Recent genome-sequencing data and a wealth of biochemical and molecular genetic investigations have revealed the occurrence of dozens of families of primary and secondary transporters. Two such families have been found to occur ubiquitously in all classifications of living organisms. These are the ATP-binding cassette (ABC) superfamily and the major facilitator superfamily (MFS), also called the uniporter-symporter-antiporter family. While ABC family permeases are in general multicomponent primary active transporters, capable of transporting both small molecules and macromolecules in response to ATP hydrolysis the MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients. Although well over 100 families of transporters have now been recognised and classified, the ABC superfamily and MFS account for nearly half of the solute transporters encoded within the genomes of microorganisms. They are also prevalent in higher organisms. The importance of these two families of transport systems to living organisms can therefore not be overestimated PUBMED:9529885.

    \ \

    The MFS was originally believed to function primarily in the uptake of sugars but subsequent studies revealed that drug efflux systems, Krebs cycle metabolites, organophosphate:phosphate exchangers, oligosaccharide:H1 symport permeases, and bacterial aromatic acid permeases were all members of the MFS. These observations led to the probability that the MFS is far more widespread in nature and far more diverse in function than had been thought previously. 17 subgroups of the MFS have been identified PUBMED:9529885.

    \ \

    Evidence suggests that the MFS permeases arose by a tandem intragenic duplication event in the early prokaryotes. This event generated a 2-transmembrane-spanner (TMS) protein topology from a primordial 6-TMS unit. Surprisingly, all currently recognised MFS permeases retain the two six-TMS units within a single polypeptide chain, although in 3 of the 17 MFS families, an additional two TMSs are found PUBMED:8987357. Moreover, the well-conserved MFS specific motif between TMS2 and TMS3 and the related but less well conserved motif between TMS8 and TMS9 PUBMED:1970645 prove to be a characteristic of virtually all of the more than 300 MFS proteins identified.

    \ \ ' '4541' 'IPR001950' '\ In Saccharomyces cerevisiae (Baker\'s yeast), SUI1 is a translation initiation factor that functions in concert with eIF-2 and the initiator tRNA-Met in directing the ribosome to the proper start site of translation PUBMED:1729602. SUI1 is a protein of 108 residues. Close homologs of SUI1 have been found PUBMED:7904817 in mammals, insects and plants. SUI1 is also evolutionary related to hypothetical proteins from Escherichia coli (yciH), Haemophilus influenzae (HI1225) and Methanococcus vannielii.\ ' '4542' 'IPR004596' '\

    All proteins in this family for which the functions are known are cell division inhibitors. In Escherichia coli, SulA is one of the SOS regulated genes. Accumulation of SulA causes rapid cessation of cell division and the appearance of long, non-septate filaments. In the presence of GTP, SulA binds a polymerisation-competent form of ftsZ in a 1:1 ratio, thus inhibiting ftsZ polymerisation and therefore preventing it from participating in the assembly of the Z ring. This mechanism prevents the premature segregation of damaged DNA to daughter cells during cell division. The expression of SulA is repressed by LexA. The N-terminus of SulA may be involved in recognising the cell division apparatus.

    \ ' '4543' 'IPR011547' '\

    A number of proteins involved in the transport of sulphate across a membrane\ as well as some yet uncharacterised proteins have been shown PUBMED:8140616, PUBMED:7616962 to be evolutionary related.\ These proteins are:\

    \

    These proteins are highly hydrophobic and seem to contain about 12 transmembrane domains.

    \ \ ' '4544' 'IPR005556' '\

    This is a family of proteins is restricted to the fungi, the Saccharomycetales and Schizosaccharomycetales. In Saccharomyces cerevisiae (Baker\'s yeast)they have been termed the SUN gene family, whose products display high homology in their 258 amino acid C-terminal domain. SIM1, UTH1, NCA3 (SUN, the founding members and now including SUN4) are involved in different cellular processes: DNA replication, ageing, mitochondrial biogenesis and in the cell septation process (SUN4) PUBMED:10870102.

    \ \

    NCA3 (Nuclear Control of ATPase), is one of the two nuclear genes involved in the control of mitochondrial expression of subunits 6 and 8 of the Fo-F1 ATP synthase in Saccharomyces cerevisiae (Baker s yeast). Mutations in either NCA2 () or NCA3 dramatically lower the level of the co-transcript encoding subunits 6 and 8 PUBMED:7723016, PUBMED:7586026. Also, since NCA3 is one of the four S. cerevisiae genes of the SUN family other SUN family genes were tested; however only UTH1 (but not SUN4 or SIM1) was found to interfere with mitochondria biogenesis PUBMED:10683261.

    \ ' '4545' 'IPR005100' '\ This short region of similarity is found in two tandem copies in Supt5 proteins that are involved in chromatin regulation. The function of this region is unknown.\ ' '4546' 'IPR002828' '\

    This entry represents a SurE-like structural domain with a 3-layer alpha/bete/alpha topology that bears some topological similarity to the N-terminal domain of the glutaminase/asparaginase family. This domain is found in the stationary phase survival protein SurE, a metal ion-dependent phosphatase found in eubacteria, archaea and eukaryotes. In Escherichia coli,\ SurE also has activity as a nucleotidase and exopolyphosphatase, and may be involved in the stress response PUBMED:17561111. E. coli cells with mutations in the surE gene survive poorly in stationary phase PUBMED:11709173. The structure of SurE homologues have been determined from Thermotoga maritima PUBMED:11524683 and the archaea Pyrobaculum aerophilum PUBMED:12595266. The T. maritima SurE homologue has phosphatase activity that is inhibited by vanadate or tungstate, both of which bind adjacent to the divalent metal ion.

    \

    This domain is found in acid phosphatases (), 5\'-nucleotidases (), 3\'-nucleotidases () and exopolyphosphatases ().

    \ ' '4547' 'IPR002994' '\

    The surfeit locus 1 gene (SURF1 or surf-1) encodes a conserved protein of\ about 300 amino-acid residues that seems to be involved in the biogenesis of\ cytochrome c oxidase PUBMED:9843204. Vertebrate SURF1 is evolutionary related to yeast\ protein SHY1. There seems to be two transmembrane regions in these proteins,\ one in the N-terminal, the other in the C-terminal.\ Rickettsia prowazekii protein RP733 is also a member of this protein family.

    \ ' '4548' 'IPR002995' '\

    The surfeit locus gene SURF4 (or surf-4) encodes a conserved integral eukaryotic membrane protein of about 270 to 300 amino-acid residues that seems to be located in the endoplasmic reticulum PUBMED:7540914.

    \ ' '4549' 'IPR002566' '\ This family includes a number of bacterial surface antigens expressed on the surface of pathogens. The Anaplasma marginale surface proteins are targets of protective immune responses but are antigenically polymorphic PUBMED:8063397, PUBMED:8294020.\ ' '4550' 'IPR000436' '\

    Sushi domains are also known as Complement control protein (CCP) modules, or short consensus repeats (SCR), exist in a wide\ variety of complement and adhesion proteins. \ The structure is known for this domain,\ it is based on a beta-sandwich arrangement; one\ face made up of three beta-strands hydrogen-bonded to form a triple-stranded region at its\ centre and the other face formed from two separate beta-strands PUBMED:1829116.

    \ \

    CD21 (also called C3d receptor, CR2, Epstein Barr virus receptor or EBV-R) is the receptor for EBV and for C3d, C3dg and iC3b. Complement components may activate B cells through CD21. CD21 is part of a large signal-transduction complex that also involves CD19, CD81, and Leu13.

    \ \

    Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Complement decay-accelerating factor (Antigen CD55) belongs to the Cromer blood group system and is associated with Cr(a), Dr(a), Es(a), Tc(a/b/c), Wd(a), WES(a/b), IFC and UMC antigens. Complement receptor type 1 (C3b/C4b receptor) (Antigen CD35) belongs to the Knops blood group system and is associated with Kn(a/b), McC(a), Sl(a) and Yk(a) antigens.

    \ \ \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).\

    \ \ ' '4551' 'IPR007233' '\ Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses. Syndecan-2 induces spine formation by recruiting intracellular vesicles toward postsynaptic sites through the interaction with synbindin PUBMED:11018053. \ ' '4552' 'IPR002013' '\

    Synaptic vesicles are recycled with remarkable speed and precision in nerve terminals. A major recycling pathway involves clathrin-mediated endocytosis at endocytic zones located around sites of release. Different \'accessory\' proteins linked to this pathway have been shown to alter the shape and composition of lipid membranes, to modify membrane-coat protein interactions, and to influence actin polymerisation. These include the GTPase dynamin, the lysophosphatidic acid acyl transferase endophilin, and the phosphoinositide phosphatase synaptojanin PUBMED:9851978.

    \ \

    The recessive suppressor of secretory defect in yeast Golgi and yeast\ actin function belongs to this family. This protein may be involved in the coordination of the activities of the secretory pathway and the actin cytoskeleton.

    \ \

    Human synaptojanin which may be localised on coated endocytic intermediates in nerve terminals also belongs to this family.

    \ ' '4553' 'IPR001359' '\ Synapsins are neuronal phosphoproteins that coat synaptic vesicles, bind to several \ elements of the cytoskeleton (including actin filaments), and are believed to function in \ the regulation of neurotransmitter release PUBMED:2117454, PUBMED:10578110. The synapsin family currently \ includes the highly related synapsin I and II. Both synapsins exist in two alternatively \ spliced variants, IA and IB and IIA and IIB, that only differ at the C-terminus. \ It also includes synapsin III.\ ' '4554' 'IPR001359' '\ Synapsins are neuronal phosphoproteins that coat synaptic vesicles, bind to several \ elements of the cytoskeleton (including actin filaments), and are believed to function in \ the regulation of neurotransmitter release PUBMED:2117454, PUBMED:10578110. The synapsin family currently \ includes the highly related synapsin I and II. Both synapsins exist in two alternatively \ spliced variants, IA and IB and IIA and IIB, that only differ at the C-terminus. \ It also includes synapsin III.\ ' '4555' 'IPR001388' '\

    Synaptobrevin is an intrinsic membrane protein of small synaptic vesicles PUBMED:2560644, specialised secretory organelles of neurons that actively accumulate neurotransmitters and participate in their calcium-dependent release by exocytosis. Vesicle function is mediated by proteins in their membranes, although the precise nature of the protein-protein interactions underlying this are still uncertain PUBMED:1976629. Synaptobrevin may play a role in the molecular events underlying neurotransmitter release and vesicle recycling and may be involved in the regulation of membrane flow in the nerve terminal, a process mediated by interaction with low molecular weight GTP-binding proteins PUBMED:8406010. Synaptic vesicle-associated membrane proteins (VAMPs) from Torpedo californica (Pacific electric ray) and SNC1 from yeast are related to synaptobrevin.

    \ ' '4557' 'IPR008253' '\

    The ~130-residue MAL and related proteins for vesicle trafficking and membrane\ link (MARVEL) domain is a module with a four transmembrane-helix architecture\ that has been identified in proteins of the myelin and lymphocyte (MAL),\ physins, gyrins and occludin families. All described MARVEL domain-containing\ proteins are consistent with the M-shaped topology: four transmembrane-helix\ region architecture with cytoplasmic N- and C-terminal regions. Their function\ could be related to cholesterol-rich membrane apposition events in a variety\ of cellular processes, such as biogenesis of vesicular transport carriers or\ tight junction regulation PUBMED:12468223.

    \ \ \ ' '4558' 'IPR001050' '\

    The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions PUBMED:1335744, PUBMED:8370471.

    \ \ \

    Structurally, these proteins consist of four separate domains:

    \

    \ \

    The proteins known to belong to this family are:

    \ \

    \ \

    Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion PUBMED:9582338, PUBMED:11456484.

    \ ' '4559' 'IPR001058' '\

    Synucleins are small, soluble proteins expressed primarily in neural tissue and in certain tumors PUBMED:9750188, PUBMED:11806835. The family includes three known proteins: alpha-synuclein, beta-synuclein, and gamma-synuclein. All synucleins have in common a highly conserved alpha-helical lipid-binding motif with similarity to the class-A2 lipid-binding domains of the exchangeable apolipoproteins PUBMED:10952980.

    \

    Synuclein family members are not found outside vertebrates, although they have some conserved structural similarity with plant \'late-embryo-abundant\' proteins. The alpha- and beta-synuclein proteins are found primarily in brain tissue, where they are seen mainly in presynaptic terminals PUBMED:7857654, PUBMED:7877458. The gamma-synuclein protein is found primarily in the peripheral nervous system and retina, but its expression in breast tumors is a marker for tumor progression PUBMED:9044857.\ Normal cellular functions have not been determined for any of the synuclein proteins,\ although some data suggest a role in the regulation of membrane stability and/or turnover.\ Mutations in alpha-synuclein are associated with rare familial cases of early-onset Parkinson\'s\ disease, and the protein accumulates abnormally in Parkinson\'s disease, Alzheimer\'s disease,\ and several other neurodegenerative illnesses PUBMED:11433374.

    \ ' '4560' 'IPR000643' '\ Iodothyronine deiodinase () (DI) PUBMED:, PUBMED:7592917 is the vertebrate enzyme responsible for the deiodination of\ the prohormone thyroxine (T4 or 3,5,3\',5\'-tetraiodothyronine) into the biologically active hormone T3\ (3,5,3\'-triiodothyronine) and of T3 into the inactive metabolite T2 (3,3\'-diiodothyronine). All known DI are\ proteins of about 250 residues that contain a selenocysteine at their active site. Three types of DI are\ known, type II is essential for providing the brain with the appropriate levels of T3 during the critical\ period of development, and type III is essential for the regulation of thyroid hormone inactivation during\ embryological development.\ ' '4561' 'IPR005601' '\

    Irreversible binding of T-even bacteriophages to Escherichia coli is mediated by the short tail fibres, which serve as inextensible stays during DNA injection. Short tail fibres are exceptionally stable elongated trimers of gene product 12 (gp12), a 56 kDa protein. The N-terminal region of gp12 is important for phage attachment, the central region forms a long shaft, while a C-terminal globular region is implicated in binding to the bacterial lipopolysaccharide core. The distal half-fiber contains two molecules each of gp36 and gp37 and one molecule of gp35.\

    \ ' '4562' 'IPR005604' '\

    The bacteriophage T7 tail complex consists of a conical tail-tube surrounded by six kinked tail-fibers, which are oligomers of the viral protein gp17.

    \ ' '4563' 'IPR003133' '\

    The group of polyomaviruses is formed by the homonymous murine virus (Py) as well as other representative members such as the simian virus 40 (SV40) and the human BK and JC viruses PUBMED:8824775. Their large T antigen (T-ag) protein binds to and activates DNA replication from the origin of DNA replication (ori). Insofar as is known, the T-ag binds to the origin first as a monomer to its pentanucleotide recognition element. The monomers are then thought to assemble into hexamers and double hexamers, which constitute the form that is active in initiation of DNA replication. When bound to the ori, T-ag double hexamers encircle DNA PUBMED:17139255. T-ag is a multidomain protein that contains an N-terminal J domain, which mediates protein interactions (see , ), a central origin-binding domain (OBD), and a C-terminal superfamily 3 helicase domain (see , ) PUBMED:16611889.

    \

    This entry represents the central origin-binding domain (OBD). The overall fold of the ~130-residue T-ag OBD can be described as a central five-stranded antiparallel beta-sheet flanked by two alpha-helices on one side and one alpha-helix and one 3(10)-helix on the other. Both faces of the central beta-sheet are largely hydrophobic and are protected from solvent by the helices, thus forming two hydrophobic cores PUBMED:8946857. The T-ag OBD molecules are arranged as a spiral with a left-handed twist having six T-ag OBD\'s per turn. The spiral surrounds a central channel, the inner wall of which consists of alpha helices PUBMED:8946857.\

    \ ' '4564' 'IPR007707' '\ This family contains the proteins TACC 1, 2 and 3, found concentrated in the centrosomes of eukaryotes which may play a conserved role in organising centrosomal microtubules. The human TACC proteins have been linked to cancer and TACC2 has been identified as a possible tumour suppressor (AZU-1) PUBMED:11121038.\ ' '4565' 'IPR002040' '\ Tachykinins PUBMED:3284438, PUBMED:1969374, PUBMED:1324401 are a group of biologically active peptides which excite\ neurons, evoke behavioral responses, are potent vasodilatators and contract\ (directly or indirectly) many smooth muscles. This family includes many other peptides.\ Tachykinins, like most other active peptides, are synthesized as larger\ protein precursors that are enzymatically converted to their mature forms.\ Tachykinins are from ten to twelve residues long.\ ' '4566' 'IPR007055' '\

    The BON domain is typically ~60 residues long and has an alpha/beta predicted fold. There is a\ conserved glycine residue and several hydrophobic regions. This pattern of conservation is more\ suggestive of a binding or structural function rather than a catalytic function. Most proteobacteria seem to possess one or two BON-containing proteins, typically of the\ OsmY-type proteins; outside of this group the distribution is more disparate.

    The OsmY protein is an Escherichia coli 20 kDa outer membrane or periplasmic protein that is expressed in response to a variety of stress conditions, in particular, helping to provide\ protection against osmotic shock. One hypothesis is that OsmY prevents shrinkage of\ the cytoplasmic compartment by contacting the phospholipid interfaces surrounding the periplasmic\ space. The domain architecture of two BON domains alone suggests\ that these domains contact the surfaces of phospholipids, with each domain contacting a membrane PUBMED:12878000.

    \ ' '4567' 'IPR004823' '\ The TATA box binding protein associated factor (TAF) is part of the transcription initiation factor TFIID multimeric protein complex. TFIID plays a central role in mediating promoter responses to various activators and repressors. It binds tightly to TAFII-250 and directly interacts with TAFII-40. TFIID is composed of TATA binding protein (TBP)and a number of TBP-associated factors (TAFS). TAF proteins adopt a histone-like fold.\ ' '4568' 'IPR006751' '\ The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. TAFII55 binds to TAFII250 and inhibits its acetyltransferase activity. The exact role of TAFII55 is currently unknown. The conserved region is situated towards the N-terminal of the protein PUBMED:11592977.\ ' '4569' 'IPR006522' '\

    These sequences describe protein S of phage P2, suggested experimentally to act in tail completion and stable head joining, and related proteins from a number of phages.

    \ ' '4570' 'IPR003122' '\

    Methyl-accepting chemotaxis proteins (MCPs) are a family of bacterial receptors that mediate chemotaxis to diverse signals, responding to changes in the concentration of attractants and repellents in the environment by altering swimming behaviour PUBMED:16359703. Environmental diversity gives rise to diversity in bacterial signalling receptors, and consequently there are many genes encoding MCPs PUBMED:17299051. For example, there are four well-characterised MCPs found in Escherichia coli: Tar (taxis towards aspartate and maltose, away from nickel and cobalt), Tsr (taxis towards serine, away from leucine, indole and weak acids), Trg (taxis towards galactose and ribose) and Tap (taxis towards dipeptides).

    \

    MCPs share similar topology and signalling mechanisms. MCPs either bind ligands directly or interact with ligand-binding proteins, transducing the signal to downstream signalling proteins in the cytoplasm. MCPs undergo two covalent modifications: deamidation and reversible methylation at a number of glutamate residues. Attractants increase the level of methylation, while repellents decrease it. The methyl groups are added by the methyl-transferase cheR and are removed by the methylesterase cheB. Most MCPs are homodimers that contain the following organisation: an N-terminal signal sequence that acts as a transmembrane domain in the mature protein; a poorly-conserved periplasmic receptor (ligand-binding) domain; a second transmembrane domain; and a highly-conserved C-terminal cytoplasmic domain that interacts with downstream signalling components. The C-terminal domain contains the glycosylated glutamate residues.

    \ \

    This entry represents the ligand-binding domain found in a number of methyl-accepting chemotaxis receptors.

    \ ' '4571' 'IPR001831' '\

    Like other lentiviruses, Human immunodeficiency virus 1 (HIV-1) encodes a trans-activating regulatory protein (Tat), which is essential for efficient transcription of the viral genome PUBMED:1883204, PUBMED:8058789. Tat acts by binding to an RNA stem-loop structure, the trans-activating response element (TAR), found at the 5\' ends of nascent HIV-1 transcripts. In binding to TAR, Tat alters the properties of the transcription complex, recruits a positive transcription elongation complex (P-TEFb) and hence increases the production of full-length viral RNA PUBMED:8058789. Tat protein also associates with RNA polymerase II complexes during early transcription elongation after\ the promoter clearance and before the synthesis of full-length TAR RNA transcript. This interaction of Tat with RNA polymerase II elongation\ complexes is P-TEFb-independent. There are two Tat binding sites on each transcription elongation complex; one is located on\ TAR RNA and the other one on RNA polymerase II near the exit site for nascent mRNA transcripts which suggests that two Tat molecules are\ involved in performing various functions during a single round of HIV-1 mRNA synthesis PUBMED:12126615.

    \

    The minimum Tat sequence that can mediate specific TAR binding in vitro has been mapped to a basic domain of 10 amino acids, comprising mostly Arg and Lys residues. Regulatory activity, however, also requires the 47 N-terminal residues, which interact with components of the transcription complex and function as a transcriptional activation domain PUBMED:8058789, PUBMED:2117500, PUBMED:8121496.

    \ ' '4572' 'IPR001130' '\ This family of proteins are related to a large superfamily of metalloenzymes PUBMED:9144792. TatD, a member of this family has\ been shown experimentally to be a DNase enzyme PUBMED:10747959. Allantoinase , \ N-isopropylammelide isopropyl amidohydrolase and \ the SCN1 protein from fission yeast belong to this family.\ ' '4573' 'IPR005092' '\

    This family of trans-activating transcriptional regulators (TATR), also known as intermediate early protein 1, are common to the Nucleopolyhedroviruses.

    \ ' '4574' 'IPR003819' '\ This family consists of TauD/TfdA taurine catabolism dioxygenases. The Escherichia coli tauD gene is required for the utilization of taurine (2-aminoethanesulphonic acid) as a sulphur source and is expressed only under conditions of sulphate starvation. TauD is an alpha-ketoglutarate-dependent dioxygenase catalyzing the oxygenolytic release of sulphite from taurine PUBMED:9287300. The 2,4-dichlorophenoxyacetic acid/alpha-ketoglutarate dioxygenase from Burkholderia sp. (strain RASC) also belongs to this family PUBMED:8779585. TfdA from Ralstonia eutropha (Alcaligenes eutrophus) is a 2,4-D monooxygenase PUBMED:3036764.\ ' '4575' 'IPR004370' '\

    4-Oxalocrotonate tautomerase (4-OT) catalyzes the isomerisation of beta,gamma-unsaturated enones to their alpha,beta-isomers. The enzyme is part of a plasmid-encoded\ pathway, which enables bacteria harbouring the plasmid to use various aromatic hydrocarbons as their sole sources of carbon and energy. The\ enzyme is a barrel-shaped hexamer, which can be viewed as a trimer of dimers. The hexamer contains a hydrophobic core formed by three beta-sheets and\ surrounded by three pairs of alpha-helices. Each 4-OT monomer of 62 amino acids has a relatively simple beta-alpha-beta fold as described by the structure of the enzyme from Pseudomonas putida PUBMED:12051677. The monomer begins\ with a conserved proline at the start of a beta-strand, followed by an alpha-helix and a 310 helix preceding a second parallel beta-strand, and ends with\ a beta-hairpin near the C-terminus. The dimer results from antiparallel interactions between the beta-sheets and alpha-helices of the two monomers, forming a\ four-stranded beta-sheet with antiparallel alpha-helices on one side, creating two active sites, one at each end of the beta-sheet. Three dimers further\ associate to form a hexamer by the interactions of the strands of the C-terminal beta-hairpin loops with the edges of the four-stranded beta-sheets of neighbouring\ dimers, creating a series of cross-links that stabilise the hexamer

    \

    Pro-1 of the mature protein functions as the general base while Arg-39 and an ordered water molecule each provide a hydrogen bond to the C-2 oxygen of substrate. Arg-39\ plays an additional role in the binding of the C-1 carboxylate group. Arg-11 participates both in substrate binding and in catalysis. It\ interacts with the C-6 carboxylate group, thereby holding the substrate in place and drawing electron density to the C-5 position. The hydrophobic nature of\ the active site, which lowers the pKa of Pro-1 and provides a favourable environment for catalysis, is largely maintained by Phe-50.

    \ \

    Because several Arg residues located near the active site are not conserved among all members of this family and because of the presence of fairly distantly related paralogs in Campylobacter jejuni, the family is regarded as not necessarily uniform in function.

    \ ' '4576' 'IPR004120' '\ Human T-lymphotropic virus 1 is the etiological agent for adult T-cell leukemia (ATL), as well as for\ tropical spastic paraparesis (TSP) and HTLV-I associate myelopathy (HAM). A biological understanding of the\ involvement of HTLV-I and in ATL has focused significantly on the workings of the virally-encoded 40 kDa\ phospho-oncoprotein, Tat. Tat is a transcriptional activator. Its ability to modulate the expression and function of many\ cellular genes has been reasoned to be a major contributory mechanism explaining HTLV-I mediated transformation of\ cells. In activating cellular gene expression, Tat impinges upon several cellular signal-transduction pathways, including\ those for CREB/ATF and NF-kappaB PUBMED:11325603.\ ' '4577' 'IPR002212' '\

    Transforming growth factor beta (TGF-beta)-binding protein-like (TB) domain comes from human fibrillin-1PUBMED:8364578. This domain is\ found in fibrillins and latent TGF-beta-binding proteins (LTBPs) which are localized to\ fibrillar structures in the extracellular matrix PUBMED:9362480.

    \ ' '4578' 'IPR000195' '\ Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are\ GTPase activator proteins of yeast Ypt6 and Ypt7, imply that these domains\ are GTPase activator proteins of Rab-like small GTPases PUBMED:11013213.\ ' '4579' 'IPR004226' '\

    The folding pathway of tubulins includes highly specific interactions with a series of cofactors (A, B, C, D and E) after they are released from the eukaryotic chaperonin CCT. Cofactors A and D capture and stabilise tubulin in a quasi-native conformation. Cofactor E binds to the cofactor D-tubulin complex, and interaction with cofactor C then causes the release of tubulin poypeptides in the native state. This family is the tubulin-specific chaperone A.

    \ \ ' '4580' 'IPR005332' '\

    Two small nested genes (p19 and p22) are located near the 3\' end of the genome of tomato bushy\ stunt virus (TBSV) - the p19 gene encodes a soluble protein, whereas the p22 gene specifies a membrane-associated protein. p22 is required for cell-to-cell movement in all plants tested PUBMED:7491767.

    \ ' '4581' 'IPR004832' '\ Two related oncogenes, TCL-1 and MTCP-1 , are overexpressed in T cell prolymphocytic leukemias as a result of chromosomal rearrangements that involve the translocation of one T cell receptor gene to either chromosome 14q32 or Xq28 PUBMED:9520380.\ ' '4582' 'IPR005333' '\

    The TCP transcription factor family was named after: teosinte branched 1 (tb1, Zea mays (Maize)) PUBMED:17452340, cycloidea (cyc) (Antirrhinum majus) (Garden snapdragon) PUBMED:10363373 and PCF in rice (Oryza sativa) PUBMED:12000681, PUBMED:539398. The TCP proteins code for structurally related proteins implicated in the evolution of key morphological traits PUBMED:10363373. However, the biochemical function of CYC and TB1 proteins remains to be demonstrated. One of the conserved regions is predicted to form a non-canonical basic-Helix-Loop-Helix (bHLP) structure. This domain is also found in two rice DNA-binding proteins, PCF1 and PCF2, where it has been shown to be involved in DNA-binding and dimerization.

    \

    This family of transcription factors are exclusive to higher plants. They can be divided into two groups, TCP-C and TCP-P, that appear to have separated following an early gene duplication event PUBMED:17568984. This duplication event may have led to functional divergence and it has been proposed that that the TCP-P subfamily are transcriptional repressors, while the TPC-C subfamily are transcription activators PUBMED:16123132.

    \ ' '4583' 'IPR005334' '\

    Tctex-1 is a dynein light chain. Dynein translocates rhodopsin-bearing vesicles along microtubules and it has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. An efficient vectorial transport\ system must be required to deliver large numbers of newly synthesized rhodopsin molecules (~107 molecules per\ day per photoreceptor) to the base of the outer segment of the photoreceptor, Tctex-1 may well play a role in this process. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit the interaction between Tctex-1 and rhodopsin, which may be the molecular basis of\ retinitis pigmentosa.

    In the mouse, the\ chromosomal location and pattern of expression of Tctex-1 make it a candidate for involvement in male sterility PUBMED:2570638.

    \ ' '4584' 'IPR001983' '\

    Mammalian translationally controlled tumour protein (TCTP) (or P23) is a protein which has been found to be preferentially synthesised in cells during the early growth phase of some types of tumour PUBMED:2479380, PUBMED:3357792, but which is also expressed in normal cells. The physiological function of TCTP is still not known. It was first identified as a histamine-releasing factor, acting in IgE +-dependent allergic reactions. In addition, TCTP has been shown to bind to tubulin in the cytoskeleton, has a high affinity for calcium, is the binding target for the antimalarial compound artemisinin, and is induced in vitamin D-dependent apoptosis. TCTP production is thought to be controlled at the translational as well as the transcriptional level PUBMED:10951206.

    \ \

    TCTP is a hydrophilic protein of 18 to 20 Kd. TCTPs do not share significant sequence similarity with any other class of proteins. Recently, the structure of TCTP was determined and exhibited significant structural similarity to the human protein Mss4, which is a guanine nucleotide-free chaperone of the Rab protein PUBMED:11473261. Close homologues have been found in plants PUBMED:1623194, earthworm PUBMED:9655922, Caenorhabditis elegans (F52H2.11), Hydra, Saccharomyces cerevisiae (YKL056c) PUBMED:8091862 and Schizosaccharomyces pombe (SpAC1F12.02c).

    \ ' '4585' 'IPR005015' '\

    Thermostable direct haemolysin (TDH) is considered an important virulence factor in Vibrio parahaemolyticus gastroenteritis and is a dimer composed of two identical subunit molecules of approximately 21 kDa. A number of biological properties have been attributed to TDH including haemolytic activity, enterotoxicity, cytotoxicity and cardiotoxicity PUBMED:11267763.

    \ ' '4586' 'IPR000818' '\ Transcriptional enhancer activators are nuclear proteins that contain a TEA/ATTSdomain, a DNA-binding region of 66-68 amino acids. The TEA/ATTS domain is found in the N-termini of certain gene regulatory proteins, such as the Simian virus 40 (SV40) enhancer factor TEF-1, yeast trans-acting factor TEC-1 (which is required for TY1 enhancer activity), and the Aspergillus abaA regulatory gene product. SV40 and retroviral enhancers, and those to which TEF-1, TEC-1 and abaA proteins bind, contain GT-IIC sites: the TEA/ATTS domain may therefore recognise and bind such sites. Secondary structure predictions suggest the presence of 3 helices, but have not confirmed the presence of the helix-turn-helix motif characteristic of many DNA-binding proteins: DNA-binding may therefore be effected by a different mechanism PUBMED:2070413.\ ' '4587' 'IPR015985' '\

    Tellurite resistance protein TehB is part of a tellurite-reducing operon tehA and tehB. When present in high copy number, TehB is responsible for potassium tellurite resistance, probably by increasing the reduction rate of tellurite to metallic tellurium within the bacterium. TehB is a cytoplasmic protein which possesses three conserved motifs (I, II, and III) found in S-adenosyl-L-methionine (SAM)-dependent non-nucleic acid methyltransferases PUBMED:11053398. Conformational changes in TehB are observed upon binding of both tellurite and SAM, suggesting that TehB utilises a methyltransferase activity in the detoxification of tellurite.

    \ \

    This entry represents the core methyltransferase domain found in all TehB proteins.

    \ ' '4588' 'IPR011564' '\

    The telomere-binding protein forms a heterodimer in ciliates consisting of an alpha and a beta subunit. This complex may function as a protective cap for the single-stranded telomeric overhang. Alpha subunit consists of 3 structural domains, all with the same beta-barrel OB fold.

    \ ' '4591' 'IPR004305' '\

    Proteins containing this domain are found in all the three major phyla of life: archaebacteria, eubacteria, and eukaryotes. In\ Bacillus subtilis, TENA is one of a number of proteins that enhance the expression of extracellular enzymes, such as\ alkaline protease, neutral protease and levansucrase PUBMED:1898926.

    \

    The THI-4 protein, which is involved in thiamine biosynthesis, also contains this domain. The C-terminal part of these proteins consistently show significant sequence similarity to\ TENA proteins. This similarity was first noted with the Neurospora crassa THI-4 PUBMED:8662211. The exact molecular function of\ this domain is uncertain.

    \ ' '4592' 'IPR006960' '\ This entry contains the tenuivirus major non-capsid protein. Proteins accumulate in large amounts in tenuivirus infected cells. They are found in the inclusion bodies that are formed after infection PUBMED:8317091.\ ' '4593' 'IPR004980' '\

    This is a non-structural protein found in members of the Tenuivirus family.

    \ ' '4594' 'IPR007791' '\ This family contains the TerB tellurite resistance proteins from a number of bacteria.\ ' '4595' 'IPR005496' '\ This family contains a number of integral membrane proteins including the TerC protein. TerC has been implicated in resistance to tellurium, and may be involved in efflux of tellurium ions.\ The tellurite-resistant Escherichia coli strain KL53 was found during testing of a group of clinical isolates for antibiotic and heavy metal ion resistance PUBMED:10069007. The determinant of the strain\'s tellurite resistance was located on a large conjugative plasmid, and analyses showed the genes terB, terC, terD and terE were essential for conservation of this resistance.\ Members of this family contain a number of conserved aspartates which may be involved in metal ion binding.\ ' '4596' 'IPR003325' '\ This domain is found in tellurite resistance proteins, cAMP binding protein, and chemical-damaging agent resistance proteins and general stress proteins. \ Tellurium compounds are used in several industrial processes, although they are\ relatively rare in the environment. Genes associated with tellurite resistance (TeR) are found in many pathogenic bacteria PUBMED:10203839. \

    The cellular Slime mould, Dictyostelium discoideum, contains a cAMP-binding protein, CABP1, which is composed of two subunits. The C-terminal half of these subunits contain this domain PUBMED:2176639.

    \ ' '4597' 'IPR005335' '\

    Packaging of double-stranded viral DNA concatemers requires interaction of the prohead with virus DNA. This process is mediated by a phage-encoded DNA recognition and terminase protein. The terminase enzymes described so far, which are hetero-oligomers composed of a small and a large subunit, do not have a significant level of sequence homology. The small terminase subunit is thought to form a nucleoprotein structure that helps to position the terminase large subunit at the packaging initiation site PUBMED:2679356.

    \ ' '4598' 'IPR005630' '\

    Sequences containing this domain belong to the terpene synthase family. It has been suggested that this gene family be designated tps (for terpene synthase). Sequence comparisons reveal similarities between the monoterpene (C10) synthases, sesquiterpene (C15) synthases and the diterpene (C20) synthases. It has been split into six subgroups on the basis of phylogeny, called Tpsa-Tpsf PUBMED:9268308.

    \ \ \ \

    In the fungus Phaeosphaeria sp. (strain L487) the synthesis of ent-kaurene from geranylgeranyl dophosphate is promoted by a single bifunctional protein PUBMED:9268298.

    \ ' '4599' 'IPR004111' '\

    The antibiotic tetracycline has a broad spectrum of activity, acting to inhibit bacterial protein synthesis by binding to the 30S ribosomal subunit, which prevents the association of the aminoacyl-tRNA to the ribosomal acceptor A site. Tetracycline binding is reversible, therefore diluting out the antibiotic can reverse its effects. Tetracycline resistance genes are often located on mobile elements, such as plasmids, transposons and/or conjugative transposons, which can sometimes be transferred between bacterial species. In certain cases, tetracycline can enhance the transfer of these elements, thereby promoting resistance amongst a bacterial colony. There are three types of tetracycline resistance: tetracycline efflux, ribosomal protection, and tetracycline modification PUBMED:16887689, PUBMED:15837373:

    \

    \

    \

    \

    The expression of several of these tet genes is controlled by a family of tetracycline transcriptional regulators known as TetR. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity PUBMED:15944459. The TetR proteins identified in over 115 genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response.

    \ \

    This entry represents the C-terminal domain found in the tetracycline transcriptional repressor TetR, which binds to the Tet(A) gene to repress its expression in the absence of tetracycline PUBMED:7707374. Tet(A) is a membrane-associated efflux protein that exports tetracycline from the cell before it can attach to ribosomes and inhibit polypeptide chain growth. TetR occurs as a homodimer and uses two helix-turn-helix (HTH) motifs to bind tandem DNA operators, thereby blocking the expression of the associated genes, TetA and TetR. The structure of the class D TetR repressor protein PUBMED:8153629 involves 10 alpha-helices, with connecting turns and loops. The three N-terminal helices constitute the DNA-binding HTH domain, which has an inverse orientation compared with HTH motifs in other DNA-binding proteins. The core of the protein, formed by helices 5-10, is responsible for dimerisation and contains, for each monomer, a binding pocket that accommodates tetracycline in the presence of a divalent cation.

    \ ' '4600' 'IPR013854' '\

    Activator protein-2 (AP-2) transcription factors constitute a family of\ closely related and evolutionarily conserved proteins that bind to the DNA \ consensus sequence GCCNNNGGC and stimulate target gene transcription\ PUBMED:2010091, PUBMED:1998122.\ Four different isoforms of AP-2 have been identified in mammals, termed AP-2\ alpha, beta, gamma and delta. Each family member shares a common structure, \ possessing a proline/glutamine-rich domain in the N-terminal region, which \ is responsible for transcriptional activation PUBMED:2010091, and a helix-span-helix\ domain in the C-terminal region, which mediates dimerisation and site-specific DNA binding PUBMED:1998122.\

    \

    The AP-2 family have been shown to be critical regulators of gene expression\ during embryogenesis. They regulate the development of facial prominence and\ limb buds, and are essential for cranial closure and development of the lens\ PUBMED:11137286; they have also been implicated in tumourigenesis. AP-2 protein \ expression levels have been found to affect cell transformation, tumour \ growth and metastasis, and may predict survival in some types of cancer\ PUBMED:9632718, PUBMED:10864206\

    \ \

    This entry represents the C-terminal region of these proteins, including the helix-span-helix domain.

    \ ' '4601' 'IPR003711' '\

    The bacterium Myxococcus xanthus responds to blue light by producing carotenoids. It also responds to starvation conditions by developing fruiting bodies, where the cells differentiate into myxospores. Each response entails the transcriptional activation of a separate set of genes. A single gene, carD, is required for the activation of both light- and starvation-inducible genes PUBMED:8692912.

    \ \

    The predicted protein contains four repeats of a DNA-binding domain present in mammalian high mobility group I(Y) proteins and other nuclear proteins from animals and plants. Other peptide stretches on CarD also resemble functional domains typical of eukaryotic transcription factors, including a very acidic region and a leucine zipper. High mobility group yI(Y) proteins are known to bind the minor groove of A+T-rich DNA PUBMED:8692912.

    \ ' '4602' 'IPR013851' '\

    Otx proteins constitute a class of vertebrate homeodomain-containing\ transcription factors that have been shown to be essential for anterior\ head formation, including brain morphogenesis. They are orthologous to the\ product of the Drosophila head gap gene, orthodenticle (Otd), and appear to\ play similar roles in both, since the developmental abnormalities caused by\ disruption of these transcription factors in one, can be recovered by\ substitution of the factor(s) from the other. Such studies have provided\ strong evidence that there exists a conserved genetic programme for insect\ and mammalian brain development, which presumably arose in a more primitive\ common ancestor PUBMED:10199636, PUBMED:10440864.

    \ \

    Two vertebrate orthodenticle-related transcription factors have been\ indentified, Otx1 and Otx2, which have sizes of 355 and 289 residues\ respectively. They contain a bicoid-like homeodomain, which features a\ conserved lysine residue at position 9 of the DNA recognition helix, which\ is thought to confer high-affinity binding to TAATCC/T elements on DNA PUBMED:10375352.\ Otd-like transcription factors have also been found in zebrafish and \ certain lamprey species.

    \ \

    This entry represents a conserved region found in the C-terminal region of these proteins.

    \ ' '4603' 'IPR004598' '\ Members of this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair. The core-TFIIH basal transcription factor complex has six subunits, this is the p52 subunit.\ ' '4604' 'IPR004855' '\

    Transcription factor IIA (TFIIA) is one of several factors that form part of a transcription pre-initiation complex along with RNA polymerase II, the TATA-box-binding protein (TBP) and TBP-associated factors, on the TATA-box sequence upstream of the initiation start site. After initiation, some components of the pre-initiation complex (including TFIIA) remain attached and re-initiate a subsequent round of transcription. TFIIA binds to TBP to stabilise TBP binding to the TATA element. TFIIA also inhibits the cytokine HMGB1 (high mobility group 1 protein) binding to TBP PUBMED:12818428, and can dissociate HMGB1 already bound to TBP/TATA-box.

    \

    Human and Drosophila TFIIA have three subunits: two large subunits, LN/alpha and LC/beta, derived from the same gene, and a small subunit, S/gamma. Yeast TFIIA has two subunits: a large TOA1 subunit that shows sequence similarity to the N-terminal of LN/alpha and the C-terminal of LC/beta, and a small subunit, TOA2 that is highly homologous with S/gamma. The conserved regions of the large and small subunits of TFIIA combine to form two domains: a four-helix bundle (helical domain) composed of two helices from each of the N-terminal regions of TOA1 and TOA2 in yeast; and a beta-barrel (beta-barrel domain) composed of beta-sheets from the C-terminal regions of TOA1 and TOA2 PUBMED:8610010.

    \ \

    This entry represents the precursor that yields both the alpha and beta subunits of TFIIA. The TFIIA heterotrimer is an essential general transcription initiation factor for the expression of genes transcribed by RNA polymerase II PUBMED:11089979.

    \ ' '4605' 'IPR015872' '\

    Transcription factor IIA (TFIIA) is one of several factors that form part of a transcription pre-initiation complex along with RNA polymerase II, the TATA-box-binding protein (TBP) and TBP-associated factors, on the TATA-box sequence upstream of the initiation start site. After initiation, some components of the pre-initiation complex (including TFIIA) remain attached and re-initiate a subsequent round of transcription. TFIIA binds to TBP to stabilise TBP binding to the TATA element. TFIIA also inhibits the cytokine HMGB1 (high mobility group 1 protein) binding to TBP PUBMED:12818428, and can dissociate HMGB1 already bound to TBP/TATA-box.

    \

    Human and Drosophila TFIIA have three subunits: two large subunits, LN/alpha and LC/beta, derived from the same gene, and a small subunit, S/gamma. Yeast TFIIA has two subunits: a large TOA1 subunit that shows sequence similarity to the N-terminal of LN/alpha and the C-terminal of LC/beta, and a small subunit, TOA2 that is highly homologous with S/gamma. The conserved regions of the large and small subunits of TFIIA combine to form two domains: a four-helix bundle (helical domain) composed of two helices from each of the N-terminal regions of TOA1 and TOA2 in yeast; and a beta-barrel (beta-barrel domain) composed of beta-sheets from the C-terminal regions of TOA1 and TOA2 PUBMED:8610010.

    \ \

    This entry represents the alpha-helical domain found at the N-terminal of the gamma subunit of transcription factor TFIIA.

    \ ' '4606' 'IPR015871' '\

    Transcription factor IIA (TFIIA) is one of several factors that form part of a transcription pre-initiation complex along with RNA polymerase II, the TATA-box-binding protein (TBP) and TBP-associated factors, on the TATA-box sequence upstream of the initiation start site. After initiation, some components of the pre-initiation complex (including TFIIA) remain attached and re-initiate a subsequent round of transcription. TFIIA binds to TBP to stabilise TBP binding to the TATA element. TFIIA also inhibits the cytokine HMGB1 (high mobility group 1 protein) binding to TBP PUBMED:12818428, and can dissociate HMGB1 already bound to TBP/TATA-box.

    \

    Human and Drosophila TFIIA have three subunits: two large subunits, LN/alpha and LC/beta, derived from the same gene, and a small subunit, S/gamma. Yeast TFIIA has two subunits: a large TOA1 subunit that shows sequence similarity to the N-terminal of LN/alpha and the C-terminal of LC/beta, and a small subunit, TOA2 that is highly homologous with S/gamma. The conserved regions of the large and small subunits of TFIIA combine to form two domains: a four-helix bundle (helical domain) composed of two helices from each of the N-terminal regions of TOA1 and TOA2 in yeast; and a beta-barrel (beta-barrel domain) composed of beta-sheets from the C-terminal regions of TOA1 and TOA2 PUBMED:8610010.

    \ \

    This entry represents the beta-barrel domain found at the C-terminal of the gamma subunit of transcription factor TFIIA.

    \ ' '4607' 'IPR003162' '\ Human transcription initiation factor TFIID is composed of the TATA-binding polypeptide (TBP) and at least 13 TBP-associated factors (TAFs) that collectively or individually are involved in activator-dependent transcription PUBMED:7667268. \

    TAFII-31 protein is a transcriptional coactivator of the p53 protein PUBMED:7761466.

    \ ' '4608' 'IPR003923' '\

    Transcription initiation factor TFIID is a multimeric protein complex that\ plays a central role in mediating promoter responses to various activators\ and repressors. The complex includes TATA binding protein (TBP) and various\ TBP-associated factors (TAFS). TFIID a bona fide RNA polymerase II-specific\ TATA-binding protein-associated factor (TAF) and is essential for viability PUBMED:8662725.

    \

    TFIID acts to nucleate the transcription complex, recruiting the rest of\ the factors through a direct interaction with TFIIB. The TBP subunit of TFIID is sufficient for TATA-element binding and TFIIB interaction, and can support basal transcription. The protein belongs to the TAF2H family.

    \ ' '4609' 'IPR002853' '\

    Initiation of eukaryotic mRNA transcription requires melting of promoter DNA with the help of the general transcription factors TFIIE and TFIIH. In higher eukaryotes, the general transcription factor TFIIE consists of two subunits: the large alpha subunit () and the small beta (). TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The approximately 120-residue central core domain of TFIIE beta plays a role in double-stranded DNA binding of TFIIE PUBMED:10716934.

    \ \

    The TFIIE beta central core DNA-binding domain consists of three helices with a beta hairpin at the C-terminus, resembling the winged helix proteins. It shows a novel double-stranded DNA-binding activity where the DNA-binding surface locates on the opposite side to the previously reported winged helix motif by forming a positively charged furrow PUBMED:10716934.

    \ \ \

    This entry represents the conserved amino terminal region of eukaryotic TFIIE-alpha and proteins from archaebacteria (TFE) that are also presumed to be TFIIE-alpha subunits PUBMED:9389475.

    \ ' '4610' 'IPR003166' '\

    Initiation of eukaryotic mRNA transcription requires melting of promoter DNA with the help of the general transcription factors TFIIE and TFIIH. In higher eukaryotes, the general transcription factor TFIIE consists of two subunits: the large alpha subunit () and the small beta (). TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The approximately 120-residue central core domain of TFIIE beta plays a role in double-stranded DNA binding of TFIIE PUBMED:10716934.

    \ \

    The TFIIE beta central core DNA-binding domain consists of three helices with a beta hairpin at the C-terminus, resembling the winged helix proteins. It shows a novel double-stranded DNA-binding activity where the DNA-binding surface locates on the opposite side to the previously reported winged helix motif by forming a positively charged furrow PUBMED:10716934.

    \ \ \

    This entry represents the beta subunit of the transcription factor TFIIE.

    \ ' '4611' 'IPR003196' '\ Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIF (TFIIF) is a tetramer of two beta subunits associate with two alpha subunits which interacts directly with RNA polymerase II. The beta subunit of TFIIF is required for recruitment of RNA polymerase II onto the promoter. \ ' '4612' 'IPR001222' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a zinc finger motif found in transcription factor IIs (TFIIS). In eukaryotes the initiation of transcription of protein encoding genes by polymerase II (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least eight different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, -IIH and -IIS PUBMED:3346229. During mRNA elongation, Pol II can encounter DNA sequences that cause reverse movement of the enzyme. Such backtracking involves extrusion of the RNA 3\'-end into the pore, and can lead to transcriptional arrest. Escape from arrest requires cleavage of the extruded RNA with the help of TFIIS, which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites PUBMED:10723030. TFIIS extends from the polymerase surface via a pore to the internal active site. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

    \

    TFIIS is a protein of about 300 amino acids. It contains three regions: a variable N-terminal domain not required for TFIIS activity; a conserved central domain required for Pol II binding; and a conserved C-terminal C4-type zinc finger essential for RNA cleavage. The zinc finger folds in a conformation termed a zinc ribbon PUBMED:7626141 characterised by a three-stranded antiparallel beta-sheet and two beta-hairpins. A backbone model for Pol II-TFIIS complex was obtained from X-ray analysis. It shows that a beta hairpin protrudes from the zinc finger and complements the pol II active site PUBMED:12914699.

    \

    Some viral proteins also contain the TFIIS zinc ribbon C-terminal domain. The Vaccinia virus protein, unlike its eukaryotic homologue, is an integral RNA polymerase subunit rather than a readily separable transcription factor PUBMED:2398897.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '4613' 'IPR007077' '\

    This domain is found in a number of bacterial proteins including the TfoX gene product of Haemophilus influenzae. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes PUBMED:7724607. This family corresponds to the C-terminal presumed domain of TfoX. The domain is found in association with the N-terminal domain in some, but not all members of this group, suggesting this is an autonomous and functionally unrelated domain. For example it is found associated with in .

    \ ' '4614' 'IPR007076' '\

    This domain is found in a number of bacterial proteins including the TfoX gene product of Haemophilus influenzae. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes PUBMED:7724607. This family corresponds to the N-terminal presumed domain of TfoX. The domain is found in association with the C-terminal domain in some, but not all members of this group, suggesting this is an autonomous and functionally unrelated domain.

    \ ' '4615' 'IPR001839' '\

    Transforming growth factor-beta (TGF-beta) PUBMED: is a multifunctional peptide that controls proliferation, differentiation and other functions in many cell types. TGF-beta-1 is a peptide of 112 amino acid residues derived by proteolytic cleavage from the C-terminal of a precursor protein PUBMED:8679613.

    \

    A number of proteins are known to be related to TGF-beta-1 PUBMED:1575734, PUBMED:8199356. Proteins from the TGF-beta family are only active as homo- or heterodimer; the two chains being linked by a single disulphide bond. From X-ray studies of TGF-beta-2 PUBMED:1631557, it is known that all the other cysteines are involved in intrachain disulphide bonds. There are four disulphide bonds in the TGF-beta\'s and in inhibin beta chains, while the other members of this family lack the first bond.

    \

    The regulatory cytokine TGFbeta exerts tumour-suppressive effects, but also modulates cell invasion and immune regulation PUBMED:18662538. Misregulation of the TGF-beta signalling pathway can result in tumour development.

    \ ' '4616' 'IPR001111' '\ The transforming growth factor beta, N-terminus (TGFb) domain is present in a\ variety of proteins which include the transforming growth factor beta,\ decapentaplegic proteins and bone morphogenetic proteins. Transforming growth\ factor beta is a multifunctional peptide that controls proliferation,\ differentiation and other functions in many cell types. The decapentaplegic\ protein acts as an extracellular morphogen responsible for the proper\ development of the embryonic dorsal hypoderm, for viability of larvae and\ for cell viability of the epithelial cells in the imaginal disks. Bone\ morphogenetic protein induces cartilage and bone formation and may be responsible\ for epithelial osteogenesis in some organisms.\ ' '4617' 'IPR002616' '\ This is a family of queuine, archaeosine and general tRNA-ribosyltransferases , also known as tRNA-guanine transglycosylase and guanine insertion enzyme. Queuine tRNA-ribosyltransferase modifies tRNAs for asparagine, aspartic acid, histidine and tyrosine with queuine at position 34 and with archaeosine at position 15 in archaeal tRNAs. In bacterial it catalyses the exchange of guanine-34 at the wobble position with 7-aminomethyl-7-deazaguanine, and the addition of a cyclopentenediol moiety to 7-aminomethyl-7-deazaguanine-34 tRNA; giving a hypermodified base queuine in the wobble position PUBMED:8654383, PUBMED:8323579. The aligned region contains a zinc binding motif C-x-C-x2-C-x29-H, and important tRNA and 7-aminomethyl-7deazaguanine binding residues PUBMED:8654383.\ ' '4618' 'IPR006942' '\ TH1 is a highly conserved but uncharacterised metazoan protein. No homologue has been identified in Caenorhabditis elegans PUBMED:11030415. TH1 binds specifically to A-Raf kinase PUBMED:11952167.\ ' '4619' 'IPR001938' '\

    Thaumatin PUBMED:7049841 is an intensely sweet-tasting protein, 100 000 times sweeter than sucrose on a molar basis PUBMED:7049841 found in berries from Thaumatococcus daniellii, a tropical flowering plant known as Katemfe, it is induced by attack by viroids, which are single-stranded unencapsulated RNA molecules that do not code for protein.

    \

    Thaumatin consists of about 200 residues and contains 8 disulphide bonds. Like other PR proteins, thaumatin is predicted to have a mainly beta structure, with a high content of beta-turns and little helix PUBMED:7049841. Several stress-induced proteins of plants have been found to be related to thaumatins:

    \

    \

    This protein is also referred to as pathogenesis-related group 5 (PR5), as many thaumatin-like proteins accumulate in plants in response to infection by a pathogen and possess antifungal activity PUBMED:1463856. The proteins are involved in systematically acquired resistance and stress response in plants, although their precise role is unknown PUBMED:1463856.

    \ ' '4620' 'IPR002922' '\ This family includes a putative thiamine biosynthetic enzyme PUBMED:7961415. This enzyme is involved in the biosynthesis of the thiamine precursor thiazole, and is repressed by thiamine.\ ' '4621' 'IPR002817' '\ ThiC is found within the thiamin biosynthesis operon. ThiC is involved in\ thiamin biosynthesis PUBMED:10382260. The precise catalytic\ function of ThiC is still not known. ThiC participates in the formation of\ 4-Amino-5-hydroxymethyl-2-methylpyrimidine from AIR, an intermediate in\ the de novo pyrimidine biosynthesis.\ \ ' '4622' 'IPR020536' '\ Thiamine pyrophosphate (TPP) is synthesized de novo in many bacteria and is a required cofactor for many enzymes in the cell. \ ThiI is required for thiazole synthesis in the thiamine biosynthesis pathway PUBMED:9209060. Almost all proteins containing this entry have an N-terminal THUMP domain (see ).\ ' '4623' 'IPR001031' '\ Thioesterase domains often occur integrated in or associated with peptide synthetases\ which are involved in the non-ribosomal synthesis of peptide antibiotics PUBMED:9560421.\ Thioesterases are required for the addition of the last amino acid to the peptide\ antibiotic, thereby forming a cyclic antibiotic. Next to the operons encoding these\ enzymes, in almost all cases, are genes that encode proteins that have similarity to\ the type II fatty acid thioesterases of vertebrates.\ ' '4624' 'IPR001869' '\ Thiol-activated cytolysins PUBMED:2254290, PUBMED: are toxins produced by a variety of Gram-positive bacteria and are characterised by their ability to lyse cholesterol-containing membranes, their reversible inactivation by oxidation and their capacity to bind to cholesterol. All these proteins contain a single cysteine residue, located in their C-terminal section, which has been shown PUBMED:2888650 to be essential for the binding to cholesterol.\ ' '4625' 'IPR002155' '\

    Two different types of thiolase PUBMED:1755959, PUBMED:2191949, PUBMED:1354266 are found both in eukaryotes and in prokaryotes: acetoacetyl-CoA thiolase () and 3-ketoacyl-CoA thiolase (). 3-ketoacyl-CoA thiolase (also called thiolase I) has a broad chain-length specificity for its substrates and is involved in degradative pathways such as fatty acid beta-oxidation. Acetoacetyl-CoA thiolase (also called thiolase II) is specific for the thiolysis of acetoacetyl-CoA and involved in biosynthetic pathways such as poly beta-hydroxybutyrate synthesis or steroid biogenesis.

    \ \

    In eukaryotes, there are two forms of 3-ketoacyl-CoA thiolase: one located in the mitochondrion and the other in peroxisomes.

    \ \

    There are two conserved cysteine residues important for thiolase activity. The first located in the N-terminal section of the enzymes is involved in the formation of an acyl-enzyme intermediate; the second located at the C-terminal extremity is the active site base involved in deprotonation in the condensation reaction.

    \ \

    Mammalian nonspecific lipid-transfer protein (nsL-TP) (also known as sterol carrier protein 2) is a protein which seems to exist in two different forms: a 14 Kd protein (SCP-2) and a larger 58 Kd protein (SCP-x). The former is found in the cytoplasm or the mitochondria and is involved in lipid transport; the latter is found in peroxisomes. The C-terminal part of SCP-x is identical to SCP-2 while the N-terminal portion is evolutionary related to thiolases PUBMED:1755959.

    \ ' '4626' 'IPR002155' '\

    Two different types of thiolase PUBMED:1755959, PUBMED:2191949, PUBMED:1354266 are found both in eukaryotes and in prokaryotes: acetoacetyl-CoA thiolase () and 3-ketoacyl-CoA thiolase (). 3-ketoacyl-CoA thiolase (also called thiolase I) has a broad chain-length specificity for its substrates and is involved in degradative pathways such as fatty acid beta-oxidation. Acetoacetyl-CoA thiolase (also called thiolase II) is specific for the thiolysis of acetoacetyl-CoA and involved in biosynthetic pathways such as poly beta-hydroxybutyrate synthesis or steroid biogenesis.

    \ \

    In eukaryotes, there are two forms of 3-ketoacyl-CoA thiolase: one located in the mitochondrion and the other in peroxisomes.

    \ \

    There are two conserved cysteine residues important for thiolase activity. The first located in the N-terminal section of the enzymes is involved in the formation of an acyl-enzyme intermediate; the second located at the C-terminal extremity is the active site base involved in deprotonation in the condensation reaction.

    \ \

    Mammalian nonspecific lipid-transfer protein (nsL-TP) (also known as sterol carrier protein 2) is a protein which seems to exist in two different forms: a 14 Kd protein (SCP-2) and a larger 58 Kd protein (SCP-x). The former is found in the cytoplasm or the mitochondria and is involved in lipid transport; the latter is found in peroxisomes. The C-terminal part of SCP-x is identical to SCP-2 while the N-terminal portion is evolutionary related to thiolases PUBMED:1755959.

    \ ' '4627' 'IPR013766' '\

    Thioredoxins PUBMED:3896121, PUBMED:2668278, PUBMED:7788289, PUBMED:7788290 are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of two cysteine thiol groups to a disulphide, accompanied by the transfer of two electrons and two protons. The net result is the covalent interconversion of a disulphide and a dithiol. In the NADPH-dependent protein disulphide reduction, thioredoxin reductase (TR) catalyses the reduction of oxidised thioredoxin (trx) by NADPH using FAD and its redox-active disulphide; reduced thioredoxin then directly reduces the disulphide in the substrate protein PUBMED:3896121.

    \ \

    Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulphide bond is well conserved. All thioredoxins contain a cis-proline located in a loop preceding beta-strand 4, which makes contact with the active site cysteines, and is important for stability and function PUBMED:8590004. Thioredoxin belongs to a structural family that includes glutaredoxin, glutathione peroxidase, bacterial protein disulphide isomerase DsbA, and the N-terminal domain of glutathione transferase PUBMED:7788290. Thioredoxins have a beta-alpha unit preceding the motif common to all these proteins.

    \ \

    A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulphide isomerases (PDI). PDI () PUBMED:3371540, PUBMED:2537773, PUBMED:7940678 is an endoplasmic reticulum multi-functional enzyme that catalyses the formation and rearrangement of disulphide bonds during protein folding PUBMED:7913469. All PDI contains two or three (ERp72) copies of the thioredoxin domain, each of which contributes to disulphide isomerase activity, but which are functionally non-equivalent PUBMED:7983029. Moreover, PDI exhibits chaperone-like activity towards proteins that contain no disulphide bonds, i.e. behaving independently of its disulphide isomerase activity PUBMED:7635143. The various forms of PDI which are currently known are:

    \ \

    \ \

    Bacterial proteins that act as thiol:disulphide interchange proteins that allows disulphide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins are:

    \ \

    \ \

    This entry represents the thioredoxin domain.

    \ ' '4628' 'IPR001721' '\ Threonine dehydratases including Serine/threonine dehydratase (see ) contain a common C-terminal region that may have a regulatory role. Some members contain two copies of this region PUBMED:9562556.\ ' '4629' 'IPR007292' '\

    Nuclear fusion protein tht1 is an integral membrane protein that was shown PUBMED:9442101 by mutation studies to be required for the fusion of nuclear envelopes during karyogamy.

    \ ' '4630' 'IPR004114' '\

    The THUMP domain is shared by 4-thiouridine, pseudouridine synthases and RNA methylasesPUBMED:11295541 and is probably an RNA-binding domain that adopts an\ alpha/beta fold similar to that found in the C-terminal domain of translation initiation factor 3 and ribosomal protein S8.\ The THUMP domain probably functions by delivering a variety of RNA modification enzymes to their targets PUBMED:11295541.

    \

    This domain is found in the thiamine biosynthesis proteins (ThiI) (see ).

    \ ' '4631' 'IPR003669' '\

    All cellular organisms need thymidylate (dTMP) for the replication of their chromosomes, as dTMP is required for the biosynthesis of dTTP, a building block of DNA. Cells can produce thymidylate either de novo from dUMP or incorporate thymidine using thymidine kinase. The de novo pathway of dTMP synthesis requires a specific enzyme, thymidylate synthase, which methylates dUMP at position 5 of the pyrimidine ring. There are two pathways for thymidylate synthesis, each utilising a different thymidylate synthase enzyme: ThyA () and ThyX () PUBMED:15046578. Both enzymes convert dUMP to dTMP, but there is no sequence identity between the two enzymes, and their mechanisms of action differ PUBMED:15123820. Only ThyX uses FAD as cofactor.

    \

    The well studied thyA proteins catalyse the reductive methylation reaction of dUMP, with methylenetetrahydrofolate (CH(2)H(4)folate) serving as one-carbon donor and as source of reductive power. On the other hand the thyX family of thymidylate synthases contains FAD that is tightly bound by a novel fold. FAD mediates hydride transfer from NADPH during catalysis. Consequently, in the reaction catalysed by thyX, CH(2)H(4)folate serves only as a carbon donor and tetrahydrofolate (and not dihydrofolate as in the case of thyA) is produced PUBMED:12029065, PUBMED:16707489.

    \

    The thyX domain consists of a central alpha/beta domain and two alpha-helices located away from the central domain. The central domain is made up of a five-stranded antiparallel beta-sheet, flanked by six alpha-helices on one side of the sheet PUBMED:16707489, PUBMED:12211025, PUBMED:12791256. Sequence alignments reveal a specific sequence motif R-H-R-X(7)-S (thyX motif) common to this family of proteins PUBMED:12029065.

    \

    This entry represents the flavin-dependent enzyme ThyX, which is a homotetramer bound to four FAD molecules. Under oxygen-limiting conditions, thyX can complement a thyA mutation PUBMED:12791256.

    \ ' '4632' 'IPR000398' '\ Thymidylate synthase () PUBMED:6996564, PUBMED:2117882\ catalyzes the reductive methylation\ of dUMP to dTMP with concomitant conversion of 5,10-methylenetetrahydrofolate\ to dihydrofolate:\ \ This provides the sole de novo pathway for \ production of dTMP and is the only enzyme in folate metabolism in which the\ 5,10-methylenetetrahydrofolate is oxidised during one-carbon transfer PUBMED:3099389.\ The enzyme is essential for regulating the balanced supply of the 4 DNA\ precursors in normal DNA replication: defects in the enzyme activity\ affecting the regulation process cause various biological and genetic\ abnormalities, such as thymineless death PUBMED:2243092. The enzyme is an important target for certain chemotherapeutic drugs. \ Thymidylate synthase is an enzyme of about 30 to 35 Kd in most species except\ in protozoan and plants where it exists as a bifunctional enzyme that includes\ a dihydrofolate reductase domain PUBMED:3099389.\ A cysteine residue is involved in the catalytic mechanism (it covalently binds\ the 5,6-dihydro-dUMP intermediate). The sequence around the active site of\ this enzyme is conserved from phages to vertebrates.\ ' '4633' 'IPR000062' '\

    Thymidylate kinase (; dTMP kinase) catalyzes the phosphorylation of thymidine 5\'-monophosphate (dTMP) to form thymidine 5\'-diphosphate (dTDP) in the presence of ATP and magnesium:

    \ \

    Thymidylate kinase is an ubiquitous enzyme of about 25 Kd and is important in the dTTP synthesis pathway for DNA synthesis. The function of dTMP kinase in eukaryotes comes from the study of a cell cycle mutant, cdc8, in Saccharomyces cerevisiae. Structural and functional analyses suggest that the cDNA codes for authentic human dTMP kinase. The mRNA levels and enzyme activities corresponded to cell cycle progression and cell growth stagesPUBMED:8024690.

    \ \

    This entry reprsents known and predicted kinases, and related enzymes such as UMP-CMP kinase.

    \ ' '4634' 'IPR001152' '\

    Thymosin beta-4 is a small polypeptide whose exact physiological role is not\ yet known PUBMED:4088087. It was first isolated as a thymic hormone that induces terminal deoxynucleotidyltransferase. It is found in high quantity in thymus and spleen but is widely distributed in many tissues. It has also been shown to bind to actin monomers and thus to inhibit actin polymerisation PUBMED:15336106.

    \ \

    A number of peptides closely related to thymosin beta-4 belong to this family. They include, thymosin beta-9 (and beta-8) in Bos taurus (Bovine) and Sus scrofa (Pig), thymosin beta-10 in Homo sapiens (Human) and Rattus norvegicus (Rat), thymosin beta-11 and beta-12 in Oncorhynchus mykiss (Rainbow trout) and human Nb thymosin\ beta.

    \ ' '4635' 'IPR000716' '\

    Thyroglobulin (Tg) is a large glycoprotein specific to the thyroid gland and is the precursor of the iodinated thyroid hormones thyroxine (T4) and triiodothyronine (T3). The N-terminal section of Tg contains 10 repeats of a domain of about 65 amino acids which is known as the Tg type-1 repeat PUBMED:3595599, PUBMED:8797845. Such a domain has also been found as a single or repeated sequence in the HLA class II associated invariant chain PUBMED:3038530; human pancreatic carcinoma marker proteins GA733-1 and GA733-2 PUBMED:2333300; nidogen (entactin), a sulphated glycoprotein which is widely distributed in basement membranes and that is tightly associated with laminin; insulin-like growth factor binding proteins (IGFBP) PUBMED:1709161; saxiphilin, a transferrin-like protein from Rana catesbeiana (Bull frog) that binds specifically to the neurotoxin saxitoxin PUBMED:8146142; chum salmon egg cysteine proteinase inhibitor, and equistatin, a thiol-protease inhibitor from Actinia equina (sea anemone) PUBMED:9153250. The existence of Thyr-1 domains in such a wide variety of proteins raises questions about their activity and function, and their interactions with neighbouring domains. The Thyr-1 and related domains belong to MEROPS proteinase inhibitor family I31, clan IX.

    \ \

    Equistatin from A. equina is composed of three Thyr-1 domains; as with other proteins that contains Thyr-1 domains, the thyropins, they bind reversibly and tightly to cysteine proteases (inhibitor family C1). In equistatin inhibition of papain is a function of domain-1. Unusually domain-2 inhibits cathepsin D, an aspartic protease (inhibitor family A1) and has no activity against papain. Domain-3, does not inhibit either papain or cathepsin D, and its function or its target peptidase has yet to be determined PUBMED:9153250, PUBMED:12650938.

    \ \ ' '4636' 'IPR007378' '\

    Chloroplast function requires the import of nuclear encoded proteins from the cytoplasm across the chloroplast double membrane. This is accompished by two protein complexes, the Toc complex located at the outer membrane and the Tic complex loacted at the inner membrane PUBMED:11315189. The Toc complex recognises specific proteins by a cleavable N-terminal sequence and is primarily responsible for translocation through the outer membrane, while the Tic complex translocates the protein through the inner membrane.

    \ \

    This entry represents Tic22, a core member of the Tic complex. It is believed to act as a link between both protein complexes, contacting the translocated protein in the intermembrane space after transport through the Toc complex, and directing it to the Tic complex PUBMED:9817756.

    \ \ ' '4637' 'IPR018453' '\

    This domain is found in proteinase inhibitors as well as in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9.

    \ \

    This inhibitor domain belongs to MEROPS inhibitor family I8 (clan IA). Proteins containing this domain inhibit peptidases belonging to families S1 (), S8 (), and M4 () PUBMED:14705960 and are restricted to the chordata, nematoda, arthropoda and echinodermata. Examples of proteins containing this domain are:

    \ \ \ \ ' '4639' 'IPR000652' '\

    Triosephosphate isomerase () (TIM) PUBMED:2204417 is the glycolytic enzyme that catalyses the reversible interconversion of glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. TIM plays an important role in several metabolic pathways and is essential for efficient energy production. It is a dimer of identical subunits, each of which is made up of about 250 amino-acid residues. A glutamic acid residue is involved in the catalytic mechanism PUBMED:2005961. The sequence around the active site residue is perfectly conserved in all known TIM\'s. Deficiencies in TIM are associated with haemolytic anaemia coupled with a progressive, severe neurological disorder PUBMED:12023819.

    \ ' '4640' 'IPR003397' '\

    The membrane-embedded multi-protein complexes of mitochondria mediate the transport of nuclear-encoded proteins across and into the outer or inner mitochondrial membranes PUBMED:15232570. The TOM (translocase of the outer mitochondrial membrane) complex consists of cytosol-exposed receptors and a pore-forming core, and mediates the transport of proteins from the cytosol across and into the outer mitochondrial membrane. A novel protein complex in the outer membrane of mitochondria, called the SAM complex (sorting and assembly machinery), is involved in the biogenesis of beta-barrel proteins of the outer membrane. Two translocases of the inner mitochondrial membrane (TIM complexes) mediate protein transport at the inner membrane.

    The TIM23 complex (a presequence translocase) mediates the transport of presequence-containing proteins across and into the inner membrane. TIM17 forms a part of this complex, although its role is not yet fully understood. The TIM22 complex (a twin-pore carrier translocase) catalyses the insertion of multi-spanning proteins that have internal targeting signals into the inner membrane. The TIM22 complex mediates the membrane insertion of multi-spanning inner-membrane proteins that have internal targeting signals, and it uses a as an external driving force. The Tim22 subunit of the mitochondrial import inner membrane translocase is included in this family.

    \ ' '4641' 'IPR006906' '\

    The timeless gene in Drosophila melanogaster (Fruit fly) and its homologues in a number of other insects and mammals (including human) are involved in circadian rhythm control PUBMED:11710984. This family includes related proteins from a number of fungal species and from Arabidopsis thaliana.

    \ ' '4642' 'IPR007725' '\ The timeless (tim) gene is essential for circadian function in Drosophila. Putative homologues of Drosophila tim have been identified in both mice and humans (mTim and hTIM, respectively). Mammalian TIM is not the true orthologue of Drosophila TIM, but is the likely orthologue of a fly gene, timeout (also called tim-2) PUBMED:11237000. mTim has been shown to be essential for embryonic development, but does not have substantiated circadian function PUBMED:10903565. Some family members contain a SANT domain in this region.\ ' '4643' 'IPR001820' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    Tissue inhibitors of metalloproteinases (TIMPs, PUBMED:2793861, PUBMED:1850705, PUBMED:1512267) and their target matrix metalloproteinases (MMPs, MEROPS peptidase family M10A) are important in connective tissue re-modelling in diseases of the cardiovascular system and in the physiological degradation of connective tissue, as well as in pathological states such as tumour invasion and arthritis. TIMPs belong to MEROPS proteinase inhibitor family I35, clan IT.

    \ \

    TIMPs complex with extracellular matrix metalloproteinases (such as collagenases) and irreversibly inactivate them. Members of this family are common in extracellular regions of vertebrate species PUBMED:7918391. TIMPs are proteins of about 200 amino acid residues, 12 of which are cysteines involved in disulphide bonds PUBMED:2163605.\ The basic structure of such a type of inhibitor is shown in the following schematic representation:

    \
    \
              +-----------------------------+         +--------------+\
              |                             |         |              |\
            CxCxCxxxxxxxxxxxxxxxxxCxxxxxxxxxCxxxxxxxCxCxCxCxCxxxxxCxxCxxx\
            |   |                 |                 |   | | |     |\
            |   +-----------------|-----------------+   +-+ +-----+\
            +---------------------+\
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
    
    \ \

    The crystal structure of the human proMMP-2/TIMP-2 complex reveals an interaction between the hemopexin domain of proMMP-2 and the C-terminal domain of TIMP-2, leaving the catalytic site of MMP-2 and the inhibitory site of TIMP-2 distant and spatially isolated. The interfacial contact of these two proteins is characterised by two distinct binding regions composed of alternating hydrophobic and hydrophilic interactions. This unique structure provides information for how specificity for non-inhibitory MMP/TIMP complex formation is achieved PUBMED:12032297.

    \ ' '4645' 'IPR003536' '\ Secretion of virulence factors in Gram-negative bacteria involves \ transportation of the protein across two membranes to reach the cell \ exterior. There have been four secretion systems described in \ animal enteropathogens, such as Salmonella and Yersinia, with further \ sequence similarities in plant pathogens like Ralstonia and Erwinia PUBMED:9618447.\ \

    The type III secretion system is of great interest, as it is used to \ transport virulence factors from the pathogen directly into the host cell \ and is only triggered when the bacterium comes into close contact with\ the host. The protein subunits of the system are very similar to those of \ bacterial flagellar biosynthesis. However, while the latter forms a\ ring structure to allow secretion of flagellin and is an integral part of\ the flagellum itself PUBMED:9618447, type III subunits in the outer membrane \ translocate secreted proteins through a channel-like structure.

    \ \

    Exotoxins secreted by the type III system do not possess a secretion signal,\ and are considered unique for this reason PUBMED:9618447. Enteropathogenic and entero-\ haemorrhagic Escherichia coli secrete the bacterial adhesion mediation\ molecule intimin PUBMED:10835344, which targets the translocated intimin receptor, Tir. Tir is secreted by the bacteria and is embedded in the target cell\'s plasma membrane PUBMED:10835344. This facilitates bacterial cell attachment to the host.

    \ ' '4646' 'IPR007635' '\ All proteins of containing this domain also contain a tandem repeat of CCCH zinc fingers (). Tis11B, Tis11D and their homologues are thought to be regulatory proteins involved in the response to growth factors PUBMED:1695727. Tis11B () is thought to be involved in calcium signalling-induced apoptosis in B cells PUBMED:8898945. The function of this N-terminal domain is unknown.\ ' '4647' 'IPR001187' '\

    Tissue factor (TF, also known as thromboplastin) is an integral membrane glycoprotein that initiates blood coagulation by forming a complex with circulating factor VII (FVII) or VIIa (FVIIa), which it comes in contact with following damage to blood vessel walls. Calcium forms the bridge between TF and FVII, the resultant TF/FVII undergoing auto-cleavage to produce activated TF/FVIIa. This activation sets off an extracellular cascade involving sequential serine protease activations, where TF/FVIIa converts FIX to FIXa, followed by a series of reactions to finally produce fibrin, leading to fibrin deposition and the activation of platelets to form clots PUBMED:15569823, PUBMED:16261634.

    \

    Tissue Factor plays many diverse roles, and in addition to promoting blood coagulation, it is involved in inflammation, embryonic development, angiogenesis, tumour metastasis, cell adhesion/migration, and innate immunity PUBMED:16479459. For example, TF plays an important role in inflammation, since the extracellular blood coagulation signalling pathway can trigger an intracellular inflammation-signalling pathway PUBMED:16036212, PUBMED:14872439. TF activation leads to the production of activated factors FVIIa, FXa and FIIa, which in turn can activate PAR (protease-activated receptors) receptors, resulting in the expression of a variety of inflammatory molecules.

    \

    The extracellular domain of tissue factor, which accounts for over 80% of the protein, contains two domains with the same structural fold as fibronectin type III, consisting of an immunoglobulin-like beta-sandwich with a Greek key topology.

    \

    More information about these protein can be found at Protein of the Month: Tissue Factor PUBMED:.

    \ \ ' '4648' 'IPR001267' '\

    Thymidine kinase (TK) () is an ubiquitous enzyme that catalyzes the ATP-dependent phosphorylation of thymidine. Two different families of TK have been identified PUBMED:3027984, PUBMED:2389555 and are included in this family; one family groups\ together TK from herpesviruses as well as cellular thymidylate kinases and the \ second family groups TK from various sources that include, vertebrates, bacteria, the Bacteriophage T4, poxviruses, African swine fever virus (ASFV) and Fish lymphocystis disease virus (FLDV). The major capsid protein of insect iridescent viruses also belongs to this family. The Prosite pattern recognises only the cellular type of thymidine kinases.

    \ ' '4649' 'IPR001889' '\

    The thymidine kinase from Herpesviridae catalyses the reaction:

    \

    The enzyme is not subject to feedback inhibition by its product and the crystal structure of the enzyme from Human herpesvirus 1 (HHV-1) has been reported PUBMED:7552712.

    \ ' '4650' 'IPR004667' '\

    These proteins are members of the ATP:ADP Antiporter (AAA) family, which consists of nucleotide transporters that have 12 GES predicted transmembrane regions. One protein from Rickettsia prowazekii functions to take up ATP from the eukaryotic cell cytoplasm into the bacterium in exchange for ADP. Five AAA family paralogues are encoded within the genome of R. prowazekii. This organism transports UMP and GMP but not CMP, and it seems likely that one or more of the AAA family paralogues are responsible. The genome of Chlamydia trachomatis encodes two AAA family members, Npt1 and Npt2, which catalyse ATP/ADP exchange and GTP, CTP, ATP and UTP uptake probably employing a proton symport mechanism. Two homologous adenylate translocators of Arabidopsis thaliana are postulated to be localized to the intracellular plastid membrane where they function as ATP importers.

    \ ' '4651' 'IPR007713' '\ This short repeat consists of the motif WXXh where X can be any residue and h is a hydrophobic residue. The repeat is named TMP after its occurrence in the tape measure protein (TMP). Tape measure protein is a component of phage tail and probably forms a beta-helix. Truncated forms of TMP lead to shortened tail fibres PUBMED:11040123. This repeat is also found in non-phage proteins where it may play a structural role.\ ' '4652' 'IPR003733' '\

    Thiamine monophosphate synthase (TMP) () catalyzes the substitution of the pyrophosphate of 2-methyl-4-amino-5- hydroxymethylpyrimidine pyrophosphate by 4-methyl-5- (beta-hydroxyethyl)thiazole phosphate to yield thiamine phosphate in the thiamine biosynthesis pathway PUBMED:9139923.

    \ \

    TENI, a protein from Bacillus subtilis that regulates the production of several extracellular enzymes by reducing alkaline protease production belongs to this group PUBMED:1898926.

    \ ' '4653' 'IPR001337' '\

    This family contains coat proteins from tobamoviruses, which are ssRNA positive-strand viruses with no DNA stage. Examples include Tobacco mosaic virus (TMV), Cucumber green mottle mosaic virus and Ribgrass mosaic virus (RMV).

    \

    In order to establish infections, viruses must be delivered to the cells of potential hosts and must then engage in activities that enable their genomes to be expressed and replicated. With most viruses, the events that precede the onset of production of progeny virus particles are referred to as the early events and, in the case of positive-strand RNA viruses, they include the initial interaction with and entry of host cells and the release (uncoating) of the genome from the virus particles. The uncoating process in TMV may involve the bidirectional release of coat protein subunits from the viral RNA which may be mediated by cotranslational and coreplicational disassembly mechanisms PUBMED:10212940.

    \

    The TMV particle is assembled from its constituent coat protein and RNA by a complex process. The protein forms an obligatory intermediate (a cylindrical disk composed of two layers of protein units), which recognises a specific RNA hairpin sequence. This mechanism simultaneously fulfils the physical requirement for nucleating the growth of the helical particle and the biological requirement for specific recognition of the viral DNA PUBMED:10212932.

    \ ' '4654' 'IPR006052' '\

    Cytokines can be grouped into a family on the basis of sequence, functional and structural similarities PUBMED:8095800, PUBMED:1377364, PUBMED:. Tumor necrosis factor (TNF) (also known as TNF-alpha or cachectin) is a monocyte-derived cytotoxin that has been implicated in tumour regression, septic shock and cachexia PUBMED:2989794, PUBMED:3349526. The protein is synthesised as a prohormone with an unusually long and atypical signal sequence, which is absent from the mature secreted cytokine PUBMED:2268312. A short hydrophobic stretch of amino acids serves to anchor the prohormone in lipid bilayers PUBMED:2777790. Both the mature protein and a partially-processed form of the hormone are secreted after cleavage of the propeptide PUBMED:2777790.

    There are a number of different families of TNF, but all these cytokines seem to form homotrimeric (or heterotrimeric in the case of LT-alpha/beta) complexes that are recognised by their specific receptors.

    \

    The following cytokines can be grouped into a family on the basis of sequence, functional, and structural similarities PUBMED:8095800, PUBMED:1377364, PUBMED::

    \ \

    All these cytokines seem to form homotrimeric (or heterotrimeric in the case of LT-alpha/beta) complexes that are recognised by their specific receptors. The PROSITE pattern for this family is located in a beta-strand in the central section of the protein which is conserved across all members.

    \ ' '4655' 'IPR001368' '\

    A number of proteins, some of which are known to be receptors for growth factors have been found to contain a cysteine-rich domain at the N-terminal region that can be subdivided into four (or in some cases, three) repeats containing six conserved cysteines all of which are involved in intrachain disulphide bonds PUBMED:8387891.

    \ \

    CD27 (also called S152 or T14) mediates a co-stimulatory signal for T and B cell activation and is involved in murine T cell development. Tyrosine-phosphorylation of ZAP-70 following CD27 ligation of T cells has been reported PUBMED:7989747, but not confirmed independently. CD30 was originally identified as Ki-1, an antigen expressed on Reed-Sternberg cells in Hodgkin\'s lymphomas and other non-Hodgkin\'s lymphomas, particularly diffuse large-cell lymphoma and immunoblastic lymphoma. CD30 has pleiotropic effects on CD30-positive lymphoma cell lines ranging from cell proliferation to cell death. It is thought to be involved in negative selection of T-cells in the thymus and is involved in TCR-mediated cell death. CD30 is a member of the TNFR family of molecules, activate NFkB through interaction with TRAF2 and TRAF5. CD40 (Bp50) plays a central role in the regulation of cell-mediated immunity as well as antibody mediated immunity. It is central to T cell dependent (TD)-responses and may influence survival of B cell lymphomas.

    \ \

    CD95 (also called APO-1, fas antigen, Fas tumour necrosis factor receptor superfamily, member 6, TNFRSF6 or apoptosis antigen 1, APT1) is expressed, typically at high levels, on activated T and B cells. It is involved in the mediation of apoptosis-inducing signals.

    \ \

    Other proteins known to belong to this family PUBMED:1653571, PUBMED:2174582, PUBMED:15335933, PUBMED: are, tumour Necrosis Factor type I and type II receptors (TNFR), Rabbit fibroma virus soluble TNF receptor (protein T2), lymphotoxin alpha/beta receptor, low-affinity nerve growth factor receptor (LA-NGFR) (p75), T-cell antigen OX40, Wsl-1, a receptor (for a yet undefined ligand) that mediates apoptosis and Vaccinia virus \ protein A53 (SalF19R).

    \ \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).

    \ ' '4656' 'IPR001022' '\ The movement protein of tobamoviruses is necessary for the initial cell-to-cell\ movement during the early stages of a viral infection. This movement is active,\ and involves the interaction of the movement protein with the plasmodesmata.\ The movement protein possesses the ability to bind to RNA to achieve its\ role PUBMED:1546450.\

    The N terminus contains two particularly well-conserved regions, substitutions\ in one of these results in temperature-sensitive cell-to-cell movement. The C terminus contains three sub-regions characterised by the distributions of charged\ amino acid residues PUBMED:3201760.

    \ ' '4657' 'IPR007195' '\

    TolB is a periplasmic protein from Escherichia coli that is part of the Tol-dependent translocation system involving group A and E colicins that is used to penetrate and kill cells PUBMED:10545334, PUBMED:10673426. TolB has two domains, an alpha-helical N-terminal domain that shares structural similarity with the C-terminal domain of transfer RNA ligases, and a beta-propeller C-terminal domain () that shares structural similarity with numerous members of the prolyl oligopeptidase family and, to a lesser extent, to class B metallo-beta-lactamases PUBMED:10545334. The function of the N-terminal domain is uncertain.

    \ ' '4658' 'IPR005017' '\

    This family includes TodX from Pseudomonas putida (strain F1/ATCC 700007) and TbuX from Burkholderia pickettii (Ralstonia pickettii) (Pseudomonas pickettii) PKO1 . These are membrane proteins of uncertain function that are involved in toluene catabolism. Related proteins involved in the degradation of similar aromatic hydrocarbons are also in this family, such as CymD .

    \ ' '4659' 'IPR005683' '\

    The mitochondrial protein translocase (MPT) family, which brings nuclearly encoded preproteins into mitochondria, is very complex with 19 currently identified protein constituents. These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins.The inner membrane translocase is formed of a complex with a number of proteins, including the Tim17, Tim23 and Tim44 subunits. This family is specific for the Tom22 proteins.

    \ ' '4660' 'IPR004905' '\ This family represents the Tombusvirus P19 core protein.\ ' '4661' 'IPR003538' '\ Iron is essential for growth in both bacteria and mammals. Controlling the\ amount of free iron in solution is often used as a tactic by hosts to limit\ invasion of pathogenic microbes; binding iron tightly within protein\ molecules can accomplish this. Such iron-protein complexes include haem in\ blood, lactoferrin in tears/saliva and transferrin in blood plasma. Some\ bacteria express surface receptors to capture eukaryotic iron-binding\ compounds, while others have evolved siderophores to scavenge iron from\ iron-binding host proteins PUBMED:8057905.\ \

    The absence of free iron molecules in the surrounding environment triggers \ transcription of gene clusters that encode both siderophore-synthesis \ enzymes, and receptors that recognise iron-bound siderophores PUBMED:2521621. An \ example of the latter is Escherichia coli fepA, which resides in the outer \ envelope and captures iron-bound enterobactin PUBMED:9886293.

    \ \

    To complete transport of bound iron across the inner membrane, a second \ receptor complex is needed. The major component of this is tonB, a 27kDa\ protein that facilitates energy transfer from the proton motive force to\ outer receptors PUBMED:9643536. B-12 and colicin receptors also make use of the tonB\ system to drive active transport at the outer membrane.

    \ ' '4662' 'IPR000531' '\

    In Escherichia coli the TonB protein interacts with outer membrane receptor proteins that carry out high-affinity binding and energy-dependent uptake of specific substrates into the periplasmic space PUBMED:14499604. These substrates are either poorly permeable through the porin channels or are encountered at very low concentrations. In the absence of TonB, these receptors bind their substrates but do not carry out active transport. TonB-dependent regulatory systems consist of six components: a specialised outer membrane-localised TonB-dependent receptor (TonB-dependent transducer) that interacts with its energising TonB-ExbBD protein complex, a cytoplasmic membrane-localised anti-sigma factor and an extracytoplasmic function (ECF)-subfamily sigma factor PUBMED:15993072. The TonB complex senses signals from outside the bacterial cell and transmits them via two membranes into the cytoplasm, leading to transcriptional activation of target genes. The proteins that are currently known or presumed to interact with TonB include BtuB PUBMED:12652322, CirA, FatA, FcuT, FecA PUBMED:11872840, FhuA PUBMED:9865695, FhuE, FepA PUBMED:9886293, FptA, HemR, IrgA, IutA, PfeA, PupA and Tbp1. The TonB protein also interacts with some colicins. Most of these proteins contain a short conserved region at their N-terminus PUBMED:12957833.

    \

    This entry covers the conserved part of the beta-barrel structure at the C-terminal.

    \ ' '4663' 'IPR013497' '\

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis PUBMED:12042765, PUBMED:11395412. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227.

    \

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    \

    Type IA topoisomerases are comprised of four domains that together form a toroidal structure with a central hole large enough to accommodate single- and double-stranded DNA: an N-terminal alpha/beta Toprim domain, domain 2 and the C-terminal domain 4 are winged-helix domains, and domain 3 is a beta-barrel. Domains 1 (Toprim) and 3 form the active site of the enzyme, while the winged helix domains 2 and 4 form a single-strand DNA-binding groove PUBMED:14604525, PUBMED:10574789. This entry represents the central portion of the enzyme, which covers domains 2 and 3 in topoisomerase type IA enzymes.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '4664' 'IPR008336' '\

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis PUBMED:12042765, PUBMED:11395412. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227.

    \

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    \

    This entry represents the N-terminal DNA-binding domain found in eukaryotic topoisomerase I, which is a type IB enzymes. To cleave the DNA backbone, these enzymes must make a transient phosphotyrosine bond. The N-terminal domain of human topoisomerase I is thought to coordinate the restriction of free strand rotation during the topoisomerisation step of catalysis. A conserved tryptophan residue may be important for the DNA-interaction ability of the N-terminal domain PUBMED:14741206. Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes PUBMED:9488644.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '4665' 'IPR013500' '\

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis PUBMED:12042765, PUBMED:11395412. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227.

    \

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    \

    This entry represents the catalytic core of eukaryotic and viral topoisomerase I (type IB) enzymes, which occurs near the C-terminal region of the protein.

    \

    Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity PUBMED:1849260. The crystal structures of human topoisomerase I comprising the core and carboxyl-terminal domains in covalent and noncovalent complexes with 22-base pair DNA duplexes reveal an enzyme that "clamps" around essentially B-form DNA. The core domain and the first eight residues of the carboxyl-terminal domain of the enzyme, including the active-site nucleophile tyrosine-723, share significant structural similarity with the bacteriophage family of DNA integrases. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes PUBMED:9488644.

    \

    Vaccinia virus, a cytoplasmically-replicating poxvirus, encodes a type I DNA topoisomerase that is biochemically similar to eukaryotic-like DNA topoisomerases I, and which has been widely studied as a model topoisomerase. It is the smallest topoisomerase known and is unusual in that it is resistant to the potent chemotherapeutic agent camptothecin. The crystal structure of an amino-terminal fragment of vaccinia virus DNA topoisomerase I shows that the fragment forms a five-stranded, antiparallel beta-sheet with two short alpha-helices and connecting loops. Residues that are conserved between all eukaryotic-like type I topoisomerases are not clustered in particular regions of the structure PUBMED:7994576.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '4666' 'IPR006171' '\

    This is a conserved region from DNA primase. This corresponds to the Toprim (topoisomerase-primase) domain common to DnaG primases, topoisomerases, OLD family nucleases and RecR/M DNA repair proteins PUBMED:9121560. Both DnaG motifs IV and V are present in the alignment, the DxD (V) motif may be involved in Mg2+ binding and mutations to the conserved glutamate (IV) completely abolish DnaG type primase activity. DNA primase is a nucleotidyltransferase it synthesizes the oligoribonucleotide primers required for DNA replication on the lagging strand of the replication fork; it can also prime the leading stand and has been implicated in cell division PUBMED:8294018. This family also includes the atypical archaeal A subunit from type II DNA topoisomerases PUBMED:9722641. Type II DNA topoisomerases catalyse the relaxation of DNA supercoiling by causing transient double strand breaks.

    \ ' '4667' 'IPR002517' '\ The tospovirus genome consists of three linear ssRNA segments,\ denoted L, M and S complexed with the nucleocapsid protein.\ The S RNA encodes the nucleocapsid protein and another\ non-structural protein PUBMED:8429298.\ ' '4668' 'IPR003571' '\

    Snake toxins belong to a family of proteins PUBMED:6433031, PUBMED:, PUBMED: which groups short and\ long neurotoxins, cytotoxins and short toxins, as well as a other miscellaneous\ venom peptides. Most of these toxins act by binding to the nicotinic\ acetylcholine receptors in the postsynaptic membrane of skeletal muscles and\ prevent the binding of acetylcholine, thereby blocking the excitation of\ muscles.

    \

    Snake toxins are proteins that consist of sixty to seventy five amino acids.\ Among the invariant residues are eight cysteines all involved in disulphide\ bonds. The structure is small, disulphide-rich, nearly all beta sheet.

    \ ' '4669' 'IPR001947' '\

    Scorpion venoms contain a variety of peptides toxic to mammals, insects and crustaceans. Among these peptides there is a family of short toxins (30 to 40 residues) PUBMED:7998956, PUBMED:7819188 including charybdotoxin, kaliotoxin PUBMED:1730708, noxiustoxin PUBMED: and iberiotoxin PUBMED:1694175, PUBMED:1381959. Charybdotoxin consists of a single polypeptide chain and is a potent, selective inhibitor of calcium-activated potassium channels in pituitary and aortic smooth muscle cells - the toxin reversibly blocks channel activity by interacting at the external pore of the channel proteinPUBMED:2453055.

    \ \

    The tertiary structure of the toxins comprises a 3-stranded beta-sheet and a short helix, and is stabilised by a number of disulphide bridges PUBMED:1381959 as shown in the following schematic representation:\

    \
                                 +---------------------+\
                                 |                     |\
                                 |                     |\
                          xxxxxxxCxxxxxCxxxCxxxxxxxxxxxCxxxxCxCxxx\
                                       |   |                | |\
                                       |   +----------------+ |\
                                       +----------------------+\
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
    
    \

    \ ' '4670' 'IPR002061' '\

    Scorpion toxins, which may be mammal or insect specific, bind to sodium\ channels, inhibiting the inactivation of activated channels and blocking\ neuronal transmission. The complete covalent structure of the toxins has\ been deduced: it comprises around 66 amino acid residues and is cross-\ linked by 4 disulphide bridges PUBMED:2311768, PUBMED:6845379. An anti-epilepsy peptide isolated\ from scorpion venom PUBMED:2930463 shows similarity to both scorpion neurotoxins and anti-insect toxins.

    \ \

    This family also contains a group of proteinase inhibitors from Arabidopsis thaliana and Brassica spp., which belong to MEROPS inhibitor family I18, clan I-. The Brassica napus (Oil seed rape) and Sinapis alba (White mustard) inhibitors PUBMED:8143882, PUBMED:1451776, inhibit the catalytic activity of bovine beta-trypsin and bovine alpha-chymotrypsin, which belong to MEROPS peptidase family S1 () PUBMED:14705960.

    \ ' '4671' 'IPR000693' '\ Sea anemones produce many different neurotoxins with related structure and function. Proteins\ belonging to this family include the neurotoxins, of which there are several, including calitoxin and anthopleurin.\ The neurotoxins bind specifically to the sodium channel, thereby delaying its inactivation during\ signal transduction, resulting in strong stimulation of mammalian cardiac muscle contraction. Calitoxin\ 1 has been found in neuromuscular prearations of crustaceans, where it increases transmitter release,\ causing firing of the axons. Three disulphide bonds are present in this protein PUBMED:6108877, PUBMED:4019448, PUBMED:7582896.\ ' '4672' 'IPR001319' '\

    Nuclear transition protein 1 (TP1) is one of the spermatid-specific proteins \ PUBMED:2040274. TP1 is a basic protein well conserved in mammalian species. In mammals, the second stage of spermatogenesis is characterised by the conversion of nucleosomal chromatin to the compact, non-nucleosomal and transcriptionally inactive form found in the sperm nucleus. This condensation is \ associated with a double-protein transition. The first transition corresponds to the replacement of histones by several spermatid-specific proteins (also called transition proteins) which are themselves replaced by protamines during the second transition.

    \ ' '4673' 'IPR000678' '\ In mammals, the second stage of spermatogenesis is characterised by the conversion of nucleosomal\ chromatin to the compact, nonnucleosomal and transcriptionally inactive form found in the sperm nucleus.\ This condensation is associated with a double-protein transition. The first transition corresponds to the\ replacement of histones by several spermatid-specific proteins, also called transition proteins, which are\ themselves replaced by protamines during the second transition. Nuclear transition protein 2 (TP2) is one\ of those spermatid-specific proteins. TP2 is a basic, zinc-binding protein PUBMED:1930189 of 116 to 137\ amino-acid residues. \

    Structurally, TP2 consists of three distinct parts, a conserved serine-rich N-terminal\ domain of about 25 residues, a variable central domain of 20 to 50 residues which contains cysteine residues,\ and a conserved C-terminal domain of about 70 residues rich in lysines and arginines.

    \ ' '4674' 'IPR013049' '\

    This entry represents the N-terminal domain found in Spo11, a meiotic recombination protein found in eukaryotes, and in subunit A of topoisomerase VI, a type IIB topoisomerase found predominantly in archaea PUBMED:10545127, PUBMED:12618182. These two types of proteins share structural homology.

    \

    Spo11 is a meiosis-specific protein that is responsible for the initiation of recombination through the formation of DNA double-strand breaks by a type II DNA topoisomerase-like activity. Spo11 acts in conjunction with several other proteins, including Rec102 in yeast, to bring about meiotic recombination PUBMED:11805049.

    \

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. They can be divided into two classes: type I enzymes (, topoisomerases I, III and V) break single-strand DNA, and type II enzymes (, topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227. Topoisomerase VI is a type IIB enzymes that assembles as a heterotetramer, consisting of two A subunits required for DNA cleavage and two B subunits required for ATP hydrolysis. The B subunit is structurally similar to the ATPase domain of type IIA topoisomerases, but the A subunit is distinct, and instead shares homology with the Spo11 protein.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '4675' 'IPR000878' '\

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway PUBMED:11215515. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including cobalamin (vitamin B12), haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin PUBMED:17227226.

    \

    This entry represents several tetrapyrrole methylases, which consist of two non-similar domains. These enzymes catalyse the methylation of their substrates using S-adenosyl-L-methionine as a methyl source. Enzymes in this family include:

    \ \ ' '4676' 'IPR007327' '\ The hD52 gene was originally identified through its elevated expression level in human breast carcinoma. Cloning of D52 homologues from other species has indicated that D52 may play roles in calcium-mediated signal transduction and cell proliferation. Two human homologues of hD52, hD53 and hD54, have also been identified, demonstrating the existence of a novel gene/protein family PUBMED:9484778. These proteins have an N-terminal coiled-coil that allows members to form homo- and heterodimers with each other PUBMED:9484778.\ ' '4677' 'IPR012000' '\

    A number of enzymes require thiamine pyrophosphate (TPP) (vitamin B1) as a cofactor. It has been shown PUBMED:8604141 that some of these enzymes are structurally related. This central domain of TPP enzymes contains a 2-fold Rossman fold.

    \ ' '4678' 'IPR011766' '\

    A number of enzymes require thiamine pyrophosphate (TPP) (vitamin B1) as a cofactor. It has been shown PUBMED:8604141 that some of these enzymes are structurally related. This represents the C-terminal TPP binding domain of TPP enzymes.

    \ ' '4679' 'IPR012001' '\

    A number of enzymes require thiamine pyrophosphate (TPP) (vitamin B1) as a cofactor. It has been shown PUBMED:8604141 that some of these enzymes are structurally related. This represents the N-terminal TPP binding domain of TPP enzymes.

    \ ' '4680' 'IPR002816' '\

    In prokaryotes, for example Enterococcus faecalis (Streptococcus faecalis), the conjugative transfer of certain plasmids is controlled by peptide pheromones PUBMED:15374642. Plasmid free recipient cells secret plasmid specific oligopeptides, termed sex pheromones. They induce bacterial clumping and specifically activate the conjugative transfer of the corresponding plasmid. Once recipient cells acquire the plasmid they start to produce a pheromone inhibitor to block the activity of the pheromone and to prevent plasmid containing cells from clumping; they also become donor cells able to transfer the plasmid to plasmid free recipient cells. Examples of such plasmid-pheromone systems are bacteriocin plasmid pPD1 PUBMED:7559344, haemolysin/bacteriocin plasmid, pAD1 PUBMED:1924555, tetracycline-resistance plasmid,\ pCF10 PUBMED:8349565, and the haemolysin/bacteriocin plasmid, pOB1 PUBMED:7772836.

    \ \

    TraB in combination with another factor contributes to pheromone shutdown in cells that have acquired a plasmid. It exact function has not yet been determined PUBMED:2158976, PUBMED:10850999. This entry also contains plant and mammalian proteins, suggesting that these Trab-related proteins may have a somewhat wider or different function in eukaryotes.

    \ ' '4681' 'IPR003688' '\ The TRAG family are bacterial conjugation proteins. These proteins aid the transfer of DNA from the plasmid into\ the host bacterial chromosome although the exact mechanism of action is unknown.\ ' '4682' 'IPR001585' '\

    Transaldolase () catalyses the reversible transfer of a three-carbon ketol unit from sedoheptulose 7-phosphate to glyceraldehyde 3-phosphate to form erythrose 4-phosphate and fructose 6-phosphate. This enzyme, together with transketolase, provides a link between the glycolytic and pentose-phosphate pathways. Transaldolase is an enzyme of about 34 kDa whose sequence has been well conserved throughout evolution. A lysine has been implicated PUBMED:8109173 in the catalytic mechanism of the enzyme; it acts as a nucleophilic group that attacks the carbonyl group of fructose-6-phosphate.

    \

    Transaldolase is evolutionary related PUBMED:7773398 to a bacterial protein of about 20 Kd (known as talC in Escherichia coli, ), whose exact function is not yet known.

    \ ' '4683' 'IPR013150' '\

    Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles PUBMED:12910258, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

    \

    Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi\'s sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus PUBMED:11056549.

    \ \

    In eukaryotes, transcription initiation of all protein encoding genes involves the polymerase II system. This sytem is modulated by both general and specific \ transcription factors. The general factors (which include TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIG and TFIIH) operate through common promoter elements, such as the TATA box. Transcription factor IIB (TFIIB) is of central importance in transcription of class II genes. It associates with TFIID-TFIIA bound to DNA (the DA complex) to form a ternary TFIID-IIA-IBB (DAB) complex, which is recognised by \ RNA polymerase II PUBMED:1876184, PUBMED:1949150. TFIIB comprises ~315-340 residues and contains an imperfect C-terminal repeat of a 75-residue domain that may contribute to the symmetry of the folded protein. The basal archaeal transcription machinery resembles that of the eukaryotic polymerase II system and includes a homologue of TFIIB PUBMED:7597027.

    \ \

    This entry represents a cyclin-like domain which is found repeated in the C-terminal region of a variety of eukaryotic TFIIB\'s and their archaeal counterparts. These domains individually form the typical cyclin fold, and in the transcription complex they straddle the C-terminal region of the TATA-binding protein - an interaction essential for the formation of the transcription initiation complex PUBMED:9177165, PUBMED:10619841.

    \ ' '4684' 'IPR001156' '\

    Transferrins are eukaryotic iron-binding glycoproteins that control the\ level of free iron in biological fluids PUBMED:3032619. The proteins have arisen by duplication of a\ domain, each duplicated domain binding one iron atom. Members of the family include\ blood serotransferrin (siderophilin); milk lactotransferrin (lactoferrin); egg white\ ovotransferrin (conalbumin); and membrane-associated melanotransferrin.

    \ \

    Human lactoferrin is a serine peptidase belonging to MEROPS peptidase family S60, clan SR. It is found at high concentrations in all \ human secretions, where it plays a major role in mucosal defence. Lactoferrin cleaves IgA1 protease at an arginine-rich region defined by amino acids RRSRRSVR and digests Hap at a similar arginine-rich sequence (VRSRRAAR). Ser259 and Lys73 form a catalytic dyad, reminiscent of a number of bacterial serine proteases.

    \ ' '4685' 'IPR001102' '\ Synonym(s): Protein-glutamine gamma-glutamyltransferase, Fibrinoligase, TGase \

    Protein-glutamine gamma-glutamyltransferases () (TGase) are calcium-dependent enzymes that catalyse the cross-linking of proteins by promoting the formation of isopeptide bonds between the gamma-carboxyl group of a glutamine in one polypeptide chain and the epsilon-amino group of a lysine in a second polypeptide chain. TGases also catalyse the conjugation of polyamines to\ proteins PUBMED:1683845, PUBMED:1974250.

    \ \

    Transglutaminases are widely distributed in various organs, tissues and\ body fluids. The best known transglutaminase is blood coagulation factor XIII,\ a plasma tetrameric protein composed of two catalytic A subunits and two\ non-catalytic B subunits. Factor XIII is responsible for cross-linking fibrin chains, thus stabilising the fibrin clot.

    \ \

    There are commonly three domains: N-terminal, middle () and C-terminal (). This entry represents the N-terminal domain found in transglutaminases.

    \ ' '4686' 'IPR001264' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    Glycosyltransferase family 51 comprises enzymes with only one known activity; murein polymerases (). These enzymes utilise MurNAc-GlcNAc-P-P-lipid II as the sugar donor.

    \ \ \

    The family includes the bifunctional penicillin-binding proteins that have a \ transglycosylase (N-terminus) and transpeptidase (C-terminus) domain PUBMED:9244263 and \ the monofunctional biosynthetic peptidoglycan transglycosylases PUBMED:8830253.

    \ ' '4687' 'IPR005474' '\

    Transketolase (TK) catalyzes the reversible transfer of a\ two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as\ ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3-\ phosphate. This enzyme, together with transaldolase, provides a link between\ the glycolytic and pentose-phosphate pathways.\ TK requires thiamine pyrophosphate as a cofactor. In most sources where TK has\ been purified, it is a homodimer of approximately 70 Kd subunits. TK sequences\ from a variety of eukaryotic and prokaryotic sources PUBMED:1567394, PUBMED:1737042 show that the\ enzyme has been evolutionarily conserved.\ In the peroxisomes of methylotrophic yeast Pichia angusta (Yeast) (Hansenula polymorpha), there is a\ highly related enzyme, dihydroxy-acetone synthase (DHAS) (also\ known as formaldehyde transketolase), which exhibits a very unusual\ specificity by including formaldehyde amongst its substrates.

    \ 1-deoxyxylulose-5-phosphate synthase (DXP synthase) PUBMED:9371765 is an enzyme so far\ found in bacteria (gene dxs) and plants (gene CLA1) which catalyzes the\ thiamine pyrophosphoate-dependent acyloin condensation reaction between carbon\ atoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate to yield 1-deoxy-D-\ xylulose-5-phosphate (dxp), a precursor in the biosynthetic pathway to\ isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6). DXP synthase\ is evolutionary related to TK. The N-terminal section, contains a histidine residue which appears to function in\ proton transfer during catalysis PUBMED:1628611. In the central\ section there are conserved acidic residues that are part of the active cleft\ and may participate in substrate-binding PUBMED:1628611.\ This family includes transketolase enzymes \ and also partially matches to 2-oxoisovalerate dehydrogenase\ beta subunit . Both these enzymes\ utilise thiamine pyrophosphate as a cofactor, suggesting\ there may be common aspects in their mechanism of catalysis.

    \ ' '4688' 'IPR002702' '\ The translational regulator protein regA is encoded by the Bacteriophage T4 and binds to a region of messenger RNA (mRNA) that includes the initiator codon. RegA is unusual in that it represses the translation of about 35 early T4 mRNAs but does not affect nearly 200 other mRNAs PUBMED:7761833.\ ' '4689' 'IPR002848' '\

    Translins are DNA-binding proteins that specifically recognise consensus sequences at the breakpoint junctions in chromosomal translocations, mostly involving immunoglobulin (Ig)/T-cell receptor gene segments. They seem to recognise single-sranded DNA ends generated by staggered breaks occuring at recombination hot spots PUBMED:9013868.

    \

    Translin folds into an alpha-alpha superhelix, consisting of two curved layers of alpha/alpha topology PUBMED:12079346, PUBMED:15039555.

    \ ' '4690' 'IPR001248' '\

    The Nucleobase Cation Symporter-1 (NCS1) family consists of bacterial and yeast transporters for nucleobases including purines and pyrimidines. Members of this family possess twelve putative transmembrane a-helical spanners (TMSs). At least some of them have been shown to function in uptake by substrate:H+ symport mechanism.

    \ ' '4691' 'IPR007350' '\

    This domain corresponds to a C-terminal cysteine rich region that probably binds to a metal ion and could be DNA-binding. It is found in association with the DDE superfamily () and the Tc5 transposase family ().

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4692' 'IPR001207' '\

    Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting\ the mobile element. Transposases have been grouped into various families PUBMED:8041625, PUBMED:1310791, PUBMED:1718819. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti\ and others.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4693' 'IPR014760' '\ A number of serum transport proteins are known to be evolutionarily related, including albumin, alpha-fetoprotein, vitamin D-binding protein and afamin PUBMED:2481749, PUBMED:2423133, PUBMED:7517938. Albumin is the main protein of plasma; it binds water, cations (such as Ca2+, Na+ and K+), fatty acids, hormones, bilirubin and drugs - its main function is to regulate the colloidal osmotic pressure of blood. Alphafeto- protein (alpha-fetoglobulin) is a foetal plasma protein that binds various cations, fatty acids and bilirubin. Vitamin D-binding protein binds to vitamin D and its metabolites, as well as to fatty acids. The biological role of afamin (alpha-albumin) has not yet been characterised. The 3D structure of human serum albumin has been determined by X-ray crystallography to a resolution of 2.8A PUBMED:1630489. It comprises three homologous domains that assemble to form a heart-shaped molecule PUBMED:1630489. Each domain is a product of two subdomains that possess common structural motifs PUBMED:1630489. The principal regions of ligand binding to human serum albumin are located in hydrophobic cavities in subdomains IIA and IIIA, which exhibit similar chemistry. Structurally, the serum albumins are similar, each domain containing five or six internal disulphide bonds, as shown schematically below:\
    \
                        +---+          +----+                        +-----+\
                        |   |          |    |                        |     |\
     xxCxxxxxxxxxxxxxxxxCCxxCxxxxCxxxxxCCxxxCxxxxxxxxxCxxxxxxxxxxxxxxCCxxxxCxxxx\
       |                 |       |     |              |               |\
       +-----------------+       +-----+              +---------------+\
    
    \ ' '4694' 'IPR001888' '\

    Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting\ the mobile element. Transposases have been grouped into various families PUBMED:8041625, PUBMED:1310791, PUBMED:1718819. This family includes the mariner transposase PUBMED:8895590.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4695' 'IPR002560' '\

    Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting\ the mobile element. Transposases have been grouped into various families PUBMED:8041625, PUBMED:1310791, PUBMED:1718819. This family includes the IS204 PUBMED:8196545, IS1001 PUBMED:8093238, IS1096 PUBMED:1660454 and IS1165 PUBMED:1325060 transposases.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4696' 'IPR002622' '\

    Transposase proteins are necessary for efficient DNA transposition.\ This family includes insertion sequences from Synechocystis sp. (strain PCC 6803) three of which are characterised as homologous to bacterial IS5- and IS4- and to several members of the IS630-Tc1-mariner superfamily PUBMED:9305771.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4697' 'IPR002686' '\

    Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases for IS200 from Escherichia coli PUBMED:10471738.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4698' 'IPR001959' '\

    This entry represents a conserved region of a probable transposase family, which is found in a number of uncharacterised bacterial proteins. A novel insertion sequence (IS)-like element of the Bacillus PS3 (Thermophilic bacterium PS-3) that promotes expression of the alanine carrier protein-encoding gene PUBMED:7557457 belongs to this entry.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4699' 'IPR004242' '\

    This family includes a En/Spm-like transposable element, Tdc1 from carrot PUBMED:9180694. The function of these proteins is unknown.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4700' 'IPR004244' '\

    Many human L1 elements are capable of retrotransposition. Some of these have been shown to exhibit reverse transcriptase (RT) activity PUBMED:9140393 although the function of many are, as yet, unknown.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4701' 'IPR004264' '\

    Proteins in this group are TNP1/EN/SPM-like transposon proteins with no known function mostly from Arabidopsis thaliana PUBMED:16297077.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4702' 'IPR004252' '\

    Transposase proteins are necessary for efficient DNA transposition. This family includes various plant transposases from the Ptta and En/Spm families PUBMED:16297077.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4703' 'IPR004291' '\

    Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens PUBMED:6095299. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids PUBMED:6095299.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4704' 'IPR005063' '\

    Transposase proteins are necessary for efficient DNA transposition. This family represents bacterial IS1 transposases PUBMED:17106514.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4705' 'IPR007321' '\

    This domain is found in a family of plant gene products and is thought to be related to gypsy type transposons. There is a domain of unknown function, (), at the C terminus of the proteins.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4706' 'IPR006783' '\

    Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting\ the mobile element. Transposases have been grouped into various families PUBMED:8041625, PUBMED:1310791, PUBMED:1718819. This family includes the putative transposase ISC1217 from archaebacteria.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ \ ' '4707' 'IPR006829' '\

    This group of putative transposases includes mostly Bacillus members. However, we have also found a Bacillus subtilis bacteriophage SPbetac2 homologue (), possibly arising as a result of horizontal transfer.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4708' 'IPR006842' '\

    This is a family of putative transposases includes the YhgA sequence from Escherichia coli () and several prokaryotic homologues PUBMED:8837478.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4709' 'IPR007069' '\

    Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases IS1294 and IS801 PUBMED:9870703.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4710' 'IPR002492' '\

    Transposase proteins are necessary for efficient DNA transposition.\ This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of Caenorhabditis elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in PUBMED:9312061. Tc3 is a member of the Tc1/mariner family of transposable elements.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4711' 'IPR002514' '\

    Transposase proteins are necessary for efficient DNA transposition.\ This family consists of various Escherichia coli insertion elements and other bacterial transposases some of which are members of the IS3 family. This region includes a helix-turn-helix motif (HTH) at the N terminus followed by a leucine zipper (LZ) motif. The LZ motif has been shown\ to mediate oligomerisation of the transposase components in IS911 PUBMED:9761671.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4712' 'IPR004906' '\

    This group includes Caenorhabditis elegans vacuolar assembly protein and several uncharacterised proteins which may be putative transposases, including Tc5 PUBMED:8088523.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4713' 'IPR003201' '\

    Transposons are mobile DNA sequences capable of replication and insertion into the chromosome. Typically transposons code for the transposase enzyme, which catalyses insertion, found between terminal inverted repeats. Tn5 has a unique method of self- regulation in which a truncated version of the transposase enzyme acts as an inhibitor PUBMED:10207011.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '4714' 'IPR000895' '\

    This family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyse the conversion of 5-hydroxyisourate (HIU) to OHCU PUBMED:16098976, PUBMED:16462750. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage PUBMED:16952372, PUBMED:8428915. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence.

    \ \

    Transthyretin (formerly prealbumin) is one of 3 thyroid hormone-binding proteins found in the blood of vertebrates PUBMED:1833190. It is produced in the liver and circulates in the bloodstream, where it binds retinol and thyroxine (T4) PUBMED:4054629. It differs from the other 2 hormone-binding proteins (T4-binding globulin and albumin) in 3 distinct ways: (1) the gene is expressed at a high rate in the brain choroid plexus; (2) it is enriched in cerebrospinal fluid; and (3) no genetically caused absence has been observed, suggesting an essential role in brain function, distinct from that played in the bloodstream PUBMED:1833190. The protein consists of around 130 amino acids, which assemble as a homotetramer that contains an internal channel in which T4 is bound. Within this complex, T4 appears to be transported across the blood-brain barrier, where, in the choroid plexus, the hormone stimulates further synthesis of transthyretin. The protein then diffuses back into the bloodstream, where it binds T4 for transport back to the brain PUBMED:1833190.

    \ ' '4715' 'IPR005595' '\

    The alpha-subunit of the TRAP complex (TRAP alpha) is a single-spanning membrane protein of the endoplasmic reticulum (ER) which is found in proximity of nascent polypeptide chains translocating across the membrane PUBMED:8050590.

    \ ' '4716' 'IPR007194' '\

    TRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is a large multimeric protein that contains at least 10 subunits. This family contains many TRAPP family proteins. The Bet3 subunit is one of the better characterised TRAPP proteins and has a dimeric structure PUBMED:15608655 with hydrophobic channels. The channel entrances are located on a putative membrane-interacting surface that is distinctively flat, wide and decorated with positively charged residues. Bet3 is proposed to localise TRAPP to the Golgi PUBMED:9564032.

    \ ' '4717' 'IPR007039' '\

    Conjugal transfer protein, TrbC, has been identified as a subunit of the pilus precursor in bacteria. The protein undergoes three processing steps before gaining its mature cyclic structure PUBMED:12160637.

    \ ' '4719' 'IPR005498' '\ There are two unlinked regions of octopine-type Ti plasmids that contain genes required for conjugal transfer. One gene cluster, tra, is probably required for conjugal DNA processing whilst the other gene cluster, trb probably directs the synthesis of a conjugal pilus and mating pore PUBMED:8763954.\ \ The trbI gene encodes a 128-amino-acid polypeptide that is an intrinsic inner-membrane protein. TrbI may influence the kinetics of pilus outgrowth and/or retraction PUBMED:1355084. Although not essential for conjugation, the TrbI protein greatly increases conjugational efficiency PUBMED:8763954. In Plasmid pTiC58, all of the trb genes except trbI and trbK are essential for conjugal transfer PUBMED:10438776.\ ' '4720' 'IPR007688' '\

    VirB proteins are suggested to act at the bacterial surface and there play an important role in directing t-DNA transfer to plant cells. VirB6 from Agrobacterium tumefaciens is an essential component of the type IV secretion machinery for T pilus formation and genetic\ transformation of plants. Absence of VirB6 leads to\ reduced cellular levels of VirB5 and VirB3, which were proposed to assist T pilus formation as minor component(s) or assembly\ factor(s), respectively.\

    \ ' '4721' 'IPR005118' '\

    This domain is found in proteins necessary for strand-specific repair in DNA such as TRCF in Escherichia coli. A lesion in the template strand blocks the RNA polymerase complex (RNAP). The RNAP-DNA-RNA complex is specifically recognised by the transcription-repair-coupling factor (TRCF) which releases RNAP and the truncated transcript.

    \ ' '4722' 'IPR003993' '\

    Treacher Collins Syndrome (TCS) is an autosomal dominant disorder of\ craniofacial development, the features of which include conductive hearing \ loss and cleft palate PUBMED:9096354, PUBMED:9042910; it is the most common of the human mandibulo-facial dysostosis disorders PUBMED:9096354. The TCS locus has been mapped to human chromosome 5q31.3-32 and the mutated gene identified (TCOF1) PUBMED:9042910. To date, 35 mutations have been reported in TCOF1, all but one of which result in the introduction of a premature-termination codon into the predicted protein, Treacle. The observed mutational spectrum supports the hypothesis that TCS results from haploinsufficiency.

    \

    Treacle is a low complexity protein of 1,411 amino acids whose predicted\ protein structure contains a set of highly polar repeated motifs PUBMED:9096354. These motifs are common to nucleolar trafficking proteins in other species and are predicted to be phosphorylated by casein kinase. In concert with this observation, the full-length TCOF1 protein sequence also contains putative nuclear and nucleolar localisation signals PUBMED:9096354. Throughout the open\ reading frame are found mutations in TCS families and several polymorphisms. It has thus been suggested that TCS results from defects in a nucleolar trafficking protein that is critically required during human craniofacial development.

    \ ' '4723' 'IPR000519' '\ A cysteine-rich domain of approximately forty five amino-acid residues has been found in some extracellular eukaryotic proteins PUBMED:7820556, PUBMED:9187350, PUBMED:8518738, PUBMED:8267796. It is known as either the \'P\', \'trefoil\' or \'TFF\' domain, and contains six cysteines linked by three disulphide bonds with connectivity 1-5, 2-4, 3-6. The domain has been found in a variety of extracellular eukaryotic proteins PUBMED:7820556, PUBMED:8518738, PUBMED:8267796, including protein pS2 (TFF1), a protein secreted by the stomach mucosa; spasmolytic polypeptide (SP) (TFF2), a protein of about 115 residues that inhibits gastrointestinal motility and gastric acid secretion; intestinal trefoil factor (ITF) (TFF3); Xenopus laevis stomach proteins xP1 and xP4; xenopus integumentary mucins A.1 (FIM-A.1 or preprospasmolysin) and C.1 (FIM-C.1), proteins which may be involved in defence against microbial infections by protecting the epithelia from the external environment; xenopus skin protein xp2 (or APEG); Zona pellucida sperm-binding protein B (ZP-B); intestinal sucrase-isomaltase ( / ), a vertebrate membrane bound, multifunctional enzyme complex which hydrolyses sucrose, maltose and isomaltose; and lysosomal alpha-glucosidase ().\ ' '4724' 'IPR001661' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 37 \ comprises enzymes with only one known activity; trehalase ().

    \ \

    Trehalase is the enzyme responsible for the degradation of the disaccharide alpha,alpha-trehalose yielding two glucose subunits PUBMED:8444853. It is an enzyme found in a wide variety of organisms and whose sequence has been highly conserved throughout evolution.

    \ ' '4725' 'IPR003337' '\ Trehalose-phosphatases catalyse the de-phosphorylation of\ trehalose-6-phosphate to trehalose and orthophosphate. Trehalose is a common disaccharide of bacteria, fungi and invertebrates that appears to play a major role in desiccation tolerance. A pathway for trehalose biosynthesis may also exist in plants PUBMED:9681009. The trehalose-phosphatase signature is found in the C-terminus of\ trehalose-6-phosphate synthase adjacent to the trehalose-6-phosphate synthase domain (see ). It would appear that the two equivalent genes in the Escherichia coli otsBA operon: otsA, the\ trehalose-6-phosphate synthase and otsB, trehalose-phosphatase (this family) have undergone gene fusion in\ most eukaryotes PUBMED:8045430.\ ' '4726' 'IPR005657' '\

    This family contains saliva proteins from haematophagous insects that counteract vertebrate host haemostasis events such as coagulation,\ vasoconstriction and platelet aggregation PUBMED:12421416. These include:

    \ \

    All members of this family belong to MEROPS proteinase inhibitor family I59, clan IZ.

    \ ' '4727' 'IPR006037' '\

    The regulator of K+ conductance (RCK) domain is found in many ligand-gated K+ channels, most often attached to the intracellular carboxy terminus. The domain is prevalent among prokaryotic K+ channels, and also found in eukaryotic, high-conductance Ca2+-activated K+ channels (BK channels) PUBMED:11292341, PUBMED:11301020, PUBMED:12037559. Largely involved in redox-linked regulation of potassium channels, the N-terminal part of the RCK domain is predicted to be an active dehydrogenase at least in some cases PUBMED:11292341. Some have a conserved sequence motif (G-x-G-x-x-G-x(n)-[DE]) for NAD+ binding PUBMED:8412700, but others do not, reflecting the diversity of ligands for RCK domains. The C-terminal part is less conserved, being absent in some channels, such as the kefC antiporter from Escherichia coli. It is predicted to bind unidentified ligands and to regulate sulphate, sodium and other transporters.

    \ \

    The X-ray structure of several RCK domains has been solved PUBMED:16227203, PUBMED:11301020, PUBMED:12037559. It reveals an alpha-beta fold similar to dehydrogenase enzymes. The domain forms a homodimer, producing a cleft between two lobes. It has a composite structure, with an N-terminal (RCK-N), and a C-terminal (RCK-C) subdomain. The RCK-N subdomain forms a Rossmann fold with two alpha helices on one side of a six stranded parallel beta sheet and three alpha helices on the other side. The RCK-C subdomain is an all-beta-strand fold. It forms an extention of the dimer interface and further stabilises the RCK homodimer PUBMED:16227203, PUBMED:11301020, PUBMED:12037559. Ca2+ is a ligand that opens the channel in a concentration-dependent manner. Two Ca2+ ions are located at the base of a cleft between two RCK domains, coordinated by the carboxylate groups of two glutamate residues, and by an aspartate residue PUBMED:16227203, PUBMED:11301020, PUBMED:12037559.

    \ \

    RCK domains occur in at least five different contexts:\

    \

    \ \

    This entry represents the C-terminal subdomain of RCK.

    \ ' '4728' 'IPR003148' '\

    The regulator of K+ conductance (RCK) domain is found in many ligand-gated K+ channels, most often attached to the intracellular carboxy terminus. The domain is prevalent among prokaryotic K+ channels, and also found in eukaryotic, high-conductance Ca2+-activated K+ channels (BK channels) PUBMED:11292341, PUBMED:11301020, PUBMED:12037559. Largely involved in redox-linked regulation of potassium channels, the N-terminal part of the RCK domain is predicted to be an active dehydrogenase at least in some cases PUBMED:11292341. Some have a conserved sequence motif (G-x-G-x-x-G-x(n)-[DE]) for NAD+ binding PUBMED:8412700, but others do not, reflecting the diversity of ligands for RCK domains. The C-terminal part is less conserved, being absent in some channels, such as the kefC antiporter from Escherichia coli. It is predicted to bind unidentified ligands and to regulate sulphate, sodium and other transporters.

    \ \

    The X-ray structure of several RCK domains has been solved PUBMED:16227203, PUBMED:11301020, PUBMED:12037559. It reveals an alpha-beta fold similar to dehydrogenase enzymes. The domain forms a homodimer, producing a cleft between two lobes. It has a composite structure, with an N-terminal (RCK-N), and a C-terminal (RCK-C) subdomain. The RCK-N subdomain forms a Rossmann fold with two alpha helices on one side of a six stranded parallel beta sheet and three alpha helices on the other side. The RCK-C subdomain is an all-beta-strand fold. It forms an extention of the dimer interface and further stabilises the RCK homodimer PUBMED:16227203, PUBMED:11301020, PUBMED:12037559. Ca2+ is a ligand that opens the channel in a concentration-dependent manner. Two Ca2+ ions are located at the base of a cleft between two RCK domains, coordinated by the carboxylate groups of two glutamate residues, and by an aspartate residue PUBMED:16227203, PUBMED:11301020, PUBMED:12037559.

    \ \

    RCK domains occur in at least five different contexts:\

    \

    \ \

    This entry represents the N-terminal subdomain of RCK.

    \ ' '4729' 'IPR003445' '\ This family consists of various potassium transport proteins (Trk) and V-type sodium ATP synthase subunit J or translocating ATPase J (). These proteins are involved in active sodium up-take utilizing ATP in the process. TrkH from Escherichia coli is a hydrophobic membrane protein and determines the specificity and kinetics of cation transport by the TrK system in this organism PUBMED:7896723. This protein interacts with TrkA and requires TrkE for transport activity.\ ' '4730' 'IPR002905' '\ This enzyme uses S-adenosyl-L-methionine to methylate tRNA:\ \ The TRM1 gene of Saccharomyces cerevisiae is necessary for the N2,N2-dimethylguanosine modification of both mitochondrial and cytoplasmic tRNAs PUBMED:9685492. The enzyme is found in both eukaryotes and archaea PUBMED:3299379.\ ' '4731' 'IPR002300' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \ ' '4732' 'IPR020058' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    Glutamyl-tRNA synthetase () is a class Ic synthetase and shows several similarities with glutaminyl-tRNA synthetase concerning structure and catalytic properties. It is an alpha2 dimer. To date one crystal structure of a glutamyl-tRNA synthetase (Thermus thermophilus) has been solved. The molecule has the form of a bent cylinder and consists of four domains. The N-terminal half (domains 1 and 2) contains the \'Rossman fold\' typical for class I synthetases and resembles the corresponding part of Escherichia coli GlnRS, whereas the C-terminal half exhibits a GluRS-specific structure PUBMED:9426192.\

    \ ' '4733' 'IPR020059' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    Glutamyl-tRNA synthetase () is a class Ic synthetase and shows several similarities with glutaminyl-tRNA synthetase concerning structure and catalytic properties. It is an alpha2 dimer. To date one crystal structure of a glutamyl-tRNA synthetase (Thermus thermophilus) has been solved. The molecule has the form of a bent cylinder and consists of four domains. The N-terminal half (domains 1 and 2) contains the \'Rossman fold\' typical for class I synthetases and resembles the corresponding part of Escherichia coli GlnRS, whereas the C-terminal half exhibits a GluRS-specific structure PUBMED:9426192.\

    \ ' '4734' 'IPR015945' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This entry represents the core region of arginyl-tRNA synthetase (), which has been crystallized and preliminary X-ray crystallographic analysis of yeast arginyl-tRNA synthetase-yeast tRNAArg complexes is available PUBMED:10739930.

    \ ' '4735' 'IPR015803' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    Cysteinyl-tRNA synthetase () is an alpha monomer and belongs\ to class Ia.

    \ ' '4736' 'IPR002904' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    Lysyl-tRNA synthetase () is an alpha 2 homodimer that belong to both class I and class II. In eubacteria and eukaryota lysyl-tRNA synthetases belong to class II in the same family as aspartyl tRNA synthetase. The class Ic lysyl-tRNA synthetase family is present in archaea and in a number of bacterial groups that include the alphaproteobacteria and spirochaetesPUBMED:9353192. A refined crystal structures shows that the active site of lysU is shaped to position the substrates for the nucleophilic attack of the lysine carboxylate on the ATP alpha-phosphate. No residues are directly involved in catalysis, but a number of highly conserved amino acids and three metal ions coordinate the substrates and stabilise the pentavalent transition state. A loop close to the catalytic pocket, disordered in the lysine-bound structure, becomes ordered upon adenine binding PUBMED:10913247.

    \ ' '4737' 'IPR004364' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This entry includes the asparagine, aspartic acid and lysine tRNA synthetases.

    \ ' '4738' 'IPR000824' '\

    The tryptophan RNA-binding attenuation protein (TRAP) regulates expression of the tryptophan biosynthetic genes in Bacillus sp. by binding to the leader region of the nascent trp operon mRNA PUBMED:11914485. The crystal structure of the Trp RNA-binding attenuation protein of Bacillus subtilis has been solved PUBMED:7715723. TRAP forms an oligomeric ring consisting of 11 single-domain subunits, where each subunit adopts a double-stranded beta-helix structure with the appearance of a beta-sandwich of distinct architecture and jelly-roll fold. The 11 subunits are stabilised by 11 inter-subunit strands, forming a beta-wheel with a large central hole. TRAP is activated by binding to tryptophan in clefts between adjacent beta-strands, which induces conformational changes in the protein. Activated TRAP binds an mRNA target sequence consisting of 11 (G/U)AG repeats, separated by 2-3 spacer nucleotides. The spacer nucleotides do not make direct contact with the TRAP protein, but they do influence the conformation of the RNA, which might influence the specificity of TRAP PUBMED:15050822.

    \

    This entry represents TRAP family of proteins.

    \ ' '4739' 'IPR002314' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This domain includes the glycine, histidine, proline, threonine and serine tRNA synthetases.

    \ ' '4740' 'IPR018164' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    Alanyl-tRNA synthetase () is an alpha4 tetramer that belongs to class IIc.

    \ ' '4741' 'IPR006677' '\

    This entry represents a 3-layer alpha/beta/alpha domain found as the catalytic domain at the C-terminal in homotetrameric tRNA-intron endonucleases PUBMED:9535656, and as domains 2 and 4 (C-terminal) in the homodimeric enzymes PUBMED:16690865. tRNA-intron endonucleases () remove tRNA introns by cleaving pre-tRNA at the 5\'- and 3\'-splice sites to release the intron. The products are an intron and two tRNA half-molecules bearing 2\',3\' cyclic phosphate and 5\'-hydroxyl termini PUBMED:9200602. These enzymes recognise a pseudosymmetric substrate in which 2 bulged loops of 3 bases are separated by a stem of 4 bp PUBMED:14993668. Although homotetrameric enzymes contain four active sites, only two participate in the cleavage, and should therefore, be considered as a dimer of dimers.

    \ ' '4742' 'IPR006678' '\

    This entry represents a 2-layer alpha/beta domain found at the N-terminal in homotetrameric tRNA-intron endonucleases PUBMED:9535656, and as domains 1 (N-terminal) and 3 in the homodimeric enzymes PUBMED:16690865. tRNA-intron endonucleases () remove tRNA introns by cleaving pre-tRNA at the 5\'- and 3\'-splice sites to release the intron. The products are an intron and two tRNA half-molecules bearing 2\',3\' cyclic phosphate and 5\'-hydroxyl termini PUBMED:9200602. These enzymes recognise a pseudosymmetric substrate in which 2 bulged loops of 3 bases are separated by a stem of 4 bp PUBMED:14993668. Although homotetrameric enzymes contain four active sites, only two participate in the cleavage, and should therefore, be considered as a dimer of dimers.

    \ ' '4743' 'IPR016009' '\

    In transfer RNA many different modified nucleosides are found, especially in the anticodon region.\ tRNA (guanine-N1-)-methyltransferase is one of several nucleases operating together with the tRNA-modifying enzymes before the formation of the mature tRNA. It catalyses the reaction:\ \ \ methylating guanosine(G) to N1-methylguanine (1-methylguanosine (m1G)) at position 37 of tRNAs that read CUN (leucine), CCN(proline), and CGG (arginine) codons. The presence of m1G improves the cellular growth rate and the polypeptide steptime and also prevents the tRNA from shifting the reading frame PUBMED:2207153.

    The mechanism of the trmD3-induced frameshift involving mutant tRNA(Pro) and tRNA(Leu) species has been investigated PUBMED:7689113. It has been suggested that the conformation of the anticodon loop may be a major determining element for the formation of m1G37 in vivo PUBMED:9047363.

    \ ' '4744' 'IPR018318' '\ This entry is found in a number of enzymes, including GMP synthase and the tRNA-specific 2-thiouridylase catalysing the 2-thiolation of uridine at the wobble position (U34) of tRNA, leading to the formation of s(2)U34.\ ' '4745' 'IPR007639' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \ This is a region found N-terminal to the catalytic domain of glutaminyl-tRNA synthetase () in eukaryotes but not in Escherichia coli. This region is thought to bind RNA in a non-specific manner, enhancing interactions between the tRNA and enzyme, but is not essential for enzyme function PUBMED:10347214.\ ' '4746' 'IPR007638' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \ This is a region found N-terminal to the catalytic domain of glutaminyl-tRNA synthetase () in eukaryotes but not in Escherichia coli. This region is thought to bind RNA in a non-specific manner, enhancing interactions between the tRNA and enzyme, but is not essential for enzyme function PUBMED:10347214.\ ' '4747' 'IPR002310' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    In eubacteria, glycyl-tRNA synthetase () is an alpha2/beta2 tetramer composed of 2 different subunits PUBMED:6309809, PUBMED:7962006, PUBMED:7665503. In some eubacteria, in archaea and eukaryota, glycyl-tRNA synthetase is an alpha2 dimer (see ). It belongs to class IIc and is one of the most complex synthetases. What is most interesting is the lack of similarity between the two types: divergence at the sequence level is so great that it is impossible to infer descent from common genes. The alpha and beta subunits (see ) also lack significant sequence similarity.\ However, they are translated from a single mRNA PUBMED:6309809, and a single chain glycyl-tRNA synthetase from Chlamydia trachomatis has been found to have significant similarity with both domains, suggesting divergence from a single polypeptide chain PUBMED:7665503.

    \ ' '4748' 'IPR002311' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This entry represents the N-terminal region of the beta subunit of glycyl-tRNA synthases (class IIc).

    \ \

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold and are mostly monomeric, while class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet formation, flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic aci, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \ \

    The 10 class I synthetases are considered to have in common the catalytic domain structure based on the Rossmann fold, which is totally different from the class II catalytic domain structure. The class I synthetases are further divided into three subclasses, a, b and c, according to sequence homology. No conserved structural features for tRNA recognition by class I synthetases have been established.

    \

    Class-II tRNA synthetases do not share a high degree of similarity, however at least three conserved regions are present PUBMED:8274143, PUBMED:2053131, PUBMED:1852601.

    \

    In eubacteria, glycyl-tRNA synthetase () is an alpha2/beta2 tetramer composed of 2 different subunits PUBMED:6309809, PUBMED:7962006, PUBMED:7665503. In some eubacteria, in archaea and eukaryota, glycyl-tRNA synthetase is an alpha2 dimer (see ), this family. It belongs to class IIc and is one of the most complex synthetases. What is most interesting\ is the lack of similarity between the two types: divergence at the sequence\ level is so great that it is impossible to infer descent from common genes. \ The alpha (see ) and beta subunits also lack significant sequence similarity.\ However, they are translated from a single mRNA PUBMED:6309809, and a single chain \ glycyl-tRNA synthetase from Chlamydia trachomatis has been found to have \ significant similarity with both domains, suggesting divergence from a \ single polypeptide chain PUBMED:7665503.

    \ ' '4749' 'IPR000533' '\ Tropomyosins PUBMED:3606587, are a family of closely related proteins present in muscle and non-muscle cells. In striated muscle, tropomyosin mediate the interactions between the troponin complex and actin so as to regulate muscle contraction PUBMED:12690456. The role of tropomyosin in smooth muscle and non-muscle tissues is not clear. Tropomyosin is an alpha-helical protein that forms a coiled-coil structure of 2 parallel helices containing 2 sets of 7 alternating actin binding sites PUBMED:6993480. There are multiple cell-specific isoforms, created by differential splicing of the messenger RNA from one gene, but the proportions of the isoforms vary between different cell types. Muscle isoforms of tropomyosin are characterised by having 284 amino acid residues and a highly conserved N-terminal region, whereas non-muscle forms are generally smaller and are heterogeneous in their N-terminal region.\ \

    Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee\ King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E.,\ Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of\ the first three letters of the genus; a space; the first letter of the\ species name; a space and an arabic number. In the event that two species\ names have identical designations, they are discriminated from one another\ by adding one or more letters (as necessary) to each species designation.

    \

    The allergens in this family include allergens with the following designations: Met e 1.

    \ ' '4750' 'IPR004981' '\

    This is a family of tryptophan 2,3-dioxygenase () enzymes involved in tryptophan metabolism, which catalyse the reaction:\

    \ ' '4751' 'IPR006905' '\ Tryptophan halogenase catalyses the chlorination of tryptophan to form 7-chlorotryptophan. This is the first step in the biosynthesis of pyrrolnitrin, an antibiotic with broad-spectrum anti-fungal activity. Tryptophan halogenase is NADH-dependent PUBMED:10547442.\ ' '4752' 'IPR000831' '\

    The Trp repressor (TrpR) binds to at least five operators in the Escherichia coli genome, repressing gene expression. The operators at which it binds vary considerably in DNA sequence and location within the promoter; when bound to the Trp operon it recognises the sequence 5\'-ACTAGT-3\' and acts to prevent the initiation of transcription. The TrpR controls the trpEDCBA (trpO) operon and the genes for trpR, aroH, mtr and aroL, which are involved in the biosynthesis and uptake of the amino acid tryptophan PUBMED:12475235. The repressor binds to the operators only in the presence of L-tryptophan, thereby controlling the intracellular level of its effector; the complex also regulates Trp repressor biosynthesis by binding to its own regulatory region. TrpR acts as a dimer that is composed of identical 6-helical subunits, where four of the helices form the core of the protein and intertwine with the corresponding helices from the other subunit.

    \ ' '4753' 'IPR002028' '\

    Tryptophan synthase () catalyzes the last step in the biosynthesis\ of tryptophan PUBMED:2679363, PUBMED:1366510:\ \ \ It has two functional domains, each found in bacteria and plants on a\ separate subunit. In Escherichia coli, the 2 subunits, A and B, are encoded by the trpA and trpB genes respectively. The alpha chain is for the aldol cleavage of indoleglycerol phosphate to indole and glyceraldehyde 3-phosphate and the beta chain is for the synthesis of tryptophan from indole and serine. In fungi the two domains are fused together in a single multifunctional protein, in the order: (NH2-A-B-COOH) PUBMED:2521855, PUBMED:2734310. The two domains of the Neurospora crassa polypeptide are linked by a connector of 54-amino acid residues that has less than 25% identity to the 45-residue connector of the Saccharomyces cerevisiae (Baker\'s yeast) polypeptide. Two acidic residues are believed to serve as proton donors/acceptors in the enzyme\'s\ catalytic mechanism.

    \ ' '4754' 'IPR018227' '\ Amino acid permeases are integral membrane proteins involved in the transport\ of amino acids into the cell. A number of such proteins have been found to be\ evolutionary related PUBMED:3146645, PUBMED:2687114, PUBMED:8382989. \

    Aromatic amino acids are concentrated in the cytoplasm of Escherichia coli by 4 \ distinct transport systems: a general aromatic amino acid permease, and a\ specific permease for each of the 3 types (Phe, Tyr and Trp) PUBMED:1987112. It has been shown PUBMED:2022620 that some permeases in E. coli and related bacteria are evolutionary related.\ These permeases are proteins of about 400 to 420 amino acids and are located in the cytoplasmic membrane and, like bacterial sugar/cation transporters, are thought to contain 12 transmembrane (TM)\ regions PUBMED:1987112 - hydropathy analysis, however, is inconclusive, suggesting the\ possibility of 10 to 12 membrane-spanning domains PUBMED:2022620. The best conserved domain is a stretch of 20 residues which seems to be located in a cytoplasmic loop between the\ first and second transmembrane region.

    \ ' '4755' 'IPR002501' '\

    TruB is responsible for the pseudouridine residue present in the T loops of virtually all tRNAs. TruB recognises the preformed 3-D structure of the T loop primarily through shape complementarity. It accesses its substrate uridyl residue by flipping out the nucleotide and disrupts the tertiary structure of tRNA PUBMED:11779468.

    \

    Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an alpha+beta structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are PUBMED:10529181:

    \ \

    \ \

    This entry represents pseudouridine synthase TruB, as well as Cbf5p that modifies rRNA PUBMED:9472021.

    \ ' '4756' 'IPR000458' '\ This family of trypanosomal proteins resemble vertebrate mucins.\ The protein consists of three regions. The N and C terminii are\ conserved between all members of the family, whereas the central\ region is not well conserved and contains a large number of\ threonine residues which can be glycosylated PUBMED:7592617.\ Indirect evidence suggested that these genes might encode the core\ protein of parasite mucins, glycoproteins that were proposed to be\ involved in the interaction with, and invasion of, mammalian host\ cells.\ ' '4757' 'IPR001812' '\ The trypanosome parasite expresses these proteins to evade the immune response PUBMED:2231728. The variant surface glycoprotein (VSG) of Trypanosoma brucei forms a coat on the surface of the parasite; by the expression of a series of antigenically distinct VSGs in the surface coat the parasite escapes the host immune response. \

    The 2.9A resolution crystal structure of the N-terminal domain of one variant, MITat 1.2, has been determined PUBMED:2231728. The "top" of the protein, which in the surface coat may be exposed to the external environment, is formed from the ends of the two long helices, a short three-stranded beta-sheet, and a strand having irregular conformation that packs above these secondary structure elements. Two conserved disulphide bridges are in this part of the molecule. Several elements of the MITat 1.2 sequence, which contribute to the formation of the helix bundle structure, have been identified. These elements can be found in the sequences of several different VSGs, suggesting that to some extent the VSG structure is conserved in those variants PUBMED:9574925.

    \ ' '4758' 'IPR004933' '\ There are several antigenic variants in Rickettsia tsutsugamushi, and a type-specific antigen (TSA) of 56-kilodaltons located on the\ rickettsial surface is responsible for the variation PUBMED:2496028, PUBMED:1618776. TSA proteins are probably integral membrane proteins. \ \ ' '4759' 'IPR000580' '\

    Several eukaryotic proteins are evolutionary related and are thought to be involved in transcriptional regulation. These proteins are highly similar in a region of about 50 residues that include a conserved leucine-zipper domain\ most probably involved in homo- or hetero-dimerisation. Proteins containing this signature include:

    \ \

    \ ' '4760' 'IPR006761' '\

    Tsg was identified in Drosophila melanogaster as being required to specify the dorsal-most structures in the embryo, for example the amnioserosa. Biochemical experiments have revealed three key properties of Tsg:

  • it can synergistically inhibit Dpp/BMP action in both D. melanogaster and vertebrates by forming a tripartite complete between itself, SOG/chordin and a BMP ligand;
  • Tsg seems to enhance the Tld/BMP-1-mediated cleavage rate of SOG/chordin and may change the preference of site utilisation;
  • Tsg can promote the dissociation of chordin cysteine-rich-containing fragments from the ligand to inhibit BMP signalling PUBMED:7958834, PUBMED:11260716.
  • \ ' '4761' 'IPR006795' '\ This region is found in some members of the SpoU-type rRNA methylase family ().\ ' '4762' 'IPR000884' '\

    Thrombospondins are multimeric multidomain glycoproteins that function at cell surfaces and in the extracellular matrix milieu. They act as regulators of cell interactions in vertebrates. They are divided into two subfamilies, A and B, according to their overall molecular organisation. The subgroup A proteins TSP-1 and -2 contain an N-terminal domain, a VWFC domain, three TSP1 repeats, three EGF-like domains, TSP3 repeats and a C-terminal domain. They are assembled as trimer. The subgroup B thrombospondins, designated TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) are distinct in that they contain unique N-terminal regions, lack the VWFC domain and TSP1 repeats, contain four copies of EGF-like domains, and are assembled as pentamers PUBMED:11687483. EGF, TSP3 repeats and the C-terminal domain are thus the hallmark of a thrombospondin.

    \

    This repeat was first described in 1986 by Lawler and Hynes PUBMED:2430973. It was found in the thrombospondin protein where it is repeated 3 times. Now a number of proteins involved in the complement pathway (properdin, C6, C7, C8A, C8B, C9) PUBMED:2459396 as well as extracellular matrix protein like mindin, F-spondin PUBMED:10409509, SCO-spondin and even the circumsporozoite surface protein 2 and TRAP proteins of Plasmodium PUBMED:10508153, PUBMED:1501644 contain one or more instance of this repeat.\ It has been involved in cell-cell interraction, inhibition of angiogenesis PUBMED:10500044 and\ apoptosis PUBMED:9135017.

    \

    The intron-exon organisation of the properdin gene confirms the hypothesis \ that the repeat might have evolved by a process involving exon shuffling PUBMED:1417780.\ A study of properdin structure provides some information about the structure of\ the thrombospondin type I repeat PUBMED:1868073.

    \ ' '4763' 'IPR003367' '\

    Thrombospondins are multimeric multidomain glycoproteins that function at cell surfaces and in the extracellular matrix milieu. They act as regulators of cell interactions in vertebrates. They are divided into two subfamilies, A and B, according to their overall molecular organisation. The subgroup A proteins TSP-1 and -2 contain an N-terminal domain, a VWFC domain, three TSP1 repeats, three EGF-like domains, TSP3 repeats and a C-terminal domain. They are assembled as trimer. The subgroup B thrombospondins, designated TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) are distinct in that they contain unique N-terminal regions, lack the VWFC domain and TSP1 repeats, contain four copies of EGF-like domains, and are assembled as pentamers PUBMED:11687483. EGF, TSP3 repeats and the C-terminal domain are thus the hallmark of a thrombospondin.

    \

    This entry represents the type 3 thrombospondin repreat, and releated repeats present in other types of protein.

    \ ' '4764' 'IPR004307' '\

    Members of this group are involved in transmembrane signalling. In both prokaryotes and mitochondria they are localized to the outer membrane, and have been shown to bind and transport dicarboxylic tetrapyrrole intermediates of the haem biosynthetic pathway PUBMED:1373486, PUBMED:7673149. They are associated with the major outer membrane porins (in prokaryotes) and with the voltage-dependent anion channel (in mitochondria) PUBMED:8114671.

    \ \

    Rhodobacter sphaeroides TspO (previously CrtK) is involved in signal transduction, functioning as a negative regulator of the expression of some photosynthesis genes (PpsR/AppA repressor/antirepressor regulon). This down-regulation is believed to be in response to oxygen levels. TspO works through (or modulates) the PpsR/AppA system and acts upstream of the site of action of these regulatory proteins PUBMED:11591680. It has been suggested that the TspO regulatory pathway works by regulating the efflux of certain tetrapyrrole intermediates of the haem/bacteriochlorophyll biosynthetic pathways in response to the availability of molecular oxygen, thereby causing the accumulation of a biosynthetic intermediate that serves as a corepressor for the regulated genes PUBMED:10409680. A homologue of the TspO protein in Rhizobium meliloti (Sinorhizobium meliloti) is involved in regulating expression of the ndi locus in response to stress conditions PUBMED:11097914. There is evidence that the S. meliloti TspO acts through, or in addition to, the FixL regulatory system.

    \ \

    In animals, the peripheral-type benzodiazepine receptor (PBR, MBR) is a mitochondrial protein (located in the outer mitochondrial membrane) characterised by its ability to bind with nanomolar affinity to a variety of benzodiazepine-like drugs, as well as to dicarboxylic tetrapyrrole intermediates of the haem biosynthetic pathway. Depending upon the tissue, it was shown to be involved in steroidogenesis, haem biosynthesis, apoptosis, cell growth and differentiation, mitochondrial respiratory control, and immune and stress response, but the precise function of the PBR remains unclear. The role of PBR in the regulation of cholesterol transport from the outer to the inner mitochondrial membrane, the rate-determining step in steroid biosynthesis, has been studied in detail. PBR is required for the binding, uptake and release, upon ligand activation, of the substrate cholesterol PUBMED:11806292. PBR forms a multimeric complex with the voltage-dependent anion channel (VDAC) PUBMED:8114671 and adenine nucleotide carrier PUBMED:1373486. Molecular modeling of PBR suggested that it might function as a channel for cholesterol. Indeed, cholesterol uptake and transport by bacterial cells was induced upon PBR expression. Mutagenesis studies identified a cholesterol recognition/interaction motif (CRAC) in the cytoplasmic C-terminus of PBR PUBMED:11158628, PUBMED:12589253.

    \ \

    In complementation experiments, rat PBR (pk18) functionally substitutes for its homologue TspO in R. sphaeroides, negatively affecting transcription of specific photosynthesis genes PUBMED:9144197. This suggests that PBR may function as an oxygen sensor, transducing an oxygen-triggered signal leading to an adaptive cellular response.

    \ \

    These observations suggest that fundamental aspects of this receptor and the downstream signal transduction pathway are conserved in bacteria and higher eukaryotic mitochondria. The alpha-3 subdivision of the purple bacteria is considered to be a likely source of the endosymbiont that ultimately gave rise to the mitochondrion. Therefore, it is possible that the mammalian PBR remains both evolutionarily and functionally related to the TspO of R. sphaeroides.

    \ ' '4765' 'IPR004219' '\ Torque teno virus, isolated initially from a Japanese patient with hepatitis of unknown aetiology, has since been found to infect both healthy and diseased individuals and numerous prevalence studies have raised questions about its role in unexplained hepatitis. ORF1 is a large 750 residue protein.\ ' '4766' 'IPR004118' '\

    A nonenveloped and single-stranded DNA virus designated Torque teno virus has been reported from Japan in association with hepatitis of unknown etiology PUBMED:10388667. ORF2 is a 150 residue protein of unknown function.

    \ ' '4767' 'IPR018515' '\

    Initiation of eukaryotic mRNA transcription requires melting of promoter DNA with the help of the general transcription factors TFIIE and TFIIH. In higher eukaryotes, the general transcription factor TFIIE consists of two subunits: the large alpha subunit () and the small beta (). TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The approximately 120-residue central core domain of TFIIE beta plays a role in double-stranded DNA binding of TFIIE PUBMED:10716934.

    \ \

    The TFIIE beta central core DNA-binding domain consists of three helices with a beta hairpin at the C terminus, resembling the winged helix proteins. It shows a novel double-stranded DNA-binding activity where the DNA-binding surface locates on the opposite side to the previously reported winged helix motif by forming a positively charged furrow PUBMED:10716934.

    \ \ \

    This domain is found in Tuberin proteins.

    \ ' '4768' 'IPR003008' '\

    This domain is found in all tubulin chains, as\ well as the bacterial FtsZ family of proteins. These proteins\ are involved in polymer formation. Tubulin is the major component\ of microtubules, while FtsZ is the polymer-forming protein\ of bacterial cell division, it is part of a ring in the middle of the\ dividing cell that is required for constriction of cell membrane and\ cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases, this entry is the GTPase domain.\ FtsZ can polymerise into tubes, sheets, and rings in vitro and is\ ubiquitous in bacteria and archaea.

    \ ' '4769' 'IPR001084' '\

    Microtubules consist of tubulins as well as a group of additional proteins\ collectively known as the Microtubule Associated Proteins (MAP). MAP\'s have\ been classified into two classes: high molecular weight MAP\'s and Tau\ protein. The Tau proteins promote microtubule assembly and stabilise\ microtubules.

    \ \

    The C-terminal region of these proteins contains three or four tandem repeats\ of a conserved domain of about thirty amino acid residues which is implicated\ in tubulin-binding and which seems to have a stiffening effect on microtubules.

    \ ' '4770' 'IPR015820' '\

    Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. \ This signature defines TYA, found at the N terminus of the gag and gag-pol polyproteins of approximately 440 amino acids and 1755 amino acids in length respectively.

    \ \ \

    Yeast retrotransposon Ty1 produces its proteins as precursors that are subsequently cleaved by an aspartic protease encoded by the element. Cleavage of the gag and gag-Pol polyprotein precursors is a critical step in proliferation of retroviruses and retroelements. These cleavage events are essential for transposition as they release the active reverse transcriptase and integrase and they modify the structure of the virus-like particles in a way that is analogous to the morphological changes that occur during retrovirus core maturation PUBMED:9261411, PUBMED:8971723, PUBMED:8764068.

    \ ' '4771' 'IPR004935' '\ Tymoviruses are single stranded RNA viruses. This family includes a protein of unknown function that has been named based on its\ molecular weight. Tymoviruses such as the ononis yellow mosaic tymovirus encode only three proteins. Of these two are overlapping\ this protein overalps a larger ORF that is thought to be the polymerase PUBMED:2800337.\ ' '4772' 'IPR000574' '\ This signature is found in coat proteins from the related tymoviruses. The coat protein is also known as the virion\ protein. The virus coat is composed of 180 copies of the coat protein arranged in an icosahedral shell.\ ' '4773' 'IPR002227' '\ Tyrosinase () PUBMED:3130643 is a copper monooxygenases that catalyzes the\ hydroxylation of monophenols and the oxidation of o-diphenols to o-quinols.\ This enzyme, found in prokaryotes as well as in eukaryotes, is involved in the\ formation of pigments such as melanins and other polyphenolic compounds.\ Tyrosinase binds two copper ions (CuA and CuB). Each of the two copper ions has\ been shown PUBMED:1901488 to be bound by three conserved histidines residues. The regions\ around these copper-binding ligands are well conserved and also shared by some\ hemocyanins, which are copper-containing oxygen carriers from the hemolymph of\ many molluscs and arthropods PUBMED:2664531, PUBMED:1898774.\ At least two proteins related to tyrosinase are known to exist in mammals, and include TRP-1 (TYRP1) PUBMED:7813420, which is responsible for the conversion of 5,6-dihydro-xyindole-2-carboxylic acid (DHICA) to indole-5,6-quinone-2-carboxylic acid; and TRP-2 (TYRP2) PUBMED:1537334, which is the melanogenic enzyme DOPAchrome tautomerase\ () that catalyzes the conversion of DOPAchrome to DHICA. TRP-2\ differs from tyrosinases and TRP-1 in that it binds two zinc ions instead\ of copper PUBMED:7980602.\ Other proteins that belong to this family are plant polyphenol oxidases (PPO) (), which catalyze the oxidation\ of mono- and o-diphenols to o-diquinones PUBMED:1391768; and \ Caenorhabditis elegans hypothetical protein C02C2.1.\ ' '4774' 'IPR004138' '\ This family represents herpes virus protein U79 and cytomegalovirus early phosphoprotein P34 (UL112).\ ' '4775' 'IPR000127' '\

    The post-translational attachment of ubiquitin () to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation PUBMED:15556404, PUBMED:15196553, PUBMED:15454246. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1, ), a ubiquitin-conjugating enzyme (E2, ), and a ubiquitin ligase (E3, , ), which work sequentially in a cascade PUBMED:14998368. The E1 enzyme is responsible for activating ubiquitin, the first step in ubiquitinylation. The E1 enzyme hydrolyses ATP and adenylates the C-terminal glycine residue of ubiquitin, and then links this residue to the active site cysteine of E1, yielding a ubiquitin-thioester and free AMP. To be fully active, E1 must non-covalently bind to and adenylate a second ubiquitin molecule. The E1 enzyme can then transfer the thioester-linked ubiquitin molecule to a cysteine residue on the ubiquitin-conjugating enzyme, E2, in an ATP-dependent reaction.

    \

    This domain is found 2 times in each member of the ubiquitin activating enzymes and is located downstream of the active site cysteine PUBMED:1634524.

    \ ' '4776' 'IPR002830' '\

    This family of proteins is found in prokaryotes, archaea and fungi, with two members in Archaeoglobus fulgidus. They are related to UbiD, a 3-octaprenyl-4-hydroxybenzoate carboxy-lyase from Escherichia coli that is involved in ubiquinone biosynthesis PUBMED:11029449. The member from Helicobacter pylori has a C-terminal extension of just over 100 residues that is shared, in part, by the Aquifex aeolicus homologue.

    \ ' '4777' 'IPR004033' '\ A number of methyltransferases have been shown to share regions of\ similarities PUBMED:9045837. Apart from the ubiquinone/menaquinone biosynthesis methyltransferases (for example, the C-methyltransferase from the ubiE gene of Escherichia coli), the ubiquinone biosynthesis methyltransferases (for example, the C-methyltransferase from the COQ5 gene of Saccharomyces cerevisiae) and the menaquinone biosynthesis methyltransferases (for example, the C-methyltransferase from the MENH gene of Bacillus subtilis), this family also includes methyltransferases involved in biotin and sterol biosynthesis and in phosphatidylethanolamine methylation.\ ' '4778' 'IPR007129' '\

    Saccharomyces cerevisiae biquinol-cytochrome C chaperone is required for assembly of coenzyme QF-2-cytochrome C reductase. It appears to be found in a number of different organisms including Homo sapiens, Caenorhabditis elegans and Rhizobium meliloti.

    \ ' '4779' 'IPR003197' '\

    The cytochrome bd type terminal oxidases catalyse quinol dependent, Na+ independent oxygen uptake PUBMED:8626304. Members of this family are integral membrane proteins and contain a protoheame IX centre B558.

    \

    Cytochrome bd may play an important role in microaerobic nitrogen fixation in the enteric bacterium Klebsiella pneumoniae, where it is expressed under all conditions that permit diazotrophy PUBMED:9274021.

    \ \

    The 14 kDa (or VI) subunit of the complex is not directly involved in electron transfer, but has a role in assembly of the complex PUBMED:7770525.

    \ ' '4780' 'IPR003422' '\ The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex PUBMED:9651245. The bc1 complex contains 11 subunits; 3 respiratory subunits (cytochrome B, cytochrome C1, Rieske protein), 2 core proteins and 6 low molecular weight proteins. This family represents the \'hinge\' protein of the complex which is thought to mediate formation of the cytochrome c1 and cytochrome c complex.\ ' '4781' 'IPR004192' '\ The ubiquinol cytochrome c reductase (cytochrome bc1) complex is a respiratory chain that generates an elctrochemical potential coupled to ATP synthesis. The bc1 complex contains 11 subunits, 3 respiratory subunits (cytochrome B, cytochrome C1, Rieske protein), 2 core proteins and 6 low-molecular weight proteins. Each subunit of the cytochrome bc1 complex provides a single helix (this family) to make up the transmembrane region of the complex.\ ' '4782' 'IPR004205' '\

    The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multi-enzyme complex PUBMED:9651245, which recognises a mitochondrial targeting presequence. The bc1 complex contains 11 subunits: 3 respiratory subunits (cytochrome b, cytochrome c1 and Rieske protein), 2 core proteins and 6 low molecular weight proteins. This family represents the 9.5 kDa subunit of the complex. This subunit together with cytochrome B binds to ubiquinone.

    \ ' '4783' 'IPR014026' '\

    The UDP-glucose/GDP-mannose dehydrogenases are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate PUBMED:2470755, PUBMED:9013585.

    \ \

    The enzymes have a wide range of functions. In plants UDP-glucose dehydrogenase, , is an important enzyme in the synthesis of hemicellulose and pectin PUBMED:12031484, which are the components of newly formed cell walls; while in zebrafish UDP-glucose dehydrogenase is required for cardiac valve formation PUBMED:11533493. In Xanthomonas campestris, a plant pathogen, UDP-glucose dehydrogenase is required for virulence PUBMED:11554764.

    \ \

    GDP-mannose dehydrogenase, , catalyses the formation of GDP-mannuronic acid, which is the monomeric unit from which the exopolysaccharide alginate is formed. Alginate is secreted by a number of bacteria, which include Pseudomonas aeruginosa and Azotobacter vinelandii. In P. aeruginosa, alginate is believed to play an important role in the bacteria\'s resistance to antibiotics and the host immune response PUBMED:12135385, while in A. vinelandii it is essential for the encystment process PUBMED:9864323.

    \ \

    This entry represents an alpha helical region that serves as the dimerisation interface for these enzymes PUBMED:10841783, PUBMED:12705829.

    \ ' '4784' 'IPR014027' '\

    The UDP-glucose/GDP-mannose dehydrogenases are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate PUBMED:2470755, PUBMED:9013585.

    \ \

    The enzymes have a wide range of functions. In plants UDP-glucose dehydrogenase, , is an important enzyme in the synthesis of hemicellulose and pectin PUBMED:12031484, which are the components of newly formed cell walls; while in zebrafish UDP-glucose dehydrogenase is required for cardiac valve formation PUBMED:11533493. In Xanthomonas campestris, a plant pathogen, UDP-glucose dehydrogenase is required for virulence PUBMED:11554764.

    \ \

    GDP-mannose dehydrogenase, , catalyses the formation of GDP-mannuronic acid, which is the monomeric unit from which the exopolysaccharide alginate is formed. Alginate is secreted by a number of bacteria, which include Pseudomonas aeruginosa and Azotobacter vinelandii. In P. aeruginosa, alginate is believed to play an important role in the bacteria\'s resistance to antibiotics and the host immune response PUBMED:12135385, while in A. vinelandii it is essential for the encystment process PUBMED:9864323.

    \ \

    This entry represents the C-terminal substrate-binding domain of these enzymes. Structural studies indicate that this domain forms an incomplete dinucleotide binding fold PUBMED:10841783, PUBMED:12705829.

    \ ' '4785' 'IPR001732' '\

    The UDP-glucose/GDP-mannose dehydrogenases are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate PUBMED:2470755, PUBMED:9013585.

    \ \

    The enzymes have a wide range of functions. In plants UDP-glucose dehydrogenase, , is an important enzyme in the synthesis of hemicellulose and pectin PUBMED:12031484, which are the components of newly formed cell walls; while in zebrafish UDP-glucose dehydrogenase is required for cardiac valve formation PUBMED:11533493. In Xanthomonas campestris, a plant pathogen, UDP-glucose dehydrogenase is required for virulence PUBMED:11554764.

    \ \

    GDP-mannose dehydrogenase, , catalyses the formation of GDP-mannuronic acid, which is the monomeric unit from which the exopolysaccharide alginate is formed. Alginate is secreted by a number of bacteria, which include Pseudomonas aeruginosa and Azotobacter vinelandii. In P. aeruginosa, alginate is believed to play an important role in the bacteria\'s resistance to antibiotics and the host immune response PUBMED:12135385, while in A. vinelandii it is essential for the encystment process PUBMED:9864323.

    \ \

    This entry represents the N-terminal NAD(+)-binding domain. Structural studies indicate that this domain forms an alpha-beta structure containing the six-stranded parallel beta sheet characteristic of the dinucleotide binding Rossman fold PUBMED:10841783, PUBMED:12705829.

    \ ' '4786' 'IPR002618' '\ This family consists of UTP--glucose-1-phosphate uridylyltransferases (). Also known as UDP-glucose pyrophosphorylase (UDPGP) and Glucose-1-phosphate uridylyltransferase. UTP--glucose-1-phosphate uridylyltransferase catalyses the interconversion of MgUTP + glucose-1-phosphate and UDP-glucose + MgPPi PUBMED:8631325. UDP-glucose is an important intermediate in mammalian carbohydrate interconversion involved in various metabolic roles depending on tissue type PUBMED:8631325. In Dictyostelium discoideum (Slime mold), mutants in this enzyme abort the development cycle PUBMED:3035502. Also within this family is UDP-N-acetylglucosamine pyrophosphorylase () PUBMED:9603950 and two hypothetical proteins from Borrelia burgdorferi, the Lyme disease spirochaete ( and ).\ ' '4787' 'IPR002213' '\

    UDP glycosyltransferases (UGT) are a superfamily of enzymes that catalyzes the addition of the glycosyl group from a UTP-sugar to a small hydrophobic molecule. This family currently consist of:

    \ \

    These enzymes share a conserved domain of about 50 amino acid residues located in their C-terminal section.

    \ ' '4788' 'IPR004854' '\

    Post-translational ubiquitin-protein conjugates are recognised for degradation by the ubiquitin fusion degradation (UFD) pathway. Several proteins involved in this pathway have been identified PUBMED:7615550. This family includes UFD1, a 40kDa protein that is essential for vegetative cell viability PUBMED:7615550. The human UFD1 gene is expressed at high levels during embryogenesis, especially in the eyes and in the inner ear primordia and is thought to be important in the determination of ectoderm-derived structures, including neural crest cells. In addition, this gene is deleted in the CATCH-22 (cardiac defects, abnormal facies, thymic hypoplasia, cleft palate and hypocalcaemia with deletions on chromosome 22) syndrome. This clinical syndrome is associated with a variety of developmental defects, all characterised by microdeletions on 22q11.2. Two such developmental defects are the DiGeorge syndrome OMIM:188400, and the velo-cardio- facial syndrome OMIM:145410. Several of the abnormalities associated with these conditions are thought to be due to defective neural crest cell differentiation PUBMED:9063746.

    \ ' '4789' 'IPR003670' '\ The UK protein is an African swine fever virus (ASFV) protein that is highly conserved amongst strains. Data indicates that the\ highly conserved UK gene of ASFV, while being nonessential for growth in\ macrophages in vitro, is an important viral virulence determinant for domestic pigs PUBMED:9444996.\ ' '4790' 'IPR004286' '\

    UL16 protein may play a role in capsid maturation including DNA packaging/cleavage PUBMED:9645194. In immunofluorescence studies PUBMED:8955043, UL16 was localised to the nucleus of infected cells in areas containing high concentrations of Human herpesvirus 2 (Herpes simplex virus 2) capsid proteins. These nuclear compartments have been described previously as viral assemblons PUBMED:8676489 and are distinct from compartments containing replicating DNA. Localization within assemblons argues for a role of UL16 encoded protein in capsid assembly or maturation PUBMED:8955043.

    \ ' '4791' 'IPR004936' '\ The UL21 protein appears to be a dispensable component in herpesviruses PUBMED:8151763.\ ' '4792' 'IPR002493' '\ The Herpesvirus UL25 gene product is a virion component involved in virus\ penetration PUBMED:8615003 and capsid assembly. The product of the UL25 gene is\ required for packaging but not cleavage of replicated viral DNA PUBMED:8615003.\ This family includes a number of Herpesvirus proteins: Epstein-Barr virus (strain B95-8) (HHV-4) (Human herpesvirus 4) BVRF1 , Human cytomegalovirus (strain AD169) (HHV-5) (Human herpesvirus 5) UL77 , Infectious laryngotracheitis virus (strain Thorne V882) (ILTV) ORF2 , and Varicella-zoster virus (strain Dumas) (HHV-3) (Human herpesvirus 3) gene 34 .\ ' '4793' 'IPR007584' '\ UL35 represents a true late gene which encodes a 12 kDa capsid protein PUBMED:1313892.\ ' '4794' 'IPR005655' '\

    UL37 interacts with UL36, which is thought to be an important early step in tegumentation during virion morphogenesis in the cytoplasm PUBMED:11861875.

    \ ' '4795' 'IPR003202' '\ The DNA polymerase processivity factor (UL42) of Human herpesvirus 1 (HHV-1) forms a heterodimer with UL30 to create the viral DNA polymerase complex. UL42 functions to increase the processivity of polymerisation and makes little contribution to the catalytic activity of the polymerase.\ ' '4796' 'IPR004339' '\

    UL49 proteins are present in the viral tegument at the surface of the nucleocapsid PUBMED:12134026. Many of the nonconserved tegument proteins of alpha-herpes viruses play important roles during different steps of the viral replication cycle, such as the shutoff of host cell functions by the vhs protein encoded by UL41 and the transcriptional activation of viral immediate-early genes by the UL48 gene product, VP16. UL49 of Human herpesvirus 1 (HHV-1) has been shown to directly interact with VP16. The UL49 gene products of HHV-1 and Bovine herpesvirus 1 exhibit virus-independent intercellular trafficking of unknown biological function but are dispensable for productive viral replication.

    \

    Envelope glycoprotein M (gM) and the complex formed by glycoproteins E (gE) and I (gI) are involved in the secondary envelopment of Suid herpesvirus 1 (Pseudorabies virus, PrV) particles in the cytoplasm of infected cells. In the absence of the gE-gI complex and gM, envelopment is blocked and capsids surrounded by tegument proteins accumulate in the cytoplasm. The cytoplasmic domains of gE and gM specifically interact with the C-terminal part of the UL49 gene product of PrV suggesting a role for the protein in secondary envelopment during herpesvirus virion maturation PUBMED:12134026.

    \ ' '4797' 'IPR004340' '\

    Human herpesvirus 1 (HHV-1) DNA replication in host cells is known to be mediated by seven viral-encoded proteins, three of which form a heterotrimeric DNA helicase-primase complex. This complex consists of UL5, UL8, and UL52 subunits. Heterodimers consisting of UL5 and UL52 have been shown to retain both helicase and primase activities. Nevertheless, UL8 is still essential for replication: though it lacks any DNA binding or catalytic activities, it is involved in the transport of UL5-UL52 and it also interacts with other replication proteins.

    \

    The molecular mechanisms of the UL5-UL52 catalytic activities are not known. While UL5 is associated with DNA helicase activity and UL52 with DNA primase activity, the helicase activity requires the interaction of UL5 and UL52 PUBMED:10501495, PUBMED:11278618. It is not known if the primase activity can be maintained by UL52 alone. The biological significance of UL52-UL8 interaction is not known. Yeast two-hybrid analysis together with immunoprecipitation experiments have shown that the HHV-1 UL52 region between residues 366-914 is essential for this interaction, while the first 349 N-terminal residues are dispensable PUBMED:10501495.

    \

    This family also includes protein UL70 from cytomegalovirus (CMV, a subgroup of the Herpesviridae) strains which, by analogy with UL52, is thought to have DNA primase activity. Indeed, CMV strains also possess a DNA helicase-primase complex, the other subunits being protein UL105 (with known similarity to HHV-1 UL5) and protein UL102.

    \ ' '4798' 'IPR004290' '\ Members of this family are functionally uncharacterised proteins from herpesviruses.\ ' '4799' 'IPR004285' '\ Members of this family are functionally uncharacterised.\ ' '4800' 'IPR004289' '\ Members of this family are functionally uncharacterised proteins from herpesviruses. The N terminus of these proteins contain 6 conserved cysteines and histidines that might form a zinc binding domain.\ ' '4801' 'IPR004280' '\ Members of this family are functionally uncharacterised proteins from herpesviruses.\ ' '4802' 'IPR006902' '\ The long distance movement protein of Umbraviruses mediates the movement of viral RNA through the phloem of infected plants PUBMED:11601910.\ ' '4803' 'IPR001526' '\

    CD59 (also called 1F-5Ag, H19, HRF20, MACIF, MIRL, P-18 or protectin) inhibits formation of membrane attack complex (MAC), thus protecting cells from complement mediated lysis. It has a signalling role, as a GPI-anchored molecule, in T cell activation and appears to have some role in cell adhesion through CD2 (controversial). CD59 associates with C9, inhibiting incorporation into C5b-8 preventing terminal steps in polymerisation of the (MAC) in plasma membranes. Genetic defects in GPI-anchor attachment that cause a reduction or loss of both CD59 and CD55 on erythrocytes produce the symptoms of the disease paroxysmal nocturnal haemoglobinuria (PNH).

    \ \

    A variety of GPI-linked cell-surface glycoproteins are composed of one or more copies of a conserved domain of about 100 amino-acid residues PUBMED:1850423, PUBMED:8394346. Among these proteins, U-PAR contains three tandem copies of the domain, while all the others are made up of a single domain.

    \

    As shown in the following schematic, this conserved domain contains 10 cysteine residues involved in five disulphide bonds - in U-PAR, the first copy of the domain lacks the fourth disulphide bond.

    \
    \
         +------+     +------------------------+                    +---+\
         |      |     |                        |                    |   |\
     xCxxCxxxxxxCxxxxxCxxxxxCxxxxxxxxxxxxxxxxxxCxxxxCxxxxxxxxxxxxxxCCxxxCxxxxxxxx\
      |                     |                       |              |\
      +---------------------+                       +--------------+\
    \
    \'C\': conserved cysteine involved in a disulphide bond.\
    
    \ \

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).

    \ ' '4804' 'IPR013848' '\

    This entry represents an N-terminal domain found in a family of proteins defined by sequence similarity. Most of these proteins are not yet characterised, but those that are include\ \

    \ \ It is almost always found in conjunction with a radical SAM domain () and a TRAM domain ().

    \ \ ' '4805' 'IPR005226' '\

    This family has no known function. It includes potential membrane proteins.

    \ ' '4806' 'IPR001727' '\

    A number of uncharacterised proteins share regions of similarities. These include,\

    \

    These are hydrophobic proteins of 200 to 320 amino acids that seem to contain six or seven transmembrane domains.

    \ ' '4807' 'IPR000241' '\ This domain is probably a methylase. It is associated with the THUMP domain that also occurs with RNA modification domains PUBMED:11295541.\ ' '4808' 'IPR011063' '\

    This entry represents the PP-loop motif superfamily PUBMED:7731953,PUBMED:10506138. The PP-loop motif appears to be a modified version of the P-loop of nucleotide binding domain that is involved in phosphate binding PUBMED:7731953. Named PP-motif, since it appears to be a part of a previously uncharacterised ATP pyrophophatase domain. ATP sulfurylases, Escherichia coli NtrL, and Bacillus subtilis OutB consist of this domain alone. In other proteins, the pyrophosphatase domain is associated with amidotransferase domains (type I or type II), a putative citrulline-aspartate ligase domain or a nitrilase/amidase domain.

    \ ' '4809' 'IPR019783' '\

    This entry represents the N-terminal domain of proteins that are highly conserved in species ranging from archaea to vertebrates and plants PUBMED:12496757. The family contains several Shwachman-Bodian-Diamond syndrome (SBDS, OMIM 260400) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. It is characterised by bone marrow failure and leukemia predisposition. Members of this family play a role in RNA metabolism PUBMED:15701631, PUBMED:15701634. In yeast Sdo1 is involved in the biogenesis of the 60S ribosomal subunit and translational activation of ribosomes. Together with the EF-2-like GTPase RIA1 (EfI1), it triggers the GTP-dependent release of TIF6 from 60S pre-ribosomes in the cytoplasm, thereby activating ribosomes for translation competence by allowing 80S ribosome assembly and facilitating TIF6 recycling to the nucleus, where it is required for 60S rRNA processing and nuclear export. This data links defective late 60S subunit maturation to an inherited bone marrow failure syndrome associated with leukemia predisposition PUBMED:17353896.

    \ \ \

    A number of uncharacterised hydrophilic proteins of about 30 kDa share regions of similarity. These include,

    \ \ \ ' '4810' 'IPR001656' '\

    Pseudouridine synthase TruD modifies uracil-13 in tRNA PUBMED:12756329. TruD belongs to a recently identified and large family of pseudouridine synthases present in all kingdoms of life PUBMED:15135053. TruD folds into a V-shaped molecule with an RNA-binding cleft formed between its two domains: a catalytic domain and an insertion domain. The catalytic domain differs in sequence but is structurally very similar to the catalytic domain of other pseudouridine synthases. The insertion (or TRUD) domain displays a novel alpha/beta structure that forms a compact fold titled away from the catalytic domain to form a deep cleft in TruD which is lined with basic residues from each domain. The insertion domain is characterised by two conserved sequence motifs that form a part of the hydrophobic core, as well as by large insertions at several specific sites that are seen in many archaeal and eukaryotic homologues. The insertion domain is likely to be involved in substrate recognition and may represent a RNA binding module PUBMED:15208439.

    \ \

    Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an alpha+beta structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are PUBMED:10529181:

    \ \

    \ \ ' '4811' 'IPR001233' '\ A number of uncharacterised proteins including Escherichia coli rtcB, Mycobacterium \ tuberculosis MtCY441.01., Caenorhabditis elegans F16A11.2 and Methanocaldococcus jannaschii (Methanococcus jannaschii) \ MJ0682 belong to this family.\ ' '4812' 'IPR001498' '\

    This entry contains the protein Impact, which is a translational regulator that ensures constant high levels of translation under amino acid starvation. It acts by interacting with Gcn1/Gcn1L1, thereby preventing activation of Gcn2 protein kinases (EIF2AK1 to 4) and subsequent down-regulation of protein synthesis. It is evolutionary conserved from eukaryotes to archaea PUBMED:11116084.

    \ ' '4813' 'IPR000631' '\

    This family is related to Hydroxyethylthiazole kinase and PfkB carbohydrate kinase implying that it also a carbohydrate kinase.

    \ \

    Several uncharacterised proteins have been shown to share regions of similarities, including yeast chromosome XI hypothetical protein YKL151c; Caenorhabditis elegans hypothetical protein R107.2; Escherichia coli hypothetical protein yjeF;\ Bacillus subtilis hypothetical protein yxkO; Helicobacter pylori hypothetical protein HP1363; Mycobacterium tuberculosis hypothetical protein MtCY77.05c; Mycobacterium leprae hypothetical protein B229_C2_201; Synechocystis sp. (strain PCC 6803) hypothetical protein sll1433; and Methanocaldococcus jannaschii (Methanococcus jannaschii) hypothetical protein MJ1586. These are proteins of about 30 to 40 kDa whose central region is well conserved.

    \ ' '4814' 'IPR002033' '\

    Proteins encoded by the mttABC operon (formerly yigTUW), mediate a novel Sec-independent membrane targeting and translocation system in Escherichia coli that interacts with cofactor-containing redox proteins having a S/TRRXFLK "twin arginine" leader motif. This family contains the E. coli mttB gene (TATC) PUBMED:9546395.

    \ \

    A functional Tat system or Delta pH-dependent pathway requires three integral membrane proteins: TatA/Tha4, TatB/Hcf106 and TatC/cpTatC. The TatC protein is essential for the function of both pathways. It might be involved in twin-arginine signal peptide recognition, protein translocation and proton translocation. Sequence analysis predicts that TatC contains six transmembrane helices (TMHs), and experimental data confirmed that N and C termini of TatC or cpTatC are exposed to the cytoplasmic or stromal face of the membrane. The cytoplasmic N terminus and the first cytoplasmic loop region of the E. coli TatC protein are essential for protein export. At least two TatC molecules co-exist within each Tat translocon PUBMED:9649434, PUBMED:12163163.

    \ ' '4815' 'IPR001455' '\

    SirA functions as a response regulator as part of a two-component system, where BarA is the sensor kinase. This system increases the expression of virulence genes and decreases the expression of motility genes PUBMED:14645287. BarA phosphorylates SirA, thereby activating the protein. Phosphorylated SirA directly activates virulence expression by interacting with hilA and hilC promoters, while repressing the flagellar regulon indirectly by binding to the csrB promoter, which in turn affects flagellar gene expression. Orthologues of SirA from Salmonella spp. can be found throughout proteobacteria, such as GacA in Psuedomonas spp., VarA in Vibrio cholerae, ExpA in Erwinia carotovora, LetA in Legionella pneumophila, and UvrY in Escherichia coli PUBMED:11768529. A sensor kinase for SirA is present in each of these organisms as well; the sensor kinase is known as BarA in E. coli and Salmonella spp., but has different names in other genera. In different species, SirA/BarA orthologues are required for virulence gene expression, exoenzyme and antibiotic production, motility, and biofilm formation.

    \

    The structure of SirA consists of an alpha/beta sandwich with a beta-alpha-beta-alpha-beta(2) fold, comprising a mixed four-stranded beta-sheet stacked against two alpha-helices, both of which are nearly parallel to the strands of the beta-sheet PUBMED:11080457.

    \

    Several uncharacterised bacterial proteins (73 to 81 amino-acid residues in length) that contain a well-conserved region in their N-terminal region show structural similarity to the SirA protein, including the E. coli protein YedF, and other members of the UPF0033 family.

    \ ' '4816' 'IPR020603' '\

    this entry represents the 70 amino acid region found duplicated in the bacterial proteins MraZ. These proteins may be DNA-binding transcription factors, its members are probably enzymes containing a conserved DXXXR motif that probably forms part of the active site.

    \ ' '4817' 'IPR005336' '\

    This is a family of proteins of unknown function.

    \ ' '4818' 'IPR005337' '\

    This is a family of putative P-loop ATPases PUBMED:9714532. Many of the proteins in this family are hypothetical and kinase activity has been proposed for some family members. This family contains an ATP-binding site and could be an ATPase.

    \ ' '4819' 'IPR001890' '\

    The CRM domain is an ~100-amino acid RNA-binding domain. The name chloroplast RNA splicing and ribosome maturation (CRM) has been suggested to reflect the functions established for the four characterised members of the family: Zea mays (Maize) CRS1 (), CAF1 () and CAF2 () proteins and the Escherichia coli protein YhbY (). The CRM domain is found in eubacteria, archaea, and plants. The CRM domain is represented as a stand-alone protein in archaea and bacteria, and in single- and multi-domain proteins in plants. It has been suggested that prokaryotic CRM proteins existed as ribosome-associated proteins prior to the divergence of archaea and bacteria, and that they were co-opted in the plant lineage as RNA binding modules by incorporation into diverse protein contexts. Plant CRM domains are predicted to reside not only in the chloroplast, but also in the mitochondrion and the nucleo/cytoplasmic compartment. The diversity of the CRM domain family in plants suggests a diverse set of RNA targets PUBMED:12881426, PUBMED:17105995.

    \ \

    The CRM domain is a compact alpha/beta domain consisting of a four-stranded beta sheet and three alpha helices with an alpha-beta-alpha-beta-alpha-beta-beta topology. The beta sheet face is basic, consistent with a role in RNA binding. Proximal to the basic beta sheet face is another moiety that could contribute to nucleic acid recognition. Connecting strand beta1 and helix alpha2 is a loop with a six amino acid motif, GxxG flanked by large aliphatic residues, within which one \'x\' is typically a basic residue PUBMED:12429100.

    \

    Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants PUBMED:17105995. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing PUBMED:18065687. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes PUBMED:17105995. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome PUBMED:11565746.

    \ ' '4820' 'IPR001602' '\ This family contains small uncharacterised proteins of 14 to 16 kDa mainly from bacteria although the signatures also occur in a hypothetical protein from archaea and from yeast.\ ' '4821' 'IPR000825' '\

    Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S] PUBMED:16221578. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.

    \

    The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins PUBMED:16211402, PUBMED:16843540. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly PUBMED:15937904.

    \

    The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA PUBMED:17350000. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA PUBMED:15278785, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.

    \

    In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins PUBMED:11498000. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen PUBMED:8875867.

    \ \

    This entry represents SufB and SufD proteins that form part of the SufBCD complex in the SUF system. No specific functions have been assigned to these proteins.

    \ ' '4822' 'IPR002882' '\

    This entry contains LPPG:Fo 2-phospho-L-lactate transferase (CofD) and related sequences of unknown function belong to unidentified protein family UPF0052. CofD catalyses the fourth step in the biosynthesis of coenzyme F420, which is the transfer of the 2-phospholactate moiety from lactyl (2) diphospho-(5\')guanosine (LPPG) to 7,8-didemethyl-8-hydroxy-5-deazariboflavin (FO) with the formation of the L-lactyl phosphodiester of 7,8-didemethyl-8-hydroxy-5-deazariboflavin (F420-0) and GMP. F420 is a flavin derivative found in methanogens, Mycobacteria, and several other lineages. This enzyme is characterised so far in Methanocaldococcus jannaschii (Methanococcus jannaschii) PUBMED:11888293 but appears restricted to F420-containing species and is predicted to carry out the same function in these other species.

    \ ' '4823' 'IPR002036' '\

    These, as yet, uncharacterised proteins are of 17 to 21 kDa. They contain a conserved region with three histidines at the C terminus. The protein family is represented by a single member sequence only in nearly every bacterium.

    \ \

    The crystal structure of the protein from the hyperthermophilic bacteria Aquifex aeolicus has been determined. The overall fold consists of one central alpha-helix surrounded by a four-stranded beta-sheet and four other alpha-helices. Structure-based homology analysis reveals a good resemblance to the metal-dependent proteinases such as collagenases and gelatinases. However, experimental tests for collagenase and gelatinase-type function show no detectable activity under standard assay conditions PUBMED:12832766.

    \ ' '4824' 'IPR000612' '\ Several proteins have been shown PUBMED:9588799 to be evolutionary related. These are small proteins of from 52 to\ 140 amino-acid resiudes that contains two transmembrane domains.\ ' '4825' 'IPR002753' '\ These archaebacterial proteins have no known function. Members of\ the family are about 90-105 amino acid residues long.\ ' '4826' 'IPR003844' '\

    This entry describes integral membrane proteins of unknown function.

    \ ' '4827' 'IPR003846' '\

    This entry describes proteins of unknown function.

    \ ' '4828' 'IPR005064' '\

    Bordetella pertussis, the causative agent of human whooping cough (pertussis), is an obligate human pathogen with diverse high-affinity transport systems for the assimilation of iron, a biometal that is essential for growth PUBMED:17724074. Periplasmic binding proteins of a new family, particularly well represented in this organism (and more generally in beta-proteobacteria), have been called Bug receptors PUBMED:16403514.

    \ \

    They adopt a characteristic Venus flytrap fold with two globular domains bisected by a ligand-binding cleft. The family is specific for carboxylated solutes, with a characteristic mode of binding involving two highly conserved beta strand-beta turn-alpha helix motifs originating from each domain. These two motifs form hydrogen bonds with a carboxylate group of the ligand, both directly and via conserved water molecules, and have thus been termed the carboxylate pincers. Domain 1 recognises the ligand and the carboxylate group serves as an initial anchoring point. Domain 2 discriminates between productively and non-productively bound ligands as proper interactions with this domain is needed for the of the closed conformation PUBMED:17870093.

    \ \

    BugE has a glutamate bound ligand. No charged residues are involved in glutamate binding by BugE, unlike what has been described for all glutamate receptors reported so far. The Bug architecture is highly conserved despite limited sequence identity PUBMED:17057341.

    \ ' '4829' 'IPR001378' '\ This domain had been observed is a number of proteins of archaea and bacterial origin. The function of this domain is unknown.\ ' '4830' 'IPR004254' '\ Members of this family are integral membrane proteins. This family includes proteins that are hemolysin-III homologs.\ ' '4831' 'IPR000944' '\ The following uncharacterised bacterial proteins have been shown to be evolutionary related, Desulfovibrio vulgaris protein Rrf2; Escherichia coli hypothetical proteins yfhP and yjeB; Bacillus subtilis hypothetical proteins yhdE, yrzC and ywgB; Mycobacterium tuberculosis hypothetical protein Rv1287; and Synechocystis sp. (strain PCC 6803) hypothetical protein slr0846. These are small proteins of 12 to 18kDa which seem to contain \ a signal sequence, and may represent a family of probable transcriptional regulators.\ ' '4832' 'IPR005338' '\

    Anhydro-N-acetylmuramic acid kinase catalyzes the specific phosphorylation of 1,6-anhydro-N-acetylmuramic acid (anhMurNAc) with the simultaneous cleavage of the 1,6-anhydro ring, generating MurNAc-6-P. It is also required for the utilisation of anhMurNAc, either imported from the medium, or derived from its own cell wall murein, and in so doing plays a role in cell wall recycling PUBMED:15901686, PUBMED:16452451.

    \ ' '4833' 'IPR003442' '\

    This group consists of bacterial proteins, which contain a P-loop. They are probably essential to bacteria as members are found in all genomes so far sequenced and no equivalent genes have been found in the archaea and eukaryotes, suggesting the protein may be involved in cell wall biosynthesis. The sequence of YjeE, from Haemophilus influenzae, has been determined to 1.7-A resolution. The protein has a nucleotide-binding fold with a four-stranded parallel beta-sheet flanked by antiparallel beta-strands on each side. The topology of the beta-sheet is unique among P-loop proteins and has features of different families of enzymes. ADP has been shown to bind to the P-loop in the presence of Mg2+ and ATPase activity has been confirmed by kinetic measurements PUBMED:12112691.

    \ \ ' '4835' 'IPR005227' '\

    Holliday junction resolvases (HJRs) are key enzymes of DNA recombination. The principal HJRs are now known or confidently predicted for all bacteria and archaea whose genomes have been completely sequenced, with many species encoding multiple potential HJRs. Structural and evolutionary relationships of HJRs and related nucleases suggests that the HJR function has evolved independently from at least four distinct structural folds, namely RNase H, endonuclease, endonuclease VII-colicin E and RusA ():

    \ \

    Horizontal gene transfer, lineage-specific gene loss and gene family expansion, and non-orthologous gene displacement seem to have been major forces in the evolution of HJRs and related nucleases. A remarkable case of displacement is seen in the Lyme disease spirochete Borrelia burgdorferi, which does not possess any of the typical HJRs, but instead encodes, in its chromosome and each of the linear plasmids, members of the exonuclease family predicted to function as HJRs. The diversity of HJRs and related nucleases in bacteria and archaea contrasts with their near absence in eukaryotes. The few detected eukaryotic representatives of the endonuclease fold and the RNase H fold have probably been acquired from bacteria via horizontal gene transfer. The identity of the principal HJR(s) involved in recombination in eukaryotes remains uncertain; this function could be performed by topoisomerase IB or by a novel, so far undetected, class of enzymes. Likely HJRs and related nucleases were identified in the genomes of numerous bacterial and eukaryotic DNA viruses. Gene flow between viral and cellular genomes has probably played a major role in the evolution of this class of enzymes.

    \

    This family represents the YqgF family of putative Holliday junction resolvases. With the exception of the spirochetes, the YqgF family is represented in all bacterial lineages, including the mycoplasmas with their highly degenerate genomes.

    \

    The RuvC resolvases are conspicuously absent in the low-GC Gram-positive bacterial lineage, with the exception of Ureaplasma parvum (Ureaplasma urealyticum biotype 1) (, PUBMED:10982859). Furthermore, loss of function ruvC mutants of E. coli show a residual HJR activity that cannot be ascribed to the prophage-encoded RusA resolvase PUBMED:8648624. This suggests that the YqgF family proteins could be alternative HJRs whose function partially overlaps with that of RuvC PUBMED:10982859.

    \ ' '4836' 'IPR002730' '\

    This entry represents the p29 subunit (also known as Rpp29 or Pop4) of the related ribonucleoproteins ribonuclease (RNase) P and RNase MRP, which can be found in both eukaryotes and arachea PUBMED:10352175. The structure of the RNase P subunit, Rpp29, from Methanobacterium thermoautotrophicum has been determined. Mth Rpp29 is a member of the oligonucleotide/oligosaccharide binding fold family. It contains a structured beta-barrel core and unstructured N- and C-terminal extensions bearing several highly conserved amino acid residues that could be involved in RNA contacts in the protein-RNA complex PUBMED:14673079. Rpp29 () catalyses the endonucleolytic cleavage of RNA, removing 5\'-extranucleotides from tRNA precursor. It interacts with the Rpp25 and Pop5 subunits.

    \ \

    RNase P is a ubiquitous ribonucleoprotein enzyme primarily responsible for cleaving the 5\' leader sequence during maturation of tRNAs in all three domains of life. In eubacteria, this enzyme is made up of two subunits: a large RNA (approximately 120 kDa) responsible for mediating catalysis, and a small protein cofactor (approximately 15 kDa) that modulates substrate recognition and is required for efficient in vivo catalysis. In contrast, multiple proteins are associated with eukaryotic and archaeal RNase P, and these proteins exhibit no recognizable homology to the conserved bacterial protein subunit. In reconstitution experiments with recombinantly expressed and purified protein subunits Mth Rpp29, a homologue of the Rpp29 protein subunit from eukaryotic RNase P, is an essential protein component of the archaeal holoenzyme PUBMED:14673079. In \ Saccharomyces cerevisiae (Baker\'s yeast), RNase P consists of 9 protein subunits (Pop1, Pop3-8, Rpr2 and Rpp1), while in humans there are 10 subunits (Rpp14, 20, 21, 25, 29, 30, 38, 40, hPop1, 5).

    \

    RNase MRP (mitochondrial RNA processing) is an rRNA processing enzyme that cleaves a specific site within precursor rRNA to generate the mature 5\'-end of 5.8S rRNA PUBMED:15916546. RNase MRP also cleaves primers for mitochondrial DNA replication and CLB2 mRNA. In yeast, RNase MRP possesses one putatively catalytic RNA and at least 9 protein subunits and is highly related to RNase P (Pop1, Pop3-Pop8, Rpp1, Snm1 and Rmp1).

    \ ' '4837' 'IPR004255' '\ This family of uncharacterised proteins is greatly expanded in Mycobacterium tuberculosis.\ ' '4838' 'IPR005265' '\

    There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. However, they are predicted to be integral membrane proteins (with several transmembrane segments). There is some indirect indication of a link between this protein and the function or assembly of cytochromes: in Escherichia coli, strains overproducing this protein turn pink, perhaps because of an excess of accumulated haems PUBMED:8606169.

    \ ' '4839' 'IPR003507' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This signature is found in the Escherichia coli microcin C7 self-immunity protein mccF and in muramoyltetrapeptide carboxypeptidase (, LD-carboxypeptidase A). LD-carboxypeptidase A belongs to MEROPS peptidase family S66 (clan SS). The entry also contains uncharacterised proteins including hypothetical proteins from various bacteria archaea.

    \ \ \ ' '4840' 'IPR002833' '\ Peptidyl-tRNA hydrolases are enzymes that release tRNAs from peptidyl-tRNA during translation.\ ' '4841' 'IPR003509' '\ The function of this family is unknown. Members include several bacterial hypothetical proteins.\ ' '4842' 'IPR002737' '\

    This entry contains proteins from all branches of life. The molecular function of these proteins are unknown, but Memo (mediator of ErbB2-driven cell motility) a human protein is included in this family PUBMED:15156151. It has been suggested that Memo controls cell migration by relaying extracellular chemotactic signals to the microtubule cytoskeleton PUBMED:15156151.

    \ ' '4843' 'IPR005242' '\

    This entry contains proteins annotated as belonging to the \'Uncharacterised protein family UPF0104 and those that are involved in the addition of lysine to phosphatidylglycerol, lysylphosphatidylglycerol synthetase (L-PG synthase). L-PG synthase produces lysyl-phosphatidylglycerol (L-PG), the major component of the bacterial membrane and responsible for its positive charge. L-PG synthesis contributes to bacterial virulence as it is involved in the cationic antimicrobial peptides (CAMP) resistance mechanism of the host\'s immune system (defensins, cathelicidins) and of the competing microorganisms (bacteriocins).

    \ ' '4844' 'IPR005341' '\

    The Pam16 protein is the fifth essential subunit of the pre-sequence translocase-associated protein import motor (PAM) PUBMED:14981507. In Saccharomyces cerevisiae (Baker\'s yeast), Pam16 is required for preprotein translocation into the matrix, but not for protein insertion into the inner membrane PUBMED:14981507.

    \ ' '4845' 'IPR005155' '\

    This entry represents PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain containing proteins such as the ribosomal biogenesis factor NIP7 PUBMED:9271378, PUBMED:9891085. PUA domains are predicted to bind RNA molecules with complex folded structures PUBMED:10093218. NIP7 is required for efficient 60S ribosome subunit biogenesis and has been shown to interact with another essential nucleolar protein, Nop8p, and the exosome subunit Rrp43p. These three proteins are required for 60S subunit synthesis and may be part of a dynamic complex involved in this process.

    \ ' '4846' 'IPR005134' '\ This conserved hypothetical protein family with four predicted transmembrane regions is found in\ Escherichia coli, Haemophilus influenzae, and Helicobacter pylori 26695, among completed genomes.\ ' '4847' 'IPR002549' '\

    This is a family of hypothetical proteins. A number of the sequence records state they are transmembrane proteins or putative permeases. It is not clear what source suggested that these proteins might be permeases and this\ information should be treated with caution.

    \ \ ' '4848' 'IPR005343' '\

    This is a small family of mainly hypothetical proteins of unknown function.

    \ ' '4849' 'IPR005344' '\

    Uncharacterised integral membrane protein family.

    \ ' '4850' 'IPR007394' '\ Members of this family are predicted to contain a helix-turn-helix motif, for example residues 37-55 in Mycoplasma mycoides p13 (). Genes encoding family members are often part of operons that encode components of the SRP pathway, and this protein may regulate the expression of an operon related to the SRP pathway PUBMED:9070906.\ ' '4851' 'IPR005345' '\

    Phf5 is a member of a novel murine multigene family that is highly conserved during evolution and belongs to the superfamily of PHD-finger proteins. At least one example, from Mus musculus (Mouse), may act as a chromatin-associated protein PUBMED:14762117. The Schizosaccharomyces pombe (Fission yeast) ini1 gene is essential, required for splicing PUBMED:12054543. It is localised in the nucleus, but not detected in the nucleolus and can be complemented by human ini1 PUBMED:12054543. The proteins of this family contain five CXXC motifs.

    \ ' '4852' 'IPR005346' '\

    This is a small family of proteins of unknown function.

    \ ' '4853' 'IPR005115' '\

    This domain is duplicated in bacterial membrane proteins of unknown function and each domain contains three transmembrane helices. The conserved glycines are suggestive of an ion channel.

    \ ' '4854' 'IPR005266' '\

    The function of this family is unknown. These proteins are from 222 to 233 residues in length, lack hydrophobic stretches, and are found so far only in thermophiles.

    \ ' '4857' 'IPR005349' '\

    This family of short membrane proteins is as yet uncharacterised.

    \ ' '4858' 'IPR005350' '\

    This family of bacterial proteins includes a number of plasmid-encoded virulence proteins.

    \ ' '4859' 'IPR005351' '\

    This is a small family of proteins of unknown function which appear to be related to the hypothetical protein CG10674 from Drosophila melanogaster (Fruit fly)().

    \ ' '4860' 'IPR005353' '\

    The function of this family of proteins is unknown.

    \ ' '4861' 'IPR005354' '\

    This family of small proteins has no known function.

    \ ' '4863' 'IPR005356' '\

    The protein in this family are about 190 amino acids long. The function of these proteins is unknown.

    \ ' '4864' 'IPR005357' '\

    This family of small proteins is uncharacterised. In this domain is found next to a DNA binding helix-turn-helix domain , which suggests that this is some kind of ligand binding domain.

    \ ' '4865' 'IPR005358' '\

    This family of proteins contain 8 conserved cysteines that may form a zinc binding site. The function of these proteins is unknown.

    \ ' '4866' 'IPR005359' '\

    This family contains a set of short bacterial proteins of unknown function.

    \ ' '4867' 'IPR005360' '\

    Proteins in this entry are almost always found adjacent to a plasmid stabilisation protein (), which is the killing partner of an addiction module for plasmid stabilisation. Proteins in this entry, therefore, are putative addiction module antidote proteins. Some are encoded on plasmids or in prophage regions, but others appear to be chromosomal.\ A genome may contain several identical copies, for example four are found in Magnetococcus sp. (strain MC-1). This entry is named after one member, Caulobacter crescentus CB15.

    \ ' '4868' 'IPR007344' '\ This family of uncharacterised proteins is also known as GrpB.\ ' '4869' 'IPR005361' '\

    This is a small family of hypothetical bacterial proteins of unknown function.

    \ ' '4870' 'IPR003226' '\ The function of this domain is not known, but it is found in several uncharacterised proteins and a probable metal dependent protein hydrolase.\ ' '4871' 'IPR005362' '\

    This family of uncharacterised proteins are only found in Treponema pallidum. They contain a putative signal peptide so may be secreted proteins.

    \ ' '4872' 'IPR005363' '\

    The proteins in this family are about 200 amino acids long and each contain 3 CXXC motifs.

    \ ' '4874' 'IPR005365' '\

    This is a small family of proteins of unknown function.

    \ ' '4875' 'IPR005366' '\

    This is a small family of proteins of unknown function.

    \ ' '4876' 'IPR003485' '\

    This is a family of unique short (US) region proteins from Herpesviridae strains. The US2 family has no known function.

    \ ' '4878' 'IPR005368' '\

    This family contains small proteins of unknown function.

    \ ' '4879' 'IPR005370' '\

    The members of this family are small uncharacterised proteins.

    \ ' '4880' 'IPR005371' '\

    This family contains small proteins of about 50 amino acids of unknown function. The family includes YoaH .

    \ ' '4881' 'IPR005374' '\

    This is a small family of mainly hypothetical proteins of unknown function.

    \ ' '4882' 'IPR007193' '\

    Transcripts harbouring premature signals for translation termination are recognised and rapidly degraded by eukaryotic cells through a pathway known as nonsense-mediated mRNA decay. In Saccharomyces cerevisiae, three trans-acting factors (Upf1 to Upf3) are required for nonsense-mediated mRNA decay PUBMED:11073994.

    \ ' '4883' 'IPR001441' '\

    Synonym(s): Di-trans-poly-cis-undecaprenyl-diphosphate synthase, Undecaprenyl pyrophosphate synthetase, Undecaprenyl pyrophosphate synthase, UPP synthetase

    \ \

    Di-trans-poly-cis-decaprenylcistransferase () (UPP synthetase) \ generates undecaprenyl pyrophosphate (UPP) from isopentenyl pyrophosphate\ (IPP) PUBMED:9882662. This bacterial enzyme is also found in archaebacteria and in a number of uncharacterised proteins including some from yeasts.

    \ \

    This entry also matches related enzymes that transfer alkyl groups, such as dehydrodolichyl diphosphate synthase.

    \ ' '4884' 'IPR011612' '\

    Urease (urea amidohydrolase, ) catalyses the hydrolysis of urea to form ammonia and carbamate. The subunit composition of urease from different sources varies PUBMED:7565414, but each holoenzyme consists of four structural domains PUBMED:7754395: three structural domains and a nickel-binding catalytic domain common to amidohydrolases PUBMED:9144792. Urease is unique among nickel metalloenzymes in that it catalyses a hydrolysis rather than a redox reaction. In Helicobacter pylori, the gamma and beta domains are fused and called the alpha subunit (). The catalytic subunit (called beta or B) has the same organization as the Klebsiella alpha subunit. Jack bean (Canavalia ensiformis) urease has a fused gamma-beta-alpha organization ().

    The N-terminal domain is a composite domain and plays a major trimer stabilising role by contacting the catalytic domain of the symmetry related alpha-subunit PUBMED:7754395.

    \ ' '4885' 'IPR002019' '\

    Urease is a nickel-binding enzyme that catalyzes the hydrolysis of urea to carbon dioxide\ and ammonia PUBMED:3402446:\ \ Historically, it was the first enzyme to be crystallized (in 1926). It is mainly\ found in plant seeds and microorganisms. In plants, urease is a hexamer of identical chains. In bacteria\ PUBMED:2651866, it consists of either two or three different subunits (alpha , beta, described in this entry, and gamma ). The structure of the\ urease complex is known PUBMED:7754395.

    \ This subunit does not appear to take part in the catalytic mechanism. \ This subunit is known (confusingly) as alpha in Helicobacter.\ ' '4886' 'IPR002026' '\

    Urease is a nickel-binding enzyme that catalyzes the hydrolysis of urea to carbon dioxide\ and ammonia PUBMED:3402446:\ \ Historically, it was the first enzyme to be crystallized (in 1926). It is mainly\ found in plantseeds and microorganisms. In plants, urease is a hexamer of identical chains. In bacteria\ PUBMED:2651866, it consists of either two or three different subunits (alpha , beta and gamma, described in this entry). The structure of the\ urease complex is known PUBMED:7754395.

    \ \ This subunit does not appear to take part in the catalytic mechanism.\ ' '4887' 'IPR002669' '\ UreD is a urease accessory protein. Urease hydrolyses urea into ammonia and carbamic acid PUBMED:8550495. UreD is involved in activation of the urease enzyme via the UreD-UreF-UreG-urease complex PUBMED:9209019 and is required for urease nickel metallocentre assembly PUBMED:7909161. See also UreF , UreG .\ ' '4888' 'IPR007864' '\

    Urease and other nickel metalloenzymes are synthesised as precursors devoid of the metalloenzyme active site. These precursors then undergo a complex post-translational maturation process that requires a number of accessory proteins.

    \ \

    Members of this group are nickel-binding proteins required for urease metallocentre assembly PUBMED:8318889. They are believed to function as metallochaperones to deliver nickel to urease apoprotein PUBMED:12072968, PUBMED:10753863. It has been shown by yeast two-hybrid analysis that UreE forms a dimeric complex with UreG in Helicobacter pylori PUBMED:12388207. The UreDFG-apoenzyme complex has also been shown to exist PUBMED:11157956, PUBMED:7721685 and is believed to be, with the addition of UreE, the assembly system for active urease PUBMED:7721685. The complexes, rather than the individual proteins, presumably bind to UreB via UreE/H recognition sites.

    \ \

    The structure of Klebsiella aerogenes UreE reveals a unique two-domain architecture.The N-terminal domain is structurally related to a heat shock protein, while the C-terminal domain shows homology to the Atx1 copper metallochaperone PUBMED:11591723, PUBMED:11602602. Significantly, the metal-binding sites in UreE and Atx1 are distinct in location and types of residues despite the relationship between these proteins and the mechanism for UreE activation of urease is proposed to be different from the thiol ligand exchange mechanism used by the copper metallochaperones.

    \ \

    The C-terminal domain of this protein is the metal-binding region, which can bind up to six Ni molecules per dimer. Most members of this group contain a histidine-rich C-terminal motif that is involved in, but not solely responsible for, binding nickel ions in K. aerogenes UreE PUBMED:8808929. However, internal ligands, not the histidine residues at the C terminus, are necessary for UreE to assist in urease activation in K. aerogenes PUBMED:11591723, even though the truncated protein lacking the His-rich region binds two nickel ions instead of six. In H. pylori and some other organisms, the terminal histidine-rich binding sites are absent, but the internal histidine sites are present, and the latter probably function as nickel donors. Deletion analysis shows that this domain alone is sufficient for metal-binding and activation of urease PUBMED:15866948.

    \ ' '4889' 'IPR002639' '\ This family consists of the urease accessory protein, UreF. The urease enzyme (urea amidohydrolase) hydrolyses urea into ammonia and carbamic acid PUBMED:8550495. UreF is proposed to modulate the activation process of urease by eliminating the binding of nickel irons to noncarbamylated protein PUBMED:8808930.\ ' '4890' 'IPR007247' '\ Ureidoglycolate hydrolase () carries out the third step in the degradation of allantoin.\ ' '4891' 'IPR002042' '\ Uricase () (urate oxidase) PUBMED:3182808 is the peroxisomal enzyme responsible\ for the degradation of urate into allantoin:\ \ Some species, like primates and birds, have lost the gene for uricase and are therefore unable to degrade urate PUBMED:2594778. Uricase is a protein of 300 to 400 amino acids, its sequence is well conserved.\ It is mainly localised in the liver, where it forms a large electron-dense paracrystalline core in many peroxisomes PUBMED:2338140.\ The enzyme exists as a tetramer of identical subunits, \ each containing a possible type 2 copper-binding site PUBMED:2594778. In legumes, 2 forms of uricase are found: in the roots, the tetrameric form; and, in the uninfected cells of root nodules, a monomeric form, which plays an\ important role in nitrogen-fixation PUBMED:16593585.\ ' '4892' 'IPR000257' '\

    Uroporphyrinogen decarboxylase (URO-D), the fifth enzyme of the haem biosynthetic pathway, catalyses the sequential decarboxylation of the four acetyl side chains of uroporphyrinogen to yield coproporphyrinogen PUBMED:1576986. URO-D deficiency is responsible for the human genetic diseases familial porphyria cutanea tarda (fPCT) and hepatoerythropoietic porphyria (HEP). The sequence of URO-D has been well conserved throughout evolution. The best conserved region is located in the N-terminal section; it contains a perfectly conserved hexapeptide. There are two arginine residues in this hexapeptide which could be involved in the binding, via salt bridges, to the carboxyl groups of the propionate side chains of the substrate.

    \ \

    The crystal structure of human uroporphyrinogen decarboxylase shows it as comprised of a single domain containing a (beta/alpha)8-barrel with a deep active site cleft formed by loops at the C-terminal ends of the barrel strands. \ URO-D is a dimer in solution. Dimerisation juxtaposes the active site clefts of the monomers, suggesting a functionally important interaction between the catalytic centres PUBMED:9564029.

    \ ' '4893' 'IPR000193' '\ Urocanase PUBMED:7944380 (also known as imidazolonepropionate hydrolase or\ urocanate hydratase) is the enzyme that catalyzes the second step in the\ degradation of histidine, the hydration of urocanate into\ imidazolonepropionate.\ \ Urocanase is found in some bacteria (gene hutU), in the\ liver of many vertebrates and has also been found in the plant Trifolium repens (white clover).\ Urocanase is a protein of about 60 Kd, it binds tightly to NAD+ and uses it\ as an electrophil cofactor. A conserved cysteine has been found to be\ important for the catalytic mechanism and could be involved in the binding of\ the NAD+.\ ' '4894' 'IPR001483' '\ Urotensin II, a small peptide that contains a\ disulphide bridge, was originally isolated from the caudal\ portion of the spinal cord of teleost and elasmobranch fish PUBMED:1620290. The peptide has also been found in the brain of frogs PUBMED:1445302. Urotensin II seems to be involved in smooth\ muscle stimulation.\ \ ' '4895' 'IPR003360' '\

    This entry contains US22 family members from the Cytomegalovirus, Muromegalovirus and the Roseolovirus taxonomic groups.

    \ \

    The name sake of this family US22 is an early nuclear protein that is secreted from cells PUBMED:1321206. The US22 family may have a role in virus replication and pathogenesis PUBMED:10405367.

    \

    Herpesviruses are large and complex DNA viruses, widely found in nature. Human cytomegalovirus (HCMV), an important human pathogen, defines the betaherpesvirus family. Mouse cytomegalovirus (MCMV) and rat cytomegalovirus serve as biological model systems for HCMV. HCMV, MCMV, and rat CMV display the largest genomes among the herpesviruses and are essentially co-linear over the central 180 kb of the 230-kb genomes. Betaherpesviruses, which include the CMVs as well as human herpesviruses 6 and 7, differ from alpha- and gammaherpesviruses by the presence of additional gene families such as the US22 gene family, which are mainly clustered at the ends of the genome. The US22 family was first described in HCMV. This gene family comprises 12 members in both HCMV and MCMV and 11 in rat CMV PUBMED:12719548.

    \ \ \

    Members of the US22 gene family are characterised by stretches of hydrophobic and charged residues as well as up to four conserved sequence motifs which are specific for betaherpesviruses. Motif I differs between the HCMV US and UL family members PUBMED:8709220. Motifs I and II have consensus sequences, while motifs III\ and IV are less well defined but have stretches of non-polar residues PUBMED:10405367, PUBMED:8523552. Members of this gene family are widely divergent in function and their involvement in viral replication PUBMED:12719548.

    \ ' '4896' 'IPR000015' '\ In Gram-negative bacteria the biogenesis of fimbriae (or pili) requires a two-\ component assembly and transport system which is composed of a periplasmic\ chaperone (see ) and an outer membrane protein which has been\ termed a molecular \'usher\' PUBMED:7909802, PUBMED:7906265, PUBMED:7906046.

    The usher protein is rather large (from 86 to\ 100 Kd) and seems to be mainly composed of membrane-spanning beta-sheets, a\ structure reminiscent of porins. \ Although the degree of sequence similarity of these proteins is not very high\ they share a number of characteristics. One of these is the presence of two pairs\ of cysteines, the first one located in the N-terminal part and the second\ at the C-terminal extremity that are probably involved in disulphide bonds.\ The best conserved region is located in the central part of these proteins.

    \ ' '4897' 'IPR006955' '\ This domain identifies a group of proteins, which are described as: General vesicular transport factor, Transcytosis associate protein (TAP) and Vesicle docking protein. This myosin-shaped molecule consists of an N-terminal globular head region, a coiled-coil tail which mediates dimerisation, and a short C-terminal acidic region PUBMED:11927603. p115 tethers COP1 vesicles to the Golgi by binding the coiled coil proteins giantin (on the vesicles) and GM130 (on the Golgi), via its C-terminal acidic region. It is required for intercisternal transport in the Golgi stack. This domain is found in the acidic C-terminal region, which binds to the golgins giantin and GM130. p115 is thought to juxtapose two membranes by binding giantin with one acidic region, and GM130 with another PUBMED:12077354.\ ' '4898' 'IPR006953' '\ This domain identifies a group of proteins, which are described as: General vesicular transport factor, Transcytosis associated protein (TAP) or Vesicle docking protein, this myosin-shaped molecule consists of an N-terminal globular head region, a coiled-coil tail which mediates dimerisation, and a short C-terminal acidic region PUBMED:11927603. p115 tethers COP1 vesicles to the Golgi by binding the coiled coil proteins giantin (on the vesicles) and GM130 (on the Golgi), via its C-terminal acidic region. It is required for intercisternal transport in the Golgi stack. This domain is found in the head region. The head region is highly conserved, but its function is unknown. It does not seem to be essential for vesicle tethering PUBMED:11927603. The N-terminal part of the head region contains context-detected Armadillo/beta-catenin-like repeats.\ ' '4899' 'IPR006016' '\ The universal stress protein UspA PUBMED:8152377 is a small cytoplasmic\ bacterial protein whose expression\ is enhanced when the cell is exposed to\ stress agents. UspA enhances the rate of cell survival during\ prolonged exposure to such conditions, and may provide a general\ "stress endurance" activity.\ The crystal structure of Haemophilus influenzae UspA PUBMED:11738040 reveals\ an alpha/beta fold similar to that of the Methanocaldococcus jannaschii (Methanococcus jannaschii)\ MJ0577 protein, which binds ATP PUBMED:9860944, though UspA lacks ATP-binding\ activity.\ ' '4900' 'IPR004937' '\

    Proteins in this entry include low-affinity urea transporters found in the erythrocytes and kidneys of higher organisms. The erythrocyte proteins carry the clinically important Kidd (Jk) blood group antigens which help determine blood type. The two commonest forms are Jk(a) and Jk(b), which arise from a single residue variation at position 280; aspartate in Jk(a) and asparagine in Jk(b) PUBMED:9215669. A much rarer phenotype, Jk(null), arises when the protein is not expressed on the erythrocyte surface, and is linked to a urine-concentrating defect PUBMED:1498276. The Kidd blood group is clinically significant as Jk antibodies can cause acute transfusion reactions and haemolytic disease of the newborn (HDN), where the mother\'s body creates antibodies against the foetal blood cells. HDN associated with Jk antibodies is generally mild, but fatal cases can occur PUBMED:16479082.

    \ \

    The bacterial proteins in this entry also appear to be involved in urea transport, promoting its entry into the cell PUBMED:12180933. This uptake of urea can be advantageous for bacteria as its hydrolysis by urease generates ammonium which is an efficient source of nitrogen and, through its buffering capacity, can also provide resistance to acidic conditions.

    \ ' '4901' 'IPR006038' '\

    Uteroglobin (or blastokinin) is a mammalian steroid-inducible secreted protein originally isolated from the uterus of rabbits during early pregnancy. The mucosal epithelia of several organs that communicate with the external environment express uteroglobin. Its tissue-specific expression is regulated by steroid hormones, and is augmented in the uterus by non-steroidal prolactin. Uteroglobin may be a multi-functional protein with anti-inflammatory/immunomodulatory properties, acting to inhibit phospholipase A2 activity, and binding to (and possibly sequestering) several hydrophobic ligands such as progesterone, retinols, polychlorinated biphenyls, phospholipids and prostaglandins. In addition, uteroglobin has anti-chemotactic, anti-allergic, anti-tumourigenic and embryo growth-stimulatory properties. Uteroglobin may have a homeostatic role against oxidative damage, inflammation, autoimmunity and cancer PUBMED:17916741, PUBMED:17928103, PUBMED:11193760, PUBMED:7770456. Uteroglobin consists of a disulphide-linked dimer of two identical polypeptides, each polypeptide being composed of four helices. It is a member of the secretoglobin superfamily.

    \

    This entry represents uteroglobin proteins from several mammalian species, as well as other members of the secretoglobin superfamily, such as lipophilin B PUBMED:17163411, prostatic steroid-binding protein PUBMED:17641022, mammaglobin PUBMED:18021217, and the related allergen Fel d 1 (Felis domesticus allergen 1) PUBMED:17543334.

    \ ' '4902' 'IPR007144' '\

    A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome PUBMED:12068309, PUBMED:15590835. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:

    \ \

    There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5\' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble PUBMED:15489292. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5\' ends of nascent 18S rRNA.

    \

    This entry contains Utp11, a large ribonuclear protein that associates with snoRNA U3 PUBMED:12068309.

    \ ' '4903' 'IPR007148' '\

    A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome PUBMED:12068309, PUBMED:15590835. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:

    \ \

    There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5\' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble PUBMED:15489292. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5\' ends of nascent 18S rRNA.

    \

    This domain is found at the C terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein and the yeast protein is called Utp12 or DIP2 PUBMED:12068309. Utp12 specifacally interacts with snoRNA U3 and with MPP10.

    \ ' '4904' 'IPR006709' '\

    A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome PUBMED:12068309, PUBMED:15590835. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:

    \ \

    There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5\' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble PUBMED:15489292. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5\' ends of nascent 18S rRNA.

    \

    This entry contains Utp14, a large ribonuclear protein associated with snoRNA U3 PUBMED:12068309.

    \ ' '4905' 'IPR004601' '\

    Schizosaccharomyces pombe ultraviolet damage endonuclease (UVDE or Uve1p) performs the initial step in an alternative excision repair pathway for UV-induced DNA damage. This DNA repair pathway was originally thought to be specific for UV damage, however Uve1p also recognises UV-induced bipyrimidine photoadducts and other non-UV-induced DNA adducts PUBMED:10801329.

    The Deinococcus radiodurans UVSE protein has also shown to be a UV DNA damage endonuclease that catalyzes repair of UV-induced DNA damage by a similar mechanism PUBMED:11807060.

    \ ' '4906' 'IPR000212' '\

    Members of this family are helicases that catalyse ATP dependent\ unwinding of double stranded DNA to single stranded DNA. THe family\ includes both Rep and UvrD helcases.\ The Rep family helicases are composed of four structural domains PUBMED:9288744.\ The Rep proteins function as dimers.

    \ ' '4907' 'IPR003766' '\

    Uronate isomerase (also known as glucuronate isomerase) catalyses the reaction D-glucuronate to D-fructuronate and also converts D-galacturonate to D-tagaturonate PUBMED:9882655.

    \ ' '4908' 'IPR004628' '\

    This Fe2+-requiring enzyme plays a role in D-glucuronate catabolism in Escherichia coli. Mannonate dehydratase converts D-mannonate to 2-dehydro-3-deoxy-D-gluconate. An apparent equivalog is found in a glucuronate utilization operon in Bacillus stearothermophilus T-6.

    \ ' '4909' 'IPR004907' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) () are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release PUBMED:15629643. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c\', c\'\', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins PUBMED:15907459.

    \ \

    This entry represents the C subunit that is part of the V1 complex, and is localised to the interface between the V1 and V0 complexes PUBMED:15951435. This subunit does not show any homology with F-ATPase subunits. The C subunit plays an essential role in controlling the assembly of V-ATPase, acting as a flexible stator that holds together the catalytic (V1) and membrane (V0) sectors of the enzyme PUBMED:15540116. The release of subunit C from the ATPase complex results in the dissociation of the V1 and V0 subcomplexes, which is an important mechanism in controlling V-ATPase activity in cells.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '4910' 'IPR005124' '\ This family represents the eukaryotic vacuolar (H+)-ATPase (V-ATPase) G subunit. V-ATPases generate an acidic environment in several intracellular compartments.\ Correspondingly, they are found as membrane-attached proteins in several organelles. They are also found in the plasma membranes of some specialised cells.\ V-ATPases consist of peripheral (V1) and membrane integral (V0) heteromultimeric complexes. The G subunit is part of the V1 subunit, but is also thought to be\ strongly attached to the V0 complex. It may be involved in the coupling of ATP degradation to H+ translocation.\ ' '4911' 'IPR004908' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) () are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release PUBMED:15629643. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c\', c\'\', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins PUBMED:15907459.

    \ \

    This entry represents subunit H (also known as Vma13p) found in the V1 complex of V-ATPases. This subunit has a regulatory function, being responsible for activating ATPase activity and coupling ATPase activity to proton flow PUBMED:14635776. The yeast enzyme contains five motifs similar to the HEAT or Armadillo repeats seen in the importins, and can be divided into two distinct domains: a large N-terminal domain consisting of stacked alpha helices, and a smaller C-terminal alpha-helical domain with a similar superhelical topology to an armadillo repeat PUBMED:11416198.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '4912' 'IPR004848' '\

    This entry represents a family of viral proteins of unknown function known as the 110 family PUBMED:2325202. They contain a central cysteine rich region with eight conserved cysteines. Some proteins in this entry contain two copies of the cysteine rich region eg .

    \ ' '4913' 'IPR004072' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The rhodopsin-like GPCRs themselves represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7\ transmembrane (TM) helices PUBMED:2111655, PUBMED:2830256, PUBMED:8386361.

    \

    Pheromones have evolved in all animal phyla, to signal sex and dominance\ status, and are responsible for stereotypical social and sexual behaviour among members of the same species. In mammals, these chemical signals are believed to be detected primarily by the vomeronasal organ (VNO), a chemosensory organ located at the base of the nasal septum PUBMED:11163270. The VNO is present in most amphibia, reptiles and non-primate mammals but is absent in birds, adult catarrhine monkeys and apes PUBMED:10531049. An active role for the human VNO in the detection of pheromones is disputed; the VNO is clearly present in the foetus but appears to be atrophied or absent in adults. Three distinct families of putative pheromone receptors have been identified in the vomeronasal organ (V1Rs, V2Rs and V3Rs). All are G protein-coupled receptors but are only distantly related to the receptors of the main olfactory system, highlighting their different role PUBMED:11163270.

    \

    The V1 receptors share between 50 and 90% sequence identity but have little\ similarity to other families of G protein-coupled receptors. They appear to\ be distantly related to the mammalian T2R bitter taste receptors and the\ rhodopsin-like GPCRs PUBMED:10548735. In rat, the family comprises 30-40 genes. These are expressed in the apical regions of the VNO, in neurons expressing Gi2. Coupling of the receptors to this protein mediates inositol trisphosphate signalling PUBMED:11163270. A number of human V1 receptor homologues have also been found. The majority of these human sequences are pseudogenes PUBMED:11116092 but an apparently functional receptor has been identified that is expressed in the human olfactory system PUBMED:10973240.

    \ ' '4914' 'IPR004096' '\ Central cellular functions such as metabolism, solute transport and signal transduction are regulated, in part, via binding of small molecules by specialised domains.\ The 4-vinyl reductase (4VR) domain is a predicted small molecular binding domain, that may bind to hydrocarbons PUBMED:11292341. Proteins that contain this domain include a regulator of the phenol catabolic pathway and a protein involved in chlorophyll biosynthesis.\ ' '4915' 'IPR002490' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis PUBMED:11309608, PUBMED:15629643, PUBMED:15168615. The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.

    \

    This entry represents the 116-kDa subunit (or subunit a) and subunit I found in the V0 or A0 complex of V- or A-ATPases, respectively. The 116-kDa subunit is a transmembrane glycoprotein required for the assembly and proton transport activity of the ATPase complex. Several isoforms of the 116-kDa subunit exist, providing a potential role in the differential targeting and regulation of the V-ATPase for specific organelles PUBMED:9891027.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '4916' 'IPR003436' '\ This is a family of viral fusion proteins from the Chordopoxvirinae. A 14-kDa Vaccinia virus protein has been demonstrated to function as a viral fusion protein mediating cell fusion at endosmomal (low) pH PUBMED:2389560. The protein, found in the envelope fraction of the virions, is required for fusing the outermost of the two golgi-derived membranes enveloping the virus with the plasma membrane, and its subsequent release extracellularly. The N-terminal proximal region is essential for its fusion ability.\ ' '4917' 'IPR003842' '\

    Helicobacter pylori is a micro-aerophilic bacterium with the extraordinary \ ability to establish infections in human stomachs that can last for years or \ decades, despite immune and inflammatory responses and normal turnover of \ the gastric epithelium and overlying mucin layer in which it resides. Most H. pylori strains secrete a toxin (VacA) that induces multiple \ structural and functional alterations in eukaryotic cells. The most \ prominent effect of VacA is its capacity to induce the formation of large \ cytoplasmic vacuoles in eukaryotic cells. In addition, VacA interferes with \ the process of antigen presentation, increases permeability of polarised \ epithelial cell monolayers, and forms anion-selective membrane channels. \ Formation of channels in endosomal membranes of cells may be an important \ feature of the mechanism by which VacA induces cell vacuolation. H. pylori \ vacA encodes a ~139kDa protoxin, which undergoes cleavage of a 33-residue \ N-terminal signal sequence and C-terminal proteolytic processing to \ yield a mature secreted toxin. Purified VacA degrades during prolonged \ storage into two fragments (of ~34 and 58kDa), which are derived from the\ N- and the C-terminus of the toxin respectively. The mass of the\ experimentally intact toxin (~88.2kDa) corresponds closely to the sum of \ the masses of the two proteolytic fragments PUBMED:11160018.\

    \ Secondary structure predictions suggest that a 35kDa portion of the VacA \ C-terminal domain is rich in amphipathic beta-sheets, and this region \ exhibits low-level similarity to members of the family of autotransporter \ proteins. In addition, at the C-terminus of VacA, there is a phenylalanine-\ containing motif that is commonly found in autotransporter proteins, as well\ as in numerous Gram-negative bacterial outer membrane proteins. An intact \ N-terminal portion of VacA is not required for proteolytic processing of the\ protoxin. However, the N-terminal 32 amino acids of the mature VacA are \ predicted to form the only contiguous hydrophobic region in the protein that\ is long enough to span the membrane. What is more, isogenic H. pylori mutant\ strains in which the C-terminal VacA domain is disrupted, fail to express or\ secrete any detectable VacA, which is probably attributable to the \ degradation of export-incompetent toxin precursors within the periplasm. It \ is speculated that the VacA protoxin may undergo proteolytic cleavage at\ multiple sites downstream from amino acid 854 of the protoxin, which would\ yield a 33kDa cell-associated domain, as well as a fragment of ~15kDa PUBMED:11160018.\ \

    \ ' '4918' 'IPR004311' '\

    Proteins containing this domain include a number of Helicobacter pylori outer membrane proteins with multiple copies of this small conserved region.

    \ ' '4919' 'IPR007428' '\

    \ Lipoproteins in Gram-negative microbes also act as structural stabilisers,\ forming non-covalent bonds with peptidoglycan on the outer membrane of the \ cell PUBMED:7542800. Following completion of the genomes of several Gram-negative \ prokaryotes, a putative lipoprotein, VacJ, has been discovered in the raw \ sequence open reading frames. Biochemical analysis of the Shigella flexneri VacJ protein revealed it to be essential for virulence, promoting \ spread of bacterial cells through the intercellular space of tissues PUBMED:8145644. \

    \

    \ Upon expression in the facultative intracellular microbe, host cells form \ membranous protrusions containing the pathogen, allowing it to move to the \ cytoplasm of the next target cell. As homologues of this lipoprotein \ have largely been found in obligate or facultative intracellular microbial \ genomes, it appears to be specific for that particular lifestyle PUBMED:8145644.\

    \ ' '4920' 'IPR007391' '\ Members of this family include vancomycin resistance protein W (VanW). Genes encoding members of this family have been found in vancomycin resistance gene clusters vanB PUBMED:11376048 and vanG PUBMED:11036060. The function of VanW is unknown.\ ' '4921' 'IPR003709' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    These proteins are metallopeptidases belonging to MEROPS peptidase family M15 (clan MD), subfamily M15B (vanY D-Ala-D-Ala carboxypeptidase) and M15C (Ply, L-alanyl-D-glutamate peptidase).

    \ \ \

    Acquired VanA- and VanB-type glycopeptide resistance in enterococci is due to synthesis of modified peptidoglycan precursors terminating in D-lactate. As opposed to VanA-type strains which are resistant to both vancomycin and teicoplanin, VanB-type strains remain teicoplanin susceptible PUBMED:8631706. \ The vanY gene was necessary for synthesis of the vancomycin-inducible D,D-carboxypeptidase activity previously proposed to be responsible for glycopeptide resistance. However, this activity was not required for peptidoglycan synthesis in the presence of glycopeptides PUBMED:1398115.

    \ \

    Bacteriophage lysins (Ply) or endolysins are phage-encoded cell wall lytic enzymes which are synthesised late during virus multiplication and mediate the release of progeny virions. Bacteriophages of the pathogen Listeria monocytogenes encode endolysin enzymes which specifically hydrolyse the cross-linking peptide bridges in Listeria peptidoglycan. Ply118 is a 30.8-kDa\ L-alanoyl-D-glutamate peptidase and Ply511 (36.5 kDa) acts as N-acetylmuramoyl-L-alanine amidase ().

    \ ' '4922' 'IPR006976' '\

    This family contains several examples of the VanZ protein, but also contains examples of phosphotransbutyrylases. VanZ confers low-level resistance to the glycopeptide antibiotic teicoplanin (Te). Analysis of cytoplasmic peptidoglycan precursors, accumulated in the presence of ramoplanin, showed that VanZ-mediated Te resistance does not involve incorporation of a substituent of D-alanine into the peptidoglycan precursors PUBMED:7867956.

    \ ' '4924' 'IPR003633' '\ Variant-surface-glycoprotein phospholipase C, by hydrolysis of the attached glycolipid, releases soluble variant surface glycoprotein containing phosphoinositol from the cell wall after lysis. It catalyses the conversion of variant-surface-glycoprotein 1,2 didecanoyl-SN-phosphatidylinositol and water to 1,2-didecanoylglycerol and the soluble variant-surface-glycoprotein. It also cleaves similar membrane anchors on some mammalian proteins.\ ' '4925' 'IPR002843' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis PUBMED:11309608, PUBMED:15629643, PUBMED:15168615. The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.

    \

    This entry represents subunit C from the A0 complex of A-ATPases, and subunits C and D from the V0 complex of V-ATPases, all of which are involved in the translocation of protons across a membrane. There is more than one type of D subunit in V-ATPases, where the D1 subunit is ubiquitous, while the D2 subunit has limited tissue expressivity, possibly to account for differential functions, targeting or regulation of V-ATPase activity PUBMED:15800125.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '4926' 'IPR002842' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis PUBMED:11309608, PUBMED:15629643, PUBMED:15168615. The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.

    \

    This entry represents subunit E from the V1 and A1 complexes of V- and A-ATPases, respectively. Subunit E appears to form a tight interaction with subunit G in the F0 complex, which together may act as stators to prevent certain subunits from rotating with the central rotary element, much in the same way as the F0 complex subunit B does in F-ATPases PUBMED:15292229. In addition to its key role in stator structure, subunit E appears to have a role in mediating interactions with putative regulatory subunits PUBMED:15751969.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '4927' 'IPR002499' '\

    Vaults are the largest ribonucleoprotein particles known, having a mass of approximately 13 MDa. They are multi-subunit structures that may act as scaffolds for proteins involved in signal transduction and may also play a role in nucleo-cytoplasmic transport. Vaults are present in most normal tissues, but are more highly expressed in epithelial cells with secretory and excretory functions, as well as in cells chronically exposed to xenobiotics, such as bronchial cells and cells lining the intestine PUBMED:16918321. Overexpression of these proteins is linked with multidrug-resistance in cancer cells.

    \ \

    The mammalian vault structure is highly regular and consists of approximately 96 molecules of the 100 kDa major vault protein (MVP), 2 molecules of the 240 kDa minor vault protein TEP1, 8 molecules of the 193 kDa minor vault protein VPARP and at least 6 copies of a small untranslated RNA of 88-141 bases. The MVP molecules form the core of the complex, which is a barrel-like structure with an invaginated waist and two protruding caps. The complex can unfold into two symmetrical flower-like structures with 8 petals each supposedly consisting of 6 MVP molecules PUBMED:10196123.

    \ \

    The MVP protein is composed of two distinct domains PUBMED:16373071. The N-terminal domain contains ~8 copies of the vault repeat (or MVP repeat) in tandem. The MVP repeat is composed of ~53 amino acids and forms a structural part of the vault wall. The C-terminal part of MVP may be involved in oligomerization and be located in the vault cap, while the MVP repeats in the N-terminal part can be packed like staves in a barrel to form the vault wall. The 3D structure of the repeat forms a fold that consists of a three stranded (B) antiparallel beta-sheet in a unique topology B2-B1-B3 and two loops. MVP repeats can be interaction-mediating modules, as MVP repeats 3 and 4 bind VPARP, which is one of the other vault proteins.

    \ ' '4928' 'IPR002714' '\

    This family of proteins is involved in the ubiquitylation and subsequent proteasomal degradation of proteins via the von Hippel-Lindau ubiquitylation complex. They appear to act as the target recruitment subunit in the E3 ubiquitin ligase complex and recruit hydroxylated hypoxia-inducible factor (HIF) under normoxic conditions. They are also involved in transcriptional repression through interaction with HIF1A, HIF1AN and histone deacetylases. Human VHL has been demonstrated to form a ternary complex with elonginB and elonginC proteins PUBMED:10205047. This complex binds Cul2, which then is involved in regulation of vascular endothelial growth factor mRNA.

    \ ' '4929' 'IPR003128' '\

    Villin is an F-actin bundling protein involved in the\ maintenance of the microvilli of the absorptive epithelia. The villin-type "headpiece" domain is a modular motif found at the extreme C-terminus of larger "core" domains in over 25 cytoskeletal proteins in plants and animals, often in assocation with the Gelsolin repeat. Although the headpiece is classified as an F-actin-binding domain, it has been shown that not all headpiece domains are intrinsically F-actin-binding motifs, surface charge distribution may be an important element for F-actin recognition PUBMED:11977079. An autonomously folding, 35 residue, thermostable subdomain (HP36) of the full-length 76 amino acid residue villin headpiece, is the smallest known example of a cooperatively folded domain of a naturally occurring protein. The structure of HP36, as determined by NMR spectroscopy, consists of three short helices surrounding a tightly packed hydrophobic core PUBMED:12095260.

    \ ' '4930' 'IPR006792' '\

    This region is found in plant seed storage proteins, N-terminal to the Cupin domain (). In Macadamia integrifolia (Macadamia nut) (), this region is processed into peptides of approximately 50 amino acids containing a C-X-X-X-C-(10-12)X-C-X-X-X-C motif. These peptides exhibit antimicrobial activity in vitro PUBMED:10571855.

    \ ' '4931' 'IPR000475' '\ The virion infectivity factor (vif) of Human immunodeficiency virus 1 (HIV-1) affects the infectivity of virus particles PUBMED:3497453 to T lymphocytes and macrophages (in some cases\ increasing the infectivity of HIV-1 particles by 100- to 1000-fold), but has no direct effect on transcription, translation or virus release. Vif antibodies are found in the sera of patients at all levels of HIV-1 infection, indicating that vif is expressed in natural infections in vivo. Other lentiviruses, including Simian immunodeficiency virus (SIV-cpz), Visna/Maedi virus, and Feline immunodeficiency virus (FIV), have vif open reading frames, suggesting\ vif plays an essential role during natural infections PUBMED:1357189.\ The expression of vif in BHK-21 cells has been shown to be linked to a\ modification of the C-terminus of gp41env, which modification is inhibited by trans-epoxysuccinyl-L-leucylamido-(4-guanidio)butane (E64), a specific inhibitor of cysteine proteases PUBMED:1995946. Coupled with sequence analysis and the effects of point mutations in vif, it has been suggested that vif could be a cysteine protease. Virions produced in the absence of Vif have abnormal core morphology and those produced in primary T cells carry immature core proteins \ and low levels of mature capsid PUBMED:14618252.\ ' '4932' 'IPR006077' '\

    Vinculin is a eukaryotic protein that seems to be involved in the\ attachment of the actin-based microfilaments to the plasma membrane. Vinculin\ is located at the cytoplasmic side of focal contacts or adhesion plaques\ PUBMED:2112986. In addition to actin, vinculin interacts with other structural\ proteins such as talin and alpha-actinins.

    \

    Vinculin is a large protein of 116 kDa (about a 1000 residues). Structurally the protein consists of an acidic N-terminal domain of about 90 kDa separated from a basic C-terminal domain of about 25 kDa by a proline-rich region of about 50 residues. The central part of the N-terminal domain consists of a variable number (3 in vertebrates, 2 in Caenorhabditis elegans) of repeats of a 110 amino acids domain.

    \

    Alpha-catenins are evolutionary related to vinculin PUBMED:1924379. Catenins are proteins that associate with the cytoplasmic domain of a variety of cadherins. The association of catenins to cadherins produces a complex which is linked to the actin filament network, and which seems to be of primary importance for cadherins cell-adhesion properties. Three different types of catenins seem to exist: alpha, beta, and gamma. Alpha-catenins are proteins of about 100 kDa which are evolutionary related to vinculin. In terms of their structure the most significant differences are the absence, in alpha-catenin, of the repeated domain and of the proline-rich segment.

    \ \ ' '4933' 'IPR003176' '\ This domain represents the C-terminal domain of the viral DNA- binding protein, a multi functional protein involved in DNA replication and transcription control.\ ' '4934' 'IPR005376' '\

    The adenovirus early E2A DNA-binding protein (Ad DBP) is a multifunctional protein required, amongst other things, for DNA\ replication and transcription control. It binds to single- and double-stranded DNA, as well as to RNA, in a sequence-independent\ manner. This signature represents the zinc binding domain of the viral DNA- binding protein, which is active in DNA replication. The zinc atoms appear to be required for the stability of the protein fold rather than being involved in\ direct contacts with the DNA, the protein contains two zinc atoms in\ different, novel coordinations. Two copies of this domain are found at the C-terminus of many members of the family PUBMED:8039495.

    \ ' '4935' 'IPR000416' '\

    The rotavirus outer capsid consists of the coat glycoprotein VP7 and the spike protein VP4. VP4 functions in cell attachment and membrane penetration. VP4 is cleaved by the host\'s trypsin-like proteases to produce two fragments, VP5 and VP8. VP8 functions as a viral haemagglutinin to bind sialic acid receptors on host cell membranes, while VP5 is a membrane penetration protein PUBMED:11462006. The haemagglutinin domain has a beta-sandwich fold similar to that of the sugar-binding galectins family of proteins PUBMED:8131735, PUBMED:11867517. The receptor-binding specificity of rotaviruses by VP4 may be influenced by the VP7 protein.

    \

    More information about haemagglutinin proteins can be found at Protein of the Month: Bird Flu, Haemagglutinin PUBMED:.

    \ ' '4936' 'IPR004909' '\ This family includes the Beet yellows virus heat shock protein 90 homologue and other hypothetical proteins.\ ' '4937' 'IPR007617' '\ This is a family of ssRNA positive-strand viral proteins. Conserved region is found in the Beta C and Beta D transcripts.\ ' '4938' 'IPR000937' '\ The capsid proteins of plant icosahedral positive strand RNA viruses form 4 different domains, a \ positively charged, N-terminal \'R\' domain, which interacts with RNA (66 residues); a connecting arm, \ \'a\' (35 residues); a central, surface \'S\' domain, which forms the virion shell; and a projecting, \ C-terminal \'P\' domain PUBMED:7704529. Some of the viruses lack either the R or P domains. The S domain \ contains from 158 to 166 amino acids and comprises 8 anti-parallel beta-strands, which form a twisted \ sheet or jelly-roll fold. This structure is shared by a number of plant viral capsid proteins, which include:\ Carmovirus, Dianthovirus, Sobemovirus, Tombusvirus and an\ unidentified tobacco necrosis virus PUBMED:1856686.\ ' '4939' 'IPR000635' '\ Although the overall picture of Human cytomegalovirus (HHV-5) DNA synthesis appears typical of the Herpesviruses, some novel features are emerging. Six herpesvirus-group-common genes encode proteins that likely constitute the replication fork machinery, including a two-subunit DNA polymerase, a helicas-primase complex and a single-stranded DNA-binding protein PUBMED:9130047. \

    The Human herpesvirus 1 (HHV-1) single-strand DNA-binding protein ICP8 is a 128-kDa zinc metalloprotein. Photoaffinity labeling has shown that the region encompassing residues 368-902 contains the single-strand DNA-binding site of ICP8 PUBMED:10529391. The HHHV-1 UL5, UL8, and UL52 genes encode an essential heterotrimeric DNA helicase-primase that is responsible for concomitant DNA unwinding and primer synthesis at the viral DNA\ replication fork. ICP8 may stimulate DNA unwinding and enable bypass of cisplatin damaged DNA by recruiting the helicase-primase to the DNA PUBMED:9593724.

    \ ' '4940' 'IPR000606' '\ This family includes RNA helicases thought to be involved in duplex unwinding during viral RNA replication.\ Members of this family are found in positive-strand single stranded RNA viruses from superfamily 1. This helicase has multiple roles at different stages of viral RNA replication, as dissected by mutational analysis PUBMED:10217401.\ ' '4941' 'IPR007609' '\

    This family represents the 18kDa cysteine-rich protein from ssRNA positive strand viruses.

    \ ' '4942' 'IPR002166' '\ The RNA dependent RNA polymerase is also known as non-structural protein NS5B. NS5B is a 65 kDa protein that resembles other viral RNA polymerases. Hepatitis C virus (HCV) replication is thought to occur in membrane bound replication complexes. These complexes transcribe the positive strand and the resulting minus strand is used as a template for the synthesis of genomic RNA. There are two viral proteins involved in the reaction, NS3 and NS5B PUBMED:9343198, PUBMED:8598194, PUBMED:9514871.\ ' '4943' 'IPR003365' '\

    Proteins in this entry are essential for the replication of viral ssDNA. The closed circular ssDNA genome is first converted to a superhelical dsDNA. Rep and/or Rep\' binds a specific hairpin at the genome origin of replication introducing an endonucleolytic nick within the conserved sequence 5\'-AGTATTAC-3\'.\ This initiates rolling circle replication (RCR). Following cleavage, the protein binds covalently to the 5\'-phosphate of DNA as a tyrosyl ester. The cleavage gives rise to a free 3\'-OH that serves as a primer for the cellular DNA polymerase. The polymerase synthesizes the (+) strand DNA by rolling circle mechanism. After one round of replication, a Rep-catalyzed nucleotidyl transfer reaction releases a circular single-stranded virus genome, thereby terminating the replication.

    \ ' '4944' 'IPR007792' '\ This family includes the Type IV secretory pathway VirB3 protein, that is found associated with bacterial inner and outer membranes and assists T pilus formation as an assembly factor PUBMED:8405938.\ ' '4945' 'IPR007430' '\ VirB8 is a bacterial virulence protein with cytoplasmic, transmembrane, and periplasmic regions. It is thought that it is a primary constituent of a DNA transporter. The periplasmic region interacts with VirB9, VirB10, and itself PUBMED:11371528.\ ' '4947' 'IPR000052' '\

    Potexviruses and Carlaviruses are plant-infecting viruses whose genome consist of a single-stranded RNA molecule encapsided in a coat protein. The genome of many Potexviruses is known and their coat protein sequence has been shown to be rather well conserved PUBMED:2738582. The same observation applies to the coat protein of a variety of Carlaviruses whose sequences are related to those of Potexviruses PUBMED:2732711, PUBMED:1629709. The coat proteins of Potexviruses and of Carlaviruses\ contain from 190 to 300 amino acid residues. The best conserved region of these coat proteins is located in the central part.

    \ ' '4948' 'IPR001747' '\

    This entry represents a conserved region found in several lipid transport proteins, including vitellogenin, microsomal triglyceride transfer protein and apolipoprotein B-100 PUBMED:9687371.

    \

    Vitellinogen precursors provide the major egg yolk proteins that are a source of nutrients during early development of oviparous vertebrates and invertebrates. Vitellinogen precursors are multi-domain apolipoproteins that are cleaved into distinct yolk proteins. Different vitellinogen precursors exist, which are composed of variable combinations of yolk protein components; however, the cleavage sites are conserved. In vertebrates, a complete vitellinogen is composed of an N-terminal signal peptide for export, followed by four regions that can be cleaved into yolk proteins: lipovitellin-1, phosvitin, lipovitellin-2, and a von Willebrand factor type D domain (YGP40) PUBMED:17314313, PUBMED:12135361.

    \

    Microsomal triglyceride transfer protein (MTTP) is an endoplasmic reticulum lipid transfer protein involved in the biosynthesis and lipid loading of apolipoprotein B. MTTP is also involved in the late stage of CD1d trafficking in the lysosomal compartment, CD1d being the MHC I-like lipid antigen presenting molecule PUBMED:17403933.

    \

    Apolipoprotein B can exist in two forms: B-100 and B-48. Apoliporotein B-100 is present on several lipoproteins, including very low-density lipoproteins (VLDL), intermediate density lipoproteins (IDL) and low density lipoproteins (LDL), and can assemble VLDL particles in the liver PUBMED:16238675. Apolipoprotein B-100 has been linked to the development of atherosclerosis.

    \ ' '4949' 'IPR007782' '\ Using reduced vitamin K, oxygen, and carbon dioxide, gamma-glutamyl carboxylase post-translationally modifies certain glutamates by adding carbon dioxide to the gamma position of those amino acids. In vertebrates, the modification of glutamate residues of target proteins is facilitated by an interaction between a propeptide present on target proteins and the gamma-glutamyl carboxylase PUBMED:10748045.\ ' '4950' 'IPR006743' '\ This repeat is found in the extracellular (C-terminal) region of the variant surface antigen A (VlpA) of Mycoplasma hyorhinis. Mutations that change the number of repeats in the protein are involved in antigenic variation and immune evasion of this swine pathogen PUBMED:10671459.\ ' '4951' 'IPR002588' '\ This RNA methyltransferase domain PUBMED:10364504 is found in a wide range of ssRNA viruses, which include: \ Hordeivirus, \ Tobravirus, \ Tobamovirus, \ Bromovirus, \ Closterovirus and \ Calicivirus. This methyltransferase is involved\ in mRNA capping. Capping of mRNA enhances its stability. This usually\ occurs in the nucleus. Therefore, many viruses that replicate\ in the cytoplasm encode their own PUBMED:10364504.\ ' '4952' 'IPR000349' '\ This family contains the major surface antigens of the hepatitus viruses (Hepadnaviridae). The protein is most likely required for an early step of the life cycle involving entry or uncoating of virus particles.\ ' '4953' 'IPR005515' '\

    VOMI binds tightly to ovomucin fibrils of the egg yolk membrane. The structure PUBMED:8131734 consists of three beta-sheets forming Greek key motifs, which are related by an internal pseudo three-fold symmetry. Furthermore, the structure of VOMI has strong similarity to the structure of the delta-endotoxin, as well as a carbohydrate-binding site in the top region of the common fold PUBMED:8848836.

    \ ' '4954' 'IPR001963' '\ Glycoprotein VP7, also known as outer shell glycoprotein, is a serotype-specific antigen, and is the major neutralisation antigen. It is found in the dsRNA rotaviruses.\ ' '4955' 'IPR000012' '\ Human immunodeficiency virus (HIV) is the human retrovirus associated with AIDS (acquired immune deficiency syndrome), and SIV its simian counterpart. Three main groups of primate lentivirus are known, designated Human immunodeficiency virus 1 (HIV-1), Human immunodeficiency virus 2 (HIV-2)/Simian immunodeficiency virus - mac (SIVMAC)/Simian immunodeficiency virus - sm (SIVSM) and Simian immunodeficiency virus - agm (SIVAGM). Simian immunodeficiency virus - mnd (SIVMND) has been suggested to represent a fourth distinct group PUBMED:2797181. These groups are believed to have diverged from a common ancestor long before the spread of AIDS in humans. Genetic variation in HIV-1 and HIV-2 has been studied extensively, and the nucleotide sequences reported for several strains PUBMED:2611042.

    ORF analysis has revealed two open reading frames, yielding the so-called R- and X-ORF proteins, whose functions are unknown, but which show a high degree of sequence similarity.

    \ ' '4956' 'IPR005377' '\

    The movement of lipid and protein components between intracellular organelles requires the regulated interactions of many molecules. Vacuolar protein sorting-associated protein (Vps)5 is a yeast protein that is a subunit of a large multimeric complex, termed the retromer complex, involved in retrograde transport of proteins from endosomes to the trans-Golgi network. Sorting nexin (SNX) 1 and SNX2 are its mammalian orthologs PUBMED:11102511.

    \ \

    To carry out its biological functions, Vps5 forms the retromer complex\ with at least four other proteins: Vps17, Vps26, Vps29, and Vps35 PUBMED:11102511. This family of Vps26-proteins also contains Down syndrome critical region 3/A.

    \ ' '4957' 'IPR007143' '\

    The Endosomal Sorting Complex Required for Transport (ESCRT) complexes form the machinery driving protein sorting from endosomes to lysosomes. ESCRT complexes are central to receptor down-regulation, lysosome biogenesis, and budding of HIV. Yeast ESCRT-I consists of three protein subunits, VPS23, VPS28, and VPS37. In humans, ESCRT-I comprises TSG101, VPS28, and one of four potential human VPS37 homologues. The main role of ESCRT-I is to recognise ubiquitinated cargo via the UEV domain of the VPS23/TSG101 subunit. The assembly of the ESCRT-I complex is directed by the C-terminal steadiness box (SB) of VPS23, the N-terminal half of VPS28, and the C-terminal half of VPS37. The structure is primarily composed of three long, parallel helical hairpins, each corresponding to a different subunit. The additional domains and motifs extending beyond the core serve as gripping tools for ESCRT-I critical functions PUBMED:16615893, PUBMED:16615894.

    \ ' '4958' 'IPR007262' '\ Vps55 is involved in the secretion of the Golgi form of the soluble vacuolar carboxypeptidase Y, but not the trafficking of the membrane-bound vacuolar alkaline phosphatase. Both Vps55 and obesity receptor gene-related protein are important for functioning membrane trafficking to the vacuole/lysosome of eukaryotic cells PUBMED:12006663.\ ' '4959' 'IPR008187' '\ The Human immunodeficiency virus 1 (HIV-1) Vpu protein acts in the degradation of CD4 in the endoplasmic reticulum and in the enhancement of virion release from the plasma membrane of infected cells PUBMED:7853484.\ ' '4960' 'IPR004603' '\

    This entry represents VSR (very short patch repair) endonucleases, which occur in a variety of bacteria. VSR recognises a TG mismatched base pair, generated after spontaneous deamination of methylated cytosines, and cleaves the phosphate backbone on the 5\' side of the thymine PUBMED:10612397. GT mismatches can lead to C-to-T transition mutations if not repaired. VSR repairs the mismatches in favour of the G-containing strand. In Escherichia coli, this endonuclease nicks double-stranded DNA within the sequence CT(AT)GN or NT(AT)GG next to the thymidine residue, which is mismatched to 2\'-deoxyguanosine PUBMED:12067333. The incision is mismatch-dependent and strand specific. The structure of VSR is similar to the core structure of restriction endonucleases, which have a 3-layer alpha/beta/alpha topology PUBMED:12626704.

    \ ' '4961' 'IPR001007' '\ The vWF domain is found in various plasma proteins:\ complement factors B, C2, CR3 and CR4; the integrins (I-domains); collagen \ types VI, VII, XII and XIV; and other extracellular proteins PUBMED:8412987, PUBMED:8145250, PUBMED:1864378. Although the majority of VWA-containing proteins are extracellular, the most ancient ones present in all eukaryotes are all intracellular proteins involved in functions such as transcription, DNA repair, ribosomal and membrane transport and the proteasome. A common feature appears to be involvement in multiprotein complexes. Proteins\ that incorporate vWF domains participate in numerous biological events\ (e.g. cell adhesion, migration, homing, pattern formation, and signal\ transduction), involving interaction with a large array of ligands PUBMED:8412987. A number of human diseases arise from mutations in VWA domains. Secondary structure prediction from 75 aligned vWF sequences has revealed a largely alternating sequence of alpha-helices and beta-strands PUBMED:8145250.\ The domain is named after the von Willebrand factor (VWF) type C repeat which is found in multidomain protein/multifunctional proteins involved in maintaining homeostasis PUBMED:3495268, PUBMED:1864378. For the von Willebrand factor the duplicated VWFC domain is thought to participate in oligomerization, but not in the initial dimerization step PUBMED:2007623. The presence of this region in a number of other complex-forming proteins points to the possible involvment of the VWFC domain in complex formation.\ ' '4962' 'IPR003307' '\

    This entry represents the W2 domain (two invariant tryptophans) and is a region of ~165 amino acids which is found in the C-terminus of the following eIFs PUBMED:8520487, PUBMED:14681227, PUBMED:16616930, PUBMED:16781736:\

    \

    \ \ \

    Translation initiation is a sophisticated, well regulated and highly coordinated cellular process in eukaryotes, in which at least 11 eukayrotic initiation factors (eIFs) are included PUBMED:8520487.

    \ \

    The W2 domain has a globular fold and is exclusively composed out of alpha-helices PUBMED:14681227, PUBMED:16616930, PUBMED:16781736. The structure can be divided into a structural C-terminal core onto which the two N-terminal helices are attached. The core contains two aromatic/acidic residue-rich regions (AA boxes), which are important for mediating protein-protein interactions.

    \ \

    The entry covers the entire W2 domain.

    \ ' '4963' 'IPR005159' '\

    The WCCH motif is found in a retrotransposons and Gemini viruses. A specific function has not been associated to this motif PUBMED:11600699.

    \ ' '4964' 'IPR000738' '\ A conserved domain of 46 amino acids, called WHEP-TRS has been shown PUBMED:1756734 to exist in \ a number of higher eukaryote aminoacyl-transfer RNA synthetases. This domain is present one to six\ times in the several enzymes. There are three copies in mammalian multifunctional aminoacyl-tRNA \ synthetase in a region that separates the N-terminal glutamyl-tRNA synthetase domain from the \ C-terminal prolyl-tRNA synthetase domain, and six copies in the intercatalytic region of the Drosophila enzyme. The domain is found at the N-terminal extremity of the mammalian tryptophanyl-\ tRNA synthetase and histidyl-tRNA synthetase, and the mammalian, insect, nematode and plant glycyl-\ tRNA synthetases PUBMED:8463296. This domain could contain a central alpha-helical region and \ may play a role in the association of tRNA-synthetases into multienzyme complexes.\ ' '4965' 'IPR005817' '\

    Wnt proteins constitute a large family of secreted molecules that are\ involved in intercellular signalling during development. The name derives\ from the first 2 members of the family to be discovered: int-1 (mouse) and\ wingless (Drosophila) PUBMED:9891778. It is now recognised that Wnt signalling controls many cell fate decisions in a variety of different organisms, including mammals PUBMED:10508601. Wnt signalling has been implicated in tumourigenesis, early mesodermal patterning of the embryo, morphogenesis of the brain and kidneys, regulation of mammary gland proliferation and Alzheimer\'s disease PUBMED:10967351, PUBMED:9192851.

    \ \

    Wnt-mediated signalling is believed to proceed initially through binding to\ cell surface receptors of the frizzled family; the signal is subsequently\ transduced through several cytoplasmic components to B-catenin, which enters\ the nucleus and activates the transcription of several genes important in\ development PUBMED:10733430. More recently, however, several non-canonical Wnt signalling pathways have been elucidated that act independently of B-catenin. Members of the Wnt gene family are defined by their sequence similarity to mouse Wnt-1 and Wingless in Drosophila. They encode proteins of ~350-400 residues in length, with orthologues identified in several,\ mostly vertebrate, species. Very little is known about the structure of \ Wnts as they are notoriously insoluble; but they share the following features characteristics of secretory proteins: a signal peptide, several potential N-glycosylation sites and 22 conserved cysteines PUBMED:9891778 that are probably involved in disulphide bonds. The Wnt proteins seem to adhere to the plasma membrane of the secreting cells and are therefore likely to signal over only few cell diameters. Fifteen major Wnt gene families have been \ identified in vertebrates, with multiple subtypes within some classes.

    \ \

    This entry represents Wnt-1 (previously known as int-1) is a proto-oncogene induced by the integration of the mouse mammary tumor virus. It is thought to play a role in intercellular communication and seems to be a signalling molecule important in the development of the central nervous system (CNS). The sequence of wnt-1 is highly conserved in mammals, fish, and amphibians. Wnt-1 is a member of a large family of related proteins that are all thought to be developmental regulators. These proteins are known as wnt-2 (also known as irp), wnt-3 up to wnt-15. At least four members of this family are present in Drosophila, one of them, wingless (wg), is implicated in segmentation polarity.

    \ ' '4966' 'IPR002889' '\ The WSC domain is a putative carbohydrate binding domain. The domain\ contains up to eight conserved cysteine residues that may be involved\ in disulphide bridges.\ The Trichoderma harzianum beta-1,3 exoglucanase contains two copies of the WSC domain, while the yeast SLG1 protein contains only one.\ ' '4967' 'IPR000976' '\ Wilm\'s tumour (WT) is an embryonal malignancy of the kidney, affecting around 1 in 10,000 infants. It \ occurs in both sporadic and hereditary forms. Inactivation of WT1 is one of the causes of Wilm\'s tumour. \ Defects in the WT1 gene are also associated with Denys-Drash Syndrome (DDS), which is characterised by \ typical nephropathy and genital abnormalities. The WT1 gene product shows similarity to the zinc fingers \ of the mammalian growth regulated EGR1 and EGR2 proteins PUBMED:8393820, PUBMED:1671709, PUBMED:2154702, PUBMED:1317572.\ ' '4968' 'IPR004982' '\

    This is a family of mainly hypothetical Schizosacchoromyces pombe proteins that are often encoded near long terminal repeats within the genome. Their function is unknown but they contain several predicted transmembrane regions and at least one protein is up-regulated during meiosis PUBMED:11376151. Upregulation is also observed in histone deacetylase mutants, indicating their transcription is normally inhibited by hypoacetylation PUBMED:12952871.

    \ ' '4969' 'IPR007016' '\

    This group of bacterial proteins are membrane proteins, which include O-antigen ligases (e.g. ) and putative hydrogen carbonate transporters PUBMED:9688546.

    \ ' '4970' 'IPR000236' '\

    The Hepatitis B virus (HBV) X gene shares sequences with both the polymerase and precore genes, carries several regulatory signals critical to the replicative cycle, and its product has a transactivating function PUBMED:7561749. The transactivating function is probably associated with a tumourigenic potential of HBx, since x gene sequences, encoding functional HBx, have been repeatedly found integrated into the genome of liver carcinoma cells PUBMED:8530810.

    \ ' '4971' 'IPR006043' '\

    This entry represents a susbset of the wider APC (Amino acid-Polyamine-organoCation) superfamily of transporters PUBMED:10931886. Characterised proteins in this entry include:

    \ \

    These proteins generally contain 12 transmembrane regions. Many members of this family are uncharacterised and may transport other substrates eg. RutG is likely to transport pyrimidines into the cell PUBMED:16540542.

    \ \ \ \ ' '4972' 'IPR007005' '\ These proteins are found in a wide range of eukaryotes. Their function is uncertain though they are nuclear proteins, possibly with DNA-binding activity.\ ' '4973' 'IPR003777' '\

    This entry is often found in association with an NAD-binding region, related to TrkA-N (). XdhC is believed to be involved in the attachment of molybdenum to Xanthine Dehydrogenase PUBMED:10217763.

    \ ' '4974' 'IPR005593' '\

    Phosphoketolases (PK) are key enzymes of the pentose phosphate pathway of heterofermentative and facultative homofermentative lactic acid bacteria and of the D-fructose 6-phosphate shunt of bifidobacteria. PK activity has been sporadically reported in other microorganisms including eukaryotic yeasts. Xylulose-5-phosphate/fructose-6-phosphate phosphoketolase is a thiamine diphosphate (ThdP)-dependent enzyme found in bacteria such as Bifidobacterium sp PUBMED:11292814, PUBMED:15899413. This enzyme has dual-specificity with the following catalytic activities:

    \

    \

    This family is distantly related to transketolases, e.g. .

    \ \ ' '4975' 'IPR005379' '\

    The XH (rice gene X Homology) domain is found in a family of plant proteins including Oryza sativa (Rice) . The molecular function of these proteins is unknown, however these proteins usually contain an XS domain () that is also found in the PTGS protein SGS3. As the XS and XH domains are fused in most of these proteins, these two\ domains may interact. The XH domain is between 124 and 145 residues in\ length and contains a conserved glutamate residue that may be functionally important PUBMED:12162795.

    \ ' '4976' 'IPR005380' '\

    The XS (rice gene X and SGS3) domain is found in a family of plant proteins including gene X and SGS3 . SGS3 is thought to be involved in post-transcriptional gene silencing (PTGS). This domain contains a conserved aspartate residue that may be functionally important.

    The XS domain containing proteins contain coiled-coils, which suggests that they will\ oligomerise. Most coiled-coil proteins form either a dimeric or a trimeric structure. It is possible that different members\ of the XS domain family could oligomerise via their coiled-coils forming a variety of complexes PUBMED:12162795.

    \ ' '4977' 'IPR007636' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents type II restriction enzymes such as XhoI (), which recognises the double-stranded sequence CTCGAG and cleave after C-1 PUBMED:3001639.

    \ ' '4978' 'IPR000538' '\ The link domain PUBMED:8318021 is a hyaluronan(HA)-binding region found in proteins of vertebrates that are involved in the assembly of extracellular matrix, cell adhesion, and migration. The structure has been shown PUBMED:8797823 to consist of two alpha helices and two antiparallel beta sheets arranged around a large hydrophobic core similar to that of C-type\ lectin. This domain contains four conserved cysteines involved in two disulphide bonds. The link domain has also been termed HABM PUBMED:8318021 (HA binding module) and PTR PUBMED:8690089 (proteoglycan tandem repeat). Proteins with such a domain include the proteoglycans aggrecan, brevican, neurocan and versican, which are expressed in the CNS; the cartilage link protein (LP), a proteoglycan that together with HA and aggrecan forms multimolecular aggregates; Tumour necrosis factor-inducible protein TSG-6, which may be involved in cell-cell and cell-matrix interactions during inflammation and tumourgenesis; and CD44 antigen, the main cell surface receptor for HA.\ ' '4979' 'IPR006086' '\

    Xeroderma pigmentosum (XP) PUBMED:8160271 is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People\'s skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair PUBMED:8464724, PUBMED:8206890. XP-G can be corrected by a 133 Kd nuclear protein, XPGC PUBMED:8160271. XPGC is an acidic protein that confers normal UV resistance in expressing cells PUBMED:8206890. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms PUBMED:8206890, PUBMED:8090225. XPGC cleaves one strand of the duplex at the border with the single-stranded region PUBMED:8090225.

    \

    XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker\'s yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases PUBMED:8090225, PUBMED:8247134; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5\'-3\' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.

    \ ' '4980' 'IPR006085' '\

    Xeroderma pigmentosum (XP) PUBMED:8160271 is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People\'s skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair PUBMED:8464724, PUBMED:8206890. XP-G can be corrected by a 133 Kd nuclear protein, XPGC PUBMED:8160271. XPGC is an acidic protein that confers normal UV resistance in expressing cells PUBMED:8206890. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms PUBMED:8206890, PUBMED:8090225. XPGC cleaves one strand of the duplex at the border with the single-stranded region PUBMED:8090225.

    \

    XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker\'s yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases PUBMED:8090225, PUBMED:8247134; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5\'-3\' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.

    \

    This entry represents the N-terminal of XPG.

    \ ' '4981' 'IPR002706' '\

    DNA-repair protein Xrcc1 functions in the repair of single-strand DNA breaks in mammalian cells and forms a repair complex with beta-Pol, ligase III and PARP PUBMED:10467087. The NMR solution structure of the Xrcc1 N-terminal domain (Xrcc1 NTD) shows that the structural core is a beta-sandwich with beta-strands connected by loops, three helices and two short two-stranded beta-sheets at each connection side. The Xrcc1 NTD specifically binds single-strand break DNA (gapped and nicked) and a gapped DNA-beta-Pol complex PUBMED:10467102.

    \ ' '4983' 'IPR006031' '\

    This repeat is found in a wide variety of proteins and generally consists of the motif XYPPX where X can be any amino acid. The family includes annexin VII ANX7_DICDI, the carboxy tail of certain rhodopsins OPSD_LOLSU. This family also includes plaque matrix proteins, however this motif is embedded in a ten residue repeat in FP1_MYTED. The molecular function of this repeat is unknown. It is also not clear is all the members of this family share a common evolutionary ancestor due to its short length and biased amino acid composition.

    \ ' '4984' 'IPR006780' '\

    YABBY proteins are a group of plant-specific transcription factors involved in the specification of abaxial polarity in lateral organs such as leaves and floral organs PUBMED:10679447, PUBMED:11858837.

    \ ' '4985' 'IPR005594' '\ This region represents the C-terminal 120 amino acids of a family of surface-exposed bacterial proteins. YadA, an adhesin from Yersinia, was the first member of this family to be characterised. UspA2 from Moraxella was second. The Eib immunoglobulin-binding proteins from E. coli were third, followed by the DsrA proteins of Haemophilus ducreyi, amongst others. These proteins are homologous at their C-terminal and have predicted signal sequences, but they diverge elsewhere. The C-terminal 9 amino acids, consisting of alternating hydrophobic amino acids ending in F or W, comprise a targeting motif for the outer membrane of the Gram negative cell envelope. This region is important for oligomerisation PUBMED:11705900.\ ' '4986' 'IPR003849' '\

    This entry describes proteins of unknown function.

    \ ' '4987' 'IPR007214' '\ This domain of unknown function is found in numerous prokaryote organisms. The structure of YbaK shows a novel fold. This domain also occurs in a number of prolyl-tRNA synthetases (proRS) from prokaryotes. Thus, the domain is thought to be involved in oligonucleotide binding, with possible roles in recognition/discrimination or editing of prolyl-tRNA PUBMED:10813833.\ ' '4988' 'IPR003359' '\

    Photosystem I (PSI) is a large protein complex embedded within the photosynthetic thylakoid membrane. It consists of 11 subunits, ~100 chlorophyll a molecules, 2 phylloquinones, and 3 Fe4S4-clusters. The three dimensional structure of the PSI complex has been resolved at 2.5 A PUBMED:11418848, which allows the precise localisation of each cofactor. PSI together with photosystem II (PSII) catalyses the light-induced steps in oxygenic photosynthesis - a process found in cyanobacteria, eukaryotic algae (e.g. red algae, green algae) and higher plants.

    \

    To date, three thylakoid proteins involved in the stable accumulation of PSI have been identified: BtpA () PUBMED:9045660, Ycf3 PUBMED:9321389, PUBMED:9314531, and Ycf4 PUBMED:9321389. Because translation of the psaA and psaB mRNAs encoding the two reaction centre polypeptides, of PSI and PSII respectively, is not affected in mutant strains lacking functional ycf3 and ycf4, the products of these two genes appear to act at a post-translational step of PSI biosynthesis.\ These gene products are therefore involved either in the stabilisation or in the assembly of the PSI complex. However, their exact roles remain unknown. The BtpA protein appears to act at the level of PSI stabilisation PUBMED:10806238. It is an extrinsic membrane protein located on the cytoplasmic side of the thylakoid membrane PUBMED:10103064, PUBMED:10806238. Homologs of BtpA are found in the crenarchaeota and euryarchaeota, where their function remains unknown. The Ycf4 protein is firmly associated with the thylakoid membrane, presumably through a transmembrane domain PUBMED:9321389. Ycf4 co-fractionates with a protein complex larger than PSI upon sucrose density gradient centrifugation of solubilised thylakoids PUBMED:9321389. The Ycf3 protein is loosely associated with the thylakoid membrane and can be released from the membrane with sodium carbonate. This suggests that Ycf3 is not part of a stable complex and that it probably interacts transiently with its partners PUBMED:11752384. Ycf3 contains a number of tetratrico peptide repeats (TPR, ); TPR is a structural motif present in a wide range of proteins, which mediates protein-protein interactions.

    \ \ ' '4989' 'IPR002644' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents PsbZ (Ycf9), which is a core low molecular weight transmembrane protein of photosystem II in thylakoid-containing chloroplasts of cyanobacteria and plants. It is thought to be located at the interface of PSII and LHCII (light-harvesting complex II) complexes, the latter containing the light-harvesting antenna. PsbZ appears to act as a structural factor, or linker, that stabilises the PSII-LHCII supercomplexes, which fail to form in PsbZ-deficient mutants. This may in part be due to the marked decrease in two LHCII antenna proteins, CP26 and CP29, found in PsbZ-deficient mutants, which result in structural changes, as well as functional modifications in PSII PUBMED:11402165. PsbZ may also be involved in photo-protective processes under sub-optimal growth conditions.

    \ ' '4990' 'IPR005545' '\

    The majority of proteins in this group contain a single copy of this domain, though it is also found as a repeat (e.g. in ). A strongly conserved histidine and a aspartate suggest that the domain has an enzymatic function. This entry also covers what was previously known as the DGPF domain (COG3795). Although its function is unknown it is found fused to a sigma-70 factor family domain in , suggesting that this domain may plays a role in transcription initiation. This domain is named after the most conserved motif in the alignment.

    \ ' '4991' 'IPR003105' '\

    This domain has been termed SRA-YDG, for SET and Ring finger Associated, and because of the conserved YDG motif within the domain. Further characteristics of the domain are the conservation of up to 13 evenly spaced glycine residues and a VRV(I/V)RG motif. The domain is mainly found in plants and animals and in bacteria. In animals, this domain is associated with the Np95-like ring finger protein and the related gene product Np97, which contains PHD and RING FINGER domains and which is an important determinant in cell cycle progression. Np95 is a chromatin-associated ubiquitin ligase, binding to histones is direct and shows a remarkable preference for histone H3 and its N-terminal tail. The SRA-YDG domain contained in Np95 is indispensable both for the interaction with histones and for chromatin binding in vivo PUBMED:9880673, PUBMED:14993289.\ In plants the SRA-YDG domain is associated with the SET domain, found in a family of histone methyl transferases, and in bacteria it is found in association with HNH, a non-specific nuclease motif PUBMED:14993289, PUBMED:11691919.

    \ \ ' '4992' 'IPR006879' '\ This is a family of YdjC-like proteins. It is possibly involved in the the cleavage of cellobiose-phosphate PUBMED:8407820.\ ' '4993' 'IPR000420' '\ A number of yeast cell wall glycoproteins are characterised by the presence of\ tandem repeats of a region of 18 to 19 residues PUBMED:8322511, PUBMED:9301021.\ ' '4994' 'IPR005033' '\

    Named the YEATS family, after \'YNK7\', \'ENL\', \'AF-9\', and \'TFIIF small subunit\', this family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity.

    \ ' '4995' 'IPR003951' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases correspond to MEROPS peptidase family C58 (clan CA). They are found in bacteria that include plant pathogens (Pseudomonas syringae), root nodule bacteria, and intracellular pathogens (e.g. Yersinia pestis, Haemophilus ducreyi, Pasteurella multocida, Chlamydia trachomatis) of animal hosts. The peptidase domain features a catalytic triad of Cys, His, and Asp. Sequences can be extremely divergent outside of a few well-conserved motifs. YopT, a virulence effector protein of Y. pestis, cleaves and releases host cell Rho GTPases from the membrane, thereby disrupting the actin cytoskeleton. Members of the family from pathogenic bacteria are likely to be pathogenesis factors PUBMED:12062101.

    \

    Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior. There have been four secretion systems described in animal enteropathogens such as Salmonella and Yersinia, with further sequence homologies in plant pathogens like Ralstonia and Erwinia PUBMED:9618447.

    \

    The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell and is only triggered when the bacterium comes into close contact with the host. The protein subunits of the system are very similar to those of bacterial flagellar biosynthesis. However, while the latter forms a ring structure to allow secretion of flagellin and is an integral part of the flagella itself, type III subunits in the outer membrane translocate secreted proteins through a channel-like structure PUBMED:9618447.

    \

    Exotoxins secreted by the type III system do not possess a secretion signal, and are considered unique because of this PUBMED:9618447. Y. pestis secretes such a protein, YopT PUBMED:9746557. YopT is injected into the host cell upon contact, and is therefore considered to be a virulence factor. Haemophilus spp. express a similar toxin on their surface, a 76kDa antigen PUBMED:9746557.

    \ ' '4996' 'IPR014773' '\ Secretion of virulence factors in Gram-negative bacteria involves \ transportation of the protein across two membranes to reach the cell \ exterior. There have been four secretion systems described in \ animal enteropathogens, such as Salmonella and Yersinia, with further \ sequence similarities in plant pathogens like Ralstonia and Erwinia PUBMED:9618447.\ \

    The type III secretion system is of great interest, as it is used to \ transport virulence factors from the pathogen directly into the host cell \ and is only triggered when the bacterium comes into close contact with\ the host. The protein subunits of the system are very similar to those of \ bacterial flagellar biosynthesis. However, while the latter forms a\ ring structure to allow secretion of flagellin and is an integral part of\ the flagellum itself PUBMED:9618447, type III subunits in the outer membrane \ translocate secreted proteins through a channel-like structure.

    \ \

    Exotoxins secreted by the type III system do not possess a secretion signal,\ and are considered unique for this reason PUBMED:9618447. Yersinia secrete a Rho GTPase-activating protein, YopE PUBMED:2307658, PUBMED:2191183, that disrupts the host cell actin cytoskeleton. YopE is regulated by another bacterial gene, SycE PUBMED:10419539, that enables the exotoxin to remain soluble in the bacterial cytoplasm. A similar protein, exoenzyme S from Pseudomonas aeruginosa, has both ADP-ribosylation and GTPase activity PUBMED:2191183, PUBMED:10419539.

    \ ' '4997' 'IPR005587' '\

    This presumed family is about 160 residues long. It is found in archaebacteria and eubacteria. In it is associated with a helix-turn-helix domain. This suggests that this may be a ligand-binding family.

    \ ' '4998' 'IPR003526' '\

    MECDP (2-C-methyl-D-erythritol 2,4-cyclodiphosphate) synthetase, an enzyme in the non-mevalonate pathway of isoprenoid synthesis, isoprenoids being essential in all organisms. Isoprenoids can also be synthesized through the mevalonate pathway. The non-mevolante route is used by many bacteria and human pathogens, including Mycobacterium tuberculosis and Plasmodium falciparum. This route appears to involve seven enzymes. MECDP synthetase catalyses the intramolecular attack by a phosphate group on a diphosphate, with cytidine monophosphate (CMP) acting as the leaving group to give the cyclic diphosphate product MEDCP. The enzyme is a trimer with three active sites shared between adjacent copies of the protein. The enzyme also has two metal binding sites, the metals playing key roles in catalysisPUBMED:12499535.

    \

    A number of proteins from eukaryotes and prokaryotes share this common N-terminal signature and appear to be involved in terpenoid biosynthesis. The ygbB protein is a putative enzyme of this type PUBMED:10694574.

    \ ' '4999' 'IPR003425' '\ This family consists of a repeat found in conserved hypothetical integral membrane proteins. The function of this region and the proteins which possess it is unknown.\ ' '5000' 'IPR007029' '\ This short presumed domain is about 50 amino acid residues long. It often contains two cysteines that may be functionally important. This domain is found in copper transporting ATPases, some phenol hydroxylases and in a set of uncharacterised membrane proteins including . This domain is named after three of the most conserved amino acids it contains. The domain may be metal binding, possibly copper ions. This domain is duplicated in some copper transporting ATPases.\ ' '5001' 'IPR013527' '\

    Proteins in this entry are homologues of YicC () from Escherichia coli. Although it is relatively poorly characterised YicC has been shown to be important for cells in the stationary phase, and essential for growth at high temperatures PUBMED:1925027.

    \

    This domain is found at the N-terminal region of these proteins.

    \ ' '5002' 'IPR004443' '\

    The YjeF N-terminal domains occur either as single proteins or fusions with other domains and are commonly associated with enzymes. In bacteria and archaea, YjeF N-terminal domains are often fused to a YjeF C-terminal domain with high structural homology to the members of a ribokinase-like superfamily (see )and/or belong to operons that encode enzymes of diverse functions: pyridoxal phosphate biosynthetic protein PdxJ; phosphopanteine-protein transferase; ATP/GTP hydrolase; and pyruvate-formate lyase 1-activating enzyme. In plants, the YjeF N-terminal domain is fused to a C-terminal putative pyridoxamine 5\'-phosphate oxidase. In eukaryotes, proteins that consist of (Sm)-FDF-YjeF N-terminal domains may be involved in RNA processing PUBMED:15257761, PUBMED:18202122.

    \ \

    The YjeF N-terminal domains represent a novel version of the Rossmann fold, one of the most common protein folds in nature observed in numerous enzyme families, that has acquired a set of catalytic residues and structural features that distinguish them from the conventional dehydrogenases. The YjeF N-terminal domain is comprised of a three-layer alpha-beta-alpha sandwich with a central beta-sheet surrounded by helices. The conservation of the acidic residues in the predicted active site of the YjeF N-terminal domains is reminiscent of the presence of such residues in the active sites of diverse hydrolases PUBMED:15257761, PUBMED:18202122.

    \ ' '5003' 'IPR005495' '\ Members of this family are predicted integral membrane proteins of unknown function. They are about 350 amino acids long, contain about 6 transmembrane regions and may be permeases, although there is no verification of this.\ ' '5004' 'IPR003319' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit 8 (or ymf19) found in the F0 complex of mitochondrial F-ATPases from plants and algae. This subunit is sometimes found in association and N-terminal to , in higher plants. Subunit 8 differs in sequence between plants, Metazoa () and fungi () PUBMED:12681508, PUBMED:12671689.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '5005' 'IPR006182' '\ This domain is found in proteins that are related to the YscJ lipoprotein, where it covers most of the sequence, and the flagellar M-ring protein FliF, where it covers the N-terminal region. The members of the YscJ family are thought to be involved in secretion of several proteins. The FliF protein ring is thought to be part of the export apparatus for flagellar proteins, based on the similarity to YscJ proteins PUBMED:10049798.\ ' '5006' 'IPR005877' '\

    Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology.

    \ ' '5007' 'IPR000607' '\ Double-stranded RNA-specific adenosine deaminase () converts multiple adenosines to inosines\ and creates I/U mismatched base pairs in double-helical RNA substrates without apparent sequence\ specificity. DRADA has been found to modify adenosines in AU-rich regions more frequently, probably\ due to the relative ease of melting A/U base pairs compared to G/C base pairs. The protein functions to\ modify viral RNA genomes, and may be responsible for hypermutation of certain negative-stranded viruses.\ DRADA edits the mRNAs for the glutamate receptor subunits by site-selective adenosine deamination. The\ DRADA repeat is also found in viral E3 proteins, which contain a double-stranded RNA-binding domain.\ ' '5008' 'IPR002530' '\

    Alpha-prolamins are the major seed storage proteins of species of the grass tribe Andropogonea. They are unusually rich in glutamine, proline, alanine, and leucine residues and their sequences show a series of tandem repeats presumed to be the result of multiple intragenic duplication PUBMED:8451243. In Zea mays (Maize), the 22 kDa and 19 kDa zeins are encoded by a large multigene family and are the major seed storage proteins accounting for 70% of the total zein fraction. Structurally the 22 kDa and 19 kDa zeins are composed of nine adjacent, topologically antiparallel helices clustered within a distorted cylinder. The 22 kDa alpha-zeins are encoded by 23 genes PUBMED:11691845; twenty-two of the members are found in a roughly tandem array forming a dense gene cluster. The expressed genes in the cluster are interspersed with nonexpressed genes. Interestingly, some of the expressed genes differ in their transcriptional regulation. Gene amplification appears to be in blocks of genes explaining the rapid and compact expansion of the cluster during the evolution of maize.

    \ ' '5009' 'IPR002653' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the zinc finger domain found in A20. A20 is an inhibitor of cell death that inhibits NF-kappaB activation via the tumour necrosis factor receptor associated factor pathway PUBMED:17449604. The zinc finger domains appear to mediate self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5010' 'IPR000058' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the AN1-type zinc finger domain, which has a dimetal (zinc)-bound alpha/beta fold. This domain was first identified as a zinc finger at the C-terminus of AN1 , a ubiquitin-like\ protein in Xenopus laevis PUBMED:8390387. The AN1-type zinc finger contains six conserved cysteines and two histidines that could potentially coordinate 2 zinc atoms.

    \ \

    Certain stress-associated proteins (SAP) contain AN1 domain, often in combination with A20 zinc finger domains (SAP8) or C2H2 domains (SAP16) PUBMED:17033811. For example, the human protein Znf216 has an A20 zinc-finger at the N-terminus and an AN1 zinc-finger at the C-terminus, acting to negatively regulate the NFkappaB activation pathway and to interact with components of the immune response like RIP, IKKgamma and TRAF6. The interact of Znf216 with IKK-gamma and RIP is mediated by the A20 zinc-finger domain, while its interaction with TRAF6 is mediated by the AN1 zinc-finger domain; therefore, both zinc-finger domains are involved in regulating the immune response PUBMED:14754897. The AN1 zinc finger domain is also found in proteins containing a ubiquitin-like domain, which are involved in the ubiquitination pathway PUBMED:8390387. Proteins containing an AN1-type zinc finger include:

    \

    \

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5011' 'IPR007087' '\

    C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf\'s can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 PUBMED:11361095. C2H2 Znf\'s are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes PUBMED:10664601. Transcription factors usually contain several Znf\'s (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA PUBMED:10940247. C2H2 Znf\'s can also bind to RNA and protein targets PUBMED:18253864.

    \

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the classical C2H2 type zinc finger domain.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5012' 'IPR001628' '\

    Steroid or nuclear hormone receptors constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. The receptors function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner. Nuclear hormone receptors consist of a highly conserved DNA-binding domain that recognises specific sequences, connected via a linker region to a C-terminal ligand-binding domain (). In addition, certain nuclear hormone receptors have an N-terminal modulatory domain (). The DNA-binding domain can elicit either an activating or repressing effect by binding to specific regions of the DNA known as hormone-response elements PUBMED:15242341, PUBMED:15242339. These response elements position the receptors, and the complexes recruited by them, close to the genes of which transcription is affected. The DNA-binding domains of nuclear receptors consist of two zinc-nucleated modules and a C-terminal extension, where residues in the first zinc module determine the specificity of the DNA recognition and residues in the second zinc module are involved in dimerisation. The DNA-binding domain is furthermore involved in several other functions including nuclear localisation, and interaction with transcription factors and co-activators PUBMED:15242339.

    \

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the two C4-type zinc finger modules involved in DNA-binding.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5013' 'IPR013498' '\

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis PUBMED:12042765, PUBMED:11395412. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227.

    \

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    \

    This entry represents the zinc-finger domain found in type IA topoisomerases, including bacterial and archaeal topoisomerase I and III enzymes, and in eukaryotic topoisomerase III enzymes. Escherichia coli topoisomerase I proteins contain five copies of a zinc-ribbon-like domain at their C-terminus, two of which have lost their cysteine residues and are therefore probably not able to bind zinc PUBMED:10873443. This domain is still considered to be a member of the zinc-ribbon superfamily despite not being able to bind zinc.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '5014' 'IPR004198' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a predicted zinc finger with eight potential zinc ligand binding residues. This domain is found in Jumonji PUBMED:11165500, and may have a DNA binding function. The mouse jumonji protein is required for neural tube formation, and is essential for normal heart development. It also plays a role in the down-regulation of cell proliferation signalling.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5015' 'IPR002694' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents CycHisCysCys (CHC2) type zinc finger domains, which are found in bacteria and viruses.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5016' 'IPR007872' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a probable zinc binding motif that contains four cysteines and may chelate zinc, known as the DPH-type after the diphthamide (DPH) biosynthesis protein in which it was first characterised, including the proteins DPH3 and DPH4. This domain is also found associated with N-terminal domain of heat shock protein DnaJ domain.

    \ \

    Diphthamide is a unique post-translationally modified histidine residue found only in translation elongation factor 2 (eEF-2). It is conserved from archaea to humans and serves as the target for diphteria toxin and Pseudomonas exotoxin A. These two toxins catalyse the transfer of ADP-ribose to diphtamide on eEF-2, thus inactivating eEF-2, halting cellular protein synthesis, and causing cell death PUBMED:11595641. The biosynthesis of diphtamide is dependent on at least five proteins, DPH1 to -5, and a still unidentified amidating enzyme. DPH3 and DPH4 share a conserved region, which encode a putative zinc finger, the DPH-type or CSL-type (after the conserved motif of the final cysteine) zinc finger PUBMED:14527407, PUBMED:15485916. The function of this motif is unknown.

    \ \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5017' 'IPR007853' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry prepresents the Zim17-type zinc finger motif thought to bind zinc. This domain is found in a number of eukaryotic proteins and is named after a short C-terminal motif of D(N/H)L. The domain is found in proteins having a novel zinc-finger essential for protein import into mitochondria PUBMED:15383543.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5018' 'IPR003851' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry consists of proteins containing a Dof domain, which is a zinc finger DNA-binding domain that shows resemblance to the Cys2 zinc finger, although it has a longer putative loop where an extra Cys residue is conserved PUBMED:9688549. AOBP, a DNA-binding protein in pumpkin (Cucurbita maxima), contains a 52 amino acid Dof domain, which is highly conserved in several DNA-binding proteins of higher plants.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5019' 'IPR006456' '\

    This group of sequences described by a 54-residue domain found in the N-terminal region of plant proteins, the vast majority of which contain a ZF-HD class homeobox domain toward the C terminus. The region between the two domains typically is rich in low complexity sequence. The companion ZF-HD homeobox domain is described in .

    \ ' '5020' 'IPR000967' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a domain presumed to be a zinc binding domain. The following pattern describes the zinc finger:

    \
    \
    C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C\
    
    \

    where X can be any amino acid, and numbers in brackets indicate the number of residues. The two position can be either His or Cys. This domain is found in the \ human transcriptional repressor NK-X1, a repressor of HLA-DRA transcription; the Drosophila shuttle craft protein, which plays an essential role during the late stages of embryonic neurogenesis; and a yeast hypothetical protein YNL023C.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5021' 'IPR001510' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents PARP (Poly(ADP) polymerase) type zinc finger domains.

    \ \

    NAD(+) ADP-ribosyltransferase() PUBMED:3118181, PUBMED:8016868 is a eukaryotic enzyme that catalyses the covalent attachment of ADP-ribose units from NAD(+) to various nuclear acceptor proteins. This post-translational modification of nuclear proteins is dependent on DNA. It appears to be involved in the regulation of various important cellular processes such as differentiation, proliferation and tumour transformation as well as in the regulation of the molecular events involved in the recovery of the cell from DNA damage. Structurally, NAD(+) ADP-ribosyltransferase consists of three distinct domains: an N-terminal zinc-dependent DNA-binding domain, a central automodification domain and a C-terminal NAD-binding domain. The DNA-binding region contains a pair of PARP-type zinc finger domains which have been shown to bind DNA in a zinc-dependent manner. The PARP-type zinc finger domains seem to bind specifically to single-stranded DNA and to act as a DNA nick sensor. DNA ligase III PUBMED:7760816 contains, in its N-terminal section, a single copy of a zinc finger highly similar to those of PARP.

    \ \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5022' 'IPR001876' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents the zinc finger domain found in RanBP2 proteins. Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran binding protein 2 (RanBP2) is a 358-kDa nucleoporin located on the cytoplasmic side of the nuclear pore complex which plays a role in nuclear protein import PUBMED:12019565. RanBP2 contains multiple zinc fingers which mediate binding to RanGDP PUBMED:10318915.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5023' 'IPR000197' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    TAZ (Transcription Adaptor putative Zinc finger) domains are zinc-containing domains found in the homologous transcriptional co-activators CREB-binding protein (CBP) and the P300. CBP and P300 are histone acetyltransferases () that catalyse the reversible acetylation of all four histones in nucleosomes, acting to regulate transcription via chromatin remodelling. These large nuclear proteins interact with numerous transcription factors and viral oncoproteins, including p53 tumour suppressor protein, E1A oncoprotein, MyoD, and GATA-1, and are involved in cell growth, differentiation and apoptosis PUBMED:8848831. Both CBP and P300 have two copies of the TAZ domain, one in the N-terminal region, the other in the C-terminal region. The TAZ1 domain of CBP and P300 forms a complex with CITED2 (CBP/P300-interacting transactivator with ED-rich tail), inhibiting the activity of the hypoxia inducible factor (HIF-1alpha) and thereby attenuating the cellular response to low tissue oxygen concentration PUBMED:12778114. Adaptation to hypoxia is mediated by transactivation of hypoxia-responsive genes by hypoxia-inducible factor-1 (HIF-1) in complex with the CBP and p300 transcriptional coactivators PUBMED:11959990.

    \

    The TAZ domain adopts an all-alpha fold with zinc-binding sites in the loops connecting the helices. The TAZ1 domain in P300 and the TAZ2 (CH3) domain in CBP have each been shown to have four amphipathic helices, organised by three zinc-binding clusters with HCCC-type coordination PUBMED:11023789, PUBMED:14594809, PUBMED:15641773.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5024' 'IPR004217' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a putative zinc binding domain with four conserved cysteine residues. Members of this family include subunits 8, 9, 10 and 13 of the mitochondrial inner membrane translocase complex, which are involved in mitochondrial protein import PUBMED:11101512, PUBMED:8663351.

    \ \

    Defects in TIM8 are the cause of 2 human syndromes:

    \
      \
    1. Mohr-Tranebjaerg syndrome (MTS) [MIM:304700]; also known as dystonia-deafness syndrome (DDS) or X-linked progressive deafness type 1 (DFN-1). It is a recessive neurodegenerative syndrome characterised by postlingual progressive sensorineural deafness as the first presenting symptom in early childhood, followed by progressive dystonia, spasticity, dysphagia, mental deterioration, paranoia and cortical blindness.
    2. \ \
    3. Jensen syndrome [MIM:311150]; also known as opticoacoustic nerve atrophy with dementia. This X-linked disease is characterised by deafness, blindness and muscle weakness.
    4. \
    \ \ \

    More information on zinc fingers can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ \ ' '5025' 'IPR001293' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents TRAF-type zinc finger domains. Some of the proteins that have this domain are mammalian signal transducers associated with the cytoplasmic domain of the 75 kDa tumour necrosis factor receptor PUBMED:11607847. A heterocomplex, homodimer or heterodimer of TRAF1 and TRAF2, binds to the N-terminal of the inhibitor of apoptosis proteins 1 and 2 (IAPS) and recruits them to the tumour necrosis factor receptor 2. Other proteins containing this domain include F45G2.6 protein from Caenorhabditis elegans and DG17 protein from Dictyostelium discoideum (Slime mold).

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5026' 'IPR003126' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    The N-end rule-based degradation signal, which targets a protein for ubiquitin-dependent proteolysis, comprises a destabilising amino-terminal residue and a specific internal lysine residue. This entry describes a putative zinc finger in N-recognin, a recognition component of the N-end rule pathway PUBMED:9653112.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5027' 'IPR005381' '\

    This domain is a putative nucleic acid binding zinc finger and is found at the N terminus of proteins that also contain an adjacent XS domain and a C-terminal XH domain .

    \ ' '5028' 'IPR000962' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents domains identified in zinc finger-containing members of the DksA/TraR family. DksA is a critical component of the rRNA transcription initiation machinery that potentiates the regulation of rRNA promoters by ppGpp and the initiating NTP. In delta-dksA mutants, rRNA promoters are unresponsive to changes in amino acid availability, growth rate, or growth phase. In vitro, DksA binds to RNAP, reduces open complex lifetime, inhibits rRNA promoter activity, and amplifies effects of ppGpp and the initiating NTP on rRNA transcription PUBMED:15294156, PUBMED:15294157. The dksA gene product suppresses the temperature-sensitive growth and filamentation of a dnaK deletion mutant of Escherichia coli. Gene knockout PUBMED:2180916 and deletion PUBMED:8063112 experiments have shown the gene to be non-essential, mutations causing a mild sensitivity to UV light, but not affecting DNA recombination PUBMED:8063112. In \ Pseudomonas aeruginosa, dksA is a novel regulator involved in the post-transcriptional control of extracellular virulence factor production PUBMED:12775693.

    \ \

    The proteins contain a C-terminal region thought to fold into a 4-cysteine zinc finger. Other proteins found to contain a similar zinc finger domain include:

    \ \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5029' 'IPR007212' '\

    This is a probable metal-binding domain. It is found in a probable precorrin-3B C17-methyltransferase from Methanobacterium thermoautotrophicum, that catalyses the methylation of C-17 in precorrin-3B to form precorrin-4.

    \ ' '5030' 'IPR007449' '\ This entry represents the ZipA C-terminal domain. ZipA is an essential cell division protein involved in septum formation PUBMED:9008158, PUBMED:9864327. Its C-terminal domain binds FtsZ, a major component of the bacterial septal ring PUBMED:10209756. The structure of this domain is an alpha-beta fold with three alpha helices and a beta sheet of six antiparallel beta strands. The major loops protruding from the beta sheet surface are thought to form a binding site for FtsZ PUBMED:10924108.\ ' '5031' 'IPR001138' '\

    The N-terminal region of a number of fungal transcriptional regulatory\ proteins contains a Cys-rich motif that is involved in zinc-dependent\ binding of DNA. The region forms a binuclear Zn cluster, in which two Zn\ atoms are bound by six Cys residues PUBMED:2107541, PUBMED:1557122. A wide range of proteins are known to contain this domain. These include the proteins involved in arginine, proline, pyrimidine, quinate, maltose and galactose metabolism; amide and GABA catabolism; leucine biosynthesis, amongst others.

    \ \ ' '5032' 'IPR001531' '\

    Bacillus cereus contains a monomeric phospholipase C (PLC) of 245 amino-acid residues that binds three zinc ions PUBMED:2493587. Although PLC prefers to act on phosphatidylcholine, it also shows weak catalytic activity with sphingomyelin and phosphatidylinositol PUBMED:2841128. Sequence studies have shown the PLC protein to be similar to the following:

    \

    \ \

    Each of these proteins is a zinc-dependent enzyme, binding 3 zinc ions per molecule PUBMED:. The enzymes catalyse the conversion of phosphatidylcholine and water to 1,2-diacylglycerol and choline phosphate PUBMED:2841128, PUBMED:2536355, PUBMED:. In B. cereus, there are nine residues known to be involved in binding the zinc ions: 5 His, 2 Asp, 1 Glu and 1 Trp. These residues are all conserved in the Clostridium alpha-toxin PUBMED:9699639.

    \ ' '5033' 'IPR007343' '\

    Members of this family of bacterial proteins are described as hypothetical proteins or zinc metallopeptidases. The majority have a HExxH zinc-binding motif characteristic of neutral zinc metallopeptidases, however there is no evidence to support their function as metallopeptidases.

    \ ' '5034' 'IPR007395' '\

    Members of this family of bacterial proteins are described as hypothetical proteins or zinc-dependent proteases. The majority have a HExxH zinc-binding motif characteristic of neutral zinc metallopeptidases, however there is no evidence to support their function as metallopeptidases.

    \ ' '5035' 'IPR003224' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    The RING-finger is a specialised type of Zn-finger of 40 to 60 residues that binds two atoms of zinc, and is probably involved in mediating protein-protein interactions PUBMED:8317827, PUBMED:8804826, PUBMED:8744354. There are two different variants, the C3HC4-type and a C3H2C3-type, which is clearly related despite the different cysteine/histidine pattern. The latter type is sometimes referred to as \'RING-H2 finger\'. The RING domain is a protein interaction domain which has been implicated in a range of diverse biological processes. Several 3D-structures for RING-fingers are known PUBMED:8804826, PUBMED:8744354. The 3D structure of the zinc ligation system is unique to the RING domain and is referred to as the \'cross-brace\' motif. The spacing of the cysteines in such a domain is:

    \
    \
    C-x(2)-C-x(9 to 39)-C-x(1 to 3)-H-x(2 to 3)-C-x(2)-C-x(4 to 48)-C-x(2)-C\
    
    \

    Metal ligand pairs one and three co-ordinate to bind one zinc ion, whilst pairs two and four bind the second.

    \ \

    This entry represents RING finger protein Z, a small polypeptide found in some negative-strand RNA viruses including Lassa fever virus, which plays a crucial role in virion assembly and budding. RING finger Z has been shown to interact with several host proteins, including promyelocytic leukemia protein and the eukaryotic translation initiation factor 4E PUBMED:9420283, PUBMED:10708446. It is sufficient in the absence of any other viral proteins to release virus-like particles from the infected cell PUBMED:12970458. This protein is also responsible for arenavirus superinfection exclusion; expression of this protein in a host cell strongly and specifically inhibits areanavirus transcription and replication PUBMED:14990716.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5036' 'IPR004457' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents ZPR1-type zinc finger domains. An orthologous protein found once in each of the completed archaeal genomes corresponds to a zinc finger-containing domain repeated as the N-terminal and C-terminal halves of the mouse protein ZPR1. ZPR1 is an experimentally proven zinc-binding protein that binds the tyrosine kinase domain of the epidermal growth factor receptor (EGFR); binding is inhibited by EGF stimulation and tyrosine phosphorylation, and activation by EGF is followed by some redistribution of ZPR1 to the nucleus. By analogy, other proteins with the ZPR1 zinc finger domain may be regulatory proteins that sense protein phosphorylation state and/or participate in signal transduction (see also ).

    \

    Deficiencies in ZPR1 may contribute to neurodegenerative disorders. ZPR1 appears to be down-regulated in patients with spinal muscular atrophy (SMA), a disease characterised by degeneration of the alpha-motor neurons in the spinal cord that can arise from mutations affecting the expression of Survival Motor Neurons (SMN) PUBMED:16648254. ZPR1 interacts with complexes formed by SMN PUBMED:17068332, and may act as a modifier that effects the severity of SMA.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5037' 'IPR000906' '\ This is a domain of unknown function, present in ZO-1 and Unc5-like netrin receptors. It is also found in \ different variants of ankyrin, which are responsible for attaching integral membrane proteins to \ cytoskeletal elements.\ ' '5038' 'IPR000433' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents ZZ-type zinc finger domains, named because of their ability to bind two zinc ions PUBMED:8848831. These domains contain 4-6 Cys residues that participate in zinc binding (plus additional Ser/His residues), including a Cys-X2-Cys motif found in other zinc finger domains. These zinc fingers are thought to be involved in protein-protein interactions. The structure of the ZZ domain shows that it belongs to the family of cross-brace zinc finger motifs that include the PHD, RING, and FYVE domains PUBMED:15476823. ZZ-type zinc finger domains are found in:

    \

    \ \

    Single copies of the ZZ zinc finger occur in the transcriptional adaptor/coactivator proteins P300, in cAMP response element-binding protein (CREB)-binding protein (CBP) and ADA2. CBP provides several binding sites for transcriptional coactivators. The site of interaction with the tumour suppressor protein p53 and the oncoprotein E1A with CBP/P300 is a Cys-rich region that incorporates two zinc-binding motifs: ZZ-type and TAZ2-type. The ZZ-type zinc finger of CBP contains two twisted anti-parallel beta-sheets and a short alpha-helix, and binds two zinc ions PUBMED:15476823. One zinc ion is coordinated by four cysteine residues via 2 Cys-X2-Cys motifs, and the third zinc ion via a third Cys-X-Cys motif and a His-X-His motif. The first zinc cluster is strictly conserved, whereas the second zinc cluster displays variability in the position of the two His residues.

    \

    In Arabidopsis thaliana (Mouse-ear cress), the hypersensitive to red and blue 1 (Hrb1) protein, which regulating both red and blue light responses, contains a ZZ-type zinc finger domain PUBMED:15705950.

    \

    ZZ-type zinc finger domains have also been identified in the testis-specific E3 ubiquitin ligase MEX that promotes death receptor-induced apoptosis PUBMED:16522193. MEX has four putative zinc finger domains: one ZZ-type, one SWIM-type and two RING-type. The region containing the ZZ-type and RING-type zinc fingers is required for interaction with UbcH5a and MEX self-association, whereas the SWIM domain was critical for MEX ubiquitination.

    \

    In addition, the Cys-rich domains of dystrophin, utrophin and an 87kDa post-synaptic protein contain a ZZ-type zinc finger with high sequence identity to P300/CBP ZZ-type zinc fingers. In dystrophin and utrophin, the ZZ-type zinc finger lies between a WW domain (flanked by and EF hand) and the C-terminal coiled-coil domain. Dystrophin is thought to act as a link between the actin cytoskeleton and the extracellular matrix, and perturbations of the dystrophin-associated complex, for example, between dystrophin and the transmembrane glycoprotein beta-dystroglycan, may lead to muscular dystrophy. Dystrophin and its autosomal homologue utrophin interact with beta-dystroglycan via their C-terminal regions, which are comprised of a WW domain, an EF hand domain and a ZZ-type zinc finger domain PUBMED:17009962. The WW domain is the primary site of interaction between dystrophin or utrophin and dystroglycan, while the EF hand and ZZ-type zinc finger domains stabilise and strengthen this interaction.

    \ \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5039' 'IPR007453' '\

    DsrC () has been observed to co-purify with Desulphovibrio vulgaris dissimilatory sulphite reductase PUBMED:1555572. However, DsrC appears to be only loosely associated to the sulphite reductase, which suggests that it may not be an integral part of the dissimilatory sulphite reductase. Many proteins in this entry are found in organisms such as Escherichia coli and Haemophilus influenzae which do not contain dissimilatory sulphite reductases but can synthesise assimilatory sirohaem sulphite and nitrite reductases. It is speculated that DsrC may be involved in the assembly, folding or stabilisation of sirohaem proteins PUBMED:9493389. The strictly conserved cysteine in the C terminus suggests that DsrC may have a catalytic function in the metabolism of sulphur compounds PUBMED:9695921. Also included in this entry is TusE, a partner to TusBCD in a sulphur relay system for 2-thiouridine biosynthesis, a tRNA base modification process. Many proteins in this entry are annotated as the third (gamma) subunit of dissimilatory sulphite reductase

    \ ' '5041' 'IPR007364' '\

    This family of proteins are predicted to be alpha/beta-knot SAM-dependent RNA methyltransferases PUBMED:15215454.

    \ ' '5042' 'IPR007454' '\ This family includes several proteins of uncharacterised function.\ ' '5043' 'IPR007329' '\ This conserved region includes the FMN-binding site of the NqrC protein PUBMED:11248234 as well as the NosR and NirI regulatory proteins.\ ' '5044' 'IPR007525' '\

    Coenzyme F420 hydrogenase () reduces the low-potential two-electron acceptor coenzyme F420. This family contains the C-termini of F420 hydrogenase and dehydrogenase beta subunits PUBMED:2207102, PUBMED:10751389. The C terminus of Methanobacterium formicicum formate dehydrogenase beta chain (, ) is also represented in this entry PUBMED:3531194. This region is often found in association with the 4Fe-4S binding domain, fer4 (), and the N terminus .

    \ ' '5045' 'IPR007687' '\

    Methyl-coenzyme M reductase (MCR) catalyses the reduction of methyl-coenzyme M (CH3-SCoM) and coenzyme B (HS-CoB) to methane and the corresponding heterosulphide CoM-S-S-CoB (), the final step in methane biosynthesis. This reaction proceeds under anaerobic conditions by methanogenic Archaea PUBMED:16260307, and requires a nickel-porphinoid prosthetic group, coenzyme F430, which is in the EPR-detectable Ni(I) oxidation state in the active enzyme. Studies on a catalytically inactive enzyme aerobically co-crystallized with coenzyme M displayed a fully occupied coenzyme M-binding site with no alternate conformations. The binding of coenzyme M appears to induce specific conformational changes that suggests a molecular mechanism by which the enzyme ensures that methyl-coenzyme M enters the substrate channel prior to coenzyme B, as required by the active-site geometry PUBMED:11491299.

    \

    MCR is a hexamer composed of 2 alpha, 2 beta, and 2 gamma subunits with two identical nickel porphinoid active sites, which form two long active site channels with F430 embedded at the bottom PUBMED:9367957, PUBMED:16234924.

    \

    Genes encoding the beta (mcrB) and gamma (mcrG) subunits of MCR are separated by two open reading frames coding for two proteins C and D PUBMED:3170483, PUBMED:8863453. The function of proteins C and D is unknown. This entry represents protein C.

    \ ' '5046' 'IPR007685' '\ The functions of Escherichia coli RelA and SpoT differ somewhat. RelA () produces pppGpp (or ppGpp) from ATP and GTP (or GDP). SpoT () degrades ppGpp, but may also act as a secondary ppGpp synthetase. The two proteins are strongly similar. In many species, a single homologue to SpoT and RelA appears reponsible for both ppGpp synthesis and ppGpp degradation. \ \

    (p)ppGpp is a regulatory metabolite of the stringent response, but appears also to be involved in antibiotic biosynthesis in some species.

    \ ' '5047' 'IPR007341' '\ This bacterial protein is predicted to be an integral membrane protein. Some family members have been annotated as transglycosylase-associated proteins, but no experimental evidence is provided. This family was annotated based on the information in .\ ' '5048' 'IPR007372' '\

    This entry represents the lipid-binding protein YceI from Escherichia coli PUBMED:12107143 and the polyisoprenoid-binding protein TTHA0802 from Thermus thermophilus PUBMED:15741337. Both these proteins share a common domain with an 8-stranded beta-barrel fold, which resembles the lipocalin fold, although no sequence homology exists with lipocalins. In TTHA0802, the protein binds the polyisoprenoid chain within the pore of the barrel via hydrophobic interactions PUBMED:15741337. Sequence homologues of this core structure are present in a wide range of bacteria and archaea. The crystal structures of Yce1 and TTHA0802 suggest that this family of proteins plays an important role in isoprenoid quinone metabolism and/or transport and/or storage PUBMED:15741337.

    \ ' '5049' 'IPR007524' '\ This region is found N-terminal to the pectate lyase domain () in some plant pectate lyase enzymes.\ ' '5050' 'IPR007887' '\

    The multiple antibiotic resistance of methicillin-resistant\ strains of Staphylococcus aureus (MRSA) has become a\ major clinical problem worldwide. Methicillin resistance in MRSA strains is\ due to the acquisition of the mecA gene via horizontal transfer\ from an unidentified species which encodes penicillin-binding protein 2a (PBP2a).

    \

    The structure of the N-terminal domain from MecA is known PUBMED:12389036 and is found to be similar to that found in NTF2 . The length of the PBP2A N-terminal domain\ (which positions the transpeptidase active site more than 100A from the\ expected C terminus of the transmembrane anchor) suggests a\ possible structural role and potentially gives the transpeptidase\ domain substantial reach from the cell membrane. This domain seems unlikely to have an enzymatic function.

    \ ' '5051' 'IPR007888' '\

    This family of proteins includes the DNA-binding meisosis-specific protein NDT80 PUBMED:12454476. It also describes PhoG\ and its homologues, proteins that have been found to increase acid phosphatase activity within certain fungi PUBMED:7916713. It\ is not clear that these proteins are actually the acid phosphatase themselves.

    \ ' '5052' 'IPR007889' '\

    This DNA-binding motif is found in four copies in the pipsqueak protein of Drosophila melanogaster PUBMED:9774480. In pipsqueak this domain\ binds to GAGA sequence PUBMED:9774480. The pipsqueak family, which includes proteins from fungi, sea urchins,\ nematodes, insects, and vertebrates appear to be proteins essential for sequence-specific targeting of a polycomb group protein\ complex PUBMED:12167718.

    \ ' '5053' 'IPR007890' '\

    CHASE2 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are upstream of signal transduction pathways in bacteria. Specifically, CHASE2 domains are found in histidine kinases, adenylate cyclases, serine/threonine kinases and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognised by\ CHASE2 domains are not known at this time PUBMED:12486065.

    \ ' '5054' 'IPR007891' '\

    CHASE3 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are upstream of signal transduction pathways in bacteria. Specifically, CHASE3 domains are found in histidine kinases, adenylate cyclases, methyl-accepting chemotaxis proteins and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognised by CHASE3 domains are not known at this time PUBMED:12486065.

    \ ' '5055' 'IPR007892' '\

    CHASE4 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are upstream of signal transduction pathways in prokaryotes. Specifically, CHASE4 domains are found in histidine kinases in archaea and in predicted diguanylate cyclases/phosphodiesterases in bacteria. Environmental factors that are recognised by CHASE4 domains are not known at this time PUBMED:12486065.

    \ ' '5056' 'IPR007893' '\

    This domain is found in protein U, a spore coat protein produced at the late stage of development of Myxococcus xanthus. Protein U is produced as a secretory precursor, pro-protein U, which is then secreted across the membrane to assemble on the spore surface PUBMED:1904442. This domain is also found in a number of the genes within a conserved polycistronic operon that encodes a novel chaperone-usher pili assembly system. Examples are CsuA/B of Acinetobacter baumannii, and the CsuA, CsuB and CsuE of Vibrio parahaemolyticus and the related genes of Yersinia pestis.

    \ \

    In A. baumannii, csuC and csuE are required in the early steps of the process that that leads to biofilm formation. The conservation of the genes and gene order among unrelated bacteria, suggests that the csu operon is widespread and is involved in surface pilus formation which allows the bacteria to form biofilms on abiotic surfaces, a property that may aid there survival in their natural environment PUBMED:14663080.

    \ \ ' '5057' 'IPR007894' '\

    This domain of unknown function is often found adjacent to the GGDEF domain in bacteria ().

    \ ' '5058' 'IPR007895' '\

    This is a domain of unknown function found in proteins of unknown function.

    \ \ ' '5059' 'IPR007896' '\

    This domain represents a conserved pair of transmembrane helices. It appears to be found as two\ tandem repeats in a family of hypothetical proteins.

    \ ' '5060' 'IPR007897' '\

    The proteins this domain is found in are typically involved in regulating polymer accumulation in bacteria, for example the production of poly-beta-hydroxybutyrate (PHB) which is formed via the polymerisation of D(-)-3-hydroxybutyryl-CoA PUBMED:9922249. The function of\ this domain is unknown.

    \ ' '5061' 'IPR007898' '\

    The protein Rrn10 has been identified as a component of the Upstream Activating Factor\ (UAF), an RNA polymerase I (pol I) specific transcription stimulatory factor that recognises the upstream ribosomal RNA\ (rRNA) gene promoter in a sequence specific manner and which stimulates rRNA synthesis PUBMED:12490702.

    \ ' '5062' 'IPR007899' '\

    The CHAD domain is an alpha-helical domain functionally associated with some members of the adenylate cyclase family . It has conserved histidines that may chelate metals\ PUBMED:12456267.

    \ ' '5063' 'IPR007900' '\

    Accurate transcription initiation at protein-coding genes by RNA polymerase II requires the assembly of a multiprotein\ complex around the mRNA start site. Transcription factor TFIID is one of the general factors involved in this process. Yeast TFIID comprises the TATA binding protein and 14 TBP-associated factors (TAFIIs), nine of which contain\ histone-fold domains (). The C-terminal region of the TFIID-specific yeast TAF4 (yTAF4) containing the HFD shares\ strong sequence similarity with Drosophila (d)TAF4 and human TAF4. A structure/function\ analysis of yTAF4 demonstrates that the HFD, a short conserved C-terminal domain (CCTD), and the region separating them\ are all required for yTAF4 function. This region of similarity is found in Transcription initiation factor TFIID component TAF4\ PUBMED:12237303.

    \ ' '5064' 'IPR007901' '\

    This putative domain is found in the MoeZ protein and the MoeB protein. The domain has two\ CXXC motifs that are only partly conserved. MoeZ is necessary for the synthesis of pyridine-2,6-bis(thiocarboxylic acid), a small secreted metabolite that has a high affinity for transition\ metals, increases iron uptake efficiency by 20% in Pseudomonas stutzeri, has the ability to reduce both soluble and mineral forms of\ iron, and has antimicrobial activity towards several species of bacteria. MoeB is the molybdopterin synthase activating enzyme in the molybdopterin cofactor biosynthesis pathway.\ Both these enzymes are members of a superfamily consisting of related but structurally distinct proteins that are members of pathways involved in the\ transfer of sulphur-containing moieties to metabolites PUBMED:11972321 and both also contain the UBA/THIF-type NAD/FAD binding fold ().

    \ ' '5065' 'IPR007902' '\

    This family includes CHL4 that is involved in chromosome segregation PUBMED:8243998. It is required for chromosome stability but is non-essential for growth. Chl4 is a component of the central kinetochore, which mediates the attachment of the centromere to the mitotic spindle by forming essential interactions between the microtubule-associated outer kinetochore proteins and the centromere-associated inner kinetochore proteins. It is required for the establishment of bipolar spindle-microtubule attachments and proper chromosome segregation. Sgo1, Chl4, and Iml3 are all important for retaining centromeric cohesion until the onset of anaphase II PUBMED:14752166.

    \ ' '5066' 'IPR007903' '\

    The PRC-barrel is an all beta barrel domain found in photosynthetic reaction centre subunit H of the purple bacteria. PRC-barrels are approximately 80 residues long, and found widely represented in bacteria, archaea and plants. This domain is also present at the C terminus of the pan-bacterial protein RimM, which is involved in ribosomal maturation and processing of 16S rRNA. A family of small proteins conserved in all known euryarchaea are composed entirely of a single stand-alone copy of the domain PUBMED:12429060.

    \ ' '5067' 'IPR007904' '\

    This domain is found at the C terminus of the Apolipoprotein B mRNA editing enzyme. Apobec-1 catalyzes C to U editing of apolipoprotein B (apoB) mRNA in the mammalian intestine. C to U RNA editing of mammalian apolipoprotein B (apoB) RNA is a site-specific posttranscriptional modification in which a single cytidine is enzymatically\ deaminated to uridine, thereby generating a UAA stop codon in the edited mRNA. The function\ of this domain is currently unknown.

    \ ' '5068' 'IPR007905' '\

    Emopamil binding protein (EBP) is a nonglycosylated type I integral\ membrane protein of endoplasmic reticulum and shows high level expression in epithelial tissues. The\ EBP protein has emopamil binding domains, including the sterol acceptor site and the catalytic\ centre, which show Delta8-Delta7 sterol isomerase activity. Human sterol isomerase, a homologue\ of mouse EBP, is suggested not only to play a role in\ cholesterol biosynthesis, but also to affect lipoprotein internalisation. In humans, mutations of EBP\ are known to cause the genetic disorder of X-linked dominant chondrodysplasia punctata (CDPX2).\ This syndrome of humans is lethal in most males, and affected females display asymmetric\ hyperkeratotic skin and skeletal abnormalities PUBMED:11471053.

    \ ' '5069' 'IPR007906' '\

    This family consists of the lactophorin precursors proteose peptone component 3 (PP3) and\ glycosylation-dependent cell adhesion molecule 1 (GlyCAM-1). GlyCAM-1 functions as a ligand\ for L-selectin, a saccharide-binding protein on the surface of circulating leukocytes, and mediates\ the trafficking of blood-born lymphocytes into secondary lymph nodes. In this context, sulphatation\ of the carbohydrates of GlyCAM-1 has been shown to be a critical structural requirement to be\ recognised by L-selectin. GlyCAM-1 is also expressed in pregnant and lactating mammary glands\ of mouse and in an unknown site in the lung, in the bovine uterus and rat\ cochlea PUBMED:12057858.

    \ ' '5071' 'IPR007908' '\

    This family consists of several outer membrane proteins (2a and 2b) from Brucella abortus.\ B. abortus is Gram-negative, facultative intracellular bacteria that can infect many species of animals\ and Homo sapiens PUBMED:9884218.

    \ ' '5073' 'IPR007910' '\

    This family consists of several uncharacterised Borrelia burgdorferi proteins of unknown function.

    \ ' '5074' 'IPR007911' '\

    This family consists of several bacterial flagellar transcriptional activator (FlhD) proteins. FlhD combines with FlhC to form a regulatory complex in Escherichia coli. This complex has been shown to be a global regulator involved in many cellular processes as well as a flagellar transcriptional activator PUBMED:11287152.

    \ ' '5075' 'IPR007912' '\

    Adenoviruses have evolved multiple mechanisms to evade the host immune response. Several of the immunomodulatory adenoviral\ proteins are encoded in early transcription unit 3 (E3). The E3A/19K protein interferes with antigen presentation and T cell recognition PUBMED:9707602.

    \ ' '5077' 'IPR007914' '\

    This family of proteins is functionally uncharacterised.

    \ ' '5078' 'IPR007915' '\

    This family of proteins is functionally uncharacterised.

    \ ' '5080' 'IPR007917' '\

    This family of proteins is functionally uncharacterised.

    \ ' '5081' 'IPR007918' '\

    This is a family of small highly conserved proteins. In Saccharomyces cerevisiae (Baker\'s yeast) the gene YKL053C-A (MDM35) is one of the genes essential for maintenance of normal mitochondrial distribution and \ morphology (MDM) PUBMED:11907266; wherease in Homo sapiens (Human), p53CSV, is a direct transcriptional target for p53 and appears to be a cell-survival mediator in response to genotoxic stress including low-levels of DNA damage. It is suggested that p53CSV modulates the apoptotic pathway through interaction with HSP70 and Apaf-1 thereby inhibiting activation of procaspase-3 and procaspase-9 PUBMED:15735003.

    \ \ ' '5082' 'IPR007919' '\

    This family of proteins is functionally uncharacterised.

    \ ' '5083' 'IPR007920' '\

    This family of proteins is functionally uncharacterised.

    \ ' '5084' 'IPR007921' '\

    The CHAP (cysteine, histidine-dependent amidohydrolases/peptidases) domain is\ a region between 110 and 140 amino acids that is found in proteins from\ bacteria, bacteriophages, archaea and eukaryotes of the Trypanosomidae family.\ Many of these proteins are uncharacterised, but it has been proposed that they\ may function mainly in peptidoglycan hydrolysis. The CHAP domain is found in a\ wide range of protein architectures; it is commonly associated with bacterial\ type SH3 domains and with several families of amidase domains. It has been\ suggested that CHAP domain containing proteins utilise a catalytic cysteine\ residue in a nucleophilic-attack mechanism PUBMED:12765833, PUBMED:12765834.

    \ \

    The CHAP domain contains two invariant residues, a cysteine and a histidine.\ These residues form part of the putative active site of CHAP domain containing\ proteins. Secondary structure predictions show that the CHAP domain belongs to\ the alpha + beta structural class, with the N-terminal half largely containing\ predicted alpha helices and the C-terminal half principally composed of\ predicted beta strands PUBMED:12765833, PUBMED:12765834.

    \ \

    Some proteins known to contain a CHAP domain are listed below:\

    \ ' '5085' 'IPR007922' '\

    This family contains several actinomycete proteins of unknown function, and related sequences from other species.

    \ ' '5086' 'IPR007923' '\

    This entry represents Herpesvirus glycoprotein L (gL), which is a virion associated envelope glycoprotein PUBMED:9526546. Heterodimer formation between gH and gL has been demonstrated in both virions and infected cells PUBMED:9267002. Heterodimer formation between gL and gH is important for the proper folding of gH and its insertion into the membrane because the anti-gH conformation-dependent monoclonal antibodies (mAbs) 53S and LP11 bind gH only when gL is present PUBMED:3016991, PUBMED:2552150.

    \

    Herpesviruses are enveloped by a lipid bilayer that contains at least a dozen glycoproteins. The virion surface glycoproteins mediate recognition of susceptible cells and promote fusion of the viral envelope with the cell membrane, leading to virus entry. No single glycoprotein associated with the virion membrane has been identified as the fusogen PUBMED:17299053.

    \ \

    Glycoprotein L (gL) forms a non-covalently linked heterodimer with glycoprotein H (gH). This heterodimer is essential for virus-cell and cell-cell fusion since the association of gH and gL is necessary for correct localisation of gH to the virion or cell surface. gH anchoring the heterodimer to the plasma membrane through its transmembrane domain. gL lacks a transmembrane domain and is secreted from cells when expressed in the absence of gH PUBMED:7769724.

    \ ' '5088' 'IPR007925' '\

    The TraM protein is an essential part of the DNA transfer machinery of the conjugative\ resistance plasmid R1 (IncFII). On the basis of mutational analyses, it was shown that the essential\ transfer protein TraM has at least two functions. First, a functional TraM protein was found to be\ required for normal levels of transfer gene expression. Second, experimental evidence was obtained\ that TraM stimulates efficient site-specific single-stranded DNA cleavage at the oriT, in vivo.\ Furthermore, a specific interaction of the cytoplasmic TraM protein with the membrane protein TraD\ was demonstrated, suggesting that the TraM protein creates a physical link between the relaxosomal\ nucleoprotein complex and the membrane-bound DNA transfer apparatus PUBMED:11258958.

    \ ' '5089' 'IPR007926' '\

    This family consists of several Borrelia P83/P100 antigen proteins.

    \ ' '5090' 'IPR007927' '\

    This family contains several bacteriophage proteins of\ unknown function.

    \ ' '5091' 'IPR007928' '\

    Antifreeze proteins (AFPs) are a class of proteins that are able to bind to and inhibit the growth of macromolecular ice, thereby permitting an organism to survive subzero temperatures by decreasing the probability of ice nucleation in their bodies PUBMED:15291806. These proteins have been characterised from a variety of organisms, including fish, plants, bacteria, fungi and arthropods. This entry represents insect AFPs of the type found in spruce budworm, Choristoneura fumiferana.

    \

    The structure of these AFPs consists of a left-handed beta-helix with 15 residues per coil PUBMED:12015145. The beta-helices of insect AFPs present a highly rigid array of threonine residues and bound water molecules that can effectively mimic the ice lattice. As such, beta-helical AFPs provide a more effective coverage of the ice surface compared to the alpha-helical fish AFPs.

    \

    A second insect antifreeze from Tenebrio molitor () also consists of beta-helices, however in these proteins the helices form a right-handed twist; these proteins show no sequence homology to the current entry, but may act by a similar mechanism. The beta-helix motif may be used as an AFP structural motif in non-homologous proteins from other (non-fish) organisms as well.

    \ ' '5092' 'IPR007929' '\

    This family contains several uncharacterised proteins from Neisseria meningitidis. These proteins may have a role in DNA binding.

    \ ' '5093' 'IPR007930' '\

    This family contains several uncharacterised proteins found exclusively in Arabidopsis thaliana.

    \ ' '5094' 'IPR007931' '\

    This family contains several Drosophila proteins of unknown function.

    \ ' '5095' 'IPR007932' '\

    This family contains several Gp38 proteins from T-even-like phages. Gp38, together with a\ second phage protein, gp57, catalyses the organisation of gp37 but is absent from the phage\ particle. Gp37 is responsible for receptor recognition PUBMED:9680195.

    \ ' '5096' 'IPR007933' '\

    This family consists of several phage CII regulatory proteins. ATP-dependent proteases, like FtsH (HflB), recognise specific protein substrates. One of these is the lambda CII protein, which plays a key role in the phage lysis-lysogeny decision. The conserved C-terminal end of CII is a necessary and sufficient for rapid proteolysis PUBMED:12397182. Deletion of the heat-shock protease gene ftsH can restore CII function following heat induction but not following SOS induction PUBMED:18298445. The CIII protein acts as an inhibitor of HflB (FtsH) PUBMED:17890311.

    \ ' '5097' 'IPR007934' '\

    This family consists of several fungal alpha-L-arabinofuranosidase B proteins. L-Arabinose is a\ constituent of plant cell wall polysaccharides. It is found in a polymeric form in L-arabinan, in which\ the backbone is formed by 1,5-a- linked l-arabinose residues that can be branched via 1,2-a- and\ 1,3-a-linked l-arabinofuranose side chains. AbfB hydrolyses 1,5-a, 1,3-a and 1,2-a linkages in both\ oligosaccharides and polysaccharides, which contain terminal non-reducing l-arabinofuranoses in\ side chains PUBMED:10217508.

    \ ' '5098' 'IPR007935' '\

    This family consists of several tobravirus 2B proteins. It is known that the 2B protein is required for transmission by both Paratrichodorus pachydermus and Paratrichodorus\ anemones nematodes PUBMED:11162804. Transmission of the tobraviruses Tobacco rattle virus by trichodorid vector nematodes requires the viral coat protein (CP) and the 2B protein, a nonstructural protein encoded by RNA2, the smaller of the two viral genomic RNAs. It is hypothesized that the 2B protein functions by interacting with a small, flexible domain located at the C-terminus of the CP, forming a bridge between the virus particle and the internal surface of the vector nematode feeding apparatus PUBMED:12202212.

    \ ' '5099' 'IPR007936' '\

    This family contains several bacterial virulence-associated protein E like proteins.

    \ ' '5100' 'IPR007937' '\

    Vaccinia viral RNA synthesis is carried out by a virus coded, multi-subunit, eukaryotic-like RNA polymerase. RNA polymerase subunits are synthesized\ throughout infection and the assembled RNA polymerase is packaged into nascent virions late in infection. The RNA polymerase exists in two different forms, one\ specific for early genes and one specific for late genes. Both forms of the RNA polymerase have in common eight subunits, ranging in size from 147 to 7 kDa This family consists of several poxvirus DNA-dependent RNA polymerase 22 kDa\ subunits.

    \ ' '5101' 'IPR007938' '\

    This family consists of several nucleopolyhedrovirus occlusion-derived virus envelope E25\ proteins. The N terminus of this protein is extremely hydrophobic, studies suggest that this defined hydrophobic domain is sufficient to direct the protein to\ induced membrane microvesicles within a baculovirus-infected cell nucleus and the viral envelope. In addition,\ movement of the protein into the nuclear envelope may initiate through cytoplasmic membranes, such as endoplasmic reticulum, and\ that transport into the nucleus may be mediated through the outer and inner nuclear membrane PUBMED:9108103.

    \ ' '5102' 'IPR007939' '\

    This family consists of several bacterial copper resistance proteins. Copper is essential and\ serves as a cofactor for more than 30 enzymes yet a surplus of copper is toxic and leads to free radical\ formation and oxidation of biomolecules. Therefore, copper homeostasis is a key requisite for every\ organism. CopB serves to extrude copper when it approaches toxic levels PUBMED:11696373 and has been\ shown to act as an ATPase ().

    \ ' '5103' 'IPR007940' '\

    The SH3 domain-binding protein inhibits the auto and transphophorylation of BTK and acts as a negative regulator of BTK-related signalling in B cells.

    \ ' '5104' 'IPR007941' '\

    This family consists of several uncharacterised eukaryotic proteins.

    \ ' '5105' 'IPR007942' '\

    This family contains several phospholipase-like proteins from Arabidopsis thaliana which are homologous to PEARLI 4.

    \ ' '5106' 'IPR007943' '\

    This domain is found in members of the junctin, junctate and aspartyl beta-hydroxylase\ protein families. Junctate is an integral ER/SR membrane calcium binding protein, which comes from an\ alternatively spliced form of the same gene that generates aspartyl beta-hydroxylase and junctin\ PUBMED:11735129. Aspartyl beta-hydroxylase catalyses the post-translational hydroxylation of\ aspartic acid or asparagine residues contained within epidermal growth factor (EGF) domains of\ proteins PUBMED:11773073.

    \ ' '5107' 'IPR007944' '\

    This family consists of several bacterial flagellar transcriptional activator (FlhC) proteins. FlhC\ combines with FlhD to form a regulatory complex in Escherichia coli,\ this complex has been shown to be a global regulator involved in many cellular processes as well as\ a flagellar transcriptional activator PUBMED:11287152.

    \ ' '5108' 'IPR007945' '\

    Mature peptide hormones and neuropeptides are typically synthesised from much larger precursors and require several post-translational processing steps--including\ proteolytic cleavage--for the formation of the bioactive species. The subtilisin-related proteolytic enzymes that accomplish neuroendocrine-specific cleavages are\ known as prohormone convertases 1 and 2 (PC1 and PC2), which belong to MEROPS peptidase family S8B. The cell biology of these proteases within the regulated secretory pathway of neuroendocrine cells is\ complex, and they are themselves initially synthesised as inactive precursor molecules. ProPC1 propeptide cleavage occurs rapidly in the endoplasmic reticulum, yet its major site of action on prohormones takes place later in the secretory pathway. PC1 undergoes an interesting carboxyl terminal processing event whose function\ appears to be to activate the enzyme. ProPC2, on the other hand, exhibits comparatively long initial folding times and exits the endoplasmic reticulum without\ propeptide cleavage, in association with the neuroendocrine-specific protein 7B2. Once the proPC2/7B2 complex arrives at the trans-Golgi network, 7B2 is\ internally cleaved into two domains, the 21-kDa fragment and a carboxy-terminal 31 residue peptide. PC2 propeptide removal occurs in the maturing secretory granule, most likely through autocatalysis, and 7B2 association does not appear to be directly required for this cleavage event. However, if proPC2 has not encountered 7B2 intracellularly, it cannot generate a catalytically active mature species. The molecular mechanism behind the intriguing intracellular association of 7B2 and proPC2 is still unknown, but may involve conformational rearrangement or stabilisation of a proPC2 conformer mediated by a 36-residue internal segment of 21-kDa 7B2.

    \ \ \

    This family represents, 7B2 (secretogranin V), which is the molecular escort protein for PC2. 7B2 is a bifunctional protein with an N-terminal activation domain and a C-terminal inhibitory domain (MEROPS inhibitor family I21, clan I-) separated by a furin cleavage site PUBMED:10506829. Although 7B2 represents a potent inhibitor of PC2, there is an absolute requirement of 7B2 for the activation of PC2, which is synthesised as a zymogen. Both the full length, 27 kDa, and the C-terminal peptide (CT domain) derived from intramolecular cleavage of 7B2 are potent inhibitors of PC2. Studies have shown that the active peptide in the CT domain to be LLRVHK, active in the nanomolar range not only against PC2 but also PC1 PUBMED:9756897, PUBMED:10812060. Knockout studies have shown that the PC2 nulls are not phenotypically equivalent to the 7B2 nulls, which suggests that 7B2 may have other activities in addition to being the activator of PC2 PUBMED:12472887.

    \ \

    7B2 exhibits both structural and functional homology to proSAAS (), which is the PC1 binding protein. The CT domain of proSAAS contains the same inhibitor hexapeptide as 7B2, consequently both 7B2 and proSAAS are two members of a homologous family of prohormone convertase inhibitor proteins.

    \ \ ' '5109' 'IPR007946' '\

    This family consists of several eukaryotic AAR2-like proteins. The Saccharomyces cerevisiae protein AAR2 is involved in splicing pre-mRNA\ of the a1 cistron and other genes that are important for cell growth PUBMED:1922071.

    \ ' '5110' 'IPR007947' '\

    CD164 is a mucin-like receptor, or sialomucin, with specificity in\ receptor/\ ligand interactions that depends on the structural characteristics of the\ mucin-like receptor. Its functions include mediating, or regulating,\ haematopoietic progenitor cell adhesion and the negative regulation of their\ growth and/or-differentiation. It exists in the native state as a\ disulphide-\ linked homodimer of two 80-85kDa subunits. It is usually expressed by CD34+\ and CD341o/- haematopoietic stem cells and associated microenvironmental\ cells. It contains, in its extracellular region, two mucin domains (I and\ II)\ linked by a non-mucin domain, which has been predicted to contain intra-\ disulphide bridges. This receptor may play a key role in haematopoiesis\ by facilitating the adhesion of human CD34+ cells to bone marrow stroma and\ by negatively regulating CD34+ CD341o/- haematopoietic progenitor cell\ proliferation. These effects involve the CD164 class I and/or II epitopes\ recognised by the monoclonal antibodies (mAbs) 105A5 and 103B2/9E10. These\ epitopes are carbohydrate-dependent and are located on the N-terminal\ mucin domain I PUBMED:10491205, PUBMED:11027692.

    \

    It has been found that murine MGC-24v and rat endolyn share significant\ sequence similarities with human CD164. However, CD164 lacks the consensus\ glycosaminoglycan (GAG)-attachment site found in MGC-24; it is possible\ that GAG-association is responsible for the high molecular weight of the\ epithelial-derived MGC-24 glycoprotein PUBMED:9763543.\

    \

    Genomic structure studies have placed CD164 within the mucin-subgroup\ that\ comprises multiple exons, and demonstrate the diverse chromosomal\ distribution of this family of molecules. Molecules with such multiple\ exons may have sophisticated regulatory mechanisms that involve not only\ post-translational modifications of the oligosaccharide side chains, but\ also differential exon usage. Although differences in the intron and exon\ sizes are seen between the mouse and human genes, the predicted proteins\ are similar in size and structure, maintaining functionally important\ motifs that regulate cell proliferation or subcellular distribution \ PUBMED:11027692.\

    \

    CD164 is a gene whose expression depends on differential usage of poly-\ adenylation sites within the 3\'-UTR. The conserved distribution of the\ 3.2- and 1.2-kb CD164 transcripts between mouse and human suggests that\ (i) a mechanism may exist to regulate tissue-specific polyadenylation, and\ (ii) differences in polyadenylation are important for the expression and\ function of CD164 in different tissues. Two other aspects of the structure\ of CD164 are of particular interest. First, it shares one of several\ conserved features of a cytokine-binding pocket - in this respect, it is\ notable that evidence exists for a class of cell-surface sialomucin\ modulators that directly interact with growth factor receptors to regulate\ their response to physiological ligands. Second, its cytoplasmic tail\ contains a C-terminal YHTL motif found in many endocytic membrane proteins\ or receptors. These Tyr-based motifs bind to adaptor proteins, which mediate\ the sorting of membrane proteins into transport vesicles from the plasma\ membrane to the endosomes, and between intracellular compartments.\

    \ \ ' '5111' 'IPR007948' '\

    This family consists of several uncharacterised bacterial proteins of unknown function.

    \ ' '5112' 'IPR007949' '\

    This domain consists of several SDA1 protein homologues. SDA1 is a Saccharomyces cerevisiae protein which is involved in the control of the\ actin cytoskeleton. The protein is essential for cell viability and is localised in the nucleus\ PUBMED:10704371.

    \ ' '5114' 'IPR007951' '\

    This family consists of several mouse anagen-specific\ protein mKAP13 (PMG1 and PMG2). PMG1 and 2 contain characteristic repeats reminiscent of\ the keratin-associated proteins (KAPs). Both genes are expressed in growing hair follicles in skin as\ well as in sebaceous and eccrine sweat glands. Interestingly, expression is also detected in the\ mammary epithelium where it is limited to the onset of the pubertal growth phase and is independent\ of ovarian hormones. Their broad, developmentally controlled expression pattern, together with their\ unique amino acid composition, demonstrate that pmg-1 and pmg-2 constitute a novel KAP gene\ family participating in the differentiation of all epithelial cells forming the epidermal appendages\ PUBMED:10446281.

    \ ' '5115' 'IPR007952' '\

    This family consists of several poxvirus A3L or A2_5L proteins. The entry of vaccinia virus (VV) into the host cell results in the delivery of the double-stranded DNA genome-containing core into the\ cytoplasm. The core is disassembled, releasing the viral DNA in order to initiate VV cytoplasmic transcription and DNA replication.\ A3L protein is a part of that core PUBMED:10729126. The A2.5L gene product is an\ all-alpha-helical protein with a conserved Cxx(x)C motif in the N-terminal alpha-helix. It appears to be an integral component of intracellular virions PUBMED:12350360.

    \ ' '5116' 'IPR007953' '\

    This entry represents the borrelial prophage-encoded protein BlyB. Originally BlyB and its partner, the membrane-bound protein BlyA, were thought to comprise a haemolysis system. It is now thought, however, that BlyA and BlyB function instead as a holin or holin-like system PUBMED:11073925.

    \ ' '5117' 'IPR007954' '\

    This entry contains the Baculovirus immediate-early protein IE-0.

    \ ' '5118' 'IPR007955' '\

    Trophinin and tastin form a cell adhesion molecule complex that potentially mediates an initial\ attachment of the blastocyst to uterine epithelial cells at the time of implantation. Trophinin and tastin\ bind to an intermediary cytoplasmic protein called bystin. Bystin may be involved in implantation and\ trophoblast invasion because bystin is found with trophinin and tastin in the cells at human implantation sites and also in the intermediate trophoblasts at\ invasion front in the placenta from early pregnancy PUBMED:9560222. This family also includes the\ Saccharomyces cerevisiae protein ENP1. ENP1 is an essential\ protein in S. cerevisiae and is localised in the nucleus\ PUBMED:9034325. It is thought that ENP1 plays a direct role in the early steps of rRNA processing\ as enp1 defective S. cerevisiae cannot synthesise 20S\ pre-rRNA and hence 18S rRNA, which leads to reduced formation of 40S ribosomal subunits\ PUBMED:12527778.

    \ ' '5119' 'IPR007956' '\

    This family consists of several eukaryotic malonyl-CoA decarboxylase (MLYCD) proteins.\ Malonyl-CoA, in addition to being an intermediate in the de novo synthesis of fatty acids, is\ an inhibitor of carnitine palmitoyltransferase I, the enzyme that regulates the transfer of long-chain\ fatty acyl-CoA into mitochondria, where they are oxidised. After exercise, malonyl-CoA\ decarboxylase participates with acetyl-CoA carboxylase in regulating the concentration of\ malonyl-CoA in liver and adipose tissue, as well as in muscle. Malonyl-CoA decarboxylase is\ regulated by AMP-activated protein kinase (AMPK) PUBMED:12065578.

    \ ' '5120' 'IPR007957' '\

    L11L is an integral membrane protein of the African swine fever\ virus, which is expressed late in the virus replication cycle. The protein is thought to be\ non-essential for growth in vitro and for virus virulence in domestic pigs PUBMED:9603334.

    \ ' '5121' 'IPR007958' '\

    This family contains various secreted scorpion short toxins which seem to be unrelated to those described in\ .

    \ ' '5122' 'IPR007959' '\

    Proteins in this entry belong to a family of dinoflagellate luciferase and luciferin binding proteins. Luciferase is involved in catalysing the light emitting reaction in bioluminescence and luciferin binding protein (LBP) is known to bind to luciferin (the substrate for luciferase) to stop it reacting with the enzyme and therefore switching off the bioluminescence function. The expression of these two proteins is controlled by a circadian clock at the translational level, with synthesis and degradation occurring on a daily basis PUBMED:11747464.

    \ \

    This entry consists of a presumed N-terminal domain that is conserved between dinoflagellate luciferase and luciferin binding proteins. This domain is not, however, the catalytic part of the protein. It has been suggested that this region may mediate an interaction between LBP and Luciferase or their association with the vacuolar membrane PUBMED:11747464.

    \ \

    More information about these proteins can be found at Protein of the Month: Luciferase PUBMED:.

    \ ' '5123' 'IPR007960' '\

    This family consists of several forms of mammalian taste receptor proteins (TAS2Rs). TAS2Rs\ are G protein-coupled receptors expressed in subsets of taste receptor cells of the tongue and palate\ epithelia and are organised in the genome in clusters. The proteins are genetically linked to loci that\ influence bitter perception in mice and humans\ PUBMED:10761934.

    \ ' '5124' 'IPR007961' '\

    This family consists of several latent membrane protein 1 or LMP1s mostly from Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4). LMP1 of HHV-4 is a 62-65 kDa plasma membrane protein possessing six membrane spanning regions, a short cytoplasmic N terminus and a long cytoplasmic carboxy tail of 200 amino acids. HHV-4 virus latent membrane protein 1 (LMP1) is essential for HHV-4 mediated transformation and has been associated with several cases of malignancies. HHV-4-like viruses in Macaca fascicularis (Cynomolgus monkeys) have been associated with high lymphoma rates in immunosuppressed monkeys PUBMED:12457963.

    \ ' '5125' 'IPR007962' '\

    This family consists of Bombinin and Maximin proteins from Bombina maxima (Giant fire-bellied toad). Two groups of antimicrobial peptides have been isolated from skin secretions of B. maxima. Peptides in the first group, named maximins 1,\ 2, 3, 4 and 5, are structurally related to bombinin-like peptides (BLPs). Unlike BLPs, sequence\ variations in maximins occurred all through the molecules. In addition to the potent antimicrobial\ activity, cytotoxicity against tumour cells and spermicidal action of maximins, maximin 3 possessed a\ significant anti-Simian-Human immunodeficiency virus (HIV) activity.\ Maximins 1 and 3 have been found to be toxic to mice.\ Peptides in the second group, termed maximins H1, H2, H3 and H4, are homologous with bombinin\ H peptides PUBMED:11835991.

    \ ' '5126' 'IPR007963' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M61 (glycyl aminopeptidase family, clan MA(E)).The predicted active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH. The type example is glycyl aminopeptidase from Sphingomonas capsulata.

    \ ' '5127' 'IPR007964' '\

    This family consists of several uncharacterised mammalian proteins of unknown function.

    \ ' '5128' 'IPR007965' '\

    This family consists of several uncharacterised eukaryotic proteins of unknown function.

    \ ' '5129' 'IPR007966' '\

    This family consists of several uncharacterised Chlamydia proteins of unknown function.

    \ ' '5130' 'IPR007967' '\

    This family consists of several uncharacterised eukaryotic proteins of unknown function.

    \ ' '5131' 'IPR007968' '\

    This family consists of several uncharacterised tobravirus proteins of unknown function.

    \ ' '5132' 'IPR007969' '\

    This family consists of several uncharacterised Mycobacterium tuberculosis proteins of unknown function.

    \ ' '5133' 'IPR007970' '\

    This family consists of several uncharacterised Drosophila melanogaster proteins of unknown function.

    \ ' '5134' 'IPR007971' '\

    This family consists of several bundlin proteins from Escherichia coli. Bundlin is a type IV pilin protein that is the only known structural component of\ enteropathogenic E. coli bundle-forming pili (BFP). BFP\ play a role in virulence, antigenicity, autoaggregation, and localised adherence to epithelial cells\ PUBMED:11083828.

    \ ' '5135' 'IPR007972' '\

    This family consists of several uncharacterised eukaryotic proteins of unknown function.

    \ ' '5136' 'IPR007973' '\

    This family consists of several bacterial sex pilus assembly and synthesis proteins (TraE).\ Conjugal transfer of plasmids from donor to recipient cells is a complex process in which a\ cell-to-cell contact plays a key role. Many genes encoded by self-transmissible plasmids are\ required for various processes of conjugation, including pilus formation, stabilisation of mating pairs,\ conjugative DNA metabolism, surface exclusion and regulation of transfer gene expression\ PUBMED:10760136. The exact function of the TraE protein is unknown.

    \ ' '5137' 'IPR007974' '\

    This family consists of tenuivirus NS-3 (PV3 or GV3) proteins. The function of this protein is\ unknown although it is thought to be a replication protein.

    \ ' '5138' 'IPR007975' '\

    Autographa californica nuclear polyhedrosis virus (AcMNPV) p31 is a\ nuclear phosphoprotein that accumulates in the virogenic stroma, which is the viral replication centre\ in the infected-cell nucleus. The protein binds to DNA, and serves as a late expression factor\ PUBMED:8794314.

    \ ' '5140' 'IPR007977' '\

    The p21 membrane protein of vaccinia virus, encoded by the A17L (or A18L) gene, has been\ reported to localise on the inner of the two membranes of the intracellular mature virus (IMV). It has\ also been shown that p21 acts as a membrane anchor for the externally located fusion protein P14\ (A27L gene) PUBMED:11882999.

    \ ' '5141' 'IPR007978' '\

    This family consists of several baculovirus occlusion-derived virus envelope proteins (EC27 or\ E27) which appear to act as a multifuntional cyclins during the host cell cycle. The ODV-E27 protein has distinct functional characteristics compared to cellular and viral\ cyclins. When associated with cdc2, it\ exhibits cyclin B-like activity; when associated with cdk6, the complex possesses cyclin D-like activity and binds PCNA (proliferating cell nuclear antigen) PUBMED:9736714.

    \ ' '5142' 'IPR007979' '\

    This family consists of several ICEA proteins from Helicobacter pylori, infection of which causes gastritis and\ peptic ulcer disease, and the bacteria is classified as a definite carcinogen of gastric cancer. ICEA1 is speculated\ to be associated with peptic ulcer disease and may have endonuclease activity PUBMED:11843964.

    \ ' '5143' 'IPR007980' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family consists of the VAR1 mitochondrial ribosomal proteins found in yeast. Mitochondria possess their own ribosomes responsible for\ the synthesis of a small number of proteins encoded by the mitochondrial genome. VAR1 is the only protein in the yeast mitochondrial ribosome to be encoded in the mitochondria - the remaining approximately 80 ribosomal\ proteins are encoded in the nucleus PUBMED:8988258. VAR1 along with 15S rRNA are necessary\ for the formation of mature 37S subunits PUBMED:7770043.

    \ ' '5144' 'IPR007981' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This group of aspartic peptidases belong to the MEROPS peptidase family A5 (thermopsin family, clan A-). Currently the protein fold and active site residues are not known for any members of this family. The type example is thermopsin from Sulfolobus acidocaldarius.\ Thermopsin is a thermostable acid protease which is capable of hydrolysing the following bonds: Leu-Val, Leu-Tyr, Phe-Phe, Phe-Tyr, and Tyr-Thr. The specificity of thermopsin is therefore similar to that of pepsin, that is, it prefers large hydrophobic residues at both sides of the scissile bond PUBMED:2104844.

    \ ' '5145' 'IPR007982' '\

    This family consists of several Tombusvirus movement\ proteins. These proteins allow the virus to move from cell-to-cell and allow host-specific systemic\ spread PUBMED:11483749.

    \ ' '5147' 'IPR007984' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \

    The poxvirus DNA-directed RNA polymerase () catalyses the transcription of DNA into RNA. It consists of at least eight subunits, this is the 19 kDa subunit.

    \ ' '5148' 'IPR007985' '\

    This family consists of haemolysin expression modulating protein (Hha) from Escherichia coli and its enterobacterial homologues, such as YmoA from Yersinia enterocolitica, and RmoA encoded on the R100 plasmid. These proteins act as modulators of bacterial gene expression. Members of the Hha/YmoA/RmoA family act in conjunction with members of the H-NS family, participating in the thermoregulation of different virulence factors and in plasmid transfer PUBMED:11890540. Hha, along with the chromatin-associated protein H-NS, is involved in the regulation of expression of the toxin alpha-haemolysin in response to osmolarity and temperature PUBMED:11790731. YmoA modulates the expression of various virulence factors, such as Yop proteins and YadA adhesin, in response to temperature. RmoA is a plasmid R100 modulator involved in plasmid transfer PUBMED:9851035. The HHA family of proteins display striking similarity to the oligomerization domain of the H-NS proteins.

    \ ' '5149' 'IPR007986' '\

    This family consists of NINE proteins from several bacteriophage and from Escherichia coli.

    \ ' '5150' 'IPR007987' '\

    This family consists of several poxvirus A21 proteins.

    \ ' '5151' 'IPR007988' '\

    This family consists of several variants of the human and chimpanzee\ (Pan troglodytes) sperm antigen proteins (HE2 and EP2\ respectively). The EP2 gene codes for a family of androgen-dependent, epididymis-specific\ secretory proteins.The EP2 gene uses alternative promoters and differential splicing to produce a\ family of variant messages. The translated putative protein variants differ significantly from each\ other. Some of these putative proteins have similarity to beta-defensins, a family of antimicrobial\ peptides PUBMED:10819450.

    \ ' '5152' 'IPR007989' '\

    This family consists of several uncharacterised Arabidopsis thaliana proteins of unknown function.

    \ ' '5153' 'IPR007990' '\

    This family consists of seminal vesicle autoantigen and prolactin-inducible (PIP) proteins.\ Seminal vesicle autoantigen (SVA) is specifically present in the seminal plasma of mice. This 19 kDa secretory glycoprotein suppresses the motility of\ spermatozoa by interacting with phospholipid. PIP has several known functions. In saliva, this\ protein plays a role in host defence by binding to microorganisms such as Streptococcus. PIP is an\ aspartyl proteinase and it acts as a factor capable of suppressing T-cell apoptosis through its\ interaction with CD4 PUBMED:11178965.

    \ ' '5154' 'IPR007991' '\

    This family consists of several eukaryotic proteins which are homologous to the Saccharomyces cerevisiae RRN3 protein. RRN3 is one of the RRN genes specifically required for the transcription of rDNA by RNA polymerase I (Pol I) in the S. cerevisiae PUBMED:8670901 RNA polymerase I complex within the nucleolus.\ In mammalian cells, the phosphorylation state of Rrn3 regulates rDNA transcription by determining the steady-state\ concentration of the Rrn3 PUBMED:12015311.

    \ ' '5155' 'IPR007992' '\

    This family consists of several eukaryotic succinate dehydrogenase [ubiquinone] cytochrome B small subunit, mitochondrial precursor (CybS) proteins. SDHD encodes the small subunit (cybS) of cytochrome b in succinate-ubiquinone oxidoreductase (mitochondrial complex II). Mitochondrial complex II is involved in the Krebs cycle and in the aerobic electron transport chain. It contains four\ proteins. The catalytic core consists of a flavoprotein and an iron-sulphur protein; these proteins are anchored to the mitochondrial inner membrane by the large subunit of cytochrome b (cybL) and cybS, which together comprise the haem-protein cytochrome b. Mutations in the SDHD gene can lead to hereditary paraganglioma, characterised by the development of benign, vascularised tumours\ in the head and neck PUBMED:10657297.

    \ ' '5158' 'IPR007995' '\

    This family consists of several uncharacterised Streptomyces proteins as well as one from\ Mycobacterium tuberculosis. The function of these proteins is\ unknown.

    \ ' '5159' 'IPR007996' '\

    This family consists of several uncharacterised Calicivirus proteins of unknown function.

    \ ' '5161' 'IPR007998' '\

    This family consists of several eukaryotic proteins of unknown function.

    \ ' '5162' 'IPR007999' '\

    This family consists of several uncharacterised Drosophila melanogaster proteins of unknown function.

    \ ' '5163' 'IPR008000' '\

    Mutarotases are enzymes which interconvert the alpha and beta stereoisomers of monosaccharides, enhancing the rate of their metabolism. Proteins in this entry are homologues of the rhamnose mutarotase YiiL () from Escherichia coli, and are often encoded in rhamnose utilisation operons. YiiL is an enzyme which interconverts the alpha and beta stereoisomers of the pyranose form of L-rhamnose PUBMED:15060078. It is not required for growth on rhamnose, but allows cells to utilise this carbon source more efficiently PUBMED:15876375.The structure of YiiL is distinct from other mutarotases, forming an asymmetric dimmer stabilised by an intermolecular beta-sheet, hydrophobic interactions and a salt bridge PUBMED:15876375.

    \ ' '5164' 'IPR008001' '\

    Colony stimulating factor 1 (CSF-1) is a homodimeric polypeptide growth factor whose\ primary function is to regulate the survival, proliferation, differentiation, and function of cells of the\ mononuclear phagocytic lineage. This lineage includes mononuclear phagocytic precursors, blood\ monocytes, tissue macrophages, osteoclasts, and microglia of the brain, all of which possess cell\ surface receptors for CSF-1. The protein has also been linked with male fertility PUBMED:11897698\ and mutations in the Csf-1 gene have been found to cause osteopetrosis and failure of tooth eruption\ PUBMED:12379742.

    \ ' '5165' 'IPR008002' '\

    This family consists of several herpesvirus proteins of unknown function.

    \ ' '5166' 'IPR008003' '\

    This family contains several bacteriophage proteins. Three of the proteins in this\ family have been labelled putative cro repressor proteins.

    \ ' '5167' 'IPR008004' '\

    This family consists of several uncharacterised plant proteins of unknown function.

    \ ' '5168' 'IPR008005' '\

    This family consists of several uncharacterised nucleopolyhedrovirus proteins of unknown\ function.

    \ ' '5169' 'IPR008006' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases corresponds to MEROPS peptidase family M26 (clan MA(E)). The active site residues for members of this family and family M4 occur in the motif HEXXH. The type example is IgA1-specific metalloendopeptidase from Streptococcus sanguis ().

    \ ' '5170' 'IPR008007' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M42 (glutamyl aminopeptidase family, clan MH). For members of this family and family M28 the predicted metal ligands occur in the same order in the sequence: H, D, E, D/E, H; and the active site residues occur in the motifs HXD and EE.

    \ ' '5171' 'IPR008008' '\

    This is a short conserved region found in some transposons.

    \ ' '5172' 'IPR008009' '\

    This alignment represents the conserved core region of a ~90 residue repeat found in several\ haemagglutinins and other cell surface proteins. Sequence similarities to Hyalin () and the PKD domain () suggest an Ig-like fold so this family may be similar in function to the () and () protein families.

    \ ' '5173' 'IPR008010' '\

    This family is a family of eukaryotic membrane proteins. It was previously annotated as including a putative receptor for human cytomegalovirus gH PUBMED:17151244 but this has has since been disputed PUBMED:10640539. Analysis of the mouse Tapt1 protein (transmembrane anterior posterior transformation 1) has shown it to be involved in patterning of the vertebrate axial skeleton.

    \ ' '5174' 'IPR008011' '\

    This family of short proteins includes proteins from the NADH-ubiquinone oxidoreductase\ complex I. The family includes the B14 subunit from bovine NADH-ubiquinone oxidoreductase B14 subunit , and the B22 subunit from the human enzyme . The family has been named LYR after a highly conserved tripeptide motif\ close to the N terminus of these proteins.

    \ \

    Members of this family also found in yeast which do contain this complex. In these organisms they are believed to be be required for iron-sulphur custer biogenesis.

    \ ' '5175' 'IPR008012' '\

    UMP1 is a short-lived chaperone present in the precursor form of the 20S proteasome and\ absent in the mature complex. UMP1 is required for the correct assembly and enzymatic activation\ of the proteasome. UMP1 seems to be degraded by the proteasome upon its formation.

    \ ' '5176' 'IPR008013' '\

    GATA transcription factors mediate cell differentiation in a diverse range of tissues. Mutations\ are often associated with certain congenital human\ disorders. The six classical vertebrate GATA proteins, GATA-1 to GATA-6, are highly\ homologous and have two tandem zinc fingers. The classical GATA transcription factors function as\ transcription activators. In lower metazoans GATA proteins carry a single canonical zinc finger. This\ entry represents the N-terminal domain of the family of GATA transcription activators.

    \ ' '5177' 'IPR008014' '\

    Glycogen synthase kinase-3 (GSK-3) sequentially phosphorylates four serine residues on\ glycogen synthase (GS), in the sequence SxxxSxxxSxxx-SxxxS(p), by recognising and\ phosphorylating the first serine in the sequence motif SxxxS(P) (where S(p) represents a\ phosphoserine). Interaction of GSK-3 with a peptide derived from GSK-3 binding protein\ prevents GSK-3 interaction with Axin. This interaction thereby inhibits the Axin-dependent\ phosphorylation of beta-catenin by GSK-3 PUBMED:11738041.

    \ ' '5178' 'IPR008015' '\

    GMP-PDE delta subunit was originally identified as a fourth subunit of rod-specific cGMP phosphodiesterase (PDE) (). The precise function of PDE delta subunit in the rod specific GMP-PDE complex is unclear. In addition, PDE delta subunit is not confined to photoreceptor cells but is widely distributed in different tissues. PDE delta subunit is\ thought to be a specific soluble transport factor for certain prenylated proteins and Arl2-GTP a regulator of PDE-mediated transport PUBMED:11980706.

    \ ' '5179' 'IPR008016' '\

    This entry represents the upper collar protein (also known as head-tail connector protein or late protein gp10) from various bacteriophage. The upper collar protein of Bacteriophage phi-29 is composed of twelve 36 kDa subunits with 12-fold symmetry. It consists of two domains: an alpha-helical bundle domain and a beta-barrel domain. This protein is located between the head and the tail of the bacteriophage and acts as the central component of a rotary motor that packages the genomic dsDNA into pre-formed proheads. This motor consists of the upper collar protein, surrounded by a 29-encoded, 174-base, RNA and a viral ATPase protein PUBMED:9891587.

    \ ' '5180' 'IPR008017' '\

    Delta atracotoxin produces potentially fatal neurotoxic symptoms in primates by slowing the\ inactivation of voltage-gated sodium channels PUBMED:9384567. The structure of atracotoxin\ comprises a core beta region containing a triple-stranded a thumb-like extension protruding from\ the beta region and a C-terminal helix. The beta region contains a cystine knot motif, a feature seen in other neurotoxic\ polypeptides PUBMED:9384567.

    \ ' '5181' 'IPR008018' '\

    The phage head-tail attachment protein is required for the joining of phage heads and tails at the\ last step of morphogenesis PUBMED:12083526.

    \ ' '5182' 'IPR008019' '\

    Apolipoprotein CII (apoC-II) is a surface constituent of plasma lipoproteins and the activator for lipoprotein lipase (LPL). It is therefore central for lipid transport in blood. Lipoprotein lipase is a key enzyme in the regulation of triglyceride levels in human serum PUBMED:10903476. It is the C-terminal helix of apoC-II that is responsible for the activation of LPL PUBMED:12590574. The active peptide of apoC-II occurs at residues 44-79 and has been shown to reverse the symptoms of genetic apoC-II deficiency in a human subject PUBMED:10903476.

    \ \

    Micellar SDS, a commonly used mimetic of the lipoprotein surface, inhibits the aggregation of apoC-II and induces a stable structure containing approximately 60% alpha-helix. The first 12 residues of apoC-II are structurally heterogeneous but the rest of the protein forms a predominantly helical structure PUBMED:11331005.

    \ ' '5183' 'IPR008020' '\

    The major coat protein in the capsid of filamentous bacteriophage forms a helical assembly of about\ 7000 identical protomers, with each protomer comprised of 46 amino acids, after the cleavage of the\ signal peptide. Each protomer forms a slightly curved helix that combines to form a tubular structure\ that encapsulates the viral DNA PUBMED:10666593.

    \ ' '5184' 'IPR008021' '\

    The g3p protein (also known as attachment protein or coat protein A) of filamentous phage such as M13, phage fd and phage f1, is an essential coat protein for the infection of Escherichia coli. The g3p protein consists of three domains: two N-terminal domains (N1 and N2) with a similar beta-barrel fold, and a C-terminal domain PUBMED:9461080. The N-terminal domains protrude from the phage surface, while the C-terminal domain acts as an anchor embedded in the phage coat, together forming a horseshoe-like structure PUBMED:12767837. The g3p protein exists as 3-5 copies at the tip of the phage particle.

    \

    Infection by filamentous phage occurs in two steps, both of which are mediated by the g3p protein: phage attachment to the F-pilus of the host cell as the primary receptor, followed by attachment to the C-terminal domain of the periplasmic protein TolA as a co-receptor.

    \

    This entry represents the two N-terminal domains, N1 and N2, of the filamentous phage coat protein g3p.

    \ ' '5185' 'IPR008022' '\

    DicB is part of the dic operon, which resides on cryptic prophage Kim. Under normal\ conditions, expression of dicB is actively repressed. When expression is induced, however, cell\ division rapidly ceases, and this division block is dependent on MinC with which it interacts\ PUBMED:12003935.

    \ ' '5186' 'IPR008023' '\

    This is a family of proteins of unknown function.

    \ ' '5187' 'IPR008024' '\

    This domain consists of two transmembrane helices and a conserved linking section.

    \ ' '5188' 'IPR008025' '\

    Contractility of vascular smooth muscle depends on phosphorylation of myosin light chains, and\ is modulated by hormonal control of myosin phosphatase activity. Signaling pathways activate\ kinases such as PKC or Rho-dependent kinases that phosphorylate the myosin phosphatase inhibitor\ protein called CPI-17. Phosphorylation of CPI-17 at Thr-38 enhances its inhibitory potency\ 1000-fold, creating a molecular switch for regulating contraction PUBMED:11734001.

    \ ' '5189' 'IPR008269' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This signature defines the C-terminal proteolytic domain of the archael, bacterial and eukaryotic lon proteases, which are ATP-dependent serine peptidases belonging to the MEROPS peptidase family S16 (lon protease family, clan SF). In the eukaryotes the majority of the proteins are located in the mitochondrial matrix PUBMED:8248235, PUBMED:9620272. In yeast, Pim1, is located in the mitochondrial matrix, is required for mitochondrial function, is constitutively expressed but is increased after thermal stress, suggesting that Pim1 may play a role in the heat shock response PUBMED:8276800.

    \ ' '5190' 'IPR008026' '\

    ICP47 (US12) is a key factor in the evasion of cellular immune response against Human herpesvirus 1 (HHV-1) (Human herpes simplex virus 1)-infected cells. Specific inhibition of the transporter associated with antigen processing (TAP) by ICP47 prevents peptide transport into the endoplasmic reticulum and subsequent loading of major histocompatibility complex (MHC) class I molecules PUBMED:10521276. ICP47 is comprised of three helices and is associated with cellular membranes PUBMED:10521276.

    \ ' '5191' 'IPR016018' '\

    The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell PUBMED:9618447 and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Salmonella spp. \ secrete an effector protein called SopE that is responsible for stimulating \ the reorganisation of the host cell actin cytoskeleton, and ruffling of the \ cellular membrane PUBMED:9482928. It acts as a guanyl-nucleotide-exchange factor on Rho-GTPase proteins such as Cdc42 and Rac. As it is imperative for the bacterium \ to revert the cell back to its "normal" state as quickly as possible, \ another tyrosine phosphatase effector called SptP reverses the actions \ brought about by SopE PUBMED:11316807.

    \ \

    Recently, it has been found that SopE and its protein homologue SopE2 can\ activate different sets of Rho-GTPases in the host cell PUBMED:11316807. Far from being a redundant set of two similar type III effectors, they both act in unison \ to specifically activate different Rho-GTPase signalling cascades in the\ host cell during infection.\

    \ \

    This entry represents the N-terminal domain of SopE. The function of this domain is unknown.

    \ ' '5192' 'IPR008027' '\

    The UQCRX/QCR9 protein is the 9/10 subunit of complex III, and is a protein of about 7\ kDa. Deletion of QCR9 results in the inability of Saccharomyces cerevisiae to grow on a fermentable carbon\ source PUBMED:8382892. The protein is part of the mitchondrial respiratory chain.

    \ ' '5193' 'IPR008028' '\

    Sarcolipin is a 31 amino acid integral membrane protein that regulates Ca-ATPase activity in\ skeletal muscle PUBMED:11781085.

    \ ' '5194' 'IPR008029' '\

    Endonuclease I () is a junction-resolving enzyme encoded by bacteriophage T7, that selectively binds and cleaves four-way Holliday DNA junctions PUBMED:12093751. The structure of the enzyme shows that it forms a symmetric homodimer arranged in two well-separated domains. Each domain, however, is composed of elements from both subunits, and amino acid side chains from both protomers contribute to the active site PUBMED:11135673.

    \ ' '5195' 'IPR008030' '\

    NmrA is a negative transcriptional regulator involved in the post-translational modification of the\ transcription factor AreA. NmrA is part of a system controlling nitrogen metabolite repression in\ fungi PUBMED:11726498. This family only contains a few sequences as iteration results in significant\ matches to other Rossmann fold families.

    \ ' '5196' 'IPR008031' '\

    Monomethylamine methyltransferase of the archaebacterium Methanosarcina barkeri contains a novel amino acid, pyrrolysine, encoded by the termination codon UAG PUBMED:12121639. The structure of the enzyme reveals a homohexamer comprised of individual\ subunits with a TIM barrel fold. MtmB initiates the metabolism of monomethylamine by catalysing the transfer of the methyl group from monomethylamine to the corrinoid cofactor of MtmC.

    \ ' '5197' 'IPR008032' '\

    This is a family of unknown function found in archaebacterial proteins. The family has been solved via structural\ genomics techniques and comprises of segregated helical and anti-parallel beta sheet regions. It is a putative metal-binding protein.

    \ ' '5198' 'IPR008033' '\

    The filamentous bacteriophages are flexible rods about 1 to 2 microns long and 6 nm in diameter, with a helical shell of protein subunits\ surrounding a DNA core. The approximately 50-residue coat protein subunit is largely alpha-helix and the axis of the alpha-helix\ makes a small angle with the axis of the virion. The protein shell can be considered in three sections: the outer surface, occupied by the\ N-terminal region of the subunit, rich in acidic residues that interact with the surrounding solvent and give the virion a low isoelectric\ point; the interior of the shell, including a 19-residue stretch of apolar side-chains, where protein subunits interact mainly with each\ other; and the inner surface, occupied by the C-terminal region of the subunit, rich in basic residues that interact with the DNA core.

    \

    This is a family of class I phage major coat protein Gp8 or B which is a baseplate structural protein. The coat protein is largely alpha-helix with a slight\ curve PUBMED:8289247.

    \ ' '5199' 'IPR008034' '\

    Delta-lysin is a 26 amino acid, hemolytic peptide toxin secreted by Staphylococcus aureus. It is thought that delta-toxin forms an amphipathic\ helix upon binding to lipid bilayers PUBMED:12206677. The precise mode of action of delta-lysis is\ unclear.

    \ ' '5200' 'IPR008035' '\

    Iron (II)/2-oxoglutarate (2-OG)-dependent oxygenases catalyse oxidative reactions in a range\ of metabolic processes. Proline 3-hydroxylase hydroxylates proline at position 3, the first of a\ 2-OG oxygenase catalysing oxidation of a free alpha-amino acid. The structure contains conserved\ motifs present in other 2-OG oxygenases including a jelly roll strand core and residues binding iron\ and 2-oxoglutarate, consistent with divergent evolution within the extended family. The structure\ differs significantly from many other 2-OG oxygenases in possessing a discrete C-terminal helical\ domain.

    \ ' '5201' 'IPR008036' '\

    This entry represents Mu-type conotoxins. Cone snail toxins, conotoxins, are small peptides with disulphide connectivity, that target ion-channels or G-protein coupled receptors. Based on the number and pattern of disulphide bonds and biological activities, conotoxins can be classified into several families PUBMED:11478951. Omega, delta and kappa families of conotoxins have a knottin or inhibitor cystine knot scaffold. The knottin scaffold is a very special disulphide through disulphide knot, in which the III-VI disulphide bond crosses the macrocycle formed by two other disulphide bonds (I-IV and II-V) and the interconnecting backbone segments, where I-VI indicates the six cysteine residues starting from the N-terminus.

    \

    The disulphide bonding network as well as specific amino acids in inter-cysteine loops provide specificity of conotoxin PUBMED:10988292. The cysteine arrangement is the same for omega, delta and kappa families, but omega conotoxins are calcium channel blockers, whereas delta conotoxins delay the inactivation of sodium channels and kappa conotoxins are potassium channel blockers PUBMED:11478951. Mu conotoxins have two types of cysteine arrangement, but the knottin scaffold is not observed. Conotoxin gm9a, a putative 27-residue polypeptide encoded by Conus gloriamaris,\ has been shown to adopt an inhibitory cystine knot motif constrained by three\ disulphide bonds PUBMED:12006587, PUBMED:12193600.Mu conotoxins target the voltage-gated sodium channels, preferential skeletal muscle PUBMED:11478951, and are useful probes for investigating voltage-dependent sodium channels of excitable tissues PUBMED:2410412. Alpha conotoxins have two types of cysteine arrangement PUBMED:1390774 and are competitive nicotinic acetylcholine receptor antagonists.

    \ ' '5202' 'IPR008037' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    This family of serine protease inhibitors belong to MEROPS inhibitor family I19, clan IW. They inhibit chymotrpsin, a peptidase belong to the S1 family () PUBMED:14705960.

    \ \

    They were first isolated from Locusta migratoria migratoria(migratory locust). These were HI, LMCI-1 (PMP-D2) and LMCI-2 (PMP-C) PUBMED:1472051, PUBMED:1740125, PUBMED:10696590; five additional members SGPI-1 to 5 were identified in Schistocerca gregaria (desert locust) PUBMED:9475173, PUBMED:11856311, and a heterodimeric serine protease inhibitor (pacifastin) was isolated from the hemolymph of Pacifastacus leniusculus (Signal crayfish) PUBMED:9192625.

    \ \

    Pacifastin is a 155-kDa composed of two covalently linked subunits, which are separately encoded. The heavy chain of pacifastin (105 kDa) is related to transferrins, containing three transferrin lobes, two of which seem to\ be active for iron binding PUBMED:9192625. A number of the members of the transferrin family are also serine peptidases belong to MEROPS peptidase family S60 (). The light chain of pacifastin (44 kDa) is the proteinase inhibitory subunit, and has nine cysteine-rich inhibitory domains that are homologous to each other. The locust inhibitors share a conserved array of six cysteine residues with the pacifastin light chain. The structure of members of this family reveal that they are comprised of a triple-stranded antiparallel beta-sheet connected by three disulphide bridges PUBMED:9192625.

    \ \

    The biological function(s) of the locust inhibitors is (are) not fully understood. LMCI-1 and LMCI-2 were shown to inhibit the endogenous proteolytic activating cascade of prophenoloxidase PUBMED:11997226. Expression analysis shows that the genes encoding the SGPI precursors are differentially expressed in a time-, stage- and hormone-dependent manner.

    \ ' '5204' 'IPR008039' '\

    Although archaeal flagella appear superficially similar to those of bacteria, they are quite\ distinct PUBMED:11250034. In several archaea, the flagellin genes are followed immediately by the\ flagellar accessory genes flaCDEFGHIJ. The gene products may have a role in translocation,\ secretion, or assembly of the flagellum. FlaC is a protein whose exact role is unknown but it has\ been shown to be membrane-associated (by immuno-blotting fractionated cells)\ PUBMED:11717274.

    \ ' '5206' 'IPR008040' '\

    This domain is found at the N terminus of the hydantoinase/oxoprolinase family.

    \ ' '5207' 'IPR008041' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to the MEROPS peptidase family C23 (clan CA). The type example is Carlavirus (apple stem pitting virus) endopeptidase, this thought to play\ a role in the post-translational cleavage of the high molecular weight primary translation products of the virus.

    \ ' '5208' 'IPR008042' '\

    This signature identifies members of the Pao retrotransposon family.

    \ ' '5209' 'IPR008043' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This entry is found in cysteine peptidases belong to the MEROPS peptidase family C21 (tymovirus endopeptidase family, clan CA). The type example is tymovirus endopeptidase (turnip yellow mosaic virus). The noncapsid protein expressed from ORF-206 of turnip yellow mosaic virus (TYMV) is autocatalytically processed by a\ papain-like protease, producing N-terminal 150-kDa and C-terminal 70-kDa proteins.

    \ ' '5210' 'IPR008044' '\

    At least one of the members of this domain, the Pal protein from the pneumococcal\ bacteriophage Dp-1 has been shown to be an\ N-acetylmuramoyl-L-alanine amidase PUBMED:6146601. According to the known modular\ structure of this and other peptidoglycan hydrolases from the pneumococcal system, the active site\ should reside in the N-terminal domain whereas the C-terminal domain binds to the choline residues\ of the cell wall teichoic acids PUBMED:9379901, PUBMED:3422470.

    \ ' '5211' 'IPR006630' '\

    Human Ro ribonucleoproteins (RNPs) are composed of one of the four small Y RNAs and at least two proteins, Ro60 and La. The La protein is a 47 kDa polypeptide that frequently acts as an autoantigen in systemic lupus erythematosus and Sjogren\'s syndrome PUBMED:15016896. In the nucleus, La acts as a RNA polymerase III (RNAP III) transcription factor, while in the cytoplasm, La acts as a translation factor PUBMED:14636586. In the nucleus, La binds to the 3\'UTR of nascent RNAP III transcripts to assist in folding and maturation PUBMED:15004549. In the cytoplasm, La recognises specific classes of mRNAs that contain a 5\'-terminal oligopyrimidine (5\'TOP) motif known to control protein synthesis PUBMED:14690589. The specific recognition is mediated by the N-terminal domain of La, which comprises a La motif and a RNA recognition motif (RRM). The La motif adopts an alpha/beta fold that comprises a winged-helix motif PUBMED:15048103.

    \

    Homologous La domain-containing proteins have been identified in a wide range of organisms except Archaea, bacteria and viruses PUBMED:7799435.

    \ ' '5212' 'IPR008595' '\ This is a group of Bacillus DegS proteins. The DegS-DegU two-component regulatory system of Bacillus subtilis controls various processes that characterise the transition from the exponential to the stationary growth phase, including the induction of extracellular degradative enzymes, expression of late competence genes and down-regulation of the sigma D regulon PUBMED:12471443. The entry also contains one sequence from Thermoanaerobacter tengcongensis which is described as a sensory transduction histidine kinase.\ ' '5213' 'IPR008680' '\ This family consists of Homo sapiens and simian mastadenovirus early E4 13 kDa proteins. Human adenovirus 9 (HAdV-9) is unique in eliciting exclusively estrogen-dependent mammary tumours in Rattus spp. and in not requiring viral E1 region transforming genes for tumorigenicity. E4 codes for an oncoprotein essential for tumourigenesis by Ad9 PUBMED:11134268.\ ' '5214' 'IPR008850' '\

    Telomerase protein component 1 (TP1/TLP1) or TEP1 is a protein component of two ribonucleoprotein (RNP) complexes: vaults and telomerase. Vaults are large RNP particles with a barrel-like structure (). The telomerase RNP replenishes incomplete chromosome termini due to DNA replication. Mammalian TEP1 is an RNA-binding protein and is required for the association of vault RNA with the vault particle PUBMED:11149928, PUBMED:15701761. The N-terminal part of TEP1 contains 4 copies of the TEP1 N-terminal repeat in tandem. The repeat is composed of 30 amino acids and occurs in combination with the TROVE () and NACHT () domains and with WD-40 repeats (see ) in the C-terminal part.

    \ \ ' '5215' 'IPR008449' '\ This family consists of several Drosophila chorion proteins S36 and S38. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary PUBMED:1908228.\ ' '5216' 'IPR008442' '\

    This signature is found at the N terminus of carboxypeptidase Y, which belong to MEROPS peptidase family S10. This region contains the signal peptide and pro-peptide regions PUBMED:8789258,PUBMED:10077185.

    \ ' '5217' 'IPR008681' '\ This family contains several bacterial MecA proteins. The development of competence in Bacillus subtilis is regulated by growth conditions and several regulatory genes. In complex media competence development is poor, and there is little or no expression of late competence genes. Mec mutations Trachinotus falcatus competence development and late competence gene expression in complex media, bypassing the requirements for many of the competence regulatory genes. The mecA gene product acts negatively in the development of competence. Null mutations in mecA allow expression of a late competence gene comG, under conditions where it is not normally expressed, including in complex media and in cells mutant for several competence regulatory genes. Overexpression of MecA inhibits comG transcriptionPUBMED:11004200, PUBMED:12028382, PUBMED:8412687.\ ' '5218' 'IPR008659' '\ This family contains several KRE9 and KNH1 proteins which are involved in encoding cell surface O glycoproteins, which are required for beta -1,6-glucan synthesis in Saccharomyces cerevisiae PUBMED:9748432.\ ' '5219' 'IPR008669' '\ This short motif is found at the C terminus of Prp24 proteins and probably interacts with the Lsm proteins to promote U4/U6 formation PUBMED:12458792.\ ' '5220' 'IPR008433' '\

    Cytochrome oxidase subunit VIIB is one of the nuclear-coded polypeptide chains of cytochrome c oxidase, the terminal oxidase in mitochondrial electron transport. The X-ray structure of azide-bound fully oxidized cytochrome c oxidase from bovine heart at 2.9 A resolution has been determined PUBMED:10771420.

    \ ' '5221' 'IPR008652' '\ This family consists of several early glycoproteins from Homo sapiens adenoviruses.\ ' '5222' 'IPR008798' '\

    This family consists of the avirulence B and C proteins from Pseudomonas syringae PUBMED:3049552 and related proteins from Xanthomonas campestris PUBMED:12024217. avrB and avrC encode these proteins, which are 36 and 39 kilodaltons in size respectively PUBMED:3049552. Pathogenic bacterial effectors suppress pathogen-associated molecular pattern (PAMP)-triggered host immunity, thereby promoting parasitism. AvrB suppresses PAMP-triggered immunity (PTI) through RAR1, which is for effector-triggered immunity (ETI) PUBMED:17148606.

    \ ' '5223' 'IPR008466' '\ This family consists of several mammalian protein phosphatase inhibitor 1 (IPP-1) and dopamine- and cAMP-regulated neuronal phosphoprotein (DARPP-32) proteins. Protein phosphatase inhibitor-1 is involved in signal transduction and is an endogenous inhibitor of protein phosphatase-1 PUBMED:10960791. It has been demonstrated that DARPP-32, if phosphorylated, can inhibit protein-phosphatase-1 PUBMED:12543476. DARPP-32 has a key role in many neurotransmitter pathways throughout the brain and has been shown to be involved in controlling receptors, ion channels and other physiological factors including the brain\'s response to drugs of abuse, such as cocaine, opiates and nicotine. DARPP-32 is reciprocally regulated by the two neurotransmitters that are most often implicated in schizophrenia - dopamine and glutamate. Dopamine activates DARPP-32 through the D1 receptor pathway and disables DARPP-32 through the D2 receptor. Glutamate, acting through the N-methyl-d-aspartate receptor, renders DARPP-32 inactive. A mutant form of DARPP-32 has been linked with gastric cancers PUBMED:12124342.\ ' '5224' 'IPR008768' '\

    This family contains the capsid assembly protein (scaffolding protein) of bacteriophage T7.

    \ ' '5225' 'IPR008626' '\

    This family represents subunit 15 of the Mediator complex in fungi. It contains Saccharomyces cerevisiae GAL11 (Med15) protein. Gal11 (Med15) and Sin4 (Med16) proteins are S. cerevisiae global transcription factors that regulate transcription of a variety of genes, both positively and negatively. Gal11, in a major part, functions in the activation of transcription, whereas Sin4 has an opposite role PUBMED:11536332.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '5226' 'IPR008800' '\ This family consists of bacterial PufQ proteins. PufQ is required for bacteriochlorophyll biosynthesis serving a regulatory function in the formation of photosynthetic complexes PUBMED:10196154.\ ' '5227' 'IPR008608' '\ This family contains several mammalian ectropic viral integration site 2A (EVI2A) proteins. The function of this protein is unknown although it is thought to be a membrane protein and may function as an oncogene in retrovirus induced myeloid tumours PUBMED:2167436, PUBMED:2117566.\ ' '5228' 'IPR008622' '\ This family contains several bacterial flagellar FliT proteins. The flagellar proteins FlgN and FliT have been proposed to act as substrate specific export chaperones, facilitating incorporation of the enterobacterial hook-associated axial proteins (HAPs) FlgK/FlgL and FliD into the growing flagellum. In Salmonella typhimurium flgN and fliT mutants, the export of target HAPs is reduced, concomitant with loss of unincorporated flagellin into the surrounding medium PUBMED:11169117.\ ' '5229' 'IPR008715' '\ This family consists of nodulation S (NodS) proteins. The products of the rhizobial nodulation genes are involved in the biosynthesis of lipochitin oligosaccharides (LCOs), which are host-specific signal molecules required for nodule formation. NodS is an S-adenosyl-L-methionine (SAM)-dependent methyltransferase involved in N methylation of LCOs. NodS uses N-deacetylated chitooligosaccharides, the products of the NodBC proteins, as its methyl acceptors PUBMED:11344149.\ ' '5230' 'IPR008792' '\ This family contains several bacterial coenzyme PQQ synthesis protein D (PqqD) sequences. This protein is required for coenzyme pyrrolo-quinoline-quinone (PQQ) biosynthesis.\ ' '5231' 'IPR008779' '\ This family consists of several histidine-rich protein II and III sequence from Plasmodium falciparum PUBMED:8432609, PUBMED:3016741.\ ' '5232' 'IPR008855' '\ This family consists of several eukaryotic translocon-associated protein, delta subunit precursors (TRAP-delta or SSR-delta). The exact function of this protein is unknown PUBMED:7492314.\ ' '5233' 'IPR008688' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit B from the F0 complex in F-ATPases found in mitochondria of eukaryotes (metazoa, viridiplantae (plants and green algae), jakobidae and the malawimonadidae). The B subunits are part of the peripheral stalk that links the F1 and F0 complexes together, and which acts as a stator to prevent certain subunits from rotating with the central rotary element. The peripheral stalk differs in subunit composition between mitochondrial, chloroplast and bacterial F-ATPases. In mitochondria, the peripheral stalk is composed of one copy each of subunits OSCP (oligomycin sensitivity conferral protein), F6, B and D PUBMED:16045926.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '5234' 'IPR008893' '\ This domain is named after the most conserved central motif of the domain. It is found in a variety of polyA polymerases as well as the Escherichia coli molybdate metabolism regulator and other proteins of unknown function.The domain is found in isolation in proteins such as and is between 70 and 80 residues in length. \ ' '5235' 'IPR008738' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to the MEROPS peptidase family C27 (clan CA). The type example is the rubella virus endopeptidase (Rubella virus), which is required for processing of the rubella virus replication protein.

    \ ' '5236' 'IPR008739' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to MEROPS peptidase family C28 (clan CA).The protein fold of the peptidase unit for members of this family resembles that of papain.

    The leader peptidase of Foot-and-mouth disease virus cleaves itself from the growing polyprotein and also cleaves the host translation initiation factor 4GI (eIF4G), thus inhibiting 5\'-cap dependent translation PUBMED:12297280.

    \ ' '5237' 'IPR008740' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \ This group of cysteine peptidases correspond to MEROPS peptidase family C30 (clan PA(C)). These peptidases are related to serine endopeptidases of family S1 and are restricted to RNA viruses, where they are involved in viral polyprotein processing during replication PUBMED:12093723, PUBMED:10725411, PUBMED:11842254.\ ' '5238' 'IPR008741' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases corresponds to MEROPS peptidase family C31 (clan CA). Type example is porcine respiratory and reproductive syndrome arterivirus-type cysteine proteinase alpha (lactate-dehydrogenase-elevating virus), which is involved in viral polyprotein processing PUBMED:10725411.

    \ ' '5239' 'IPR008742' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases corresponds to MEROPS peptidase family C32 (clan CA). The type example is equine arteritis virus-type cysteine proteinase (porcine reproductive and respiratory syndrome virus), which is involved in viral polyprotein processing PUBMED:10725411.

    \ ' '5240' 'IPR008743' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \ This group of cysteine peptidases corresponds to MEROPS peptidase family C33 (clan CA). The type example is equine arteritis virus Nsp2-type cysteine proteinase, which is involved in viral polyprotein processing PUBMED:10725411.\ ' '5241' 'IPR008744' '\

    RNA-directed RNA polymerase (RdRp) () is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage PUBMED:2759231, PUBMED:8709232. It catalyses synthesis of the RNA strand complementary to a given RNA template, but the precise molecular mechanism remains unclear.\ The postulated RNA replication process is a two-step mechanism. First, the initiation step of RNA synthesis begins at or near the 3\' end of the RNA template by means of a primer-independent (de novo) mechanism. The de novo initiation consists in the addition of a nucleotide tri-phosphate (NTP) to the 3\'-OH of the first initiating NTP. During the following so-called elongation phase, this nucleotidyl transfer reaction is repeated with subsequent NTPs to generate the complementary RNA product PUBMED:11531403.

    \

    All the RNA-directed RNA polymerases, and many DNA-directed polymerases, employ a fold whose organisation has been likened to the shape of a right hand with three subdomains termed fingers, palm and thumb PUBMED:9309225. Only the palm subdomain, composed of a four-stranded antiparallel beta-sheet with two alpha-helices, is well conserved among all of these enzymes. In RdRp, the palm subdomain comprises three well conserved motifs (A, B and C). Motif A (D-x(4,5)-D) and motif C (GDD) are spatially juxtaposed; the Asp residues of these motifs are implied in the binding of Mg2+ and/or Mn2+. The Asn residue of motif B is involved in selection of ribonucleoside triphosphates over dNTPs and thus determines whether RNA is synthesised rather than DNA PUBMED:10827187.\ The domain organisation PUBMED:9878607 and the 3D structure of the catalytic centre of a wide range of RdPp\'s, even those with a low overall sequence homology, are conserved. The catalytic centre is formed by several motifs containing a number of conserved amino acid residues.

    \

    There are 4 superfamilies of viruses that cover all RNA containing viruses with no DNA stage:

    \ The RNA-directed RNA polymerases in the first of the above superfamilies can be divided into the following three subgroups:\

    \ \

    This signature is found in the RNA-direct RNA polymerase of apple chlorotic leaf spot virus and cherry mottle virus.

    \ ' '5242' 'IPR008745' '\ The domain is found towards the N terminus of the polyprotein of Apple stem grooving virus (strain P-209) (ASGV), Citrus tatter leaf virus and from Apple stem grooving virus (strain Korea) (ASGV) (Pear black necrotic leaf spot virus). Its function is unknown PUBMED:1413530, PUBMED:8277280.\ ' '5243' 'IPR008746' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases correspond to MEROPS peptidase family C36 (clan CA). The type example is beet necrotic yellow vein furovirus-type papain-like endopeptidase (beet necrotic yellow vein virus), which is involved in processing the viral polyprotein.

    \ ' '5244' 'IPR001665' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity PUBMED:8642693. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses PUBMED:1551442. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence\ similarity, namely the Norwalk-like viruses or small round structured\ viruses (SRSVs), and those classed as non-SRSVs.

    \ ' '5245' 'IPR008748' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \ This group of cysteine peptidases correspond to MEROPS peptidase family C41 (clan C-). The type example is cysteine proteinase (hepatitis E virus), which is a papain-like protease that cleaves the viral polyprotein encoded by ORF1 of the hepatitis E virus (Porcine hemagglutinating encephalomyelitis virus) PUBMED:10963340, PUBMED:1518855, PUBMED:8219799.\ ' '5246' 'IPR008404' '\ This family consists of several avian apovitellenin I sequences. As part of the avian reproductive effort, large quantities of triglyceride-rich very-low-density lipoprotein (VLDL) particles are transported by receptor-mediated endocytosis into the female germ cells. Although the oocytes are surrounded by a layer of granulosa cells harbouring high levels of active lipoprotein lipase, non-lipolysed VLDL is transported into the yolk. This is because VLDL particles from laying chickens (Gallus gallus) are protected from lipolysis by apolipoprotein (apo)-VLDL-II, a potent dimeric lipoprotein lipase inhibitor PUBMED:8713091. Apo-VLDL-II is produced in the liver and secreted into the blood stream when induced by estrogen production in female birds.\ ' '5247' 'IPR008629' '\ In Arabidopsis, GUN4 is required for the functioning of the plastid mediated repression of nuclear transcription that is involved in controlling the levels of magnesium- protoporphyrin IX. GUN4 binds the product and substrate of Mg-chelatase, an enzyme that produces Mg-Proto, and activates Mg-chelatase. GUN4 is thought to participate in plastid-to-nucleus signalling by regulating magnesium-protoporphyrin IX synthesis or trafficking.\ ' '5248' 'IPR008410' '\ This entry contains the C-terminal regions of several bacterial cellulose synthase operon C (BCSC) proteins. BCSC is involved in cellulose synthesis although the exact function of this protein is unknown PUBMED:11260463.\ ' '5249' 'IPR008470' '\ This family contains several plant, cyanobacterial and algal proteins of unknown function. The family is exclusively found in phototrophic organisms and may therefore play a role in photosynthesis.\ ' '5250' 'IPR008828' '\ This family consists of several stress-activated map kinase interacting protein 1 (MAPKAP1 OR SIN1) sequences. The Schizosaccharomyces pombe Sty1/Spc1 mitogen-activated protein (MAP) kinase is a member of the eukaryotic stress-activated MAP kinase (SAPK) family. Sin1 interacts with Sty1/Spc1. Cells lacking Sin1 display many, but not all, of the phenotypes of cells lacking the Sty1/Spc1 MAP kinase including sterility, multiple stress sensitivity and a cell-cycle delay. Sin1 is phosphorylated after stress but this is not Sty1/Spc1-dependent PUBMED:10428959.\ ' '5251' 'IPR008693' '\ This family contains several membrane proteins from Mycobacterium species PUBMED:11891304.\ ' '5252' 'IPR008602' '\ This family contains several Plasmodium Duffy binding proteins. Plasmodium vivax and Plasmodium knowlesi merozoites invade Homo sapiens erythrocytes that express Duffy blood group surface determinants. The Duffy receptor family is localised in micronemes, an organelle found in all organisms of the phylum Apicomplexa PUBMED:2170017.\ ' '5253' 'IPR008457' '\ Copper sequestering activity displayed by some bacteria is determined by copper-binding protein products of the copper resistance operon (cop). CopD, together with CopC, perform copper uptake into the cytoplasm PUBMED:7917425.\ ' '5254' 'IPR008397' '\ This family contains several bacterial alginate lyase proteins. Alginate is a family of 1-4-linked copolymers of beta -D-mannuronic acid (M) and alpha -L-guluronic acid (G). It is produced by brown algae and by some bacteria belonging to the genera Azotobacter and Pseudomonas. Alginate lyases catalyse the depolymerisation of alginates by beta -elimination, generating a molecule containing 4-deoxy-L-erythro-hex-4-enepyranosyluronate at the nonreducing end PUBMED:9683471.\ ' '5255' 'IPR008614' '\ Acidic fibroblast growth factor (aFGF) intracellular binding protein (FIBP) is a protein found mainly in the nucleus that is thought to be involved in the intracellular function of aFGF PUBMED:11104667.\ ' '5256' 'IPR008435' '\ This family consists of several eukaryotic corticotropin-releasing factor binding proteins (CRF-BP or CRH-BP). Corticotropin-releasing hormone (CRH) plays multiple roles in vertebrate species. In mammals, it is the major hypothalamic releasing factor for pituitary adrenocorticotropin secretion, and is a neurotransmitter or neuromodulator at other sites in the central nervous system. In non-mammalian vertebrates, CRH not only acts as a neurotransmitter and hypophysiotropin, it also acts as a potent thyrotropin-releasing factor, allowing CRH to regulate both the adrenal and thyroid axes, especially in development. CRH-BP is thought to play an inhibitory role in which it binds CRH and other CRH-like ligands and prevents the activation of CRH receptors. There is however evidence that CRH-BP may also exhibit diverse extra and intracellular roles in a cell specific fashion and at specific times in development PUBMED:12379493.\ ' '5258' 'IPR008471' '\ This entry contains several uncharacterised bacterial proteins with no known function.\ ' '5259' 'IPR008872' '\ This domain consists of Bacillus insecticidal crystal toxins. Strains of Bacillus that have this insecticidal activity use a binary toxin comprised of two proteins, P51 and P42 (this entry). Members of this are highly conserved between strains of different serotypes and phage groups PUBMED:9500937.\ ' '5260' 'IPR008412' '\ Bone sialoprotein (BSP) is a major structural protein of the bone matrix that is specifically expressed by fully-differentiated osteoblasts PUBMED:8061918. The expression of bone sialoprotein (BSP) is normally restricted to mineralised connective tissues of bones and teeth where it has been associated with mineral crystal formation. However, it has been found that ectopic expression of BSP occurs in various lesions, including oral and extraoral carcinomas, in which it has been associated with the formation of microcrystalline deposits and the metastasis of cancer cells to bone PUBMED:10785518.\ ' '5261' 'IPR008816' '\ This family consists of several Rickettsia genus specific 17 kDa surface antigen proteins.\ ' '5262' 'IPR008853' '\ This family contains several eukaryotic transmembrane proteins which are homologous to Homo sapiens transmembrane protein 9 . The TMEM9 gene encodes a 183 amino-acid protein that contains an N-terminal signal peptide, a single transmembrane region, three potential N-glycosylation sites and three conserved cys-rich domains in the N terminus, but no known functional domains. The protein is highly conserved between species from Caenorhabditis elegans to H. sapiens and belongs to a novel family of transmembrane proteins. The exact function of TMEM9 is unknown although it has been found to be widely expressed and localised to the late endosomes and lysosomes PUBMED:12359240. Members of this family contain CXCXC repeats in their N-terminal region.\ ' '5263' 'IPR008770' '\ This family consists of DNA terminal protein GP3 sequences from Phi-29 like bacteriophage. DNA terminal protein GP3 is linked to the 5\' ends of both strands of the genome through a phosphodiester bond between the beta-hydroxyl group of a serine residue and the 5\'-phosphate of the terminal deoxyadenylate. This protein is essential for DNA replication and is involved in the priming of DNA elongation PUBMED:6779279.\ ' '5264' 'IPR008675' '\ This entry contains the N-terminal regions of the Saccharomyces mating factor alpha precursor protein. All proteins in this family contain one or more copies of further toward their C terminus.\ ' '5265' 'IPR008407' '\

    This family consists of a number of bacterial and archaeal branched-chain amino acid transport proteins. AzlD, a member of this group, has been shown by mutational analysis to be involved in branched-chain amino acid transport, and to be involved in conferring resistance to 4-azaleucine PUBMED:9287000. However, its exact role in these processes is not yet clear PUBMED:9287000. Based on its hydropathy profile, it has been suggested to be a membrane protein PUBMED:9287000.

    \ \ \ ' '5266' 'IPR008857' '\ This family consists of several thyrotropin-releasing hormone (TRH) proteins. Thyrotropin-Releasing Hormone (TRH; pyroGlu-His-Pro-NH2), originally isolated as a hypothalamic neuropeptide hormone, most likely acts also as a neuromodulator and/or neurotransmitter in the central nervous system (CNS). This interpretation is supported by the identification of a peptidase localised on the surface of neuronal cells which has been termed TRH-degrading ectoenzyme (TRH-DE) since it selectively inactivates TRH. TRH has been used clinically for the treatment of spinocerebellar degeneration and disturbance of consciousness in humans PUBMED:12467901.\ ' '5267' 'IPR008657' '\ This family contains several jumping translocation breakpoint proteins or JTBs. Jumping translocation (JT) is an unbalanced translocation that comprises amplified chromosomal segments jumping to various telomeres. JTB, located at 1q21, has been found to fuse with the telomeric repeats of acceptor telomeres in a case of JT. hJTB (Homo sapiens JTB) encodes a transmembrane protein that is highly conserved among divergent eukaryotic species. JT results in a hJTB truncation, which potentially produces an hJTB product devoid of the transmembrane domain. hJTB is located in a gene-rich region at 1q21, called EDC (Epidermal Differentiation Complex) PUBMED:10321732. JTB has also been implicated in prostatic carcinomas PUBMED:10762645.\ ' '5268' 'IPR008683' '\ This family contains several microvirus A* proteins. The A* protein binds to double stranded DNA and prevents their hydrolysis by nucleases PUBMED:158588.\ ' '5269' 'IPR008690' '\ The N5-methyltetrahydromethanopterin: coenzyme M () of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump PUBMED:9559648.\ ' '5270' 'IPR008684' '\

    Microviral A protein is a specific endonuclease that cleaves the viral strand of supertwisted, closed circular DNA at a unique site in the A gene. The A protein also causes relaxation of supertwisted DNA and forms a complex with viral DNA that has a discontinuity in gene A of the viral strand PUBMED:158588. The C-terminal region of the sequence contains the cleavage site for A/A* protein PUBMED:6283158.

    \ ' '5271' 'IPR008807' '\ This family consists of several ROS/MUCR transcriptional regulator proteins. The ros chromosomal gene is present in octopine and nopaline strains of Agrobacterium tumefaciens as well as in Rhizobium meliloti (Sinorhizobium meliloti). This gene encodes a 15.5 kDa protein that specifically represses the virC and virD operons in the virulence region of the Ti plasmid PUBMED:2013576 and is necessary for succinoglycan production PUBMED:7756693. S. meliloti can produce two types of acidic exopolysaccharides, succinoglycan and galactoglucan, that are interchangeable for infection of Medicago sativa (Alfalfa) nodules. MucR from S. meliloti acts as a transcriptional repressor that blocks the expression of the exp genes responsible for galactoglucan production therefore allowing the exclusive production of succinoglycan PUBMED:10656595.\ ' '5272' 'IPR008472' '\ This entry contains sequences which are repeated in several uncharacterised proteins from Drosophila melanogaster.\ ' '5273' 'IPR008790' '\

    This family of proteins contain poxvirus serine/threonine protein kinases, which are essential for phosphorylation of virion proteins during virion assembly.

    \ ' '5276' 'IPR008391' '\ This family consists of several bacterial acetyl xylan esterase proteins. Acetyl xylan esterases are enzymes that hydrolyse the ester linkages of the acetyl groups in position 2 and/or 3 of the xylose moieties of natural acetylated xylan from hardwood. These enzymes are one of the accessory enzymes which are part of the xylanolytic system, together with xylanases, beta-xylosidases, alpha-arabinofuranosidases and methylglucuronidases; these are all required for the complete hydrolysis of xylan PUBMED:10878123.\ ' '5277' 'IPR008473' '\ This family appears to be found in a group of prophage proteins.\ ' '5278' 'IPR008710' '\ Nicastrin and presenilin are two major components of the gamma-secretase complex, which executes the intramembrane proteolysis of type I integral membrane proteins such as the amyloid precursor protein (APP) and Notch. Nicastrin is synthesised in fibroblasts and neurons as an endoglycosidase-H-sensitive glycosylated precursor protein (immature nicastrin) and is then modified by complex glycosylation in the Golgi apparatus and by sialylation in the trans-Golgi network (mature nicastrin) PUBMED:12584255.\ ' '5279' 'IPR008777' '\

    This family consists of Phytoreovirus nonstructural proteins Pns10 and Pns11. Genome segment S11 of Rice gall dwarf virus (RGDV), a Phytoreovirus, encodes a putative protein of 40 kDa that exhibits approximately 37% homology at the amino acid level to the nonstructural proteins Pns10 of rice dwarf and wound tumour viruses, which are other members of this genus PUBMED:10949951.

    \ ' '5280' 'IPR008453' '\ This family consists of clavanin proteins from the haemocytes of the invertebrate Styela clava (Sea squirt), a solitary tunicate. The family is made up of four alpha-helical antimicrobial peptides, clavanins A, B, C and D. The tunicate peptides resemble magainins in size, primary sequence and antibacterial activity. Synthetic clavanin A displays comparable antimicrobial activity to magainins and cecropins. The presence of alpha-helical antimicrobial peptides in the haemocytes of a urochordate suggests that such peptides are primeval effectors of innate immunity in the vertebrate lineage PUBMED:9001389.\ ' '5281' 'IPR008911' '\ This family consists of several scorpion toxins which act by blocking small conductance calcium activated potassium ion channels in their victim.\ ' '5282' 'IPR008465' '\ Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in Homo sapiens. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in Mus musculus brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear PUBMED:8872465.\ ' '5283' 'IPR008633' '\ This family consists of archaeal GvpH proteins which are thought to be involved in gas vesicle synthesis PUBMED:9211710.\ ' '5284' 'IPR008606' '\ This family consists of several eukaryotic translation initiation factor 4E binding proteins (EIF4EBP1, -2 and -3). Translation initiation in eukaryotes is mediated by the cap structure (m7GpppN, where N is any nucleotide) present at the 5\' end of all cellular mRNAs, except organellar. The cap is recognised by eukaryotic initiation factor 4F (eIF4F), which consists of three polypeptides, including eIF4E, the cap-binding protein subunit. The interaction of the cap with eIF4E facilitates the binding of the ribosome to the mRNA. eIF4E activity is regulated in part by translational repressors, 4E-BP1, 4E-BP2 and 4E-BP3 which bind to it and prevent its assembly into eIF4F PUBMED:9593750.\ ' '5286' 'IPR008842' '\ Siva binds to the CD27 cytoplasmic tail. It has a DD homology region, a box-B-like ring finger, and a zinc finger-like domain. Overexpression of Siva in various cell lines induces apoptosis, suggesting an important role for Siva in the CD27-transduced apoptotic pathway PUBMED:9177220. Siva-1 binds to and inhibits BCL-X(L)-mediated protection against UV radiation-induced apoptosis. Indeed, the unique amphipathic helical region (SAH) present in Siva-1 is required for its binding to BCL-X(L) and sensitising cells to UV radiation. Natural complexes of Siva-1/BCL-X(L) are detected in HUT78 and murine thymocyte, suggesting a potential role for Siva-1 in regulating T cell homeostasis PUBMED:12011449. This family contains both Siva-1 and the shorter Siva-2 lacking the sequence coded by exon 2. It has been suggested that Siva-2 could regulate the function of Siva-1 PUBMED:10597319.\ ' '5287' 'IPR008648' '\ This family includes UL69 and IE63 that are transcriptional regulator proteins.\ ' '5288' 'IPR008721' '\

    Origin recognition complex subunit 6, is a component of the origin recognition complex (ORC) that binds origins of replication. ORC is composed of six subunits. Interacts with DBF4 PUBMED:12614612. It has a role in both chromosomal replication and mating type transcriptional silencing. Binds to the ARS consensus sequence (ACS) of origins of replication in an ATP-dependent manner PUBMED:10535928, PUBMED:11850415 The function of ORC is reviewed in PUBMED:11914271.

    \ ' '5289' 'IPR008405' '\ Apo L belongs to the high density lipoprotein family that plays a central role in cholesterol transport. The cholesterol content of membranes is important in cellular processes such as modulating gene transcription and signal transduction both in the adult brain and during neurodevelopment. There are six apo L genes located in close proximity to each other on chromosome 22q12 in humans. 22q12 is a confirmed high-susceptibility locus for schizophrenia and close to the region associated with velocardiofacial syndrome that includes symptoms of schizophrenia PUBMED:11930015.\ ' '5290' 'IPR000848' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    It has been suggested that the cAMP receptors coordinate aggregation of\ individual cells into a multicellular organism, and regulate the expression\ of a large number of developmentally-regulated genes PUBMED:3047871, PUBMED:8436297, PUBMED:8382181. The amino acid\ sequences of the receptors contain high proportions of hydrophobic residues\ grouped into 7 domains, in a manner reminiscent of the rhodopsins and other\ receptors believed to interact with G-proteins. However, while a similar\ 3D framework has been proposed to account for this, there is no significant\ sequence similarity between these families: the cAMP receptors thus bear\ their own unique \'7TM\' signature.

    \ ' '5291' 'IPR008835' '\ This family contains several mammalian sclerostin (SOST) proteins. SOST is thought to suppress bone formation. Mutations of the SOST gene lead to sclerosteosis, a progressive sclerosing bone dysplasia with an autosomal recessive mode of inheritance. Radiologically, it is characterised by a generalised hyperostosis and sclerosis leading to a markedly thickened and sclerotic skull, with mandible, ribs, clavicles and all long bones also being affected. Due to narrowing of the foramina of the cranial nerves, facial nerve palsy, hearing loss and atrophy of the optic nerves can occur. Sclerosteosis is clinically and radiologically very similar to van Buchem disease, mainly differentiated by hand malformations and a large stature in sclerosteosis patients PUBMED:11181578.\ ' '5292' 'IPR008771' '\ This family consists of phi-29-like late genes activator (or early protein GP4). This protein is thought to be a positive regulator of late transcription and may function as a sigma-like component of the host RNA polymerase PUBMED:10438592.\ ' '5293' 'IPR008639' '\ This family consists of Halobacterium gas vesicle protein C sequences which are thought to confer stability to the gas vesicle membranes PUBMED:1404376,PUBMED:8763925.\ ' '5294' 'IPR008408' '\ This family consists of several brain acid soluble protein 1 (BASP1) or neuronal axonal membrane protein NAP-22. The BASP1 is a neuron enriched Ca2+-dependent calmodulin-binding protein of unknown function PUBMED:10965107,PUBMED:9310187.\ ' '5295' 'IPR008645' '\

    The function of the U47 herpesvirus proteins is unknown PUBMED:10482554.

    \ ' '5298' 'IPR008905' '\ The largest of the mammalian translation initiation factors, eIF3, consists of at least eight subunits ranging in mass from 35 to 170 kDa. eIF3 binds to the 40 S ribosome in an early step of translation initiation and promotes the binding of methionyl-tRNAi and mRNA PUBMED:8995409.\ ' '5300' 'IPR008865' '\ This entry contains several bacterial DNA replication terminus site-binding proteins (also known as Ter proteins). They are required for the termination of DNA replication and function by binding to DNA replication terminator sequences, thus preventing the passage of replication forks PUBMED:2687269. The termination efficiency is affected by the affinity of a particular protein for the terminator sequence.\ ' '5301' 'IPR008646' '\ This family consists several UL45 proteins and homologues found in the herpes simplex virus family. The herpes simplex virus UL45 gene encodes an 18 kDa virion envelope protein whose function remains unknown. It has been suggested that the 18 kDa UL45 gene product is required for efficient growth in the central nervous system at low doses and may play an important role under the conditions of a naturally acquired infection PUBMED:11958453.\ ' '5302' 'IPR008836' '\ This family consists of several mammalian semenogelin (I and II) proteins. Freshly ejaculated Homo sapiens semen has the appearance of a loose gel in which the predominant structural protein components are the seminal vesicle secreted semenogelins (Sg) PUBMED:1584792.\ ' '5303' 'IPR008444' '\ This family consists of Chlamydia virulence proteins which are thought to be required for the growth of these bacteria in mammalian cells PUBMED:2845228. Humans mount a humoral and mucosal immune response to plasmid protein pgp3 from Chlamydia trachomatis infections PUBMED:7960130, PUBMED:12559784. This immune response to pgp3 in humans and animal models of chlamydia-induced disease has potential uses in diagnostic assays and protective immunisation strategies PUBMED:12780690.\ ' '5304' 'IPR008732' '\ The nuclear PET122 gene of Saccharomyces cerevisiae encodes a mitochondrial-localised protein that activates initiation of translation of the mitochondrial mRNA from the COX3 gene, which encodes subunit III of cytochrome c oxidase PUBMED:10410243.\ ' '5305' 'IPR008833' '\ Surfeit locus protein 2 is part of a group of at least six sequence unrelated genes (Surf-1 to Surf-6). The six Surfeit genes have been classified as housekeeping genes, being expressed in all tissue types tested and not containing a TATA box in their promoter region. The exact function of SURF2 is unknown PUBMED:9414319.\ ' '5306' 'IPR008795' '\ The prominins are an emerging family of proteins that, among the multispan membrane proteins, display a novel topology. Mouse and Homo sapiens prominin and (Mus musculus) prominin-like 1 (PROML1) are predicted to contain five membrane spanning domains, with an N-terminal domain exposed to the extracellular space followed by four, alternating small cytoplasmic and large extracellular, loops and a cytoplasmic C-terminal domain PUBMED:11467842. The exact function of prominin is unknown although in humans defects in PROM1, the gene coding for prominin, cause retinal degeneration PUBMED:10587575.\ ' '5307' 'IPR008796' '\ This family contains several Photosystem I reaction centre subunit N (PSI-N) proteins. The protein has no known function although it is localised in the thylakoid lumen PUBMED:10230065. PSI-N is a small extrinsic subunit at the lumen side and is very likely involved in the docking of plastocyanin.\ ' '5308' 'IPR008846' '\

    This family consists of several different short Staphylococcal proteins, it contains SLUSH A, B and C proteins as well as haemolysin and gonococcal growth inhibitor. Some strains of the coagulase-negative Staphylococcus lugdunensis produce a synergistic hemolytic activity (SLUSH), phenotypically similar to the delta-hemolysin of S. aureus PUBMED:8975897. Gonococcal growth inhibitor from Staphylococcus acts on the cytoplasmic membrane of the gonococcal cell causing cytoplasmic leakage and, eventually, death PUBMED:3134553.

    \ ' '5309' 'IPR008691' '\ Most of the antigens of Mycobacterium leprae and Mycobacterium tuberculosis that have been identified are members of stress protein families, which are highly conserved throughout many diverse species. Of the M. leprae and M. tuberculosis antigens identified by monoclonal antibodies, all except the 18 kDa M. leprae antigen and the 19 kDa M. tuberculosis antigen are strongly cross-reactive between these two species and are coded within very similar genes PUBMED:8454357, PUBMED:2230723.\ ' '5310' 'IPR008837' '\ The Drosophila serendipity alpha (sry alpha) gene is specifically transcribed at the blastoderm stage, from nuclear cycle 11 to the onset of gastrulation, in all somatic nuclei PUBMED:2166703. SRY-A is required for the cellularisation of the embryo and is involved in the localisation of the actin filaments just prior to and during plasma membrane invagination PUBMED:8287797.\ ' '5311' 'IPR008827' '\ Synaptonemal complex protein 1 (SCP-1) is the major component of the transverse filaments of the synaptonemal complex. Synaptonemal complexes are structures that are formed between homologous chromosomes during meiotic prophase PUBMED:1464329.\ ' '5312' 'IPR008665' '\

    This iron sulphur cluster is found at the N terminus of some proteins containing leucine-repeat variant (LRV) repeats (). These proteins have a two-domain structure, composed of a small N-terminal domain containing a cluster of four Cys residues that houses the 4Fe:4S cluster, and a larger C-terminal domain containing the LRV repeats PUBMED:8946850. Biochemical studies revealed that the 4Fe:4S cluster is sensitive to oxygen, but does not appear to have reversible redox activity.

    \ ' '5313' 'IPR006612' '\

    C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf\'s can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 PUBMED:11361095. C2H2 Znf\'s are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes PUBMED:10664601. Transcription factors usually contain several Znf\'s (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA PUBMED:10940247. C2H2 Znf\'s can also bind to RNA and protein targets PUBMED:18253864.

    \

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    The THAP domain is an ~90-residue domain restricted to animals, which is shared between the THAP family of cellular DNA-binding proteins, and transposases from mobile genomic parasites. The defined THAP domain includes: a C2CH signature (consensus: C-x(2,4)-C-x(35,50)-C-x(2)-H); three additional key residues that are strictly conserved in all THAP domains that have been found to date (THAP1 amino acids P26, W36, F58); a C-terminal AVPTIF box; and several other conserved amino acid positions with distinct physicochemical properties (e.g. hydrophobic and polar). The THAP domain can be found in one or more copies and can be associated with other domains, such as the C2H2-type zinc finger. The THAP domain is supposed to be a DNA-binding domain (DBD) PUBMED:12575992, PUBMED:12717420.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5314' 'IPR008832' '\

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes PUBMED:17622352, PUBMED:16469117. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor PUBMED:17507650. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5\' and 3\' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    \

    This entry represents the 9 kDa SRP9 component. Both SRP9 and SRP14 have the same (beta)-alpha-beta(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded beta-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP PUBMED:7730321.

    \ ' '5316' 'IPR008727' '\

    The PAAR motif is usually found in pairs in a family of bacterial membrane proteins. It is also found as a triplet of tandem repeats comprising the entire length in a another family of hypothetical proteins.

    \ ' '5317' 'IPR008861' '\ This family is found in a family of phage tail proteins. Sequence analysis suggests that they are related to which suggests a general peptidoglycan binding function.\ ' '5319' 'IPR008823' '\ The RuvB protein makes up part of the RuvABC revolvasome which catalyses the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalysed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein PUBMED:12423347. This group of sequences contain this signature which is located in the C-terminal region of the proteins; it is thought to be a helicase DNA-binding domain.\ ' '5321' 'IPR008389' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) () are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release PUBMED:15629643. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c\', c\'\', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins PUBMED:15907459.

    \ \

    This entry represents subunit E (or subunit M9.2) found in the V0 complex of certain V-ATPases. The V0 complex contains subunit C (proton-conducting pore), as well as accessory subunits that function in assembly, targeting or regulation of the V-ATPase complex. Subunit E is an extremely hydrophobic protein of approximately 9 kDa, which may be required for assembly of vacuolar ATPases PUBMED:9556572. The amino terminal domain of subunit E interacts with the H subunit and is required fo V-ATPase function PUBMED:12163484. Different isoforms of this subunit exist sometimes annotated as E1 and E2 also a neuron-specific isoform, NM9.2 has been identified PUBMED:12544825.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '5322' 'IPR008869' '\ Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation PUBMED:9020089. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters PUBMED:9658016.\ ' '5323' 'IPR008913' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    Pirh2 is an eukaryotic ubiquitin protein ligase, which has been shown to promote p53 degradation in mammals. Pirh2 physically interacts with p53 and promotes ubiquitination of p53 independently of MDM2. Like MDM2, Pirh2 is thought to participate in an autoregulatory feedback loop that controls p53 function. Pirh2 proteins contain three distinct zinc fingers, the CHY-type, the CTCHY-type which is C-terminal to the CHY-type zinc finger and a RING finger. The CHY-type zinc finger has no currently known function PUBMED:12654245.

    \ \

    As well as Pirh2, the CHY-type zinc finger is also found in the following proteins:

    \ \

    \ \ \

    The solution structure of this zinc finger has been solved and binds 3 zinc atoms as shown in the following schematic representation:\

    \ \
    \
                          ++---------+-----+\
                          ||         |     |\
             CXHYxxxxxxxxxCCxxxxxCxxCHxxxxxHxxxxxxxxxxxCxxCxxxxxxxxxCxxC\
             | |                 |  |                  |  |         |  |\
             +-+-----------------+--+                  +--+---------+--+\
    \
    \'C\': conserved cysteine involved in the binding of one zinc atom.\
    \'H\': conserved histidine involved in the binding of one zinc atom.\
    
    \ \ More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:\ ' '5324' 'IPR008824' '\ The RuvB protein makes up part of the RuvABC revolvasome which catalyses the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalysed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein PUBMED:12423347. This group of sequences contain this signature which is located in the N-terminal region of the proteins.\ ' '5325' 'IPR008597' '\ Destabilase is an endo-epsilon(gamma-Glu)-Lys isopeptidase, which cleaves isopeptide bonds formed by transglutaminase (Factor XIIIa) between glutamine gamma-carboxamide and the epsilon-amino group of lysine PUBMED:9003282.\ ' '5326' 'IPR008801' '\ RALF, a 5 kDa ubiquitous polypeptide in plants, arrests root growth and development.\ ' '5327' 'IPR008468' '\ DNA methylation can contribute to transcriptional silencing through several transcriptionally repressive complexes, which include methyl-CpG binding domain proteins (MBDs) and histone deacetylases (HDACs). The chief enzyme that maintains mammalian DNA methylation, DNMT1, can also establish a repressive transcription complex. The non-catalytic N terminus of DNMT1 binds to HDAC2 and DMAP1 (for DNMT1 associated protein), and can mediate transcriptional repression. DMAP1 has intrinsic transcription repressive activity, and binds to the transcriptional co-repressor TSG101. DMAP1 is targeted to replication foci through interaction with the far N terminus of DNMT1 throughout S phase, whereas HDAC2 joins DNMT1 and DMAP1 only during late S phase, providing a platform for how histones may become deacetylated in heterochromatin following replication PUBMED:10888872.\ ' '5329' 'IPR008474' '\ This family is predominated by ORFs from Circoviridae. The function of this family remains to be determined.\ ' '5330' 'IPR008603' '\ Dynactin is a multi-subunit complex and a required cofactor for most, or all, o\ f the cellular processes powered by the microtubule-based motor cytoplasmic dyn\ ein. p62 binds directly to the Arp1 subunit of dynactin PUBMED:10671518,\ PUBMED:10607597.\ ' '5331' 'IPR008787' '\

    This family of proteins which include vaccinia virus G7L and fowlpox virus FPV120 are associated with the intracellualar mature virus particle. The function of this family of proteins is not known.

    \ ' '5332' 'IPR008844' '\ The GerAA, -AB, and -AC proteins of the Bacillus subtilis spore are required for the germination response to L-alanine as the sole germinant. Members of GerAC family are thought to be located in the inner spore membrane. Although the function of this family is unclear, they are likely to encode the components of the germination apparatus that respond directly to this germinant, mediating the spore\'s response PUBMED:11418573.\ ' '5333' 'IPR008609' '\ This family consists of Ebola virus sp., Lake Victoria marburgvirus nucleoproteins. These proteins are responsible for encapsidation of genomic RNA. It has been found that nucleoprotein DNA vaccines can offer protection from the virus PUBMED:9657001.\ ' '5334' 'IPR008475' '\ This domain is found, normally as a tandem repeat, at the C terminus of bacterial phospholipase C proteins.\ ' '5335' 'IPR008673' '\ This family consists of several mammalian microfibril-associated glycoprotein (MAGP) 1 and 2 proteins. MAGP1 and 2 are components of elastic fibres. MAGP-1 has been proposed to bind a C-terminal region of tropoelastin, the soluble precursor of elastin. MAGP-2 was found to interact with fibrillin-1 and -2, as well as fibulin-1, another component of elastic fibres. This suggests that MAGP-2 may be important in the assembly of microfibrils PUBMED:12122015.\ ' '5336' 'IPR008812' '\ The small Ras-like GTPase Ran plays an essential role in the transport of macromolecules in and out of the nucleus and has been implicated in spindle and nuclear envelope formation during mitosis in higher eukaryotes. The Saccharomyces cerevisiae ORF YGL164c encoding a novel RanGTP-binding protein, termed Yrb30p was identified. The protein competes with S. cerevisiae RanBP1 (Yrb1p) for binding to the GTP-bound form of S. cerevisiae Ran (Gsp1p) and is, like Yrb1p, able to form trimeric complexes with RanGTP and some of the karyopherins PUBMED:12578832.\ ' '5337' 'IPR008876' '\ This family consists of several enterobacterial TraY proteins. TraY is involved in bacterial conjugation where it is required for efficient nick formation in the F plasmid PUBMED:12003924.\ ' '5338' 'IPR008908' '\ Sarcoglycans are a subcomplex of transmembrane proteins which are part of the dystrophin-glycoprotein complex. They are expressed in the skeletal, cardiac and smooth muscle. Although numerous studies have been conducted on the sarcoglycan subcomplex in skeletal and cardiac muscle, the manner of the distribution and localisation of these proteins along the nonjunctional sarcolemma is not clear PUBMED:12566627. This family contains alpha and epsilon members.\ ' '5339' 'IPR008387' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit F6 (or coupling factor 6) found in the F0 complex of F-ATPases in mitochondria. The F6 subunit is part of the peripheral stalk that links the F1 and F0 complexes together, and which acts as a stator to prevent certain subunits from rotating with the central rotary element. The peripheral stalk differs in subunit composition between mitochondrial, chloroplast and bacterial F-ATPases. In mitochondria, the peripheral stalk is composed of one copy each of subunits OSCP (oligomycin sensitivity conferral protein), F6, B and D PUBMED:16045926. There is no homologue of subunit F6 in bacterial or chloroplast F-ATPase, whose peripheral stalks are composed of one copy of the delta subunit (homologous to OSCP), and two copies of subunit B in bacteria, or one copy each of subunits B and B\' in chloroplasts and photosynthetic bacteria.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '5340' 'IPR008390' '\ Members of this family are 19 kDa membrane proteins. The levels of the plant protein AWPM-19 increase dramatically when there is an increase level of abscisic acid. The increase presence of this protein leads to greater tolerance of freezing PUBMED:9249988.\ ' '5341' 'IPR008873' '\ Conjugative transfer of a bacteriocin plasmid, pPD1, of Enterococcus faecalis is induced in response to a peptide sex pheromone, cPD1, secreted from plasmid-free recipient cells. cPD1 is taken up by a pPD1 donor cell and binds to an intracellular receptor, TraA. Once a recipient cell acquires pPD1, it starts to produce an inhibitor of cPD1, termed iPD1, which functions as a TraA antagonist and blocks self-induction in donor cells. TraA transduces the signal of cPD1 to the mating response PUBMED:12399504.\ ' '5342' 'IPR008637' '\ This is a family of plant proteins that are associated with the hypersensitive response (HR) pathway of defence against plant pathogens.\ ' '5343' 'IPR008891' '\ This family is common to ssRNA positive-strand viruses and are commonly described as nucleic acid binding proteins (NABP).\ ' '5345' 'IPR008907' '\ This family encodes a 25 kDa protein that is phosphorylated by a Ser/Thr-Pro kinase PUBMED:1909972. It has been described as a brain specific protein, but it is found in Tetrahymena thermophila.\ ' '5346' 'IPR008871' '\

    This family of proteins contain the coat proteins of the Totiviruses.

    \ ' '5348' 'IPR008452' '\

    This family contains the P18 proteins of citrus tristeza virus (CTV). CTV is a member of the closterovirus group and is one of the more complex single-stranded RNA viruses. Assembly of the viral genome into virions is a critical process of the virus life cycle often defining the ability of the virus to move within the plant and to be transmitted horizontally to other plants. Closteroviridae virions are polar helical rods assembled primarily by a major coat protein, but with a related minor coat protein at one end. It is the only virus family that encodes a protein with similarity to cellular chaperones, a 70-kDa heat-shock protein homologue (HSP70h). Deletion mutagenesis reveales that p33, p6, p18, p13, p20, and p23 genes are not needed for virion formation. Their function is unknown PUBMED:11112500.

    \ ' '5349' 'IPR008767' '\

    This family describes proteins found in bacteriophage and in bacterial prophage\ regions. The function of these proteins is not\ known.

    \ ' '5350' 'IPR017980' '\

    Metallothioneins (MT) are small proteins that bind heavy metals, such as zinc, copper, cadmium, nickel, etc. They have a high content of cysteine residues that bind the metal ions through clusters of thiolate bonds PUBMED:1779825, PUBMED:2959513. An empirical classification into three classes has been proposed by Fowler and coworkers PUBMED:2959504 and Kojima PUBMED:1779826. Members of class I are defined to include polypeptides related in the positions of their cysteines to equine MT-1B, and include mammalian MTs as well as from crustaceans and molluscs. Class II groups MTs from a variety of species, including sea urchins,\ fungi, insects and cyanobacteria. Class III MTs are atypical polypeptides composed of gamma-glutamylcysteinyl units PUBMED:2959504.

    \

    This original classification system has been found to be limited, in the sense that it does not allow clear differentiation of patterns of structural similarities, either between or within classes. Consequently, all class I and class II MTs (the proteinaceous sequences) have now been grouped into families of phylogenetically-related and thus alignable sequences. This system subdivides the MT superfamily into families, subfamilies, subgroups, and isolated isoforms and alleles.

    \

    The metallothionein superfamily comprises all polypeptides that resemble equine renal metallothionein in several respects PUBMED:2959504: e.g., low molecular weight; high metal content; amino acid composition with high Cys and low aromatic residue content; unique sequence with characteristic distribution of cysteines, and spectroscopic manifestations indicative of metal thiolate clusters. A MT family subsumes MTs that share particular sequence-specific features and are thought to be evolutionarily related. The inclusion of a MT within a family presupposes that its amino acid sequence is alignable with that of all members. Fifteen MT families have been characterised, each family being identified by its number and its taxonomic range: e.g., Family 1: vertebrate MTs [see http://www.bioc.unizh.ch/mtpage/protali.html].

    \

    Echinoidea (sea urchin, family 4) MTs are 64-67 residue proteins. Members of this family are recognised by the sequence pattern P-D-x-K-C-[V,F]-C-C-x(5)-C-x-C-x(4)-C-C-x(4)-C-C-x(4,6)-C-C located near the N terminus. \ The taxonomic range of the members extends to sea urchins (echinodea). \ The protein sequence is divided into two structural domains, each containing 9 and 11 Cys residues binding 3 and 4 bivalent metal ions, respectively.\ Family 4 includes subfamilies: e1, e2, they are separate phylogenetic groups.

    \ \

    This entry includes the sea urchin proteins, and related sequences from worms.

    \ ' '5351' 'IPR008894' '\

    This group contains proteins which have a wide range of UniProtKB/Swiss-Prot annotations, from \'hypothetical protein\', \'lipopolysaccharide biosynthesis protein\' to \'bifunctional acetyl transferase/isomerase\'.

    \ ' '5352' 'IPR008731' '\

    This sequence identifies proteins which are a component of the phosphoenolpyruvate:sugar phosphotransferase system (PTS), a major carbohydrate active transport system. The PTS system is found throughout the bacterial kingdom, and is responsible for the coupled phosphorylation and translocation of numerous sugars across the cytoplasmic membrane PUBMED:8246840.\ This entry represents the N-terminal domain of enzyme I (EIN) which transfers the phosphoryl group from phosphoenolpyruvate (PEP) to the phosphoryl carrier protein (HPr) which in turn phosphorylates a group of membrane-associated proteins, known as enzyme II.\ \ The N-terminal domain of EI (EIN) extends from residues 1 to 259 and can be phosphorylated in a fully reversible manner by phosphorylated HPr. EIN, however, cannot be autophosphorylated by PEP PUBMED:1655788, PUBMED:8031118.

    \ ' '5353' 'IPR004685' '\ Characterised members of the branched chain Amino Acid:Cation Symporter (LIVCS) family transport all three of the branched chain aliphatic\ amino acids (leucine (L), isoleucine (I) and valine (V)). They function by a Na+ or H+\ symport mechanism and display 12 putative \ transmembrane helical spanners.\ ' '5354' 'IPR008810' '\ This family consists of several virulence-associated proteins from Corynebacterium equii (Rhodococcus equi). R. equi is an important pulmonary pathogen of foals and is increasingly isolated from pneumonic infections and other infections in Homo sapiens immunodeficiency virus-infected patients. Isolates from foals possess a large virulence plasmid, varying in size from 80 to 90 kb. Isolates lacking the plasmid are avirulent to foals. Little is known about the function of the plasmid apart from its encoding a virulence associated surface protein PUBMED:11083803.\ ' '5355' 'IPR008477' '\ This is a family of eukaryotic proteins with unknown function, which are induced by tumour necrosis factor.\ ' '5356' 'IPR008458' '\ Infectious bronchitis virus, a member of Coronaviridae family, has a single-stranded positive-sense RNA genome, which is 27 kb in length. Gene 5 contains two (5a and 5b) open reading frames. The function of the 5a and 5b proteins is unknown PUBMED:9168126.\ ' '5357' 'IPR008417' '\ Bap31 is a polytopic integral protein of the endoplasmic reticulum membrane and a substrate of caspase-8. Bap31 is cleaved within its cytosolic domain, generating pro-apoptotic p20 Bap31 PUBMED:11917123.\ ' '5359' 'IPR008702' '\ This family consists of several nucleopolyhedrovirus P10 proteins which are thought to be involved in the morphogenesis of the polyhedra PUBMED:9634101.\ ' '5360' 'IPR008462' '\ CsbD is a bacterial general stress response protein. It\'s expression is mediated by sigma-B, an alternative sigma factor PUBMED:11988534. The role of CsbD in stress response is unclear.\ ' '5361' 'IPR008749' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases correspond to MEROPS peptidase family C42. The type example is beet yellows virus-type papain-like endopeptidase (beet yellows virus) PUBMED:11711606.

    \ ' '5362' 'IPR008651' '\ This family consists of several bacterial HicB related proteins. The function of HicB is unknown although it is thought to be involved in pilus formation. It has been speculated that HicB performs a function antagonistic to that of pili and yet is necessary for invasion of certain niches PUBMED:9721313.\ ' '5363' 'IPR008451' '\ This family consists of several ALT protein homologues found in nematodes. Lymphatic filariasis is a major tropical disease caused by the mosquito borne nematodes Brugia and Wuchereria. About 120 million people are infected and at risk of lymphatic pathology such as acute lymphangitis and elephantiasis. Expression of alt-1 and alt-2 is initiated midway through development in the mosquito, peaking in the infective larva and declining sharply following entry into the host. ALT-1 and the closely related ALT-2 have been found to be strong candidates for a future vaccine against Homo sapiens filariasis PUBMED:10858234.\ ' '5364' 'IPR008709' '\ This family contains several eukaryotic neurochondrin proteins. Neurochondrin induces hydroxyapatite resorptive activity in bone marrow cells resistant to bafilomycin A1, an inhibitor of macrophage- and osteoclast-mediated resorption. Expression of the gene is localised to chondrocyte, osteoblast, and osteocyte in the bone and to the hippocampus and Purkinje cell layer of cerebellum in the brain PUBMED:10231559.\ ' '5365' 'IPR008478' '\ This entry consists of several uncharacterised proteins from Borrelia species including Borrelia burgdorferi and Borrelia garinii.\ ' '5366' 'IPR008439' '\ This family consists of Campylobacter major outer membrane proteins. The major outer membrane protein (MOMP), a putative porin and a multifunction surface protein of Campylobacter jejuni, may play an important role in the adaptation of the organism to various host environments PUBMED:10992471.\ ' '5367' 'IPR008781' '\

    This family of proteins contain the major surface glycoprotein of turkey rhinotracheitis virus (TRTV), avian pneumovirus (APV), the aetiological agent of turkey rhinotracheitis (TRT), and other Metapneumoviruses. The major surface glycoprotein is the attachment (G) protein, which, by analogy with other respiratory\ syncytial viruses (RSV), has been proposed to be responsible for virus binding to its cell receptor. The APV G gene and its predicted protein have several features in common with their RSV counterparts. Both G proteins are type II glycoproteins and both the RSV G and APV G proteins are heavily O-glycosylated. In both RSV and APV, the G protein is the most variable protein and is a major target for neutralizing antibodies PUBMED:11038385.

    \ ' '5368' 'IPR008838' '\ This family consists of several variable surface proteins from Treponema hyodysenteriae (Serpulina hyodysenteriae).\ ' '5369' 'IPR008843' '\ Entomopoxviruses (EPVs) are large (300-400 nm) oval-shaped viruses replicating in the cytoplasm of their insect host cells. At the end of their replicative cycle EPVs virions are occluded in a highly expressed protein called spheroidin. This protein forms large (5-20 mm long) oval-shaped occlusion bodies (OBs) called spherules. The infectious cycle of EPVs begins with the ingestion by the insect host of the spherules, their dissolution by the alkaline reducing conditions of the midgut fluid and the release of virions in the midgut lumen. The infective particles first replicate in midgut epithelial cells, then pass the gut barrier to colonise the internal tissues, mainly the fat body cells. Whilst spheroidin has been demonstrated to be non-essential for viral replication, it plays an essential role in the natural biological cycle of the virus in protecting virions from adverse environmental conditions (e.g. UV degradation) and thus improving transmission efficacy. In this respect, spheroidins are functionally similar to polyhedrins of baculoviruses or cypoviruses PUBMED:10867199.\ ' '5370' 'IPR008479' '\ This family contains several uncharacterised plant proteins.\ ' '5371' 'IPR008750' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This group of cysteine peptidases belong to the peptidase family C47 (staphopain family, clan CA). \ \ The type example are the staphopains, which are one of four major families of proteinases secreted by the Gram-positive Staphylococcus aureus. These staphylococcal cysteine proteases are secreted as preproenzymes that are proteolytically cleaved to generate the mature enzyme PUBMED:12437090, PUBMED:11447146, PUBMED:11767947.

    \ ' '5372' 'IPR008794' '\ This family consists of proline racemase () proteins which catalyse the interconversion of L- and D-proline in bacteria PUBMED:3755058. This family also contains several similar eukaryotic proteins including a sequence with B-cell mitogenic properties which has been characterised as a co-factor-independent proline racemase PUBMED:10932226.\ ' '5373' 'IPR008621' '\ This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon PUBMED:11717256.\ ' '5374' 'IPR008839' '\

    Members of this family are mitochondrial inner membrane proteins with a role in inner mitochondrial membrane organisation and biogenesis PUBMED:11907266. The yeast Mdm33 protein assembles into an oligomeric complex in the inner membrane where it performs homotypic protein-protein interactions. It has been suggested that Mdm33 plays a distinct role, possibly involved in fission of the mitochondrial inner membrane PUBMED:12591915.

    \ ' '5375' 'IPR008757' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M6 (immune inhibitor A family, clan MA(M)). The predicted active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH.

    \ \

    InhA of Bacillus thuringiensis (an entomopathogenic bacterium) specifically cleaves antibacterial peptides produced by insect hosts PUBMED:2089225. B. thuringiensis is highly resistant to the insect immune system due to its production of two factors, inhibitor A (InhA or InA) and inhibitor B (InhB or InB), which selectively block the humoral defence system developed by insects against Escherichia coli and Bacillus cereus PUBMED:992874. B. thuringiensis is especially resistant to cecropins and attacins, which are the main classes of inducible antibacterial peptides in various lepidopterans and dipterans PUBMED:7140755, PUBMED:3318666. InhA has been shown to specifically hydrolyze cecropins and attacins in the immune hemolymph of Hyalophora cecropia (Cecropia moth) in vitro PUBMED:6421577. However, it has been suggested that the role of InhA in resistance to the humoral defence system is not consistent with the time course of InhA production PUBMED:12029046.

    B. thuringiensis has two proteins belonging to this group (InhA and InhA2), and it has been shown that InhA2 has a vital role in virulence when the host is infected via the oral route PUBMED:12029046. The B. cereus member has been found as an exosporium component from endospores PUBMED:10475957. B. thuringiensis InhA is induced at the onset of sporulation and is regulated by Spo0A and AbrB PUBMED:11429458. Vibrio cholerae PrtV is thought to be encoded in the pathogenicity island PUBMED:9371455. However, PrtV mutants did not exhibit a reduced virulence phenotype, and thus PrtV is not an indispensable virulence factor PUBMED:9371455.

    Annotation note: due to the presence of PKD repeats in some of the members of this group (e.g., V. cholerae VCA0223), spurious similarity hits may appear (involving unrelated proteins), which may lead to the erroneous transfer of functional annotations and protein names. Also, please note that related Bacillus subtilis Bacillopeptidase F (Bpr or Bpf) contains two different protease domains: N-terminal (peptidase S8, subtilase, a subtilisin-like serine protease) and this C-terminal domain (peptidase M6), which may also complicate annotation.

    \ ' '5376' 'IPR008752' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M11 (gametolysin family, clan MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH PUBMED:7674922.

    The type example is gametolysin from the unicellular biflagellated alga, Chlamydomonas reinhardtii Gametolysin is a zinc-containing metallo-protease, which is responsible for the degradation of the cell wall. Homologues of gametolysin have also been reported in the simple multicellular organism, Volvox PUBMED:11489172, PUBMED:11680823.\ ' '5377' 'IPR008398' '\

    This family of sequences contains the 40 kDa polypeptides from garlic viruses (Allexiviruses), which do not resemble any other plant virus gene products reported so far PUBMED:8376963.

    \ \ Rod-shaped flexuous viruses have been isolated from garlic plants, Allium sativum. Infection by this virus creates typical mosaic symptoms. The core-like sequence of a zinc finger protein preceded by a cluster of basic amino acid residues shows similarities to the corresponding 12K proteins of the potexviruses and carlaviruses PUBMED:8376963. Viral epidemics by allexiviruses are also known to be caused by aphids and eriophyid mites (Aceria tulipae) carrying Potyviruses, Carlaviruses, and Allexiviruses PUBMED:18092468.\ ' '5378' 'IPR008751' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \ This group of cysteine peptidases belong to MEROPS peptidase family C53 (clan C-). The active site residues occur in the order E, H, C in the sequence which is unlike that in any other family. They are unique to pestiviruses. The N-terminal cysteine peptidase (Npro) encoded by the bovine viral diarrhoea virus genome is responsible for the self-cleavage that releases the N terminus of the core protein. This unique protease is dispensable for viral replication, and its coding region can be replaced by a ubiquitin gene directly fused in frame to the core PUBMED:11711606, PUBMED:10864644, PUBMED:9499122, PUBMED:8972567.\ ' '5379' 'IPR008704' '\

    This entry consists of several putative homing endonuclease proteins of around 245 residues in length which appear to be found exclusively in Naegleria species. The function of these proteins are unknown.

    \ ' '5380' 'IPR008910' '\ This alignment represents a conserved transmembrane helix as well as some flanking sequence. It is often found in association with a Mechanosensitive (MS) channel .\ ' '5381' 'IPR008480' '\ This family consists of several plant proteins of unknown function. Three of the sequences from Gossypium hirsutum (Upland cotton) in this family are described as G. hirsutum fibre expressed proteins PUBMED:9750105. The remaining sequences, found in Arabidopsis thaliana, are uncharacterised.\ ' '5382' 'IPR008720' '\ This family consists of several viral hemorrhagic septicemia virus non-virion (Nv) proteins. The NV protein is a nonstructural protein absent from mature virions although it is present in infected cells. The function of this protein is unknown PUBMED:7571446.\ ' '5383' 'IPR008481' '\ This family consists several of several uncharacterised proteins from the bacterium Coxiella burnetii. C. burnetii is the causative agent of the Q fever disease.\ ' '5384' 'IPR008438' '\

    This family consists of several mammalian calcineurin-binding proteins. Calcineurin is a Ca/calmodulin-dependent serine-threonine phosphatase and has been implicated in the transduction of signals that control the hypertrophy of cardiac muscle and slow fibre gene expression in striated muscle. \ A novel family of striated muscle-specific calcineurin-interacting proteins called calsarcins or myozenins has been identified that interact and co-localize with the Z-disc protein alpha-actinin thereby coupling muscle activity to calcineurin activation PUBMED:11114196.

    \ \

    Because calcineurin responds to sustained, low amplitude calcium signals, calsarcins may serve to localize calcineurin in the vicinity of unique intracellular pool, where it can interact with specific upstream activators or downstream substrates. Therefore, calsarcins may play an important role in modulating the function and substrate specificity of calcineurin in striated muscle cells.

    \ \

    Three isoforms of calsarcins that have been identified in human, rat and mouse.\

    \

    \ \

    Calsarcin-1, is expressed, throughout the development-cycle, in all striated muscle tissues. However, CALS-1 expression is localized in slow-twitch fibers. Calsarcin-2, has an approximate ~30% identity with CALS-1 is a globular protein with central glycine-rich domain flanked by a-helical regions. CALS-2 is expressed transiently in heart during early embryogenesis and later becomes restricted to skeletal muscle with weaker signals in adult prostate, placenta and pancreas. In contrast to CALS-1, the expression of Calsarcin-2 is restricted to fast-twitch skeletal fiber. Calsarcin-3, is expressed specifically in skeletal muscle and is enriched in fast-twitch muscle fibers. Like calsarcin-1 and calsarcin-2, calsarcin-3 interacts with calcineurin, and the Z-disc proteins alpha-actinin, gamma-filamin, and telethonin PUBMED:11842093.

    \ \ \ ' '5385' 'IPR008672' '\ This family consists of several eukaryotic mitotic checkpoint (Mitotic arrest deficient or MAD) proteins. The mitotic spindle checkpoint monitors proper attachment of the bipolar spindle to the kinetochores of aligned sister chromatids and causes a cell cycle arrest in prometaphase when failures occur. Multiple components of the mitotic spindle checkpoint have been identified in Saccharomyces cerevisiae and higher eukaryotes. In Saccharomyces cerevisiae, the existence of a Mad1-dependent complex containing Mad2, Mad3, Bub3 and Cdc20 has been demonstrated PUBMED:12574116.\ ' '5386' 'IPR008469' '\ This family contains several plant plasma membrane proteins termed DREPPs as they are developmentally regulated plasma membrane polypeptides PUBMED:9415814.\ ' '5387' 'IPR008482' '\ This family consists of several uncharacterised bacterial and archaeal proteins of unknown function.\ ' '5388' 'IPR008423' '\ This family contains several Bacillus thuringiensis P21 proteins. These proteins are thought to be molecular chaperones and have mosquitocidal properties PUBMED:9023925,PUBMED:2644205.\ ' '5389' 'IPR008483' '\ This entry consists of several uncharacterised proteins from Borrelia species including Borrelia burgdorferi and Borrelia garinii.\ ' '5390' 'IPR008892' '\ This family consists of several WCOR413-like plant cold acclimation proteins.\ ' '5391' 'IPR008834' '\ This family consists of several SpvD plasmid virulence proteins from different Salmonella species.\ ' '5392' 'IPR008406' '\ This family contains several plant dormancy-associated and auxin-repressed proteins the function of which is poorly understood PUBMED:9684359.\ ' '5393' 'IPR008840' '\ This family contains both viral and bacterial proteins which are related to the Gp157 protein of the Streptococcus thermophilus SFi bacteriophage. It is thought that bacteria possessing the gene coding for this protein have an increased resistance to the bacteriophage PUBMED:9792848.\ ' '5394' 'IPR008791' '\ Interleukin-18 (IL-18) is a proinflammatory cytokine that plays a key role in the activation of natural killer and T helper 1 cell responses principally by inducing interferon-gamma (IFN-gamma). Several poxvirus genes encode proteins with sequence similarity to IL-18BPs. It has been shown that vaccinia, ectromelia and cowpox viruses secrete from infected cells a soluble IL-18BP (vIL-18BP) that may modulate the host antiviral response. The expression of vIL-18BPs by distinct poxvirus genera that cause local or general viral dissemination, or persistent or acute infections in the host, emphasises the importance of IL-18 in response to viral infections PUBMED:10769064.\ ' '5395' 'IPR008707' '\

    This domain consists of several PilC protein sequences from Neisseria gonorrhoeae and Neisseria meningitidis. PilC is a phase-variable protein associated with pilus-mediated adherence of pathogenic Neisseria to target cells PUBMED:9467907.

    \ ' '5396' 'IPR008385' '\ This family consists of several African swine fever virus (ASFV) j13L proteins PUBMED:9603333, PUBMED:9603332, PUBMED:7730797.\ ' '5397' 'IPR008756' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to MEROPS peptidase family M56 (clan M-). The predicted active site residues for members of this family occur in the motif HEXXH. The type example is BlaR1 peptidase from Bacillus licheniformis.

    \ \

    Production of beta-Lactamase and penicillin-binding protein 2a (which mediate staphylococcal resistance to beta-lactam antibiotics) is regulated by a signal-transducing integral membrane protein and a transcriptional repressor. The signal transducer is a fusion protein with penicillin-binding and zinc metalloprotease domains. The signal for protein expression is transmitted by site-specific proteolytic cleavage of both the transducer, which auto-activates, and the repressor, which is inactivated, unblocking gene transcription.

    \ ' '5398' 'IPR008484' '\ This family consists of several short (27-30aa) porcine and bovine circovirus ORF6 proteins of unknown function.\ ' '5399' 'IPR008485' '\ This family consists of several eukaryotic proteins of unknown function.\ ' '5400' 'IPR008754' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase M43 (cytophagalysin family, clan MA(M)), subfamily M43B. The predicted active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH.

    \ \

    The type example of this family is the pregnancy-associated plasma protein A (PAPP-A), which cleaves insulin-like growth factor (IGF) binding protein-4 (IGFBP-4), causing a dramatic reduction in its affinity for IGF-I and -II. Through this mechanism, PAPP-A is a regulator of IGF bioactivity in several systems, including the Homo sapiens ovary and the cardiovascular system PUBMED:10913121, PUBMED:11713222, PUBMED:11897673.

    \ ' '5401' 'IPR008719' '\ NosL is one of the accessory proteins of the nos (nitrous oxide reductase) gene cluster. NosL is a monomeric protein of 18,540 MW that specifically and stoichiometrically binds Cu(I). The copper ion in NosL is ligated by a Cys residue, and one Met and one His are thought to serve as the other ligands. It is possible that NosL is a copper chaperone involved in metallocentre assembly PUBMED:11293413.\ ' '5403' 'IPR008890' '\ This family consists of several RfbT proteins from Vibrio cholerae. It has been found that genetic alteration of the rfbT gene is responsible for serotype conversion of V. cholerae O1 PUBMED:7688846 and determines the difference between the Ogawa and Inaba serotypes, in that the presence of rfbT is sufficient for Inaba-to-Ogawa serotype conversion PUBMED:11035750.\ ' '5404' 'IPR008761' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \ These group of serine peptidases belong to MEROPS peptidase family S37 (clan SC). The members of this group of secreted peptidases are restricted to bacteria. In Streptomyces lividans the peptidase removes tripeptides from the N terminus of extracellular proteins (tripeptidyl aminopeptidase,Tap) PUBMED:8920189, PUBMED:7487044.\ ' '5405' 'IPR008758' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to MEROPS peptidase family S28 (clan SC). The predicted active site residues for members of this family and family S10 occur in the same order in the sequence: S, D, H.

    These serine proteases include several eukaryotic enzymes such as lysosomal Pro-X carboxypeptidase, dipeptidyl-peptidase II, and thymus-specific serine peptidase PUBMED:10527559, PUBMED:11003393, PUBMED:11139392, PUBMED:11173530.

    \ ' '5406' 'IPR000280' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)).

    The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein PUBMED:7845208.\ The p80 endopeptidase resides towards the middle of the polyprotein and is\ responsible for processing all non-structural pestivirus proteins PUBMED:7845208, PUBMED:1651596.\ The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted\ to have a fold similar to that of chymotrypsin PUBMED:7845208, PUBMED:2548336. An HDS catalytic triad\ has been identified PUBMED:2548336.

    \ ' '5407' 'IPR008760' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins PUBMED:10725411.

    \ ' '5408' 'IPR008763' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)).

    The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis PUBMED:11418578, PUBMED:11741860.

    \ ' '5409' 'IPR008762' '\

    Methyl-accepting chemotaxis proteins (MCPs) are a family of bacterial receptors that mediate chemotaxis to diverse signals, responding to changes in the concentration of attractants and repellents in the environment by altering swimming behaviour PUBMED:16359703. Environmental diversity gives rise to diversity in bacterial signalling receptors, and consequently there are many genes encoding MCPs PUBMED:17299051. For example, there are four well-characterised MCPs found in Escherichia coli: Tar (taxis towards aspartate and maltose, away from nickel and cobalt), Tsr (taxis towards serine, away from leucine, indole and weak acids), Trg (taxis towards galactose and ribose) and Tap (taxis towards dipeptides).

    \

    MCPs share similar topology and signalling mechanisms. MCPs either bind ligands directly or interact with ligand-binding proteins, transducing the signal to downstream signalling proteins in the cytoplasm. MCPs undergo two covalent modifications: deamidation and reversible methylation at a number of glutamate residues. Attractants increase the level of methylation, while repellents decrease it. The methyl groups are added by the methyl-transferase cheR and are removed by the methylesterase cheB. Most MCPs are homodimers that contain the following organisation: an N-terminal signal sequence that acts as a transmembrane domain in the mature protein; a poorly-conserved periplasmic receptor (ligand-binding) domain; a second transmembrane domain; and a highly-conserved C-terminal cytoplasmic domain that interacts with downstream signalling components. The C-terminal domain contains the glycosylated glutamate residues.

    \ \

    This entry represents the N-terminal domain found in chemotaxis methyl-accepting proteins primarily from from Vibrio species and Campylobacter species.

    \ ' '5410' 'IPR008764' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported.

    \

    This is a group of peptidases belong to MEROPS peptidase family U57 (clan U-). The type example is the YabG protein of Bacillus subtilis. This is a protease involved in the synthesis and maturation of the spore coat proteins SpoIVA and YrbA of B. subtilis PUBMED:11040425.

    \ ' '5412' 'IPR008848' '\ This family consists of several plasmid regulatory proteins from the extreme thermophilic and acidophilic archaea Sulfolobus.\ ' '5413' 'IPR008737' '\

    This family of proteins are of unknown function and found exclusively in nematodes PUBMED:7525414.

    \ ' '5414' 'IPR008399' '\

    Anthrax is an acute disease in humans and animals caused by the bacterium Bacillus anthracis, which can be lethal. There are effective vaccines against anthrax, and some forms of the disease respond well to antibiotic treatment. The anthrax bacillus is one of only a few that can form long-lived spores.

    \ \

    The anthrax toxin consists of the proteins protective antigen (PA) lethal factor (LF) and oedema factor (EF). The first step of toxin entry into host cells is the recognition by PA of a receptor on the surface of the target cell. The subsequent cleavage of receptor-bound PA enables EF and LF to bind and form a heptameric PA63 pre-pore, which triggers endocytosis. PA has been shown to bind to two cellular receptors: anthrax toxin receptor/tumour endothelial marker 8 and capillary morphogenesis protein 2 (CMG2), which are closely related host cell receptors. Both bind to PA with high affinity and are capable of mediating toxicity PUBMED:15243628, PUBMED:15079089, and both are type 1 membrane proteins that include an approximately 200-aa extracellular von Willebrand factor A (VWA) domain with a metal ion-dependent adhesion site (MIDAS) motif PUBMED:15079089.

    \ \ \

    This region is found in the putatively cytoplasmic C terminus of the anthrax receptor.

    \ ' '5415' 'IPR008400' '\

    Anthrax is an acute disease in humans and animals caused by the bacterium Bacillus anthracis, which can be lethal. There are effective vaccines against anthrax, and some forms of the disease respond well to antibiotic treatment. The anthrax bacillus is one of only a few that can form long-lived spores.

    \ \

    The anthrax toxin consists of the proteins protective antigen (PA) lethal factor (LF) and oedema factor (EF). The first step of toxin entry into host cells is the recognition by PA of a receptor on the surface of the target cell. The subsequent cleavage of receptor-bound PA enables EF and LF to bind and form a heptameric PA63 pre-pore, which triggers endocytosis. PA has been shown to bind to two cellular receptors: anthrax toxin receptor/tumour endothelial marker 8 and capillary morphogenesis protein 2 (CMG2), which are closely related host cell receptors. Both bind to PA with high affinity and are capable of mediating toxicity PUBMED:15243628, PUBMED:15079089, and both are type 1 membrane proteins that include an approximately 200-aa extracellular von Willebrand factor A (VWA) domain with a metal ion-dependent adhesion site (MIDAS) motif PUBMED:15079089.

    \ \ \

    This region is found in the putatively extracellular N-terminal half of the anthrax receptor. It is probably part of the Ig superfamily and most closely related to .

    \ ' '5416' 'IPR008903' '\ This family consists of several Clostridium botulinum haemagglutinin (HA) subcomponents. C. botulinum type D strain 4947 produces two different sizes of progenitor toxins (M and L) as intact forms without proteolytic processing. The M toxin is composed of neurotoxin (NT) and nontoxic-nonhaemagglutinin (NTNHA), whereas the L toxin is composed of the M toxin and haemagglutinin (HA) subcomponents (HA-70, HA-17, and HA-33) PUBMED:8631890.\ ' '5417' 'IPR008486' '\ This family consists of several uncharacterised hypothetical proteins from Rhizobium loti (Mesorhizobium loti).\ ' '5418' 'IPR008487' '\ This family consists of several uncharacterised hypothetical proteins of unknown function from Xylella fastidiosa, the organism that causes Pierce\'s disease in plants.\ ' '5419' 'IPR008312' '\

    There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. However, these proteins are encoded in type VI secretion loci (including the SCI genomic island in Salmonella enterica and the imp locus in Rhizobium leguminosarum) implicated in pathogenicity and protein secretion PUBMED:12437215, PUBMED:12580282, PUBMED:16763151.

    \ ' '5420' 'IPR008902' '\ This entry consists of bacterial rhamnosidase A and B enzymes. L-Rhamnose is abundant in biomass as a common constituent of glycolipids and glycosides, such as plant pigments, pectic polysaccharides, gums or biosurfactants. Some rhamnosides are important bioactive compounds. For example, terpenyl glycosides, the glycosidic precursor of aromatic terpenoids, act as important flavouring substances in grapes. Other rhamnosides act as cytotoxic rhamnosylated terpenoids, as signal substances in plants or play a role in the antigenicity of pathogenic bacteria PUBMED:10632887.\ ' '5421' 'IPR006530' '\

    These sequences contain two tandem copies of a 21-residue extracellular repeat that is found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin PUBMED:10341219, PUBMED:7934896, PUBMED:2403547.

    \ ' '5422' 'IPR008619' '\

    This highly divergent repeat occurs in number of filamentous haemagglutinin (FHA) proteins implicated in cell aggregation PUBMED:2539596. Filamentous haemagglutinin from Bordetella pertussis, the infectious agent in childhood whooping cough, expresses filamentous haemagglutinin as a surface-exposed and secreted protein that acts as a major virulence attachment factor, functioning as both a primary adhesin and an immunomodulator to bind the bacterial to cells of the respiratory epithelium PUBMED:16339899. The FHA molecule has a globular head that consists of two domains: a shaft and a flexible tail. Its sequence contains two regions of tandem 19-residue repeats, where the repeat motif consists of short beta-helical strands separated by beta-turns PUBMED:7519681.

    \ ' '5423' 'IPR008489' '\ This is a family of uncharacterised ORFs found in Bacteriophage and Lactococcus lactis.\ ' '5424' 'IPR008860' '\ This family consists of several antigen proteins from Taenia and Echinococcus (tapeworm) species.\ ' '5425' 'IPR008769' '\

    Polyhydroxyalkanoates (PHAs) are storage polyesters synthesised by various bacteria as intracellular carbon and energy reserve material. PHAs are accumulated as water-insoluble inclusions within the cells. This family consists of the phasins PhaF and PhaI which act as a transcriptional regulator of PHA biosynthesis genes. PhaF has been proposed to repress expression of the phaC1 gene and the phaIF operon.

    \ ' '5426' 'IPR008490' '\ This family consists of several proteins from Sulfolobus solfataricus described as first ORF in transposon ISC1212.\ ' '5427' 'IPR008596' '\ This family consists of Rex/Tax proteins from Homo sapiens and simian T-cell leukaemia viruses. The exact function of these proteins is unknown.\ ' '5428' 'IPR008491' '\ This family contains several eukaryotic sequences which are thought to be CDK5 activator-binding proteins, however, the function of this family is unknown.\ ' '5430' 'IPR008429' '\ This family consists of several eukaryotic cleft lip and palate transmembrane protein 1 sequences. Cleft lip with or without cleft palate is a common birth defect that is genetically complex. The nonsyndromic forms have been studied genetically using linkage and candidate-gene association studies with only partial success in defining the loci responsible for orofacial clefting. CLPTM1 encodes a transmembrane protein and has strong homology to two Caenorhabditis elegans genes, suggesting that CLPTM1 may belong to a new gene family PUBMED:9828125. This family also contains the Homo sapiens cisplatin resistance related protein CRR9p which is associated with CDDP-induced apoptosis PUBMED:11162647.\ ' '5431' 'IPR008493' '\ This family consists of several eukaryotic proteins of unknown function.\ ' '5432' 'IPR008494' '\ This family consists of several highly related Mus musculus and Homo sapiens proteins of unknown function.\ ' '5433' 'IPR008598' '\ This family consists of several drought induced 19 (Di19) like proteins. Di19 has been found to be strongly expressed in both the roots and leaves of Arabidopsis thaliana during progressive drought PUBMED:7823904. The precise function of Di19 is unknown.\ ' '5434' 'IPR008495' '\ This family consists of several hypothetical proteins of unknown function, found in Borrelia burgdorferi and Borrelia garinii.\ ' '5436' 'IPR008496' '\ This family consists of several eukaryotic proteins of unknown function.\ ' '5437' 'IPR008662' '\

    This family contains Rattus norvegicus LAP1C proteins and several uncharacterised highly related sequences from both Mus sp. and humans. LAP1s (lamina-associated polypeptide 1s) are type 2 integral membrane proteins with a single membrane-spanning region of the inner nuclear membrane PUBMED:12061773. LAP1s bind to both A- and B-type lamins and have a putative role in the membrane attachment and assembly of the nuclear lamina PUBMED:7721789.

    \ ' '5438' 'IPR008497' '\ This family consists of several bacterial proteins of unknown function.\ ' '5439' 'IPR008498' '\ This family consists of several short proteins of unknown function found in Caenorhabditis species.\ ' '5440' 'IPR008499' '\ This family consists of uncharacterised proteins found in Mus musculus (mouse), man, zebra fish and other eukaryotes.\ ' '5441' 'IPR008644' '\

    U15 is an ORF present in human herpesvirus 6 (HHV-6) that was initially isolated from patients with the AIDS and lymphoproliferative disorders, but was subsequently shown to be responsible for the common childhood disease exanthema subitum (roseola). Several gene fragments of HHV-6 have been shown to activate the human immunodeficiency virus (HIV) type 1 long terminal repeat (LTR) PUBMED:11069999. The ORF U15 encodes a protein of 110 amino acids, whose function in unknown.

    \ ' '5442' 'IPR008500' '\ This family consists of porcine and bovine circovirus ORF3 proteins of unknown function.\ ' '5443' 'IPR008501' '\

    The Tho complex (THOC) is involved in transcription elongation and mRNA export from the nucleus PUBMED:15133499. This entry represents the subunit THOC7, which is found in higher eukaryotes, and the non-homologous subunit Mft1p found in yeast. The funtions of these subunits are unknown, and it is not known if these subunits are functionally equivalent.

    \ ' '5444' 'IPR008708' '\ This family consists of several Neisseria meningitidisTspB virulence factor proteins.\ ' '5445' 'IPR008502' '\ This family consists of several proteins of unknown function found exclusively in Arabidopsis thaliana.\ ' '5446' 'IPR008503' '\ This family consists of several hypothetical proteins from different archaeal and bacterial species.\ ' '5447' 'IPR008505' '\

    This entry consists of several hypothetical proteins of unknown function from Borrelia species. They may be proteinases as the majority contain a propeptide proteinase inhibitor domain which is associated with both serine and metallopeptidases.

    \ ' '5448' 'IPR008506' '\ This family consists of several eukaryotic proteins of unknown function.\ ' '5449' 'IPR008868' '\ This family consists of several bacterial TniB NTP-binding proteins. TniB is a probable ATP-binding protein PUBMED:8195081 which is involved in Tn5053 mercury resistance transposition PUBMED:8594337.\ ' '5450' 'IPR008636' '\ This family consists of several HOOK1, 2 and 3 proteins from different eukaryotic organisms. The different members of the Homo sapiens gene family are HOOK1, HOOK2 and HOOK3. Different domains have been identified in the three Homo sapiens HOOK proteins, and it was demonstrated that the highly conserved NH2-domain mediates attachment to microtubules, whereas the central coiled-coil motif mediates homodimerisation and the more divergent C-terminal domains are involved in binding to specific organelles (organelle-binding domains). It has been demonstrated that endogenous HOOK3 binds to Golgi membranes PUBMED:11238449, whereas both HOOK1 and HOOK2 are localised to discrete but unidentified cellular structures. In mice the Hook1 gene is predominantly expressed in the testis. Hook1 function is necessary for the correct positioning of microtubular structures within the haploid germ cell. Disruption of Hook1 function in mice causes abnormal sperm head shape and fragile attachment of the flagellum to the sperm head PUBMED:12075009.\ ' '5451' 'IPR008507' '\ This family consists of several plant proteins of unknown function.\ ' '5452' 'IPR008664' '\ This domain consists of mammalian LISCH7 protein homologues. LISCH7 is a liver-specific BHLH-ZIP transcription factor.\ ' '5453' 'IPR008728' '\

    The RNA polymerase II elongator complex is a major histone acetyltransferase component of the RNA polymerase II (RNAPII) holoenzyme and is involved in transcriptional elongation PUBMED:11689709, PUBMED:11904415. It may also play some role in wobble uridine tRNA modification PUBMED:15769872. This entry represents the ELP4 subunit. ELP4 is not required for the association of the complex with nascent RNA transcript, but is required for complex integrity and histone acetyltransferase activity. It is also required for an early step in synthesis of 5-methoxycarbonylmethyl (mcm5) and 5-carbamoylmethyl (ncm5) groups present on uridines at the wobble position in tRNA in yeast species.

    \ ' '5454' 'IPR008508' '\ This family consists of several hypothetical bacterial and archaeal proteins whose functions have not been experimentally verified. Computational analysis of sequence, predicted structure and genomic context suggests that these proteins may be endonucleases involved in either restriction-modification and/or DNA excision repair PUBMED:15972856.\ ' '5455' 'IPR008700' '\

    This entry contains , which has been characterised as being involved in RPM1-mediated resistance in Arabidopsis thaliana (Mouse-ear cress) PUBMED:11955429, PUBMED:15845764. Rin4 is an essential regulator of plant defence that plays a central role in resistance in case of infection by a pathogen. It is a common target for both type III avirulence proteins from Pseudomonas syringae (AvrB, AvrRpm1 and AvrRpt2) and for the plant Resistance (R) proteins RPM1 and RPS2. In strains carrying the appropriate R gene for avirulence proteins of the pathogen, its association with avirulence proteins triggers a defence system including the hypersensitive response, which limits the spread of disease. In contrast, in plants lacking appropriate R genes, its association with avirulence proteins of the pathogen impairs the defence system and leads to pathogen multiplication.

    \ \

    Rin4 interacts with the unrelated avirulence proteins AvrB, AvrRpm1 and AvrRpt2 from P. syringae. Its association with AvrB and AvrRpm1 results in its phosphorylation, which is in turn recognised by the resistance RPM1 protein, leading to the activation of RPM1-dependent disease resistance responses; while its interaction with AvrRpt2 results in its destruction, which activates RPS2-dependent disease resistance responses.

    \ \

    similarly appears to be involved in the rice defence response to Magnaporthe grisea (Rice blast fungus) (Pyricularia grisea) PUBMED:17073304.

    \ ' '5456' 'IPR008420' '\ This family consists of P13 proteins from Borrelia species. P13 is a 13 kDa integral membrane protein which is post-translationally processed at both ends and modified by an unknown mechanism PUBMED:11292755.\ ' '5457' 'IPR008706' '\ This family consists of a group of 17.4 kDa nanovirus proteins which are highly related to the Faba bean necrotic yellows virus component 8 protein whose function is unknown PUBMED:9880028.\ ' '5458' 'IPR008701' '\ This family consists of several NPP1 like necrosis inducing proteins from oomycetes, fungi and bacteria. Infiltration of NPP1 into leaves of Arabidopsis thaliana plants result in transcript accumulation of pathogenesis-related (PR) genes, production of ROS and ethylene, callose apposition, and HR-like cell death PUBMED:12410815.\ ' '5459' 'IPR008509' '\ This family consists of several eukaryotic proteins of unknown function.\ ' '5460' 'IPR008510' '\ This entry consists of several hypothetical proteins found in Borrelia species.\ ' '5461' 'IPR008511' '\ This family consists of several plant proteins of unknown function.\ ' '5462' 'IPR008512' '\ This family consists of several plant proteins of unknown function.\ ' '5463' 'IPR008815' '\

    This family consists of bacterial proteins encoded within an intervening sequence present within some 23S rRNA genesPUBMED:8341711. The function of these proteins is not known, but a structural study indicates that each momonmer folds into an antiparallel four-helix bundle, while the overall protein is a homopentamer with a toroid-shaped structure containing a tapered central channel PUBMED:16948161.

    \ ' '5464' 'IPR008513' '\ This family consists of several bacterial proteins of unknown function.\ ' '5465' 'IPR008630' '\ This family contains a number of glycosyltransferase enzymes that contain a DXD motif. This family includes a number of Caenorhabditis elegans homologues where the DXD is replaced by DXH. Some members of this family are included in glycosyltransferase family 34.\ ' '5466' 'IPR008514' '\

    This entry includes the virulence protein Hcp1 () from Pseudomonas aeruginosa, pathogenic bacteria that can cause chronic lung infections in cystic fibrosis patients. Hcp1 is a hexameric protein that forms rings with a 40-angstrom internal diameter PUBMED:16763151. Hcp1 functions during chronic P. aeruginosa infections, and can be detected in secretions from infected cystic fibrosis patient. Hcp1 appears to be part of a protein secretion apparatus that is required for virulence. Several bacterial pathogens mediate interactions with their hosts through protein secretion, often involving Hcp1-like virulence loci, which are widely distributed among pathogenic bacteria.

    \ \ ' '5467' 'IPR008515' '\ This family consists of several short bacterial proteins of unknown function.\ ' '5468' 'IPR008516' '\

    NKAIN (Na,K-Atpase INteracting) proteins are a family of evolutionary conserved transmembrane proteins that localise to neurons, that are critical for neuronal function, and that interact with the beta subunits, beta1 in vertebrates and beta in Drosophila, of Na,K-ATPase. NKAINs have highly conserved trans-membrane domains but otherwise no other characterised domains. NKAINs may function as subunits of pore or channel structures in neurons or they may affect the function of other membrane proteins. They are likely to function within the membrane bilayer PUBMED:17606467.

    \ ' '5469' 'IPR008395' '\ This domain is related to the TUDOR domain PUBMED:12575993. The function of the agenet domain is unknown. This signature matches one of the two Agenet domains in the FMR proteins PUBMED:12575993.\ ' '5470' 'IPR008845' '\ This family consists of several Theileria P67 surface antigens. A stage specific surface antigen of Theileria parva, p67, is the basis for the development of an anti-sporozoite vaccine for the control of East Coast fever (ECF) in Bos taurus. The antigen has been shown to contain five distinct linear peptide sequences recognised by sporozoite-neutralising murine monoclonal antibodies PUBMED:10024569.\ ' '5471' 'IPR008517' '\ This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative lipoproteins.\ ' '5472' 'IPR008518' '\ This family consists of several eukaryotic proteins of unknown function.\ ' '5473' 'IPR008806' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \ This entry describes the C-terminal region of several DNA-directed RNA polymerase III polypeptides which are related to the Saccharomyces cerevisiae RPC82 protein. RNA polymerase C (III) promotes the transcription of tRNA and 5S RNA genes. In Saccharomyces cerevisiae, the enzyme is composed of 15 subunits, ranging from 160 to about 10 kDa PUBMED:1406632.\ ' '5476' 'IPR008733' '\

    This family consists of several peroxisomal biogenesis factor 11 (PEX11) proteins from several eukaryotic species. The PEX11 peroxisomal membrane proteins promote peroxisome division in multiple eukaryotes PUBMED:12417726. PEX11 genes in rice have diversification not only in sequences but also in expression patterns under normal and various stress conditions PUBMED:18291602.

    \ ' '5477' 'IPR008753' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of metallopeptidases belong to the MEROPS peptidase family M13 (neprilysin family, clan MA(E)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH PUBMED:7674922.

    \ \

    M13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk PUBMED:11223883, PUBMED:7674922. The family includes eukaryotic and prokaryotic oligopeptidases, as well as some of the proteins responsible for the molecular basis of the blood group antigens e.g. Kell PUBMED:7674922.

    \ \

    Neprilysin (), is another member of this group, it is variously known as common acute lymphoblastic leukemia antigen (CALLA), enkephalinase (gp100) and neutral endopeptidase metalloendopeptidase (NEP). It is a plasma membrane-bound mammalian enzyme that is able to digest biologically-active peptides, including enkephalins PUBMED:7674922. The zinc ligands of neprilysin are known and are analogous to those in thermolysin, a related peptidase PUBMED:7674922, PUBMED:8099556. Neprilysins, like thermolysin, are inhibited by phosphoramidon, which appears to selectively inhibit this family in mammals. The enzymes are all oligopeptidases, digesting oligo- and polypeptides, but not proteins PUBMED:7674922. Neprilysin consists of a short cytoplasmic domain, a membrane-spanning region and a large extracellular domain. The cytoplasmic domain contains a conformationally-restrained octapeptide, which is thought to act as a stop transfer sequence that prevents proteolysis and secretion PUBMED:7674922, PUBMED:3555489.

    \ \ \ ' '5478' 'IPR008520' '\ This region is found as two or more repeats in a small number of hypothetical proteins.\ ' '5479' 'IPR008599' '\ This region is found in several proteins characterised as carbohydrate diacid regulators (e.g. ). An HTH DNA-binding motif is found at the C terminus of these proteins suggesting that this region includes the sugar recognition region.\ ' '5480' 'IPR008594' '\

    This entry represents scavenger mRNA decapping enzymes, such as Dcp2 and DcpS. DcpS is a scavenger pyrophosphatase that hydrolyses the residual cap structure following 3\' to 5\' mRNA degradation. DcpS uses cap dinucleotides or capped oligonucleotides as substrate to release m(7)GMP (N7-methyl GMP), while Dcp2 uses capped mRNA as substrate in order to hydrolyse the cap to release m(7)GDP (N7-methyl GDP) PUBMED:16246173. The association of DcpS with 3\' to 5\' exonuclease exosome components suggests that these two activities are linked and there is a coupled exonucleolytic decay-dependent decapping pathway. The family contains a histidine triad (HIT) sequence with three histidines separated by hydrophobic residues PUBMED:16001405. The central histidine within the DcpS HIT motif is critical for decapping activity and defines the HIT motif as a new mRNA decapping domain, making DcpS the first member of the HIT family of proteins with a defined biological function. This family is related to

    \

    More information about these proteins can be found at Protein of the Month: RNA Exosomes PUBMED:.

    \ ' '5481' 'IPR008521' '\ This family consists of several eukaryotic proteins of unknown function.\ ' '5483' 'IPR008799' '\ This family consists of several avirulence D (AvrD) proteins primarily found in Pseudomonas syringae PUBMED:10485919.\ ' '5484' 'IPR008523' '\ This family consists of several bacterial proteins of unknown function.\ ' '5485' 'IPR008524' '\ This family consists of several Siphovirus and Lactococcus proteins of unknown function. The viral sequences are thought to be tail component proteins.\ ' '5486' 'IPR008640' '\

    Hep/Hag is a seven-residue repeat that makes up the majority of the sequence of a family of bacterial haemagglutinins and invasins. As many as ten copies of the Hep/Hag motif can be present in these proteins. A family of immunodominant antigens identified in Burkholderia mallei (Pseudomonas mallei) and Burkholderia pseudomallei (Pseudomonas pseudomallei) known as Hep_Hag autotransporter (BuHA) proteins, have been found to share protein domain architectures with hemagglutinins and invasins PUBMED:17362501.

    \ ' '5487' 'IPR008808' '\ This family consists of several broad-spectrum mildew resistance proteins from Arabidopsis thaliana and other dicots. Plant disease resistance (R) genes control the recognition of specific pathogens and activate subsequent defence responses. The A. thaliana locus Resistance to powdery mildew 8 (RPW8) contains two naturally polymorphic, dominant R genes, RPW8.1 and RPW8.2, which individually control resistance to a broad range of powdery mildew pathogens. They induce localised, salicylic acid-dependent defences similar to those induced by R genes that control specific resistance. Apparently, broad-spectrum resistance mediated by RPW8 uses the same mechanisms as specific resistance PUBMED:11141561, PUBMED:12509520.\ ' '5488' 'IPR008525' '\ This family consists of several proteins of unknown function from Coxiella burnetii (the causative agent of a zoonotic disease called Q fever).\ ' '5489' 'IPR008526' '\ This family consists of several bacterial proteins of unknown function.\ ' '5490' 'IPR008635' '\ This short motif is found in invasins and haemagglutinins, normally associated with the Hep_Hag repeat ().\ ' '5491' 'IPR008527' '\ This family consists of several proteins of unknown function Raphanus sativus (Radish) and Brassica napus (Rape).\ ' '5492' 'IPR008528' '\ This family consists of several plant proteins of unknown function.\ ' '5493' 'IPR008529' '\

    These are hypothetical proteins from the proteobacteria.

    \ ' '5494' 'IPR008617' '\

    The function of these proteins is unknown.

    \ ' '5495' 'IPR008530' '\ This family consists of several eukaryotic proteins of unknown function.\ ' '5497' 'IPR008831' '\

    This entry represents subunit Med31 of the Mediator complex. It contains the Saccharomyces cerevisiae SOH1 homologues. SOH1 is responsible for the repression of temperature sensitive growth of the HPR1 mutant PUBMED:7982575 and has been found to be a component of the RNA polymerase II transcription complex. SOH1 not only interacts with factors involved in DNA repair, but transcription as well. Thus, the SOH1 protein may serve to couple these two processes PUBMED:8849885.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '5498' 'IPR008532' '\ This domain occurs in proteins that have been annotated as Fibronectin/fibrinogen binding protein by similarity. This annotation comes from where the N-terminal region is involved in this activity PUBMED:8063411. Hence the activity of this C-terminal domain is unknown. This domain contains a conserved motif D/E-X-W/Y-X-H that may be functionally important.\ ' '5499' 'IPR008627' '\ This pentapeptide repeat is found mainly in Caenorhabditis elegans. The most conserved amino acid at each position leads to its name GETHR. The family also includes a divergent repeat in a microneme protein . The function of this repeat is unknown.\ ' '5500' 'IPR008604' '\ The organisation of microtubules varies with the cell type and is presumably controlled by tissue-specific microtubule-associated proteins (MAPs). The 115 kDa epithelial MAP (E-MAP-115) has been identified as a microtubule-stabilising protein predominantly expressed in cell lines of epithelial origin PUBMED:9745708. The binding of this microtubule associated protein is nucleotide independent PUBMED:8408219.\ ' '5501' 'IPR008533' '\ This domain consists of several bacterial proteins of unknown function.\ ' '5502' 'IPR008534' '\

    This family includes proteins that are about 200 amino acids in length. The proteins are all from baculoviruses. This family includes ORF107 from Orgyia pseudotsugata multicapsid polyhedrosis virus (OpMNPV) and a variety of other numbered ORF proteins, such as ORF52 , ORF140 from other baculoviruses. The function of these proteins is unknown.

    \ ' '5503' 'IPR008535' '\ This family consists of several bacterial proteins of unknown function.\ ' '5504' 'IPR008698' '\ This family consists of several NADH-ubiquinone oxidoreductase B18 subunit proteins from different eukaryotic organisms. Oxidative phosphorylation is the well-characterised process in which ATP, the principal carrier of chemical energy of individual cells, is produced due to a mitochondrial proton gradient formed by the transfer of electrons from NADH and FADH2 to molecular oxygen. The oxidative phosphorylation (OXPHOS) system is located in the mitochondrial inner membrane and consists of five multi-subunit enzyme complexes and two small electron carriers: coenzyme Q10 and cytochrome C. At least 70 structural proteins involved in the formation of the whole OXPHOS system are encoded by nuclear genes, whereas 13 structural proteins are encoded by the mitochondrial genome. Deficiency of NADH ubiquinone oxidoreductase, the first enzyme complex of the mitochondrial respiratory chain, is one of the most frequent causes of Homo sapiens mitochondrial encephalomyopathies PUBMED:10830904.\

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '5505' 'IPR008536' '\ This family consists of several Chlamydia and Parachlamydia proteins, the function of which are unknown.\ ' '5506' 'IPR008889' '\ This short motif is found in a variety of plant proteins. These proteins vary greatly in length and are mostly composed of low complexity regions. They all conserve a short motif FXhVQChTG, where X is any amino acid and h is a hydrophobic amino acid. The function of this motif is uncertain, however one protein in this family has been found to bind the SigA sigma factor . It would seem plausible that this motif is needed for this activity and that this whole family might be involved in modulating plastid sigma factors.\ ' '5507' 'IPR008428' '\

    This family represents Chondroitin N-acetylgalactosaminyltransferase. Proteins have a type II transmembrane topology.\ \ The enzyme is involved in the biosynthetic initiation and elongation of chondroitin sulphate and is the key enzyme responsible for the selective chain assembly of chondroitin/dermatan sulphate on the linkage region tetrasaccharide common to various proteoglycans containing chondroitin/dermatan sulphate or heparin/heparan sulphate chains.\

    \ ' '5508' 'IPR008386' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit E found in the F0 complex of F-ATPases. Mitochondrial F-ATPases can associate together to form dimeric or oligomeric complexes, such interactions involving the physical association of membrane-embedded F0 complexes. In yeast, the F0 complex E subunit appears to play an important role in supporting F-ATPase dimerisation. This subunit is anchored to the inner mitochondrial membrane via its N-terminal region, which is involved in stabilising subunits G and K of the F0 complex. The C-terminal region of subunit E is hydrophilic, protruding into the intermembrane space where it can also help stabilise the F-ATPase dimer complex PUBMED:15701797.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '5509' 'IPR004646' '\

    A number of Fe-S cluster-containing hydro-lyases share a conserved motif, including argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase (fumarase), and tartrate dehydratase (see ). Proteins in this group represent a subset of closely related proteins or modules, including the Escherichia coli tartrate dehydratase alpha chain and the N-terminal region of the class I fumarase (where the C-terminal region is homologous to the tartrate dehydratase beta chain). The activity of archaeal proteins in this group is unknown.

    \ ' '5511' 'IPR004647' '\

    A number of Fe-S cluster-containing hydro-lyases share a conserved motif, including\ argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase\ (fumarase), and tartrate dehydratase (see ). Proteins in this group represent\ a subset of closely related proteins or modules, including the Escherichia coli tartrate dehydratase\ beta chain and the C-terminal region of the class I fumarase (where the N-terminal region is\ homologous to the tartrate dehydratase alpha chain). The activity of the archaeal proteins in\ this group is unknown.

    \ \ ' '5512' 'IPR008537' '\ This family contains proteins of unknown function from archaeal, bacterial and plant species.\ ' '5513' 'IPR008538' '\ This entry consists of a number of hypothetical proteins from the Anabaena and Synechocystis cyanobacterial species.\ \

    The function of this protein is completely unknown. In a small number of proteins this protein also contains Clp_N domains () that are involved in protein interactions.

    \ ' '5514' 'IPR008539' '\

    This family consists of a group of proteins, that may be involved in lipopolysaccharide-modification PUBMED:10482503. Members are functionally uncharacterised.

    \ ' '5515' 'IPR008540' '\ This group of proteins contains members of the BZR1/LAT61 family of plant transcriptional repressors involved in controlling the response to Brassinosteroids (BRs). BRs are plant hormones that play essential roles in growth and development. BZR1 binds directly to DNA repressing the synthesis of genes involved in BR synthesis. Phosphorylation of BZR1 by BIN1 targets BZR1 to the 20S proteosome, while dephosphorylation leads to nuclear accumulation of BZR1 PUBMED:15681342.\ ' '5516' 'IPR008542' '\ This family consists of a series of repeated sequences (of around 180 residues) which are found in Salmonella typhimurium, Salmonella typhi and Escherichia coli. These repeats are almost always found with this entry. The repeats are associated with RatA and RatB, the coding sequences of which are found in the pathogeneicity island of Salmonella. The sequences may be determinants of pathogenicity PUBMED:12540539, PUBMED:15347755.\ ' '5517' 'IPR008541' '\ This family consists of a series of repeated sequences (of around 180 residues) which are found in Salmonella typhimurium, Salmonella typhi and Escherichia coli. These repeats are almost always found with . The repeats are associated with RatA and RatB, the coding sequences of which are found in the pathogeneicity island of Salmonella. The sequences may be determinants of pathogenicity PUBMED:12540539, PUBMED:15347755.\ ' '5518' 'IPR008867' '\ This family consists of several bacterial thiazole biosynthesis protein G sequences. ThiG, together with ThiF and ThiH, is proposed to be involved in the synthesis of 4-methyl-5-(b-hydroxyethyl)thiazole (THZ) which is an intermediate in the thiazole production pathway PUBMED:9371431.\ ' '5519' 'IPR008811' '\ This family consists of several raffinose synthase proteins, also known as seed imbibition (Sip1) proteins. Raffinose (O-alpha- D-galactopyranosyl- (1-->6)- O-alpha- D-glucopyranosyl-(1-->2)- O-beta- D-fructofuranoside) is a widespread oligosaccharide in plant seeds and other tissues. Raffinose synthase () is the key enzyme that channels sucrose into the raffinose oligosaccharide pathway PUBMED:12244450.\ ' '5520' 'IPR008692' '\

    This entry represents a domain found in haemagglutinin membrane proteins from several pathogenic Mycoplasma species PUBMED:12593736. These haemagglutinins are immunogenic, variably expressed surface proteins, whose antigenic variation may contribute to bacterial immune evasion. Haemagglutinins are thought to be responsible for the agglutination and haemolysis of erythrocytes after Mycoplasma infection, which causes severe anaemia PUBMED:15609286. In the avian parasite Mycoplasma synoviae, the two major membrane antigens, MspA (haemagglutinin) and MspB (lipoprotein), are encoded by a single gene, vlhA (variable lipoprotein and haemagglutinin) PUBMED:15758238.

    \ ' '5521' 'IPR008631' '\ This family consists of the eukaryotic glycogen synthase proteins GYS1, GYS2 and GYS3. Glycogen synthase (GS) is the enzyme responsible for the synthesis of -1,4-linked glucose chains in glycogen. It is the rate limiting enzyme in the synthesis of the polysaccharide, and its activity is highly regulated through phosphorylation at multiple sites and also by allosteric effectors, mainly glucose 6-phosphate (G6P) PUBMED:11415431.\ ' '5522' 'IPR008826' '\ This family consists of several eukaryotic selenium binding proteins as well as three sequences from archaea. The exact function of this protein is unknown although it is thought that SBP56 participates in late stages of intra-Golgi protein transport PUBMED:10799528. The Lotus japonicus homologue of SBP56, LjSBP is thought to have more than one physiological role and can be implicated in controlling the oxidation/reduction status of target proteins in vesicular Golgi transport PUBMED:12026169.\ ' '5523' 'IPR008543' '\ This family consists of chloroplast encoded Ycf2, which is around 2000 residues in length. The function of Ycf2 is unknown, though it may be an ATPase. Its retention in reduced chlorplast geneomes of non-photosynthetic plants, e.g. Epifagus virginiana (Beechdrops), and transformation experiments in tobacco indicate that it has an essential function which is probably not related to photosynthesis PUBMED:10792825.\ ' '5524' 'IPR008544' '\ This family consists of several enterobacterial and siphoviral sequences of unknown function.\ ' '5525' 'IPR008881' '\ In the Escherichia coli cytosol, a fraction of the newly synthesised proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains PUBMED:12603737. This group of sequences contain the ribosomal subunit association domain.\ ' '5526' 'IPR008880' '\

    In the Escherichia coli cytosol, a fraction of the newly synthesised proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains PUBMED:12603737.

    \

    This entry represents the C-terminal domain of bacterial trigger factor proteins, which has a multi-helical structure consisting of an irregular array of long and short helices. This domain is structurally similar to the peptide-binding domain of the bacterial porin chaperone SurA.

    \ ' '5527' 'IPR008906' '\ This dimerisation domain is found at the C terminus of the transposases of elements belonging to the Activator superfamily (hAT element superfamily). The isolated dimerisation domain forms extremely stable dimers in vitro PUBMED:10662858.\ ' '5528' 'IPR008409' '\ This family consists of several eukaryotic sequences of unknown function. The mammalian members of this family are annotated as breast carcinoma amplified sequence 2 (BCAS2) proteins PUBMED:12169396. BCAS2 is a putative spliceosome associated protein PUBMED:9731529.\ ' '5529' 'IPR008545' '\ This family consists of several plant proteins of unknown function. Several sequences in this family are described as being myosin heavy chain-like.\ ' '5530' 'IPR008647' '\ UL49.5 protein consists of 98 amino acids with a calculated molecular mass of 10,155 Da. It contains putative signal peptide and transmembrane domains but lacks a consensus sequence for N glycosylation. UL49.5 protein is an O-glycosylated structural component of the viral envelope PUBMED:8551587.\ ' '5531' 'IPR008546' '\ This domain consists of several plant proteins of unknown function.\ ' '5532' 'IPR008441' '\ This family consists of several capsular polysaccharide proteins. Capsular polysaccharide (CPS) is a major virulence factor in Streptococcus pneumoniae PUBMED:11179285.\ ' '5533' 'IPR008547' '\ This signature identifies Transmembrane protein 53, that have no known function but are predicted to be integral membrane proteins.\ \ \ ' '5534' 'IPR008425' '\ This family consists of cyclin-dependent kinase inhibitor 3 or kinase associated phosphatase proteins from several mammalian species. The cyclin-dependent kinase (Cdk)-associated protein phosphatase (KAP) is a Homo sapiens dual specificity protein phosphatase that dephosphorylates Cdk2 on threonine 160 in a cyclin-dependent manner PUBMED:10987270,PUBMED:8127873.\ ' '5535' 'IPR008900' '\ This family consists of bacterial and viral proteins which are very similar to the Zonular occludens toxin (Zot). Zot is elaborated by bacteriophage present in toxigenic strains of Vibrio cholerae. Zot is a single polypeptide chain of 44.8 kDa, with the ability to reversibly alter intestinal epithelial tight junctions, allowing the passage of macromolecules through mucosal barriers.\ ' '5536' 'IPR008548' '\ This family contains the G6 protein from Vaccinia virus (strain Copenhagen) and related proteins from other Orthopoxvirus. The proteins are uncharacterised.\ ' '5537' 'IPR008841' '\ This family consists of several Siphovirus tail component proteins as well as some bacterial proteins of unknown function.\ ' '5538' 'IPR008455' '\ This region is found in a group of Dictyostelium discoideum (Slime mould) proteins. It is likely to form a coiled-coil. Some of the proteins are regulated by cyclic AMP and are expressed late in development PUBMED:2157129.\ ' '5539' 'IPR008884' '\ This family consists of bacterial macrocin O-methyltransferase (TylF) proteins. TylF is responsible for the methylation of macrocin to produce tylosin. Tylosin is a macrolide antibiotic used in veterinary medicine to treat infections caused by Gram-positive bacteria and as an animal growth promoter in the Sus scrofa (Pig) industry. It is produced by several Streptomyces species. As with other macrolides, the antibiotic activity of tylosin is due to the inhibition of protein biosynthesis by a mechanism that involves the binding of tylosin to the ribosome, preventing the formation of the mRNA-aminoacyl-tRNA-ribosome complex PUBMED:10220165.\ ' '5540' 'IPR008676' '\ This family consists of three different eukaryotic proteins (mortality factor 4 (MORF4/MRG15), male-specific lethal 3(MSL-3) and ESA1-associated factor 3(EAF3)). It is thought that the MRG family is involved in transcriptional regulation via histone acetylation PUBMED:11290425, PUBMED:11036083.\ ' '5541' 'IPR008687' '\ This family consists of several bacterial MobC-like, mobilisation proteins. MobC proteins belong to the group of relaxases. Together with MobA and MobB they bind to a single cis-active site of a mobilising plasmid, the origin of transfer (oriT) region PUBMED:11976306. The absence of MobC has several different effects on oriT DNA. Site- and strand-specific nicking by MobA protein is severely reduced, accounting for the lower frequency of mobilisation. The localised DNA strand separation required for this nicking is less affected, but becomes more sensitive to the level of active DNA gyrase in the cell. In addition, strand separation is not efficiently extended through the region containing the nick site. These effects suggest a model in which MobC acts as a molecular wedge for the relaxosome-induced melting of oriT DNA. The effect of MobC on strand separation may be partially complemented by the helical distortion induced by supercoiling. However, MobC extends the melted region through the nick site, thus providing the single-stranded substrate required for cleavage by MobA PUBMED:9302013.\ ' '5542' 'IPR008421' '\ This family consists of several virulent strain associated lipoproteins from Borrelia burgdorferi.\ ' '5543' 'IPR008899' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This (predicted) zinc finger is found in the bassoon and piccolo proteins, both of which are components of the presynaptic cytoskeletal matrix (PCM) assembled at the active zone of neurotransmitter release, where Piccolo plays a role in the trafficking of synaptic vesicles (SVs) PUBMED:10707984, PUBMED:9679147, PUBMED:14734538. The Piccolo zinc fingers were found to interact with the dual prenylated rab3A and VAMP2/Synaptobrevin II receptor PRA1. There are eight conserved cysteines in Piccolo-type zinc fingers, suggesting that they coordinates two zinc ligands.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5544' 'IPR018292' '\ This family consists of several mammalian protein kinase A anchoring protein 3 (PRKA3) or A-kinase anchor protein 110 kDa (AKAP 110) sequences. Agents that increase intracellular cAMP are potent stimulators of sperm motility. Anchoring inhibitor peptides, designed to disrupt the interaction of the cAMP-dependent protein kinase A (PKA) with A kinase-anchoring proteins (AKAPs), are potent inhibitors of sperm motility. PKA anchoring is a key biochemical mechanism controlling motility. AKAP110 shares compartments with both RI and RII isoforms of PKA and may function as a regulator of both motility- and head-associated functions such as capacitation and the acrosome reaction PUBMED:10319321.\

    This entry represents a sub group of the A-kinase anschor 110kDa protein.

    \ ' '5545' 'IPR008878' '\

    Thess proteins are found in insertion sequences related to IS66. The function of these proteins is uncertain, but they are probably essential for transposition PUBMED:11418571.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '5546' 'IPR008789' '\ This family consists of several highly related Poxvirus sequences which are thought to be intermediate transcription factors PUBMED:1660196.\ ' '5547' 'IPR008628' '\ This family consists of several eukaryotic GPP34 like proteins. GPP34 localises to the Golgi complex and is conserved from Saccharomyces cerevisiae to humans. The cytosolic-ally exposed location of GPP34 predicts a role for a novel coat protein in Golgi trafficking PUBMED:11042173.\ ' '5548' 'IPR008601' '\ This family is based on a group of Dictyostelium discoideum (Slime mould) proteins that are essential in early development PUBMED:2153977. and are located on the cell surface and mediate cell-cell adhesion.\ ' '5549' 'IPR008775' '\ This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins as well as a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalysing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum\'s disease (RD) which is an inherited neurological syndrome biochemically characterised by the accumulation of phytanic acid in plasma and tissues PUBMED:10767344.\ ' '5550' 'IPR008888' '\ This domain consists of several Ustilago mating-type proteins. The b locus of the phytopathogenic fungus Ustilago maydis encodes a multiallelic recognition function that controls the ability of the fungus to form a dikaryon and complete the sexual stage of the life cycle. The b locus has at least 25 alleles and any combination of two different alleles, brought together by mating between haploid cells, allows the fungus to cause disease and undergo sexual development within the plant PUBMED:2227416.\ ' '5552' 'IPR008854' '\ This family consists of thiopurine S-methyltransferase proteins from both eukaryotes and prokaryotes. Thiopurine S-methyltransferase (TPMT) is a cytosolic enzyme that catalyses S-methylation of aromatic and heterocyclic sulphydryl compounds, including anticancer and immunosuppressive thiopurines PUBMED:9780226.\ ' '5553' 'IPR008615' '\ This repeat is approximately 22 residues long and is only found in Dictyostelium discoideum (Slime mould). It appears to be related to . The alignment consists of two tandem repeats. It is termed the FNIP repeat after the pattern of conserved residues.\ ' '5554' 'IPR008778' '\

    This entry represents C-terminal domain of Pirin proteins from both eukaryotes and prokaryotes.

    \ \

    The function of Pirin is unknown but the gene coding for this protein is known to be expressed in all tissues in the human body although it is expressed most strongly in the liver and heart. Pirin is known to be a nuclear protein, exclusively localised within the nucleoplasma and predominantly concentrated within dot-like subnuclear structures PUBMED:9079676.

    \ \

    Pirin is composed of two structurally similar domains arranged face to face. The N-terminal domain additionally features four beta-strands, and the C-terminal domain also includes four additional -strands and a short alpha-helix. Although the two domains are similar, the C-terminal domain of Pirin differs from the N-terminal domain as it does not contain a metal binding site and its sequence does not contain the conserved metal-coordinating residues PUBMED:1457596.

    \ \

    Pirin is confirmed to be a member of the cupin superfamily on the basis of primary sequence and structural similarity. The presence of a metal binding site in the N-terminal beta-barrel of Pirin, may be significant in its role in regulating NFI DNA replication and NF-kappaB transcription factor activity PUBMED:1457596.

    \ \

    Pirin structure has been found to closely resemble members of the cupin superfamily. Pirin contains the two characteristic sequences of the cupin superfamily, namely PG-(X)5-HXH-(X)4-E-(X)6-G and G-(X)5-PXG-(X)2-H-(X)3-N separated by a variable stretch of 15-50 amino acids. These motifs are best conserved in the N-terminal where the conserved histidine and glutamic acid residues correspond to the metal-coordinating residues. The C-terminal domain motifs lack the metal binding residues normally associated with the cupin fold PUBMED:1457596.

    \ \

    Pirin was identified to be a metal-binding protein PUBMED:14573596, and was found that the metal-binding residues of Pirins are highly conserved across mammals, plants, fungi, and prokaryotic organisms. Pirin acts as a cofactor for the transcription factor NFI, the regulatory mechanism of which is generally believed to require the assistance of a metal ion PUBMED:12426136. Structural data supports the hypothesis that the bound iron of Pirin may participate in this transcriptional regulation by enhancing and stabilising the formation of the p50,Bcl3,DNA complex PUBMED:14573596. Metals have been implicated directly or indirectly in the NF-kappaB family of transcription factors that control expression of a number of early response genes associated with inflammatory responses, cell growth, cell cycle progression, and neoplastic transformation PUBMED:12426136. However, most metal-dependent transcription factors are DNA-binding proteins that bind to specific sequences when the metal binds to the protein. Pirin, on the other hand, appears to function differently and bind to the transcription factor DNA complex PUBMED:14573596.

    \ \ ' '5555' 'IPR008887' '\ This small family of proteins is currently restricted to Methanosarcina species. Members of this family are about 200 residues in length, except for that has two copies of this region. Although the function of this region is unknown the pattern of conservation suggests that this may be an enzyme, including multiple conserved aspartate and glutamate residues. The most conserved motif in these proteins is NEL/MEXNE/D, where X can be any amino acid, and is found at the C terminus of these proteins.\ ' '5556' 'IPR008886' '\ Despite being classed as uncharacterised proteins, the members of this family are almost certainly enzymes in that they contain a domain distantly related to .\ ' '5557' 'IPR007111' '\

    The NACHT domain is a 300 to 400 residue predicted nucleoside triphosphatase (NTPase) domain, which is found in animal, fungal and bacterial proteins. The NACHT domain has been named after NAIP, CIITA, HET-E and TP1. It is found in\ association with other domains, such as the CARD domain (), the\ DAPIN domain (), the HEAT repeat (), the WD\ repeat (), the leucine-rich repeat (LRR) or the BIR repeat () PUBMED:10782090.

    \

    \ The NACHT domain consists of seven distinct conserved motifs, including the ATP/GTPase specific P-loop, the Mg(2+)-binding site (Walker\ A and B motifs, respectively) and five more specific motifs. The unique features of the NACHT domain include the prevalence of \'tiny\' residues\ (glycine, alanine or serine) directly C-terminal of the Mg(2+)-coordinating aspartate in the Walker B motif, in place of a second acidic residue prevalent\ in other NTPases. A second acidic residue is typically found in the NACHT-containing proteins two positions downstream. Furthermore, the distal motif VII contains a conserved pattern of polar, aromatic and hydrophobic residues that is not seen in any other NTPase family PUBMED:10782090.

    \ ' '5558' 'IPR008427' '\ This fungal specific cysteine rich domain is found in some proteins with proposed roles in fungal pathogenesis PUBMED:12633989.\ ' '5559' 'IPR008858' '\

    The TROVE (Telomerase, Ro and Vault) domain is a module of ~300-500 residues\ that is found in TEP1 and Ro60 the protein components of three\ ribonucleoprotein particles. The TROVE domain is also found in bacterial\ ribonucleoproteins suggesting an ancient origin of these ribonucleoproteins.\ The TROVE domain can be found associated with other domains, such as the VWFA\ domain, the TEP1 N-terminal domain, the NACHT-NTPase domain, and WD-40 repeats. The TROVE domain may\ be involved in binding the RNA components of the three RNPs, which are\ telomerase RNA, Y RNA and vault RNA PUBMED:14563212.

    \ \

    The TROVE domain contains a few absolutely conserved residues. As none of\ these conserved residues are the polar type of amino acids found in active\ sites, it seems unlikely that this region has an enzymatic function PUBMED:14563212.

    \ ' '5560' 'IPR008813' '\ This family consists of Firmicute RepL proteins which are involved in plasmid replication.\ ' '5561' 'IPR008864' '\ This family consists of several Tenuivirus nucleocapsid proteins PUBMED:2024478.\ ' '5562' 'IPR008550' '\ This family consists of several gammaherpesvirus proteins of unknown function.\ ' '5563' 'IPR008859' '\

    Thrombospondins are multimeric multidomain glycoproteins that function at cell surfaces and in the extracellular matrix milieu. They act as regulators of cell interactions in vertebrates. They are divided into two subfamilies, A and B, according to their overall molecular organisation. The subgroup A proteins TSP-1 and -2 contain an N-terminal domain, a VWFC domain, three TSP1 repeats, three EGF-like domains, TSP3 repeats and a C-terminal domain. They are assembled as trimer. The subgroup B thrombospondins, designated TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) are distinct in that they contain unique N-terminal regions, lack the VWFC domain and TSP1 repeats, contain four copies of EGF-like domains, and are assembled as pentamers PUBMED:11687483. EGF, TSP3 repeats and the C-terminal domain are thus the hallmark of a thrombospondin.

    \

    The globular C-terminal domain is a beta sandwich of two curved antiparallel beta-sheets PUBMED:15014436. The fold is an elaboration of the jelly role topology, with strand B3-B7, B11 and B14-B15 forming the eight-stranded jelly roll motif. The function of the C-terminal domain is not yet known.

    \ ' '5564' 'IPR008722' '\ This domain represents the presumed membrane spanning region of the OmpF proteins. This region is involved in channel formation and is thought to form an 8-stranded beta-barrel PUBMED:11034289.\ ' '5565' 'IPR008456' '\ The domain fold is a jelly-roll, composed of two antiparallel beta-sheets and two short alpha-helices PUBMED:9334749. A groove on beta-sheet I exhibited the best surface complementarity to the collagen. This site partially overlaps with the peptide sequence previously shown to be critical for collagen binding. Recombinant proteins containing single amino acid mutations designed to disrupt the surface of the putative binding site exhibited significantly lower affinities for collagen.\ ' '5566' 'IPR008454' '\

    This entry represents a repeated B region domain found in the collagen-binding surface protein Cna in Staphylococcus aureus, as well as other related domains. The B region domain of Cna has a prealbumin-like beta-sandwich fold of seven strands in two sheets with a Greek key topology PUBMED:10673425. However, this domain does not mediate collagen binding, the region carries out that function; instead it appears to form a stalk that presents the ligand binding domain away from the bacterial cell surface. Cna is a collagen-binding MSCRAMM (Microbial Surface Component Recognizing Adhesive Matrix Molecules), and is necessary and sufficient for S. aureus cells to adhere to cartilage.

    \ ' '5567' 'IPR000727' '\

    The process of vesicular fusion with target membranes depends on a set of SNAREs (SNAP-Receptors), \ which are associated with the fusing membranes PUBMED:9239749, PUBMED:9232812. Target SNAREs \ (t-SNAREs) are localised on the target membrane and belong to two different families, the \ syntaxin-like family and the SNAP-25 like family. One member of each family, together with a\ v-SNARE localised on the vesicular membrane, are required for fusion.

    The Syntaxins are type-I \ transmembrane proteins that contain several regions with coiled-coil propensity in their cytosolic \ part, the SNARE motif. SNAP-25 () is a protein consisting of two coiled-coil regions, which is associated with the \ membrane by lipid anchors. SNARE motifs assemble into parallel four helix bundles stabilised by the burial of these hydrophobic helix faces in the bundle core. Monomeric SNARE motifs are disordered so this assembly reaction is accompanied by a dramatic increase in alpha-helical secondary structure PUBMED:14570579. The parallel arrangement of SNARE motifs within complexes bring the transmembrane anchors, and the two membranes, into close proximity. Recently, it was shown that the two coiled-coil regions of SNAP-25 and\ one of the coiled-coil regions of the syntaxins are related PUBMED:9096343. This domain is found in both Syntaxin and SNAP-25 families as well as in other proteins.

    \ ' '5569' 'IPR008705' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This family contains a conserved novel zinc finger domain found in the eukaryotic proteins Nanos and Xcat-2. In Drosophila melanogaster, Nanos functions as a localised determinant of posterior pattern. Nanos RNA is localised to the posterior pole of the maturing egg cell and encodes a protein that emanates from this localised source. Nanos acts as a translational repressor and thereby establishes a gradient of the morphogen Hunchback PUBMED:7601003. Xcat-2 is found in the vegetal cortical region and is inherited by the vegetal blasomeres during development, and is degraded very early in development. The localised and maternally restricted expression of Xcat-2 RNA suggests a role for its protein in setting up regional differences in gene expression that occur early in development PUBMED:8223259.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '5570' 'IPR008551' '\

    This family is found in eukaryotes, prokaryotes and viruses and has no known function. has been found to be expressed during early embryogenesis in Mus sp PUBMED:8268909.

    \ ' '5571' 'IPR008883' '\ This family consists of the eukaryotic tumour susceptibility gene 101 protein (TSG101). Altered transcripts of this gene have been detected in sporadic breast cancers and many other Homo sapiens malignancies. However, the involvement of this gene in neoplastic transformation and tumourigenesis is still elusive. TSG101 is required for normal cell function of embryonic and adult tissues but this gene is not a tumour suppressor for sporadic forms of breast cancer PUBMED:12482969.\ ' '5572' 'IPR008419' '\ This family consists of P25 proteins from the Beta vulgaris subsp. vulgaris necrotic yellow vein viruses. Beet necrotic yellow vein virus (BNYVV) is a plant pathogenic virus. It is characterised by a positive-stranded single stranded RNA genome that is rod-shaped and non-enveloped in nature. The virus is transmitted by Polymyxa betae, a fungus from the order Plasmodiophorales. p25 is an RNA-3-encoded protein. An estimate of the ratio between synonymous and non-synonymous substitution rates (omega) with maximum-likelihood models showed that the p25 sequences presented the highest (of three benyvirus proteins) mean omega values with strong positive selection acting on 14 amino acids, and particularly on amino acid 68, where the omega value was the highest so far encountered in plant viruses PUBMED:16186246.\ ' '5573' 'IPR008436' '\ Chlamydia is a genus of bacteria, which causes the most common bacterial sexually transmitted diseases. They are obligate intracellular bacterial pathogens. Members of this genus lack a peptidoglycan layer, but as a substitute, it has been proposed that they have several cysteine rich membrane proteins. This includes the major outer membrane protein (MOMP). These form disulphide bonds to provide rigidity to the cell wall. The alignment of the amino acid sequences of the MOMP from various serovars of Chlamydia show that they have between seven and ten cysteine residues; seven of which are highly conserved PUBMED:15835913. The MOMP has been the focus of efforts to produce a vaccine for Chlamydia trachomatis PUBMED:17601785.\ \ The 15 kDa cysteine-rich protein in this entry is a multi-pass outer membrane protein. They are associated with the differentiation of reticulate bodies (RBs) into elementary bodies (EBs) PUBMED:3066701. They immunolocalise to the inclusion membrane, which is the membrane that surrounds the intracellular parasite. These proteins are recognised by CD8+ T cells in both human and mouse infections, suggesting they gain access to the host cytoplasm.\ ' '5574' 'IPR008909' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \ This all alpha helical domain is the anticodon binding domain of Arginyl tRNA synthetase. This domain is known as the DALR domain after characteristic conserved amino acids PUBMED:10447505.\ ' '5576' 'IPR008820' '\ Rubella virus (RV), the sole member of the genus Rubivirus within the family Togaviridae, is a small enveloped, positive strand RNA virus. The nucleocapsid consists of 40S genomic RNA and a single species of capsid protein which is enveloped within a host-derived lipid bilayer containing two viral glycoproteins, E1 (58 kDa) and E2 (42-46 kDa). In virus infected cells, RV matures by budding either at the plasma membrane, or at the internal membranes depending on the cell type and enters adjacent uninfected cells by a membrane fusion process in the endosome, directed by E1-E2 heterodimers. The heterodimer formation is crucial for E1 transport out of the endoplasmic reticulum to the Golgi and plasma membrane. In RV E1, a cysteine at position 82 is crucial for the E1-E2 heterodimer formation and cell surface expression of the two proteins. E1 has been shown to be a type 1 membrane protein, rich in cysteine residues with extensive intramolecular disulphide bonds PUBMED:11682134. This family is found together with and .\ ' '5577' 'IPR008821' '\ Rubella virus (RV), the sole member of the genus Rubivirus within the family Togaviridae, is a small enveloped, positive strand RNA virus. The nucleocapsid consists of 40S genomic RNA and a single species of capsid protein which is enveloped within a host-derived lipid bilayer containing two viral glycoproteins, E1 (58 kDa) and E2 (42-46 kDa). In virus infected cells, RV matures by budding either at the plasma membrane, or at the internal membranes depending on the cell type and enters adjacent uninfected cells by a membrane fusion process in the endosome, directed by E1-E2 heterodimers. The heterodimer formation is crucial for E1 transport out of the endoplasmic reticulum to the Golgi and plasma membrane. In RV E1, a cysteine at position 82 is crucial for the E1-E2 heterodimer formation and cell surface expression of the two proteins PUBMED:11682134. This family is found together with and .\ ' '5578' 'IPR008819' '\ Rubella virus is an enveloped positive-strand RNA virus of the family Togaviridae. Virions are composed of three structural proteins: a capsid and two membrane-spanning glycoproteins, E2 and E1. During virus assembly, the capsid interacts with genomic RNA to form nucleocapsids. It has been discovered that capsid phosphorylation serves to negatively regulate binding of viral genomic RNA. This may delay the initiation of nucleocapsid assembly until sufficient amounts of virus glycoproteins accumulate at the budding site and/or prevent non-specific binding to cellular RNA when levels of genomic RNA are low. It follows that at a late stage in replication, the capsid may undergo dephosphorylation before nucleocapsid assembly occurs PUBMED:12525610. This family is found together with and .\ ' '5579' 'IPR008620' '\ This family consists of several Rhizobium FixH like proteins. It has been suggested that the four proteins FixG, FixH, FixI, and FixS may participate in a membrane-bound complex coupling the FixI cation pump with a redox process catalysed by FixG PUBMED:2536685.\ ' '5580' 'IPR008437' '\ This family consists of minor structural proteins largely from the Caliciviridaei family of viruses, including Sapporo virus (Hu/Chiba/041413/2004/JP) and Sapporo virus (Hu/Ehime/04-1680/2004/JP). These viruses cause gastroenteritis. The function of this family is unknown.\ ' '5581' 'IPR008856' '\ This family consists of several eukaryotic translocon-associated protein beta (TRAPB) or signal sequence receptor beta subunit (SSR-beta) proteins. The normal translocation of nascent polypeptides into the lumen of the endoplasmic reticulum (ER) is thought to be aided in part by a translocon-associated protein (TRAP) complex consisting of 4 protein subunits. The association of mature proteins with the ER and Golgi, or other intracellular locales, such as lysosomes, depends on the initial targeting of the nascent polypeptide to the ER membrane. A similar scenario must also exist for proteins destined for secretion PUBMED:11204460.\ ' '5582' 'IPR008552' '\ This short presumed domain is found in a large number of hypothetical plant proteins. The domain is quite rich in conserved glycine residues. It occurs in some putative transposons but currently has no known function.\ ' '5583' 'IPR008802' '\ This family consists of the highly related rubber elongation factor (REF), small rubber particle protein (SRPP) and stress-related protein (SRP) sequences. REF and SRPP are released from the rubber particle membrane into the cytosol during osmotic lysis of the sedimentable organelles (lutoids). The exact function of this family is unknown PUBMED:12461132.\ ' '5584' 'IPR008825' '\ S-antigens are heat stable proteins that are found in the blood of individuals infected with malaria.\ ' '5585' 'IPR008797' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    In PSII, the oxygen-evolving complex (OEC) is responsible for catalysing the splitting of water to O(2) and 4H+. The OEC is composed of a cluster of manganese, calcium and chloride ions bound to extrinsic proteins. In cyanobacteria there are five extrinsic proteins in OEC (PsbO, PsbP-like, PsbQ-like, PsbU and PsbV), while in plants there are only three (PsbO, PsbP and PsbQ), PsbU and PsbV having been lost during the evolution of green plants PUBMED:15258264.

    \

    This family represents the PSII OEC protein PsbQ. Both PsbQ and PsbP () are regulators that are necessary for the biogenesis of optically active PSII. The crystal structure of PsbQ from spinach revealed a 4-helical bundle polypeptide. The distribution of positive and negative charges on the protein surface might explain the ability of PsbQ to increase the binding of chloride and calcium ions and make them available to PSII PUBMED:12949587.

    \ \ ' '5586' 'IPR008896' '\ The chloroplast genomes of most higher plants contain two giant open reading frames designated ycf1 and ycf2. Although the function of Ycf1 is unknown, it is known to be an essential gene PUBMED:10792825.\ ' '5588' 'IPR008653' '\ This family consists of several eukaryotic immediate early response (IER) 2 and 5 proteins. The role of IER5 is unclear although it play an important role in mediating the cellular response to mitogenic signals. Again, little is known about the function of IER2 although it is thought to play a role in mediating the cellular responses to a variety of extracellular signals PUBMED:11102586, PUBMED:10049588.\ ' '5589' 'IPR008380' '\

    This family includes a 5\'-nucleotidase, , specific for purines (IMP and GMP) PUBMED:9371705. These enzymes are members of the Haloacid Dehalogenase (HAD) superfamily. HAD members are recognised by three short motifs {hhhhDxDx(T/V)}, {hhhh(T/S)}, and either {hhhh(D/E)(D/E)x(3-4)(G/N)} or {hhhh(G/N)(D/E)x(3-4)(D/E)} (where "h" stands for a hydrophobic residue). Crystal structures of many HAD enzymes has verified PSI-PRED predictions of secondary structural elements which show each of the "hhhh" sequences of the motifs as part of beta sheets. This subfamily of enzymes is part of "Subfamily I" of the HAD superfamily by virtue of a "cap" domain in between motifs 1 and 2. This subfamily\'s cap domain has a different predicted secondary structure than all other known HAD enzymes and thus has been designated "subfamily IG", the domain appears to consist of a mixed alpha/beta fold.

    \ ' '5590' 'IPR008912' '\ This group of proteins contains a VWA type domain and the function of this family is unknown. It is found as part of a CO oxidising (Cox) system operon is several bacteria PUBMED:10433972.\ ' '5591' 'IPR008553' '\ The members of this archaebacterial protein entry are around 250-300 amino acid residues in length. The function of these proteins is not known.\ ' '5592' 'IPR008895' '\ The proteins in this family are designated YL1 PUBMED:. They have been shown to be DNA-binding and may be transcription factors PUBMED:.\ ' '5594' 'IPR008713' '\

    The ninR region of phage lambda contains two recombination genes, orf (ninB) and rap (ninG). These genes are involved in the RecF and RecBCD recombination pathways of Escherichia coli that operate on phage lambda PUBMED:2142940, PUBMED:11952832. Orf and Rap participate in Red recombination, the primary pathway operating when wild-type lambda grows lytically in rec+ cells PUBMED:11952832.

    \ \

    NinG gives a 100-fold increase in recombinant frequencies for RecABC pathway-mediated, phage-plasmid homologous recombination. It is called rap, for recombination adept with plasmid PUBMED:2963943.

    \ ' '5595' 'IPR008785' '\ This family consists of several Poxvirus virion envelope protein A14-like sequences. A14 is a component of the virion membrane and has been found to be an H1 phosphatase substrate in vivo and in vitro. A14 is hyperphosphorylated on serine residues in the absence of H1 expression PUBMED:10729144.\ ' '5596' 'IPR008554' '\

    Glutaredoxins PUBMED:3152490, PUBMED:3286320, PUBMED:2668278, also known as thioltransferases (disulphide reductases, are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system PUBMED:14713336.

    \

    Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin, which functions in a similar way, glutaredoxin possesses an active centre disulphide bond PUBMED:14962389. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond.

    \

    Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed PUBMED:1994586 that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.

    \ \

    This family contains several viral glutaredoxins, and many related bacterial and eukaryotic proteins of unknown function. The best characterised member of this family is G4L () from Vaccinia virus (strain Western Reserve/WR) (VACV), which is necessary for virion morphogenesis and virus replication PUBMED:10982364. This is a cytomplasmic protein which functions as a shuttle in a redox pathway between membrane-associated E10R and L1R or F9L PUBMED:11752136. \

    \ ' '5597' 'IPR008555' '\ This family consists of several eukaryotic proteins of unknown function. One of the family members () is a circulating cathodic antigen (CCA) found in Schistosoma mansoni (Blood fluke) PUBMED:10413050.\ ' '5598' 'IPR008656' '\ This family consists of several inositol 1,3,4-trisphosphate 5/6-kinase proteins. Inositol 1,3,4-trisphosphate is at a branch point in inositol phosphate metabolism. It is dephosphorylated by specific phosphatases to either inositol 3,4-bisphosphate or inositol 1,3-bisphosphate. Alternatively, it is phosphorylated to inositol 1,3,4,6-tetrakisphosphate or inositol 1,3,4,5-tetrakisphosphate by inositol trisphosphate 5/6-kinase PUBMED:8662638.\ ' '5599' 'IPR008786' '\

    This family contains the vaccinia virus A31R protein, the function of which is not known.

    \ ' '5600' 'IPR008711' '\

    The ninR region of Bacteriophage lambda contains two recombination genes, orf (ninB) and rap (ninG), that have roles when the RecF and RecBCD recombination pathways of Escherichia coli, respectively, operate on phage lambda PUBMED:11952832. Genetic recombination in phage lambda relies on DNA end processing by Exo to expose 3\'-tailed strands for annealing and exchange by beta protein. Phage lambda encodes an additional recombinase, NinB (Orf), which participates in the early stages of recombination by supplying a function equivalent to the E. coli RecFOR complex. These host enzymes assist loading of the RecA strand exchange protein onto ssDNA coated with ssDNA-binding protein. NinB has two structural domains with unusual folds, and exists as an intertwined dimer PUBMED:.

    \ ' '5601' 'IPR006575' '\

    The RWD eukaryotic domain is found in RING finger () and WD repeat () containing proteins\ and DEXDc-like helicase () subfamily\ related to the ubiquitin-conjugating enzymes domain ().

    \ ' '5602' 'IPR008650' '\ This family consists of several helicase-primase complex components from the Gammaherpesviruses.\ ' '5603' 'IPR008394' '\ This family consists of several AfaD and related proteins from Escherichia coli and Salmonella bacteria. The afa gene clusters encode an afimbrial adhesive sheath produced by E. coli. The adhesive sheath is composed of two proteins, AfaD and AfaE, which are independently exposed at the bacterial cell surface. AfaE is required for bacterial adhesion to HeLa cells and AfaD for the uptake of adherent bacteria into these cells PUBMED:10981717.\ ' '5604' 'IPR008736' '\ Human papillomaviruses (HPVs) are epitheliotropic viruses, and their life cycle is intimately linked to the stratification and differentiation state of the host epithelial tissues. The kinetics of E5a protein expression during the complete viral life cycle has been studied and the highest level was found to be coincidental with the onset of virion morphogenesis PUBMED:9721230.\ ' '5605' 'IPR008392' '\ This family consists of accessory gland-specific 26Ab peptides or male accessory gland secretory protein 355B from different Drosophila species. Drosophila males, like males of most other insects, transfer a group of specific proteins (Acp26Ab and Acp26Aa in Drosophila) to the females during mating. These proteins are produced primarily in the accessory gland and are likely to influence the female\'s reproduction PUBMED:1361475.\ ' '5606' 'IPR008403' '\ This family consists of several mammalian apolipoprotein CIII (Apo-CIII) sequences. Apolipoprotein C-III is a 79-residue glycoprotein. It is synthesised in the intestine and liver as part of the very low density lipoprotein (VLDL) and the high density lipoprotein (HDL) particles. Owing to its positive correlation with plasma triglyceride (Tg) levels, Apo-CIII is suggested to play a role in Tg metabolism and is therefore of interest regarding atherosclerosis. However, unlike other apolipoproteins such as Apo-AI, Apo E or CII for which many naturally occurring mutations are known, the structure-function relationships of apo C-III remains a subject of debate. One possibility is that apo C-III inhibits lipoprotein lipase (LPL) activity, as shown by in vitro experiments. Another suggestion, is that elevated levels of Apo-CIII displace other apolipoproteins at the lipoprotein surface, modifying their clearance from plasma PUBMED:12082170.\ ' '5609' 'IPR008677' '\ This family consists of mammalian MRVI1 proteins which are related to the lymphoid-restricted membrane protein (JAW1) and the IP3 receptor associated cGMP kinase substrates A and B (IRAGA and IRAGB). The function of MRVI1 is unknown although mutations in the Mrvi1 gene induces myeloid leukaemia by altering the expression of a gene important for myeloid cell growth and/or differentiation so it has been speculated that Mrvi1 is a tumour suppressor gene PUBMED:10321731. IRAG is very similar in sequence to MRVI1 and is an essential NO/cGKI-dependent regulator of IP3-induced calcium release. Activation of cGKI decreases IP3-stimulated elevations in intracellular calcium, induces smooth muscle relaxation and contributes to the antiproliferative and pro-apoptotic effects of NO/cGMP PUBMED:10724174. Jaw1 is a member of a class of proteins with COOH-terminal hydrophobic membrane anchors and is structurally similar to proteins involved in vesicle targeting and fusion. This suggests that the function and/or the structure of the ER in lymphocytes may be modified by lymphoid-restricted resident ER proteins PUBMED:8021504.\ ' '5610' 'IPR008605' '\ This family consists of several eukaryotic extracellular matrix protein 1 (ECM1) sequences. ECM1 has been shown to regulate endochondral bone formation, stimulate the proliferation of endothelial cells and induce angiogenesis. Mutations in the ECM1 gene can cause lipoid proteinosis, a disorder which causes generalised thickening of skin, mucosae and certain viscera. Classical features include beaded eyelid papules and laryngeal infiltration leading to hoarseness PUBMED:11929856.\ ' '5611' 'IPR008467' '\ This family consists of several eukaryotic dynein light intermediate chain proteins. The light intermediate chains (LICs) of cytoplasmic dynein consist of multiple isoforms, which undergo post-translational modification to produce a large number of species. DLIC1 is known to be involved in assembly, organisation, and function of centrosomes and mitotic spindles when bound to pericentrin. DLIC2 is a subunit of cytoplasmic dynein 2 that may play a role in maintaining Golgi organisation by binding cytoplasmic dynein 2 to its Golgi-associated cargo PUBMED:11907264.\ ' '5612' 'IPR008649' '\ This family represents the N-terminal region of the UL82 and UL83 proteins from Betaherpesvirus sp., such as Human cytomegalovirus (HHV-5) (Human herpesvirus 5). As viruses are reliant upon their host cell to serve as proper environments for their replication, many have evolved mechanisms to alter intracellular conditions to suit their own needs. HHV-5 induces quiescent cells to enter the cell cycle and then arrests them in late G(1), before they enter the S phase, a cell cycle compartment that is presumably favourable for viral replication. The protein product of the HHV-5 UL82 gene, pp71, can accelerate the movement of cells through the G(1) phase of the cell cycle. This activity would help infected cells reach the late G(1) arrest point sooner and thus may stimulate the infectious cycle. pp71 also induces DNA synthesis in quiescent cells, but a pp71 mutant protein that is unable to induce quiescent cells to enter the cell cycle still retains the ability to accelerate the G(1) phase. Thus, the mechanism through which pp71 accelerates G(1) cell cycle progression appears to be distinct from the one that it employs to induce quiescent cells to exit G(0) and subsequently enter the S phase PUBMED:12610120.\ ' '5613' 'IPR008430' '\

    This entry represents several bacterial cytotoxic necrotizing factor proteins as well as related dermonecrotic toxin (DNT) from Bordetella species. Cytotoxic necrotizing factor 1 (CNF1) is a toxin whose structure from Escherichia coli revealed a 4-layer alpha/beta/beta/alpha structure containing mixed beta-sheets PUBMED:11427886. CNF1 is expressed in strains of E. coli causing uropathogenic and neonatal meningitis. CNF1 alters host cell actin cytoskeleton and promotes bacterial invasion of the blood-brain barrier endothelial cells PUBMED:12622819. CNF1 belongs to a unique group of large cytotoxins that cause constitutive activation of Rho guanosine triphosphatases (GTPases), which are key regulators of the actin cytoskeleton PUBMED:.

    \

    Bordetella dermonecrotic toxin (DNT) stimulates the assembly of actin stress fibres and focal adhesions by deamidating or polyaminating Gln63 of the small GTPase Rho. DNT is an A-B toxin composed of an N-terminal receptor-binding (B) domain and a C-terminal enzymatically active (A) domain PUBMED:12065482.

    \ ' '5614' 'IPR008418' '\ This family consists of several Barren protein homologues from several eukaryotic organisms. In Drosophila Barren (barr) is required for sister-chromatid segregation in mitosis. barr encodes a novel protein that is present in proliferating cells and has homologues in Saccharomyces cerevisiae and Homo sapiens. Mitotic defects in barr embryos become apparent during cycle 16, resulting in a loss of PNS and CNS neurons. Centromeres move apart at the metaphase-anaphase transition and Cyclin B is degraded, but sister chromatids remain connected, resulting in chromatin bridging. Barren protein localises to chromatin throughout mitosis. Colocalisation and biochemical experiments indicate that Barren associates with Topoisomerase II throughout mitosis and alters the activity of Topoisomerase II. It has been suggested that this association is required for proper chromosomal segregation by facilitating the decatenation of chromatids at anaphase PUBMED:8978614.\ ' '5615' 'IPR008557' '\ This family consists of bacterial proteins of unknown function.\ ' '5616' 'IPR008723' '\ This family consists of the RNA-dependent RNA polymerase protein VP1 from the Orbivirus. VP1 may have both enzymatic and structural roles in the virus life cycle PUBMED:1846500.\ ' '5617' 'IPR008416' '\ This family consists of several VP1054 proteins from the Baculoviruses. VP1054 is a virus structural protein required for nucleocapsid assembly PUBMED:9188569.\ ' '5618' 'IPR008424' '\

    The basic structure of immunoglobulin (Ig) molecules is a tetramer of two light chains and two heavy chains linked by disulphide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains: alpha, delta, epsilon, gamma and mu, all consisting of a variable domain (VH) and three (in alpha, delta and gamma) or four (in epsilon and mu) constant domains (CH1 to CH4). Ig molecules are highly modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. The domains in Ig and Ig-like molecules are grouped into four types: V-set (variable; ), C1-set (constant-1; ), C2-set (constant-2; ) and I-set (intermediate; ) PUBMED:9417933. Structural studies have shown that these domains share a common core Greek-key beta-sandwich structure, with the types differing in the number of strands in the beta-sheets as well as in their sequence patterns PUBMED:15327963, PUBMED:11377196.

    \

    Immunoglobulin-like domains that are related in both sequence and structure can be found in several diverse protein families. Ig-like domains are involved in a variety of functions, including cell-cell recognition, cell-surface receptors, muscle structure and the immune system PUBMED:10698639.

    \ \

    This entry represents C2-set domains, which are Ig-like domains resembling the antibody constant domain. C2-set domains are found primarily in the mammalian T-cell surface antigens CD2 (Cluster of Differentiation 2), CD4 and CD80, as well as in vascular (VCAM) and intercellular (ICAM) cell adhesion molecules.

    \

    CD2 mediates T-cell adhesion via its ectodomain, and signal transduction utilising its 117-amino acid cytoplasmic tail PUBMED:11376005. CD2 displays structural and functional similarities with African swine fever virus (ASFV) LMW8-DR, a protein that is involved in cell-cell adhesion and immune response modulation, suggesting a possible role in the pathogenesis of ASFV infection PUBMED:7907198. CD4 is the primary receptor for HIV-1. CD4 has four immunoglobulin-like domains in its extracellular region that share the same structure, but can differ in sequence. Certain extracellular domains may be involved in dimerisation PUBMED:15326605.

    \ \ ' '5619' 'IPR008414' '\

    This family consists of several Bacillus haemolytic enterotoxins (HblC, HblD, HblA, NheA, and NheB), which can cause food poisoning in humans PUBMED:12039781. Haemolysin BL (encoded by HBL) and non-haemolytic enterotoxin (encoded by NHE), represent the major enterotoxins produced by Bacillus cereus. Most of the cytotoxic activity of B. cereus isolates has been attributed to the level of Nhe, which may indicate a highly diarrheic potential PUBMED:16553866. The exact mechanism by which B. cereus causes diarrhoea is unknown. Hbl, cytotoxin K (CytK) and Nhe are all putative causes.

    \ \

    Both Hbl and Nhe are three-component cytotoxins and maximal cytotoxicity of Nhe against epithelia is dependent on all three components. Nhe has haemolytic activity against erythrocytes from a variety of species. It is possible that the common structural and functional properties of these toxins indicate that the Hbl/Nhe and ClyA families of toxins constitute a superfamily of pore-forming cytotoxins PUBMED:18310016. The high virulence of some strains is thought to be due to the greater cytotoxic activity of CytK-1 compared to CytK-2, and to a high level of cytK expression PUBMED:17517121. Haemolysin BL and non-haemolytic enterotoxin production are both influenced by pH and micro PUBMED:16906407.

    \ \

    This entry is found in cytotoxic proteins that form part of the enterotoxin complex and bind to erythrocytes. HblA is composed of a binding component, B, and two lytic components, L1 and L2. All three subunits act synergically to cause hemolysis.

    \ ' '5620' 'IPR008440' '\ This family consists of several agglutinin-like proteins from different Candida species. ALS genes of Candida albicans encode a family of cell-surface glycoproteins with a three-domain structure. Each Als protein has a relatively conserved N-terminal domain, a central domain consisting of a tandemly repeated motif, and a serine-threonine-rich C-terminal domain that is relatively variable across the family. The ALS family exhibits several types of variability that indicate the importance of considering strain and allelic differences when studying ALS genes and their encoded proteins PUBMED:11124701.\ ' '5621' 'IPR008851' '\ Transcription initiation factor IIF, alpha subunit (TFIIF-alpha) or RNA polymerase II-associating protein 74 (RAP74) is the large subunit of transcription factor IIF (TFIIF), which is essential for accurate initiation and stimulates elongation by RNA polymerase II PUBMED:12354769.\ ' '5622' 'IPR008862' '\ This family consists of several eukaryotic T-complex protein 11 (Tcp11) related sequences. Tcp11 is only expressed in fertile adult mammalian testes and is thought to be important in sperm function and fertility. The family also contains the Saccharomyces cerevisiae Sok1 protein which is known to suppress cyclic AMP-dependent protein kinase mutants PUBMED:8065298.\ ' '5623' 'IPR008780' '\ This family consists of several Vir proteins specific to Plasmodium vivax. The vir genes are present at about 600-1,000 copies per haploid genome and encode proteins that are immunovariant in natural infections, indicating that they may have a functional role in establishing chronic infection through antigenic variation PUBMED:11298455.\ ' '5624' 'IPR008446' '\ This family consists of several Chordopoxvirus isatin-beta-thiosemicarbazone dependent protein (protein G2) sequences. Inactivation of the gene coding for this protein renders the virus dependent upon isatin-beta-thiosemicarbazone (IBT) for growth PUBMED:2024483.\ ' '5625' 'IPR008897' '\ This family consists of the Saccharomyces cerevisiae trans-acting factor B and C (REP1 and 2) proteins. The S. cerevisiae plasmid stability system consists of two plasmid-coded proteins, Rep1 and Rep2, and a cis-acting locus, STB. The Rep proteins show both self- and cross-interactions in vivo and in vitro, and bind to the STB DNA with assistance from host factor(s). Within the S. cerevisiae nucleus, the Rep1 and Rep2 proteins tightly associate with STB-containing plasmids into well organised plasmid foci that form a cohesive unit in partitioning. It is generally accepted that the protein-protein and DNA-protein interactions engendered by the Rep-STB system are central to plasmid partitioning. Point mutations in Rep1 that knock out interaction with Rep2 or with STB simultaneously block the ability of these Rep1 variants to support plasmid stability PUBMED:12177044.\ ' '5626' 'IPR008765' '\ This family consists of bacteriophage FRD3 proteins.\ ' '5627' 'IPR008432' '\ This family consists of plant cytochrome c oxidase subunit 5c proteins PUBMED:10586516.\ ' '5628' 'IPR008634' '\ This family consists of archaeal GvpO proteins which are required for gas vesicle synthesis PUBMED:8606186. The family also contain related sequences from bacteria.\ ' '5629' 'IPR008558' '\ This family consists of several Lagovirus sequences of unknown function, largely from Oryctolagus cuniculus hemorrhagic disease virus.\ ' '5630' 'IPR008611' '\ EspB is a type-III-secreted pore-forming protein of enteropathogenic Escherichia coli (EPEC) which is essential for EPEC pathogenesis PUBMED:12071694. EspB is also found in Citrobacter rodentium.\ ' '5631' 'IPR008447' '\ This family consists of several Chordopoxvirus L2 proteins.\ ' '5632' 'IPR008658' '\ This family consists of several eukaryotic kinesin-associated (KAP) proteins. Kinesins are intracellular multimeric transport motor proteins that move cellular cargo on microtubule tracks. It has been shown that the sea urchin KRP85/95 holoenzyme associates with a KAP115 non-motor protein, forming a heterotrimeric complex in vitro, called the Kinesin-II PUBMED:10819327.\ ' '5633' 'IPR008661' '\ This family consists of several eukaryotic L6 membrane proteins. L6, IL-TMP, and TM4SF5 are cell surface proteins predicted to have four transmembrane domains. Previous sequence analysis led to their assignment as members of the tetraspanin superfamily it has now been found that that they are not significantly related to genuine tetraspanins, but instead constitute their own L6 family PUBMED:10975581. Several members of this family have been implicated in Homo sapiens cancer PUBMED:1565644, PUBMED:9479038.\ ' '5634' 'IPR008717' '\ This family consists of the eukaryotic Noggin proteins. Noggin is a glycoprotein that binds bone morphogenetic proteins (BMPs) selectively and, when added to osteoblasts, it opposes the effects of BMPs. It has been found that noggin arrests the differentiation of stromal cells, preventing cellular maturation PUBMED:12633782.\ ' '5635' 'IPR008694' '\ This family consists of a series of repeated 73 residue sequences from the Mycoplasma arthritidis MAA2 variable surface protein. MAA2 is implicated in cytoadherence and virulence and has been shown to exhibit both size and phase variability PUBMED:9596719.\ ' '5636' 'IPR008783' '\ This family consists of several mammalian podoplanin-like proteins which are thought to control specifically the unique shape of podocytes PUBMED:12032185.\ ' '5638' 'IPR008712' '\ This family consists of several bacteriophage NinF proteins as well as related sequences from Escherichia coli.\ ' '5639' 'IPR008560' '\ This family consists of a number of conserved eukaryotic proteins of unknown function.\ ' '5640' 'IPR008642' '\ This family consists of several herpes virus BLRF2 proteins. The family also contains the C-terminal region of and (hypothetical Homo sapiens and Mus musculus sequences) which align with the N terminus of the viral sequences.\ ' '5641' 'IPR008725' '\

    The function of the orthopoxvirus F7L proteins are unknown.

    \ ' '5642' 'IPR008561' '\ This family consists of several unidentified baculovirus proteins of around 85 residues long with no known function.\ ' '5643' 'IPR008562' '\ This family consists of several baculovirus sequences of between 350 and 380 residues long. The family has no known function.\ ' '5644' 'IPR008863' '\ This family consists of several prokaryotic TelA like proteins. TelA and KlA are associated with tellurite resistance PUBMED:9406390 and plasmid fertility inhibition PUBMED:7665479.\ ' '5645' 'IPR008814' '\ This family consists of several eukaryotic Ribophorin II (RPN2) proteins. The mammalian oligosaccharyltransferase (OST) is a protein complex that effects the cotranslational N-glycosylation of newly synthesised polypeptides, and is composed of at least four rough ER-specific membrane proteins: ribophorins I and II (RI and RII), OST48, and Dadl. The mechanism(s) by which the subunits of this complex are retained in the ER are not well understood PUBMED:10826490.\ ' '5646' 'IPR008874' '\

    The traT gene is one of the F factor transfer genes and encodes an outer membrane protein which is involved in interactions between Escherichia coli and its surroundings PUBMED:9933744. The protein plays a role in preventing unproductive conjugation between bacteria carrying like plasmids.

    \ ' '5647' 'IPR008718' '\ This family consists of Rhizobium NolX and Xanthomonas HrpF proteins. The interaction between the plant pathogen Xanthomonas campestris pv. vesicatoria (strain 85-10) and its host plants is controlled by hrp genes (hypersensitive reaction and pathogenicity), which encode a type III protein secretion system. Among type III-secreted proteins are avirulence proteins, effectors involved in the induction of plant defence reactions. HrpF is dispensable for protein secretion but required for AvrBs3 recognition in planta, is thought to function as a translocator of effector proteins into the host cell PUBMED:11115117. NolX, a Glycine max (Soybean) cultivar specificity protein, is secreted by a type III secretion system (TTSS) and shows homology to HrpF. It is not known whether NolX functions at the bacterium-plant interface or acts inside the host cell. NolX is expressed in planta only during the early stages of nodule development PUBMED:11790754.\ ' '5648' 'IPR008563' '\ This family consists of several highly related baculovirus proteins of unknown function.\ ' '5649' 'IPR008699' '\ This family consists of several eukaryotic NADH-ubiquinone oxidoreductase ASHI subunit (CI-ASHI) proteins. NADH:ubiquinone oxidoreductase (complex I) is an extremely complicated multiprotein complex located in the inner mitochondrial membrane. Its main function is the transport of electrons from NADH to ubiquinone, which is accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. Human complex I appears to consist of 41 subunits PUBMED:9878551.\

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '5650' 'IPR006434' '\

    This family is a small group of metazoan sequences with one sequence from Arabidopsis thaliana (Mouse-ear cress). The sequences from mouse are annotated as pyrimidine 5-nucleotidases, apparently in reference to HSPC233, the Homo sapiens (Human) homolog. However, no such annotation can currently be found for this gene. This group of sequences was found during searches for members of the haloacid dehalogenase (HAD) superfamily (). All of the conserved catalytic motifs PUBMED:7966317 are found. The placement of the variable domain between motifs 1 and 2 indicates membership in subfamily I of the superfamily, but these sequences are sufficiently different from any of the branches of that subfamily (IA-ID) as to constitute a separate branch to now be called IE. Considering that the closest identifiable hit outside of the noise range is to a phosphoserine phosphatase, this group may be considered to be most closely allied to subfamily IB.

    \ ' '5651' 'IPR008632' '\ Parasitic nematodes produce at least two structurally novel classes of small helix-rich retinol- and fatty-acid-binding proteins that have no counterparts in their plant or animal hosts and thus represent potential targets for new nematicides. Gp-FAR-1 is a member of the nematode-specific fatty-acid- and retinol-binding (FAR) family of proteins but localises to the surface of the organism, placing it in a strategic position for interaction with the host. Gp-FAR-1 functions as a broad-spectrum retinol- and fatty-acid-binding protein, and it is thought that it is involved in the evasion of primary host plant defence systems PUBMED:11368765.\ ' '5652' 'IPR005456' '\

    Melanin-concentrating hormone (MCH) is a cyclic peptide originally identified in teleost fish PUBMED:10421367,PUBMED:10421368. In fish, MCH is released from the pituitary and causes lightening of skin pigment cells through pigment aggregation PUBMED:10996523. In mammals, MCH is predominantly expressed in the hypothalamus, and functions as a neurotransmitter in the control of a range of functions. A major role of MCH is thought to be in the regulation of feeding: injection of MCH into rat brains stimulates feeding; expression of MCH is upregulated in the hypothalamus of obese and fasting mice; and mice lacking MCH are lean and eat less PUBMED:10421367. MCH and alpha melanocyte-stimulating hormone (alpha-MSH) have antagonistic effects on a number of physiological functions. Alpha-MSH darkens pigmentation in fish and reduces feeding in mammals, whereas MCH increases feeding PUBMED:10996523.

    \ \

    MCH is derived from a pre-pro-hormone (pre-pro-MCH), which contains 1-2 hormones other than MCH, depending on the species. In all species, the 17-19 C-terminal amino acids are cleaved to release MCH. In mammals, amino acids 132-144 encode the hormone neuropeptide EI (NEI), whilst in salmonids, the analogous region encodes neuropeptide EV (NEV), and in other fish, the region determines MCH gene-related peptide (Mgrp) PUBMED:8559281. A further peptide, known as neuropeptide GE (NGE), is thought to be found in mammalian pre-pro-MCH upstream of NEI, encoded by amino acids 110-129. NEI has been shown to enhance oxytocin and reduce arginine vasopressin secretion from rat pituitary PUBMED:9175893. Two paralogues of MCH, known as pro-MCH-like 1 and 2 genes (PMCHL1 and PMCHL2), which arose recently in primate evolution, also exist. At present, it is unclear whether the PMCHL genes are functional genes or inactive pseudogenes.

    \ ' '5653' 'IPR008735' '\ This family consists of the mammalian specific protein beta-microseminoprotein. Prostatic secretory protein of 94 amino acids (PSP94), also called beta-microseminoprotein, is a small, nonglycosylated protein, rich in cysteine residues. It was first isolated as a major protein from Homo sapiens seminal plasma PUBMED:10639193. The exact function of this protein is unknown.\ ' '5654' 'IPR008774' '\ This family consists of several phospholipase A2-like proteins, mostly from arthropods PUBMED:12167627.\ ' '5655' 'IPR008388' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) () are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release PUBMED:15629643. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c\', c\'\', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins PUBMED:15907459.

    \ \

    This entry represents the S1 subunit (or subunit AC45) found in the V1 complex of V-ATPases. This subunit is synthesized as an N-glycosylated 60 kDa precursor that is intracellularly cleaved to a protein of about 45 kDa. This subunit may assist the V-ATPase in the acidification of neuroendocrine granules PUBMED:10336633.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '5657' 'IPR008393' '\

    The late transcription region 2 (L2) of Adenovirus type 2 has an ORF of 80 residues positioned between nucleotides 17,676 and 17,915. It encodes an 11K polypeptide, which has the initiating methionine residue removed, leaving a 79-residue product. The L2 region that encoded 11K polypeptide is arginine rich (21%) and has a predicted molecular weight of 8,715. It was cleaved by the viral endoprotease to give two products which co-migrated on sodium dodecyl sulphate-polyacrylamide gels as virion polypeptide X PUBMED:3357209.

    \ \

    The role of the L2 precursor, by virtue of its two domains, might be to condense the viral prochromatin for encapsidation. Subsequent cleavage within the particle after residue 31 releases the cross-link and prepares the viral chromatin for a relaxed conformation, which is required during infection and uncoating. Cleavage seems to be necessary for infectivity.

    \ \

    This family consists of several adenovirus late L2 mu core protein or protein X sequences PUBMED:3357209.

    \ ' '5658' 'IPR008716' '\ The nodulation genes of Rhizobia are regulated by the nodD gene product in response to host-produced flavonoids and appear to encode enzymes involved in the production of a lipo-chitose signal molecule required for infection and nodule formation. NodZ is required for the addition of a 2-O-methylfucose residue to the terminal reducing N-acetylglucosamine of the nodulation signal. This substitution is essential for the biological activity of this molecule. Mutations in nodZ result in defective nodulation. nodZ represents a unique nodulation gene that is not under the control of NodD and yet is essential for the synthesis of an active nodulation signal PUBMED:8300517.\ ' '5659' 'IPR008625' '\ This family consists of several GAGE and XAGE proteins which are found exclusively in humans. The function of this family is unknown although they have been implicated in Homo sapiens cancers PUBMED:11992404.\ ' '5660' 'IPR008564' '\ This family consists of a number of conserved eukaryotic proteins of unknown function.\ ' '5661' 'IPR008616' '\ This family consists of the N-terminal region of the prokaryotic fibronectin-binding protein, the C-terminal region is . Fibronectin binding is considered to be an important virulence factor in streptococcal infections. Fibronectin is a dimeric glycoprotein that is present in a soluble form in plasma and extracellular fluids; it is also present in a fibrillar form on cell surfaces. Both the soluble and cellular forms of fibronectin may be incorporated into the extracellular tissue matrix. While fibronectin has critical roles in eukaryotic cellular processes, such as adhesion, migration and differentiation, it is also a substrate for the attachment of bacteria. The binding of pathogenic Streptococcus pyogenes and Staphylococcus aureus to epithelial cells via fibronectin facilitates their internalisation and systemic spread within the host PUBMED:12055283.\ ' '5662' 'IPR008671' '\ This family consists of lycopene beta and epsilon cyclase proteins. Carotenoids with cyclic end groups are essential components of the photosynthetic membranes in all plants, algae, and cyanobacteria. These lipid-soluble compounds protect against photo-oxidation, harvest light for photosynthesis, and dissipate excess light energy absorbed by the antenna pigments. The cyclisation of lycopene (psi, psi-carotene) is a key branch point in the pathway of carotenoid biosynthesis. Two types of cyclic end groups are found in higher plant carotenoids: the beta and epsilon rings. Carotenoids with two beta rings are ubiquitous, and those with one beta and one epsilon ring are common; however, carotenoids with two epsilon rings are rare PUBMED:8837512.\ ' '5663' 'IPR008849' '\ This family consists of several eukaryotic synaphin 1 and 2 proteins. Synaphin/complexin is a cytosolic protein that preferentially binds to syntaxin within the SNARE complex. Synaphin promotes SNAREs to form precomplexes that oligomerise into higher order structures. A peptide from the central, syntaxin binding domain of synaphin competitively inhibits these two proteins from interacting and prevents SNARE complexes from oligomerising. It is thought that oligomerisation of SNARE complexes into a higher order structure creates a SNARE scaffold for efficient, regulated fusion of synaptic vesicles PUBMED:11239399. Synaphin promotes neuronal exocytosis by promoting interaction between the complementary syntaxin and synaptobrevin transmembrane regions that reside in opposing membranes prior to fusion PUBMED:12200427.\ ' '5664' 'IPR008450' '\ This family consists of several examples of the Drosophila melanogaster specific chorion protein S16. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary PUBMED:1908228.\ ' '5665' 'IPR008426' '\

    CENP-H is required for the localisation of CENP-C, but not CENP-A, to the centromere. However, it may be involved in the incorporation of newly synthesised CENP-A into centromeres via its interaction with the CENP-A/CENP-HI complex. CENP-H contains a coiled-coil structure and a nuclear localisation signal. CENP-H is specifically and constitutively localised in kinetochores throughout the cell cycle, and may play a role in kinetochore organisation and function throughout the cell cycle PUBMED:10488063.

    \

    Studies show that CENP-H may be associated with certain human cancers PUBMED:18700042, PUBMED:17255272.

    \

    Chromosome segregation in eukaryotes requires the kinetochore, a multi-protein structure that assembles on centromeric DNA, and which acts to link chromosomes to spindle microtubules. Kinetochore structure and composition is highly conserved among vertebrates. The inner kinetochore is essential for kinetochore assembly, and is involved in chromosome segregation via regulation of the spindle. Inner kinetochore components include the multi-subunit CENP-H/I complex, which may function, in part, in directing centromere protein A (CENP-A) deposition to centromeres, where CENP-A is a centromere-specific histone H3 variant required for the organisation of centromeric chromatin during interphase. The CENP-H/I complex contains three functional classes of proteins PUBMED:16622420, PUBMED:18094054:

    \

    \ \ ' '5666' 'IPR008565' '\ This family consists of several hypothetical bacterial sequences as well as one viral sequence , the function of this family is unknown.\ ' '5667' 'IPR008401' '\ The anaphase-promoting complex (APC) is a conserved multi-subunit ubiquitin ligase required for the degradation of key cell cycle regulators. Members of this family are components of the anaphase-promoting complex homologous to Apc13p PUBMED:12477395.\ ' '5668' 'IPR008766' '\ This family consists of a group of bacteriophage replication gene A protein (GPA) like sequences from both viruses and bacteria. The members of this family are likely to be endonucleases PUBMED:1701261, PUBMED:7997180, PUBMED:8510152.\ ' '5669' 'IPR008402' '\

    The anaphase-promoting complex (APC) is a multi-subunit E3 protein ubiquitin ligase that is responsible for the metaphase to anaphase transition and the exit from mitosis. Anaphase is initiated when the APC triggers the destruction of securin, thereby allowing the protease, separase, to disrupt sister-chromatid cohesion. Securin ubiquitination by the APC is inhibited by cyclin-dependent kinase 1 (Cdk1)-dependent phosphorylation PUBMED:18552837.

    \ \

    Forkhead Box M1 (FoxM1), which is a transcription factor that is over-expressed in many cancers, is degraded in late mitosis and early G1 phase by the APC/cyclosome (APC/C) E3 ubiquitin ligase PUBMED:18573889. The APC/C targets mitotic cyclins for destruction in mitosis and G1 phase and is then inactivated at S phase. It thereby generates alternating states of high and low cyclin-Cdk activity, which is required for the alternation of mitosis and DNA replication PUBMED:18559889.

    \ \

    APC from Schizosaccharomyces pombe and Saccharomyces cerevisiae was previously thought to have 11 subunits, but more sensitive techniques have identified 13 subunits in both yeasts PUBMED:12477395.

    \ \

    Members of this family are components of the anaphase-promoting complex homologous to Apc15p PUBMED:12477395.

    \ ' '5670' 'IPR008612' '\ This family consists of several mating pheromone proteins from Euplotes octocarinatus. Cells of the ten mating types of the ciliate E. octocarinatus communicate by pheromones before they enter conjugation. The pheromones induce homotypic pairing when applied to mating types that do not secrete the same pheromone(s). Heterotypic pairs (i.e., those between cells of different mating types) are formed only when both mating types in a mixture secrete a pheromone that the other does not. The genetics of mating types is based on four codominant mating type alleles, each allele determining production of a different pheromone. The pheromones not only induce pair formation but also attract cells PUBMED:9018841.\ ' '5671' 'IPR008847' '\ This domain consists of several eukaryotic suppressor of forked (Suf) like proteins. The Drosophila melanogaster suppressor of forked [Su(f)] protein shares homology with the Saccharomyces cerevisiae RNA14 protein and the 77 kDa subunit of Homo sapiens cleavage stimulation factor, which are proteins involved in mRNA 3\' end formation. This suggests a role for Su(f) in mRNA 3\' end formation in Drosophila. The su(f) gene produces three transcripts; two of them are polyadenylated at the end of the transcription unit, and one is a truncated transcript, polyadenylated in intron 4. It is thought that su(f) plays a role in the regulation of poly(A) site utilisation and the GU-rich sequence is important for this regulation to occur PUBMED:9826695.\ ' '5672' 'IPR008898' '\ This family consists of several bacterial YopD like proteins. Virulent Yersinia species harbour a common plasmid that encodes essential virulence determinants (Yersinia outer proteins [Yops]), which are regulated by the extracellular stimuli Ca2+ and temperature. YopD is thought to be a possible transmembrane protein and contains an amphipathic alpha-helix in its carboxy terminus PUBMED:8418066.\ ' '5673' 'IPR008772' '\

    This family consists of several bacterial PhnH sequences, which is a component of the C-P lyase system () for the catabolism of phosphonate compounds PUBMED:2155230. The specific function of this component is unknown.

    \ ' '5674' 'IPR008445' '\ This family consists of several Chordopoxvirus A15 like sequences.\ ' '5675' 'IPR008415' '\

    This family consists of LEF-3 Orgyia pseudotsugata multicapsid polyhedrosis virus (OpMNPV) late expression factor 3 (LEF-3) sequences which are known to be ssDNA-binding proteins PUBMED:10073712. Alkaline nuclease (AN) and LEF-3 may participate in homologous recombination of the baculovirus genome in a manner similar to that of exonuclease (Redalpha) and DNA-binding protein (Redbeta) of the Red-mediated homologous recombination system of bacteriophage lambda PUBMED:12551981.

    \ ' '5676' 'IPR008463' '\ This family consists of several Firmicute transcriptional repressor of class III stress gene (CtsR) proteins. CtsR of Listeria monocytogenes negatively regulates the clpC, clpP and clpE genes belonging to the CtsR regulon PUBMED:10692157.\ ' '5677' 'IPR008660' '\ This family consists of several moth fibroin light chain (L-fibroin) proteins. Fibroin of Bombyx mori is secreted into the lumen of posterior silk gland (PSG) from the surrounding PSG cells as a molecular complex consisting of a heavy (H)-chain of approximately 350 kDa, a light (L)-chain of 25 kDa and a P25 of about 27 kDa. The H- and L-chains are disulphide-linked but P25 is associated with the H-L complex by non-covalent force PUBMED:10366732.\ ' '5679' 'IPR008668' '\ This family consists of several feline-specific Lentivirus virion infectivity factor (VIF) proteins. VIF is essential for productive Feline immunodeficiency virus infection of host target cells in vitro PUBMED:10441553.\ ' '5680' 'IPR008566' '\ This family consists of several uncharacterised proteins from the Gammaherpesvirinae.\ ' '5681' 'IPR008567' '\ This family consists of several hypothetical prokaryotic proteins with no known function.\ ' '5682' 'IPR008674' '\ This family consists of archaeal chromosomal protein MC1 sequences which protect DNA against thermal denaturation PUBMED:2503033.\ ' '5684' 'IPR008384' '\ This family consists of several eukaryotic ARP2/3 complex 20 kDa subunit (P20-ARC) proteins. The Arp2/3 protein complex has been implicated in the control of actin polymerisation in cells. The human complex consists of seven subunits which include the actin related proteins Arp2 and Arp3 it has been suggested that the complex promotes actin assembly in lamellipodia and may participate in lamellipodial protrusion PUBMED:9230079.\ ' '5685' 'IPR008875' '\ This family consists of several bacterial TraX proteins. TraX is responsible for the N-terminal acetylation of F-pilin subunits PUBMED:8444800.\ ' '5686' 'IPR008411' '\

    (Bovine immunodeficiency virus) (BIV), like the human immunodeficiency virus, is a lentivirus. It shows a great deal of genomic diversity, mostly in the viral envelope gene PUBMED:9032387. This property of the BIV group of viruses may play an important role in the pathobiology of the virus, particularly the conserved (C) 2, hypervariable (V) 1, V2 and C3 regions PUBMED:11448023.

    \ \

    The surface protein (SU) attaches the virus to the host cell by binding to its receptor. This interaction triggers the refolding of the transmembrane protein (TM) and is thought to activate its fusogenic potential by unmasking its fusion peptide. Fusion occurs at the host cell plasma membrane.

    \ \

    The transmembrane protein (TM) acts as a class I viral fusion protein. Under the current model, the protein has at least 3 conformational states: pre-fusion native state, pre-hairpin intermediate state, and post-fusion hairpin state. During viral and target cell membrane fusion, the coiled coil regions (heptad repeats) assume a trimer-of-hairpins structure, positioning the fusion peptide in close proximity to the C-terminal region of the ectodomain. The formation of this structure appears to drive apposition and subsequent fusion of viral and target cell membranes. Membranes fusion leads to delivery of the nucleocapsid into the cytoplasm.

    \ ' '5687' 'IPR008685' '\ Kinetochores are the chromosomal sites for spindle interaction and play a vital role for chromosome segregation. Fission Saccharomyces cerevisiae kinetochore protein Mis12, is required for correct spindle morphogenesis, determining metaphase spindle length PUBMED:10398680. Thirty-five to sixty percent extension of metaphase spindle length takes place in Mis12 mutants PUBMED:10398680. It has been shown that Mis12 might genetically interact with Mal2p PUBMED:12242294.\ ' '5688' 'IPR008638' '\

    This entry represents a conserved domain found near the N-terminus of a number of large, repetitive bacterial proteins, including many proteins of over 2500 amino acids. A number of the members of this family have been designated adhesins, filamentous haemagglutinins, haem/haemopexin-binding protein, etc. Members generally have a signal sequence, then an intervening region, then the region described in this entry. Following this region, proteins typically have regions rich in repeats but may show no homology between the repeats of one member and the repeats of another. This domain is suggested to be a carbohydrate-dependent haemagglutination activity site PUBMED:11703654.

    \

    In Bordetella pertussis, the infectious agent in childhood whooping cough, filamentous haemagglutinin (FHA) is a surface-exposed and secreted protein that acts as a major virulence attachment factor, functioning as both a primary adhesin and an immunomodulator to bind the bacterial to cells of the respiratory epithelium PUBMED:16339899. The FHA molecule has a globular head that consists of two domains: a shaft and a flexible tail. Its sequence contains two regions of tandem 19-residue repeats, where the repeat motif consists of short beta-helical strands separated by beta-turns PUBMED:7519681.

    \ ' '5689' 'IPR008773' '\ This family consists of several proteobacterial phosphonate metabolism protein (PhnI) sequences. Bacteria that use phosphonates as a phosphorus source must be able to break the stable carbon-phosphorus bond. In Escherichia coli phosphonates are broken down by a C-P lyase that has a broad substrate specificity. The genes for phosphonate uptake and degradation in E. coli are organised in an operon of 14 genes, named phnC to phnP. Three gene products (PhnC, PhnD and PhnE) comprise a binding protein-dependent phosphonate transporter, which also transports phosphate, phosphite, and certain phosphate esters such as phosphoserine; two gene products (PhnF and PhnO) may have a role in gene regulation; and nine gene products (PhnG, PhnH, PhnI, PhnJ, PhnK, PhnL, PhnM, PhnN, and PhnP) probably comprise a membrane-associated C-P lyase enzyme complex PUBMED:1335942.\ ' '5690' 'IPR008655' '\ This family consists of several Helicobacter pylori specific IceA2 proteins. The function of this family is unknown.\ ' '5692' 'IPR008448' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \ This family consists of several Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide sequences. DNA-dependent RNA polymerase catalyses the transcription of DNA into RNA PUBMED:1560534.\ ' '5693' 'IPR008464' '\ This family consists of several Cypovirus polyhedrin proteins. Polyhedrin is known to form a crystalline matrix (polyhedra) in infected insect cells PUBMED:8286955.\ ' '5694' 'IPR008822' '\ This family consists of several bacterial and phage Holliday junction resolvase (RusA) like proteins. The RusA protein of Escherichia coli is an endonuclease that can resolve Holliday intermediates and correct the defects in genetic recombination and DNA repair associated with inactivation of RuvAB or RuvC PUBMED:7813450.\ ' '5696' 'IPR008818' '\ This family consists of several Rotavirus major outer capsid protein VP7 sequences. The rotavirus capsid is composed of three concentric protein layers. Proteins VP4 and VP7 comprise the outer layer. VP4 forms spikes and is the viral attachment protein. VP7 is a glycoprotein and the major constituent of the outer protein layer PUBMED:12050377.\ ' '5697' 'IPR008593' '\ This family consists of several bacterial and phage DNA N-6-adenine-methyltransferase (Dam) like sequences PUBMED:2180941.\ ' '5698' 'IPR008729' '\ This family consists of several bacterial phenolic acid decarboxylase proteins. Phenolic acids, also called substituted cinnamic acids, are important lignin-related aromatic acids and natural constituents of plant cell walls. These acids (particularly ferulic, p-coumaric, and caffeic acids) bind the complex lignin polymer to the hemicellulose and cellulose in plants. The phenolic acid decarboxylase (PAD) gene (pad) is transcriptionally regulated by p-coumaric, ferulic, or caffeic acid; these three acids are the three substrates of PAD PUBMED:9546183.\ ' '5699' 'IPR008570' '\

    This entry represents the vps25 subunit (vacuolar protein sorting-associated protein 25) of the endosome-associated complex ESCRT-II (Endosomal Sorting Complexes Required for Transport protein II). ESCRT (ESCRT-I, -II, -III) complexes orchestrate efficient sorting of ubiquitinated transmembrane receptors to lysosomes via multivesicular bodies (MVBs) PUBMED:17215868. ESCRT-II recruits the transport machinery for protein sorting at MVB PUBMED:15469844. In addition, the human ESCRT-II has been shown to form a complex with RNA polymerase II elongation factor ELL in order to exert transcriptional control activity. ESCRT-II transiently associates with the endosomal membrane and thereby initiates the formation of ESCRT-III, a membrane-associated protein complex that functions immediately downstream of ESCRT-II during sorting of MVB cargo. ESCRT-II in turn functions downstream of ESCRT-I, a protein complex that binds to ubiquitinated endosomal cargo PUBMED:12194858.

    \

    ESCRT-II is a trilobal complex composed of two copies of vps25, one copy of vps22 and the C-terminal region of vps36. The crystal structure of vps25 revealed two winged-helix domains, the N-terminal domain of vps25 interacting with vps22 and vps35 PUBMED:15579210.

    \ ' '5700' 'IPR008571' '\ Members of this family have a P-loop containing nucleotide triphosphate hydrolases fold. This family is restricted to bacterial proteins, none of which have currently been characterised.\ ' '5701' 'IPR008689' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit D from the F0 complex in F-ATPases found in mitochondria. The D subunit is part of the peripheral stalk that links the F1 and F0 complexes together, and which acts as a stator to prevent certain subunits from rotating with the central rotary element. The peripheral stalk differs in subunit composition between mitochondrial, chloroplast and bacterial F-ATPases. In mitochondria, the peripheral stalk is composed of one copy each of subunits OSCP (oligomycin sensitivity conferral protein), F6, B and D PUBMED:16045926. There is no homologue of subunit D in bacterial or chloroplast F-ATPase, whose peripheral stalks are composed of one copy of the delta subunit (homologous to OSCP), and two copies of subunit B in bacteria, or one copy each of subunits B and B\' in chloroplasts and photosynthetic bacteria.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '5702' 'IPR008730' '\ This family consists of several moth pheromone biosynthesis activating neuropeptide (PBAN) sequences. Female moths produce and release species specific sex pheromones to attract males for mating. Pheromone biosynthesis is hormonally regulated by the Pheromone Biosynthesis Activating Neuropeptide (PBAN) which is biosynthesised in the subesophageal ganglion (SOG) PUBMED:12110297.\ ' '5703' 'IPR008901' '\ This family consists of several eukaryotic alkaline phytoceramidase (aPHC) sequences. Ceramidases are enzymes involved in regulating cellular levels of ceramides, sphingoid bases, and their phosphates. Alkaline phytoceramidase is responsible for the hydrolysis of phytoceramide PUBMED:11356846.\ ' '5704' 'IPR008866' '\ This entry consists of several phage terminase large subunit proteins as well as related sequences from several bacterial species. The DNA packaging enzyme of bacteriophage lambda, terminase, is a heteromultimer composed of a small subunit, gpNu1, and a large subunit, gpA, products of the Nu1 and A genes, respectively. Terminase is involved in the site-specific binding and cutting of the DNA in the initial stages of packaging. It is now known that gpA is actively involved in late stages of packaging, including DNA translocation, and that this enzyme contains separate functional domains for its early and late packaging activities PUBMED:11866517.\ ' '5706' 'IPR008776' '\ This family consists of the Phytoreovirus nonstructural proteins Pns9 and Pns10. The function of this family is unknown.\ ' '5707' 'IPR008803' '\ This family consists of several eukaryotic root hair defective 3 like GTP-binding proteins. It has been speculated that the RHD3 protein is a member of a novel class of GTP-binding proteins that is widespread in eukaryotes and required for regulated cell enlargement PUBMED:9087433. The family also contains the homologous Saccharomyces cerevisiae synthetic construct enhancement of YOP1 (SEY1) protein which is involved in membrane trafficking PUBMED:12427979.\ ' '5708' 'IPR008618' '\ This family consists of several Fijivirus 64 kDa capsid proteins.\ ' '5709' 'IPR008431' '\ This family consists of the eukaryotic protein 2\',3\'-cyclic nucleotide 3\'-phosphodiesterase (CNP). 2\',3\'-cyclic nucleotide 3\'-phosphodiesterase (CNP) is one of the earliest myelin-related proteins expressed in differentiating oligodendrocytes and Schwann cells. CNP is abundant in the central nervous system and in oligodendrocytes. This protein is also found in mammalian photoreceptor cells, testis and lymphocytes. Although the biological function of CNP is unknown, it is thought to play a significant role in the formation of the myelin sheath, where it comprises 4% of total protein. CNP selectively cleaves 2\',3\'-cyclic nucleotides to produce 2\'-nucleotides in vitro. Although physiologically relevant substrates with 2\',3\'-cyclic termini are still unknown, numerous cyclic phosphate containing RNAs occur transiently within eukaryotic cells. Other known protein families capable of hydrolysing 2\',3\'-cyclic nucleotides include tRNA ligases and plant cyclic phosphodiesterases. The catalytic domains from all these proteins contain two tetra-peptide motifs H-X-T/S-X, where X is usually a hydrophobic residue. Mutation of either histidine in CNP abolishes enzymatic activity PUBMED:11885989.\ ' '5711' 'IPR008573' '\

    This family consists of several Baculovirus proteins of around 130 residues in length. The function of this family is unknown, but it appears to be related to the U-box and ring finger domain by profile-profile comparison.

    \ ' '5712' 'IPR008574' '\ This family consists of proteins of unknown function found in Caenorhabditis species.\ ' '5714' 'IPR008726' '\ This family consists of several Orthopoxvirus F8 proteins. The function of this family is unknown.\ ' '5715' 'IPR008882' '\ This family consists of several Trypanosoma brucei procyclic acidic repetitive protein (PARP) like sequences. The procyclic acidic repetitive protein (parp) genes of T. brucei encode a small family of abundant surface proteins whose expression is restricted to the procyclic form of the parasite. They are found at two unlinked loci, parpA and parpB; transcription of both loci is developmentally regulated PUBMED:2342468.\ ' '5717' 'IPR008829' '\ This family consists of several eukaryotic and archaeal proteins which are related to the Homo sapiens soluble liver antigen/liver pancreas antigen (SLA/LP autoantigen). Autoantibodies are a hallmark of autoimmune hepatitis, but most are not disease specific. Autoantibodies to soluble liver antigen (SLA) and to liver and pancreas antigen (LP) have been described as disease specific, occurring in about 30% of all patients with autoimmune hepatitis PUBMED:10801173. The function of SLA/LP is unknown, however, it has been suggested that the protein may function as a serine hydroxymethyltransferase and may be an important enzyme in the thus far poorly understood selenocysteine pathway PUBMED:11481605. The archaeal sequences and are annotated as being pyridoxal phosphate-dependent enzymes.\ ' '5718' 'IPR008610' '\ This family consists of several eukaryotic rRNA processing protein EBP2 sequences. Ebp2p is required for the maturation of 25S rRNA and 60S subunit assembly. Ebp2p may be one of the target proteins of Rrs1p for executing the signal to regulate ribosome biogenesis PUBMED:10947841.\ ' '5719' 'IPR008576' '\ This family consists of several eukaryotic proteins of unknown function that are S-adenosyl-L-methionine-dependent methyltransferase-like.\ ' '5720' 'IPR008879' '\ This family consists of several coat proteins which are specific to the ssRNA positive-strand, no DNA stage viruses such as the Trichoviruses and Vitiviruses.\ ' '5721' 'IPR008670' '\ This family consists of several bacterial Acyl-CoA reductase (LuxC) proteins. The channelling of fatty acids into the fatty aldehyde substrate for the bacterial bioluminescence reaction is catalysed by a fatty acid reductase multienzyme complex, which channels fatty acids through the thioesterase (LuxD), synthetase (LuxE) and reductase (LuxC) components PUBMED:9128139.\ ' '5722' 'IPR008784' '\ This family consists of several DNA encapsidation protein (Gp16) sequences from the phi-29-like viruses. Gene product 16 catalyses the in vivo and in vitro genome-encapsidation reaction PUBMED:3879485.\ ' '5723' 'IPR008577' '\ This family consists of several uncharacterised proteins from a number of the Siphoviruses as well as some bacterial proteins from Streptococcus species. Some of the members of this family are described as putative minor structural proteins.\ ' '5724' 'IPR008703' '\ This family consists of several bacterial Na+-translocating NADH-quinone reductase subunit A (NQRA) proteins. The Na+-translocating NADH: ubiquinone oxidoreductase (Na+-NQR) generates an electrochemical Na+ potential driven by aerobic respiration PUBMED:10587447.\ ' '5727' 'IPR008579' '\ The function of the proteins in this entry are unknown. They contain the conserved barrel domain of the \'cupin\' superfamily and members are specific to plants and bacteria.\ ' '5729' 'IPR008613' '\ Extracellular Ca2+-dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are encoded in the genomes in two paralogous forms that differ by a ~45 amino acid fragment, which comprises a novel conserved domain. Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2+-binding loop of the calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto recognised and the evolution of EF-hand-like domains is probably more complex than previously appreciated PUBMED:12694917.\ ' '5730' 'IPR008379' '\ There is a unique sequence domain at the C terminus of all known 4.1 proteins, known as the C-terminal domain (CTD). Mammalian CTDs are associated with a growing number of protein-protein interactions, although such activities have yet to be associated with invertebrate CTDs. Mammalian CTDs are generally defined by sequence alignment as encoded by exons 18-21. Comparison of known vertebrate 4.1 proteins with invertebrate 4.1 proteins indicates that mammalian 4.1 exon 19 represents a vertebrate adaptation that extends the sequence of the CTD with a Ser/Thr-rich sequence. The CTD was first described as a 22/24 kDa domain by chymotryptic digestion of erythrocyte 4.1 (4.1R). CTD is thought to represent an independent folding structure which has gained function since the divergence of vertebrates from invertebrates PUBMED:11432737.\ ' '5731' 'IPR008580' '\ This domain consists of the N-terminal portion of several eukaryotic sequences. The function of this domain is unknown.\ ' '5732' 'IPR008581' '\ This family consists of a number of hypothetical proteins from plants. The function of this family is unknown.\ ' '5735' 'IPR008584' '\ This family consists of a number of hypothetical eukaryotic proteins of unknown function with an average length of around 165 residues.\ ' '5736' 'IPR008585' '\ This family consists of a number of bacterial and phage proteins with no known function and which are found in Bacillus species and the Lambda-like viruses.\ ' '5738' 'IPR008586' '\ This family consists of several hypothetical proteins from plants. The function of this family is unknown.\ ' '5739' 'IPR008587' '\ This family consists of a number of sequences found in plants. The function of this family is unknown.\ ' '5740' 'IPR008588' '\ This family consists of proteins of unknown function found in Caenorhabditis species.\ ' '5741' 'IPR008589' '\ This family consists of several conserved hypothetical proteins from bacteria and archaea. The function of this family is unknown though a number are annotated as outer surface proteins.\ ' '5742' 'IPR008805' '\ This family consists of several RIB43A-like eukaryotic proteins. Ciliary and flagellar microtubules contain a specialised set of protofilaments, termed ribbons, that are composed of tubulin and several associated proteins. RIB43A was first characterised in the unicellular biflagellate, Chlamydomonas reinhardtii although highly related sequences are present in several higher eukaryotes including humans. The function of this protein is unknown although the structure of RIB43A and its association with the specialised protofilament ribbons and with basal bodies is relevant to the proposed role of ribbons in forming and stabilising doublet and triplet microtubules and in organising their three-dimensional structure. Human RIB43A homologues could represent a structural requirement in centriole replication in dividing cells PUBMED:10637302.\ ' '5743' 'IPR008590' '\ This family consists of several uncharacterised eukaryotic proteins. The function of this family is unknown.\ ' '5744' 'IPR008591' '\

    DNA replication in eukaryotes results from a highly coordinated interaction between proteins, often as part of protein complexes, and the DNA template. One of the key early steps leading to DNA replication is formation of the prereplication complex, or pre-RC. The pre-RC is formed by the sequential binding of the origin recognition complex (ORC), Cdc6 and Cdt1 proteins, and the MCM complex. Activation of the pre-RC into the initiation complex (IC) is achieved via the action of S-phase kinases, eventually leading to the loading of the replication machinery.

    \

    Recently, a novel replication complex, GINS (for Go, Ichi, Nii, and San; five, one, two, and three in Japanese), has been identified PUBMED:12730133, PUBMED:12730134. \ \ The precise function of GINS is not known. However, genetic and two-hybrid interactions indicate that it mediates the loading of the enzymatic replication machinery at a step after the action of the S-phase kinases PUBMED:12730134. Furthermore, GINS may be a part of the replication machinery itself, since it is found associated with replicating DNA PUBMED:12730133, PUBMED:12730134. Electron microscopy of GINS shows that it forms a ring-like structure PUBMED:12730133, reminiscent of the structure of PCNA PUBMED:8001157, the DNA polymerase delta replication clamp.This observation, coupled with the observed interactions for GINS, indicates that the complex may represent the replication clamp for DNA polymerase epsilon PUBMED:12730133.

    \ \ \

    The GINS complex is essential for initiation of DNA replication in Xenopus egg extracts PUBMED:12730133. This 100 kDa stable complex includes Sld5, Psf1, Psf2, and Psf3. Homologues of these components are found also in other eukaryotes. This family of proteins represents the Sld5 component.

    \ ' '5745' 'IPR008592' '\ This family consists of several hypothetical proteins specific to Helicobacter pylori. The function of this family is unknown.\ ' '5746' 'IPR008383' '\ This family consists of apoptosis inhibitory protein 5 (API5) sequences from several organisms. Apoptosis or programmed cell death is a physiological form of cell death that occurs in embryonic development and organ formation. It is characterised by biochemical and morphological changes such as DNA fragmentation and cell volume shrinkage. API5 is an anti apoptosis gene located in Homo sapiens chromosome 11, whose expression prevents the programmed cell death that occurs upon the deprivation of growth factors PUBMED:9307294,PUBMED:10393420.\ ' '5747' 'IPR008686' '\ This family consists of several Mitovirus RNA-dependent RNA polymerase proteins. The family also contains fragment matches in the mitochondria of Arabidopsis thaliana PUBMED:9657003.\ ' '5748' 'IPR008422' '\

    This family consists of several mating-type alpha and beta proteins from\ Coprinus cinereus (Inky cap fungus) (Hormographiella aspergillata) as well as a related sequence from Schizophyllum commune (Bracket fungus). The A mating type locus of the fungus C. cinereus is a complex, multigenic locus which regulates\ compatibility and subsequent sexual development.

    \ ' '5750' 'IPR010259' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes PUBMED:10656993. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins, amongst others, are activated by this route PUBMED:9572109. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C-terminus.

    \

    Proteinase propeptide inhibitors (sometimes refered to as activation peptides) are responsible for the modulation of folding and activity of the pro-enzyme or zymogen. The pro-segment docks into the enzyme moiety shielding the substrate binding site, thereby promoting inhibition of the enzyme. Several such propeptides share a similar topology PUBMED:12095256, despite often low sequence identities PUBMED:9811547. The propeptide region has an open-sandwich antiparallel-alpha/antiparallel-beta fold, with two alpha-helices and four beta-strands with a (beta/alpha/beta)x2 topology.

    \ \

    This group of sequences contain the propeptide domain at the N terminus of peptidases belonging to MEROPS family S8A, subtilisins. A number of the members of this group of sequences belong to MEROPS inhibitor family I9, clan I-. The propeptide is removed by proteolytic cleavage; removal activating the enzyme.

    \ ' '5751' 'IPR009223' '\

    This short region is found repeated in the mid region of the adenomatous polyposis proteins (APCs). In the human protein many cancer-linked SNPs are found near the first three occurrences of the motif. These repeats bind beta-catenin PUBMED:9823329.

    \ ' '5752' 'IPR009224' '\

    This short region is found repeated in the mid region of the adenomatous polyposis proteins (APCs). This motif binds axin PUBMED:9823329.

    \ ' '5753' 'IPR008108' '\

    Some Gram-negative animal enteropathogens express a specialised secretion \ system to directly "inject" exotoxins into the cytoplasm of host cells. \ Dubbed the type III secretion system, it is of specific interest to \ researchers, as the components of such a system are only expressed in \ pathogenic strains PUBMED:11018143. The system is composed of structural proteins and exotoxin effectors; these are often encoded on large virulence plasmids or on the bacterial chromosome itself PUBMED:11018143.

    \ \

    The Shigella flexneri invasion plasmid antigen (ipa) genes are found on such a plasmid, and are arranged into an operon. Directly upstream of this operon is another cluster of type III genes, termed ipgD, E and F PUBMED:8478058. Deletion mutational studies of all three genes showed they were essential for virulence in S. flexneri, and that IpgD is secreted by the type III needle to the outside of the bacterial cell PUBMED:8478058. Further analysis of the ipg operon confirmed that the IpgD gene product is chaperoned by the IpgE protein while in the bacterial cytoplasm PUBMED:11029686.

    \ \

    More recently, a large study into the spread of the ipa/mxi/ipg\ pathogenicity islands through their relevant plasmid has revealed that \ homologues exist in many different Shigella strains, as well as \ enteroinvasive Escherichia coli and Salmonella spp PUBMED:11553574. There is evidence that the genes were acquired from Shigella through lateral transfer, like most of the other type III secretion system virulence plasmids.

    \ ' '5754' 'IPR009225' '\

    This family consists of several phage head completion protein (GPL) as well as related bacterial sequences. Members of this family allow the completion of filled heads by rendering newly packaged DNA in the heads resistant to DNase. The protein is thought to bind to DNA filled capsids PUBMED:1837355.

    \ ' '5755' 'IPR009226' '\

    This family consists of several isoforms of the penaeidin protein, which is specific to shrimps. Penaeidins, a unique family of antimicrobial peptides (AMPs) with both proline and cysteine-rich domains, were initially identified in the hemolymph of the Pacific white shrimp, Penaeus vannamei PUBMED:12242595.

    \ ' '5756' 'IPR009227' '\

    This family consists of several Zea mays (Maize) specific MURB-like proteins. The transposition of Mu elements underlying Mutator activity in maize requires a transcriptionally active MuDR element. Despite variation in MuDR copy number and RNA levels in Mutator lines, transposition events are consistently late in plant development, and Mu excision frequencies are similar PUBMED:11251096.

    \ ' '5757' 'IPR009228' '\

    This family consists of several bacteriophage capsid scaffolding protein (GPO) and some related bacterial sequences. GPO is thought to function in both the assembly of proheads and the cleavage of GPN PUBMED:1837355.

    \ ' '5758' 'IPR010260' '\

    This family consists of several short bacterial and phage proteins, which are related to the Escherichia coli protein AlpA. AlpA suppresses two phenotypes of a delta lon protease mutant, overproduction of capsular polysaccharide and sensitivity to UV light PUBMED:7511582. Several of the sequences in this family are thought to be DNA-binding proteins.

    \ ' '5759' 'IPR009229' '\

    This family consists of several AgrD proteins from many Staphylococcus species. The agr locus was initially described in Staphylococcus aureus as an element controlling the production of exoproteins implicated in virulence. Its pattern of action has been shown to be complex, upregulating certain extracellular toxins and enzymes expressed post-exponentially and repressing some exponential-phase surface components. AgrD encodes the precursor of the autoinducing peptide (AIP).The AIP derived from AgrD by the action of AgrB interacts with AgrC in the membrane to activate AgrA, which upregulates transcription both from promoter P2, amplifying the response, and from P3, initiating the production of a novel effector: RNAIII. In S. aureus, delta-hemolysin is the only translation product of RNA III and is not involved in the regulatory functions of the transcript, which is therefore the primary agent for modulating the expression of other operons controlled by agr PUBMED:11807079.

    \ ' '5760' 'IPR010261' '\

    This family consists of a number of bacterial sequences, which are highly similar to the Tir chaperone protein in Escherichia coli. In many Gram-negative bacteria, a key indicator of pathogenic potential is the possession of a specialised type III secretion system, which is utilised to deliver virulence effector proteins directly into the host cell cytosol. Many of the proteins secreted from such systems require small cytosolic chaperones to maintain the secreted substrates in a secretion-competent state. CesT serves a chaperone function for the enteropathogenic E. coli (EPEC) translocated intimin receptor (Tir) protein, which confers upon EPEC the ability to alter host cell morphology following intimate bacterial attachment PUBMED:11849537.

    \ ' '5761' 'IPR009230' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit 8 found in the F0 complex of mitochondrial F-ATPases from fungi. This subunit appears to be an integral component of the stator stalk in yeast mitochondrial F-ATPases PUBMED:12626501. The stator stalk is anchored in the membrane, and acts to prevent futile rotation of the ATPase subunits relative to the rotor during coupled ATP synthesis/hydrolysis. This subunit differs in sequence between fungi, Metazoa () and plants ().

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '5762' 'IPR009231' '\

    This family consists of several mid-1-related chloride channels. Mid-1-related chloride channel (MCLC) proteins function as a chloride channel when incorporated in the planar lipid bilayer PUBMED:11279057.

    \ ' '5763' 'IPR010262' '\

    This family consists of several bacterial arylsulphotransferase proteins. Arylsulphotransferase (ASST) transfers a sulphate group from phenolic sulphate esters to a phenolic acceptor substrate PUBMED:8887346.

    \ ' '5764' 'IPR010263' '\

    This family consists of a series of hypothetical bacterial sequences of unknown function. They are associated with a type VI secretion locus suggesting a possible virulence role PUBMED:16763151.

    \ ' '5765' 'IPR009232' '\

    This region at the C terminus of the APC proteins binds the microtubule-associating protein EB-1 PUBMED:11514192. At the C terminus of the alignment is also a PDZ-binding domain. A short motif in the middle of the region appears to be found in the APC2 proteins (e.g. ).

    \ ' '5766' 'IPR010264' '\

    This family consists of a series of plant proteins which are related to the Papaver rhoeas S1 self-incompatibility protein. Self-incompatibility (SI) is the single most important outbreeding device found in angiosperms and is a mechanism that regulates the acceptance or rejection of pollen. S1 is known to exhibit specific pollen-inhibitory properties PUBMED:8134385.

    \ ' '5767' 'IPR010265' '\

    This family consists of a series of phage minor tail proteins and related sequences from several bacterial species.

    \ ' '5768' 'IPR010266' '\

    This family consists of several bacterial NnrS like proteins. NnrS is a putative haeme-Cu protein (NnrS) and a member of the short-chain dehydrogenase family PUBMED:12618453. Expression of nnrS is dependent on the transcriptional regulator NnrR, which also regulates expression of genes required for the reduction of nitrite to nitrous oxide, including nirK and nor. NnrS is a haem- and copper-containing membrane protein. Genes encoding putative orthologues of NnrS are sometimes but not always found in bacteria encoding nitrite and/or nitric oxide reductase PUBMED:11882718.

    \ ' '5769' 'IPR010267' '\

    This family consists of several Chordopoxvirus A20R proteins. The A20R protein is required for DNA replication, is associated with the processive form of the viral DNA polymerase, and directly interacts with the viral proteins encoded by the D4R, D5R, and H5R open reading frames. A20R may contribute to the assembly or stability of the multiprotein DNA replication complex PUBMED:12490386.

    \ ' '5770' 'IPR010268' '\

    This family consists of several archaeal PaREP1 proteins, the function of the family is unknown.

    \ ' '5771' 'IPR010269' '\

    This family consists of a number of uncharacterised bacterial proteins. They are associated with a type VI secretion locus suggesting a possible role in virulence PUBMED:16763151.

    \ ' '5772' 'IPR010270' '\

    This family consists of several phage small terminase subunit proteins as well as some related bacterial sequences PUBMED:1837355.

    \ ' '5774' 'IPR010271' '\

    This family consists of toxin-coregulated pilus subunit (TcpA) proteins from Vibrio cholerae and related sequences. The major virulence factors of toxigenic V. cholerae are cholera toxin (CT), which is encoded by a lysogenic bacteriophage (CTXPhi), and toxin-coregulated pilus (TCP), an essential colonisation factor which is also the receptor for CTXPhi. The genes for the biosynthesis of TCP are part of a larger genetic element known as the TCP pathogenicity island PUBMED:12540588.

    \ ' '5775' 'IPR010272' '\

    This entry represents a protein family associated with type VI secretion in a number of pathogenic bacteria PUBMED:16432199, PUBMED:16763151. Mutation is associated with impaired virulence, such as impaired infection of plants by Rhizobium leguminosarum.

    \ ' '5777' 'IPR010273' '\

    This family consists of a series of hypothetical bacterial proteins. One of the family members from Bacillus subtilis is thought to be involved in cell division and sporulation PUBMED:2556375.

    \ ' '5778' 'IPR010274' '\

    This family consists of several Orthopoxvirus A36R proteins. The A36R protein is predicted to be a type Ib membrane protein PUBMED:11017799.

    \ ' '5779' 'IPR010275' '\

    This family consists of a series of hypothetical bacterial proteins of unknown function.

    \ ' '5780' 'IPR009233' '\

    Natural genetic competence in Bacillus subtilis is controlled by quorum-sensing (QS). The ComP- ComA two-component system detects the signalling molecule ComX, and this signal is transduced by a conserved phosphotransfer mechanism. ComX is synthesised as an inactive precursor and is then cleaved and modified by ComQ before export to the extracellular environment PUBMED:12067344.

    \ ' '5781' 'IPR010276' '\

    This family consists of allatostatins, bombystatins, helicostatins, cydiastatins and schistostatin from several insect species. Allatostatins (ASTs) of the Tyr/Phe-Xaa-Phe-Gly Leu/Ile-NH2 family are a group of insect neuropeptides that inhibit juvenile hormone biosynthesis by the corpora allata PUBMED:10098619.

    \ ' '5782' 'IPR010277' '\

    This family consists of a number of phage late control gene D proteins and related bacterial sequences.

    \ ' '5783' 'IPR010278' '\

    This family consists of a number of glycoprotein gp2 sequences from equine herpesviruses.

    \ ' '5784' 'IPR009234' '\

    This region of the APC family of proteins is known as the basic domain. It contains a high proportion of positively charged amino acids and interacts with microtubules PUBMED:9654054.

    \ ' '5785' 'IPR010279' '\

    This family consists of several bacterial proteins of unknown function that include the Escherichia coli genes for ElaB, YgaM and YqjD.

    \ ' '5786' 'IPR010280' '\

    This family consists of (uracil-5-)-methyltransferases from bacteria, archaea and eukaryotes.

    \ \

    A 5-methyluridine (m(5)U) residue at position 54 is a conserved feature of bacterial and eukaryotic tRNAs. The methylation of U54 is catalysed by the tRNA(m5U54)methyltransferase, which in Saccharomyces cerevisiae is encoded by the nonessential TRM2 gene. It is thought that tRNA modification enzymes might have a role in tRNA maturation not necessarily linked to their known catalytic activity PUBMED:12003492.

    \ \

    This protein family also contains the 23SrRNA methyltransferases, first proposed to be RNA methyltransferases by homology to the TrmA family. The member from Escherichia coli has now been shown to act as the 23S RNA methyltransferase for the conserved U1939. The gene is now designated rumA and was previously designated ygcA PUBMED:11779873.

    \ ' '5787' 'IPR009235' '\

    This family consists of several hypothetical baculovirus proteins of unknown function.

    \ ' '5788' 'IPR010281' '\

    This family consists of hypothetical bacterial proteins.

    \ ' '5789' 'IPR009236' '\

    This family consists of A13L proteins from the Chordopoxviruses. A13L or p8 is one of the three most abundant membrane proteins of the intracellular mature Vaccinia virus PUBMED:9311819.

    \ ' '5790' 'IPR010282' '\

    HutD from Pseudomonas fluorescens is a component of the histidine uptake and utilisation operon. HutD is operonic with the well characterised repressor protein HutC. Genetic analysis using transcriptional fusions (lacZ) and deletion mutants shows that hutD is necessary to maintain fitness in environments replete with histidine. HutD probably sets an upper bound on the level of hut operon transcription PUBMED:17717196. The mechanistic basis is unknown, but in silico molecular docking studies based on the crystal structure of HutD from Pseudomonas aeruginosa show that urocanate (the first breakdown product of histidine) docks with the active site of HutD.

    \ ' '5791' 'IPR009237' '\

    US3 of human cytomegalovirus is an endoplasmic reticulum resident transmembrane glycoprotein that binds to major histocompatibility complex class I molecules and prevents their departure. The endoplasmic reticulum retention signal of the US3 protein is contained in the luminal domain of the protein PUBMED:12525649.

    \ ' '5792' 'IPR003888' '\ The "FY-rich" domain N-terminal region is sometimes closely juxtaposed with the C-terminal region (), but sometimes is far distant. It is of unknown function, but occurs frequently in chromatin-associated proteins like trithorax and its homologues.\ ' '5793' 'IPR003889' '\ The "FY-rich" domain C-terminal region is sometimes closely juxtaposed with the N-terminal region (), but sometimes is far distant. It is of unknown function, but occurs frequently in chromatin-associated proteins like trithorax and its homologues.\ ' '5794' 'IPR009238' '\

    This family consists of several Chordopoxvirus A33R proteins. A33R plays a role in promoting Ab-resistant cell-to-cell spread of virus PUBMED:11752718 and interacts with A36R to incorporate the protein into the outer membrane of intracellular enveloped virions (IEV) PUBMED:12634370.

    \ ' '5796' 'IPR009239' '\

    This family consists of the Bacillus species-specific PapR protein. The papR gene belongs to the PlcR regulon and is located 70 bp downstream from plcR. It encodes a 48-amino-acid peptide. Disruption of the papR gene abolishes expression of the PlcR regulon, resulting in a large decrease in haemolysis and virulence in insect larvae. A processed form of PapR activates the PlcR regulon by allowing PlcR to bind to its DNA target. This activating mechanism is strain specific PUBMED:12198157.

    \ ' '5797' 'IPR010284' '\

    This family consists of several short hypothetical plant and cyanobacterial proteins. In plants these proteins are localised to the chloroplast and are known as hypothetical chloroplast protein 12. This family is likely to play some role in photosynthesis.

    \ ' '5798' 'IPR010285' '\

    The majority of members in this family have no known function. Most of the sequences in the family are described as hypothetical, however some are putative helicases and some have a nuclic acid binding fold.

    \ ' '5799' 'IPR010286' '\

    This family consists of several conserved hypothetical proteins from both eukaryotes and prokaryotes. The function of members of this family are unknown but are predicted to be SAM-dependent methyltransferases.

    \ ' '5800' 'IPR009240' '\

    The 15 aa repeat is found in the APC protein family. It is involved in binding beta-catenin PUBMED:9823329 along with the repeats. Many human cancer mutations map to the region around these motifs, and may be involved in disrupting their binding of beta-catenin.

    \ ' '5801' 'IPR009241' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '5802' 'IPR010287' '\

    This group consists of several hypothetical bacterial proteins of unknown function.

    \ ' '5803' 'IPR010288' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain PUBMED:9873074.

    \ The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site PUBMED:11421269, PUBMED:1282354, PUBMED:9640644.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis PUBMED:11988180, PUBMED:11470432, PUBMED:11402022, PUBMED:9872322, PUBMED:11080142, PUBMED:11532960.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions PUBMED:9873074. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette PUBMED:9873074, PUBMED:11421270. More than 50 subfamilies have been described based on a phylogenetic and functional classification PUBMED:9873074, PUBMED:11421269, PUBMED:11421270; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    \

    This family consists of several bacterial ABC transporter proteins which are homologous to the EcsB protein of Bacillus subtilis. EcsB is thought to encode a hydrophobic protein with six membrane-spanning helices in a pattern found in other hydrophobic components of ABC transporters PUBMED:8581172.

    \ ' '5805' 'IPR010290' '\

    This family consists of uncharacterised bacterial proteins, which are putative permeases belonging to the major facilitator superfamily. DitE is linked to the genes involved in the degradation of abietane diterpenoids in Pseudomonas abietaniphila PUBMED:10850995.

    \ ' '5806' 'IPR010291' '\

    This domain, of unknown function, associates with several hypothetical eukaryotic proteins.

    \ ' '5807' 'IPR009242' '\

    This family consists of several short, hypothetical bacterial proteins of unknown function.

    \ ' '5808' 'IPR009243' '\

    This family consists of several short spider neurotoxin proteins including many from the Funnel-web spider.

    \ ' '5809' 'IPR010292' '\

    This family consists of several bacterial CreA proteins, the function of which is unknown.

    \ ' '5810' 'IPR010293' '\

    This is a family of bacterial proteins with unknown function

    \ ' '5811' 'IPR009244' '\

    This family consists of several eukaryotic proteins, which are homologues of the yeast MED7 protein. Activation of gene transcription in metazoans is a multistep process that is triggered by factors that recognise transcriptional enhancer sites in DNA. These factors work with co-activators such as MED7 to direct transcriptional initiation by the RNA polymerase II apparatus PUBMED:9989412.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '5812' 'IPR009245' '\

    This family consists of several Cytomegalovirus UL20A proteins. UL20A is thought to be a glycoprotein PUBMED:11928987.

    \ ' '5813' 'IPR009246' '\

    This family consists of several bacterial ethanolamine ammonia-lyase light chain (EutC) sequences. Ethanolamine ammonia-lyase is a bacterial enzyme that catalyses the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia PUBMED:2197274.

    \ ' '5814' 'IPR010294' '\

    This domain represents the Spacer-1 domain from the ADAM-TS family of metalloproteinases PUBMED:11279086.

    \ \

    A cellular disintegrin and metalloproteinase (ADAM) is a family of genes with structural homology to the snake venom metalloproteinases and disintegrins PUBMED:8995297. There is variation amongst members of the family, however, all have a similar domain organization comprising a preproregion, a reprolysin-type catalytic domain, a disintegrin-like domain, a thrombospondin type-1 (TS) module, a cysteine-rich domain, a spacer domain without cysteine residues, and a COOH-terminal TS module PUBMED:10464288, PUBMED:11279086. They are involved in embryogenesis and have been implicated in some cancers and inflammatory diseases PUBMED:11279086.

    \ ' '5815' 'IPR010295' '\

    This family consists of several bacterial proteins of unknown function. Some of the family, including YjgN, are putative transmembrane proteins.

    \ ' '5816' 'IPR010296' '\

    This family consists of uncharacterised bacterial proteins of unknown function which are thioredoxin-like.

    \ ' '5817' 'IPR009247' '\

    This family consists of several Chordopoxvirus sequences homologous to the Vaccinia virus A35R protein. The function of this family is unknown.

    \ ' '5818' 'IPR010297' '\

    This domain is associated with proteins of unknown function, which are hydrolase-like.

    \ ' '5819' 'IPR010298' '\

    This family consists of several hypothetical bacterial proteins as well as some uncharacterised sequences from Arabidopsis thaliana. The function of this family is unknown.

    \ ' '5820' 'IPR009248' '\

    The Rhizobium meliloti (Sinorhizobium meliloti) bacA gene encodes a function that is essential for bacterial differentiation into bacteroids within plant cells in the symbiosis between R. meliloti and alfalfa. An Escherichia coli homologue of BacA, SbmA, is implicated in the uptake of microcins and bleomycin. This family is likely to be a subfamily of the ABC transporter family.

    \ ' '5821' 'IPR009113' '\

    Mu1 is an outer capsid protein that acts as a reoviral penetration agent. Non-enveloped animal reoviruses must enter host cells by membrane penetration that does not involve membrane fusion, as they lack a viral membrane. Reoviruses are activated by proteolytic cleavage in the intestinal lumen, leading to infectious subviral particles. The core of the virus is coated by a layer of mu1 and sigma3 proteins. Proteases strip off sigma3 exposing mu1, which provides the membrane penetration machinery that perforates the membrane. In addition, N-terminal myristoylation of polypeptide Mu1 are required for site-specific cleavage to Mu1C in transfected cells PUBMED:. Mu1 forms a trimer, where the three mu1 molecules are coiled around one another with a right-handed twist. The mu1 chain folds into four distinct domains: three intertwined, predominantly alpha helical domains and a jelly-roll beta-sandwich PUBMED:11832217.

    \ ' '5822' 'IPR008081' '\

    Cytoplasmic fragile X mental retardation protein (FMRP) interacting protein\ belongs to a highly conserved but, as yet, functionally uncharacterised\ family. Absence of FMRP is responsible for pathologic manifestations in \ Fragile X Syndrome, the most frequent cause of inherited mental retardation\ PUBMED:10449408. FMRP is an RNA-binding protein that may have a role in local protein\ translation at neuronal dendrites and in dendritic spine maturation PUBMED:10449408.\ CYFIP1 and CYFIP2, which share a high level of sequence identity, have \ recently been identified as cytoplasmic FMRP interacting proteins PUBMED:10449408.\ CYFIP2 interacts with FMRP-related proteins FXR1P/2P, while CYFIP1 interacts\ exclusively with FMRP. The FMRP-CYFIP interaction involves the domain of\ FMRP that also mediates homo- and heteromerisation, suggesting competition\ between the various interaction partners. CYFIP1 also interacts with the \ small GTPase Rac1 implicated in development and maintenance of neuronal\ structures. CYFIP1/2 are both present in synaptosomal extracts PUBMED:10449408. \

    \

    PIR121 (121F-specific p53 inducible RNA) is another functionally\ uncharacterised member of this family. The PIR121 gene maps to human\ chromosome 5q34, a region frequently translocated in acute myeloid leukaemia\ but not known to be amplified or deleted in solid tumours. Interaction\ between PIR121 and FMRP has been demonstrated, and hence PIR121 has also \ been termed CYFIP2 (Cytoplasmic FMRP Interacting Protein 2) PUBMED:10449408, PUBMED:9756361.\

    \

    Shyc (Selective HYbridizing Clone) is a cytoplasmic protein of unknown \ function, expressed in the developing and embryonic nervous system. The\ protein has also been designated CYFIP1 due to the high sequence identity\ (98.7%) to its human orthologue. The CYFIP orthologues in Caenorhabditis elegans and Drosophila melanogaster (Fruit fly) share about 51% and 67% sequence \ identity with the human proteins, respectively PUBMED:10449408. The high level of\ conservation manifest throughout the entire CYFIP sequence between various\ orthologues suggests a number of functionally/structurally important domains.

    \ ' '5823' 'IPR010300' '\

    Cysteine dioxygenase type I () converts cysteine to cysteinesulphinic acid and is the rate-limiting step in sulphate production.

    \ ' '5824' 'IPR009249' '\

    This family consists of several different but closely related proteins which include phycocyanobilin:ferredoxin oxidoreductase (PcyA), 15,16-dihydrobiliverdin:ferredoxin oxidoreductase (PebA) and phycoerythrobilin:ferredoxin oxidoreductase (PebB). Phytobilins are linear tetrapyrrole precursors of the light-harvesting prosthetic groups of the phytochrome photoreceptors of plants and the phycobiliprotein photosynthetic antennae of cyanobacteria, red algae, and cryptomonads. It is known that that phytobilins are synthesised from haem via the intermediacy of biliverdin IX alpha (BV), which is reduced subsequently by ferredoxin-dependent bilin reductases with different double-bond specificities PUBMED:11283349.

    \ ' '5825' 'IPR010301' '\

    Nop52 is believed to be involved in the generation of 28S rRNA PUBMED:10341208.

    \ ' '5827' 'IPR010302' '\

    This entry represents a family of herpesvirus proteins that includes U4, U5 and UL27.

    \ ' '5829' 'IPR010303' '\

    This domain of unknown function is found in several transcriptional co-activators including the CREB-binding protein, , which is an acetyltransferase that acetylates histones, giving a specific tag for transcriptional activation. CREB-binding protein also acetylates non-histone proteins.

    \ ' '5830' 'IPR009251' '\

    This entry represents several alpha-2,3-sialyltransferase () proteins, most of which are found in the food-borne pathogen Campylobacter jejuni. Sialyltransferases transfer a sialic acid moiety from cytidine-5\'-monophospho-N-acetyl-neuraminic acid (CMP-NeuAc) to terminal positions of various key glycoconjugates, which play critical roles in cell recognition and adherence PUBMED:14730352. The structure of Cst-II alpha-2,3-sialyltransferase from C. jejuni consists of a 3-layer alpha/beta/alpha topology. Cst-II catalytic mechanism involves an essential histidine (general base) and two tyrosine residues (coordination of the phosphate leaving group) to carry out substrate binding and glycosyl transfer.

    \ ' '5831' 'IPR010304' '\

    This family consists of several eukaryotic survival motor neuron (SMN) proteins. The Survival of Motor Neurons (SMN) protein, the product of the spinal muscular atrophy-determining gene, is part of a large macromolecular complex (SMN complex) that functions in the assembly of spliceosomal small nuclear ribonucleoproteins (snRNPs). The SMN complex functions as a specificity factor essential for the efficient assembly of Sm proteins on U snRNAs and likely protects cells from illicit, and potentially deleterious, non-specific binding of Sm proteins to RNAs.

    \ ' '5832' 'IPR010305' '\

    This family consists of several small bacterial proteins several of which are classified as putative lipoproteins. The function of this family is unknown.

    \ ' '5833' 'IPR009252' '\

    This family consists of several bacterial and archaeal hypothetical proteins of unknown function.

    \ ' '5834' 'IPR009253' '\

    This family consists of several short hypothetical proteobacterial proteins of unknown function.

    \ ' '5835' 'IPR010306' '\

    This family consists of several bacterial phosphonate metabolism (PhnJ) sequences. The exact role that PhnJ plays in phosphonate utilisation is unknown.

    \ ' '5836' 'IPR009254' '\

    Laminins are glycoproteins that are major constituents of the basement membrane of cells. Laminins are trimeric molecules; laminin-1 is an alpha1 beta1 gamma1 trimer. It has been suggested that the domains I and II from laminin A, B1 and B2 may come together to form a triple helical coiled-coil structure PUBMED:3182802. Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration and organisation of cells into tissues during embryonic development by interacting with other extracellular matrix components.

    \ ' '5837' 'IPR010307' '\

    It has been suggested that the domains I and II from laminin A, B1 and B2 may come together to form a triple helical coiled-coil structure PUBMED:3182802.

    \ ' '5839' 'IPR010308' '\

    This family consists of hypothetical proteins of unknown function found in fungi.

    \ ' '5840' 'IPR010309' '\

    This is a domain of unknown function found at the N-terminus of a family of E3 ubiquitin protein ligases, including yeast TOM1, many of which appear to play a role in mRNA transcription and processing. This domain is found in association with and immediately N-terminal to another domain of unknown function: .

    \ ' '5841' 'IPR010310' '\

    This family consists of several short bacterial proteins of unknown function.

    \ ' '5842' 'IPR009256' '\

    This family consists of several short bacterial proteins of unknown function.

    \ ' '5843' 'IPR009257' '\

    This family consists of several short Chordopoxvirus proteins which are homologous to the A30L protein of Vaccinia virus. The vaccinia virus A30L protein is required for the association of electron-dense, granular, proteinaceous material with the concave surfaces of crescent membranes, an early step in viral morphogenesis. A30L is known to interact with the G7L protein and it has been shown that the stability of each is dependent on its association with the other PUBMED:12610117.

    \ ' '5844' 'IPR010311' '\

    This family consists of several Reovirus core-spike protein lambda-2 (L2) sequences. The reovirus L2 genome segment encodes the core spike protein lambda-2, which mediates enzymatic reactions in 5\' capping of the viral plus-strand transcripts PUBMED:11531411.

    \ ' '5845' 'IPR010926' '\

    These proteins share a region of sequence similarity with the tail of myosin (for example ).\ Myosins act as molecular motors.\

    \ ' '5846' 'IPR010312' '\

    This family consists of several bacterial GTP-sensing transcriptional pleiotropic repressor CodY proteins. CodY has been found to repress the dipeptide transport operon (dpp) of Bacillus subtilis in nutrient-rich conditions PUBMED:7783641. The CodY protein also has a repressor effect on many genes in Lactococcus lactis during growth in milk PUBMED:11401725.

    \ ' '5847' 'IPR009258' '\

    This family consists of several GP30.8 proteins from the T4-like phages. The function of this family is unknown.

    \ ' '5848' 'IPR009259' '\

    This family consists of several roughex (RUX) proteins specific to Drosophila species. Roughex can influence the intracellular distribution of cyclin A and is therefore defined as a distinct and specialised cell cycle inhibitor for cyclin A-dependent kinase activity PUBMED:11027291. Rux is though to regulate the metaphase to anaphase transition during development PUBMED:11231149.

    \ ' '5849' 'IPR015938' '\

    This entry represents mammalian-specific glycine N-acyltransferase (also called aralkyl acyl-CoA:amino acid N-acyltransferase; ). Mitochondrial acyltransferases catalyse the transfer of an acyl group from acyl-CoA to the N-terminus of glycine to produce N-acylglycine. These enzymes can conjugate a multitude of substrates to form a variety of N-acylglycines. The CoA derivatives of a number of aliphatic and aromatic acids, but not phenylacetyl-CoA or (indol-3-yl)acetyl-CoA, can act as donor PUBMED:10630424, PUBMED:8660675.

    \ ' '5850' 'IPR006477' '\

    This group of sequences identifies a large paralogous family of variant antigens from several Plasmodium species (Plasmodium yoelii, Plasmodium berghei and Plasmodium chabaudi). It is not believed that there are any orthologs of this family in Plasmodium falciparum.

    \ ' '5851' 'IPR009260' '\

    This family consists of several archaeal strongly conserved proteins whose genes are associated with CRISPRs (Clustered, Regularly Interspaced Short Palidromic Repeats). The function of these proteins has not been experimentally determined, but computational analysis has suggested that they may function as nucleases in DNA repair, similar to RecB () PUBMED:15972856.

    \ \ \ ' '5852' 'IPR009261' '\

    This family consists of several baculovirus proteins of unknown function.

    \ ' '5853' 'IPR010314' '\

    This is a domain of unknown function found towards the N-terminus of a family of E3 ubiquitin protein ligases, including yeast TOM1, many of which appear to play a role in mRNA transcription and processing. This domain is found in association with and immediately C-terminal to another domain of unknown function: .

    \ ' '5854' 'IPR004788' '\

    Ribose 5-phosphate isomerase, also known as phosphoriboisomerase, catalyses the conversion of D-ribose 5-phosphate to D-ribulose 5-phosphate in the non-oxidative branch of the pentose phosphate pathway. The pentose phosphate pathway is a target for chemotherapy against Chagas disease PUBMED:18066434. This family of enzymes is coded for by two genes and is found in many taxa except the viruses. It is a highly conserved enzyme PUBMED:12517338.

    \ ' '5855' 'IPR009262' '\

    This family consists of several hypothetical proteins of unknown function. Some of the sequences in this family are annotated as putative membrane proteins.

    \ ' '5856' 'IPR010315' '\

    This family consists of bacterial proteins of unknown function, which are hydrolase-like.

    \ ' '5857' 'IPR010316' '\

    This domain is found at the N terminus of bacterial AlkA . AlkA (3-methyladenine-DNA glycosylase II) is a base excision repair glycosylase from Escherichia coli. It removes a variety of alkylated bases from DNA, primarily by removing alkylation damage from duplex and single stranded DNA. AlkA flips a 1-azaribose abasic nucleotide out of DNA. This produces a 66 degrees bend in the DNA and a marked widening of the minor groove PUBMED:10675345.

    \ \

    This groove is a large hydrophobic cleft, which is unusually rich in aromatic residues. AlkA recognises electron-deficient methylated bases through pi-donor/acceptor interactions involving the electron-rich aromatic cleft. AlkA is similar in fold and active site location to the bifunctional glycosylase/lyase endonuclease III. This suggests that the two may use similar mechanisms for base excision PUBMED:8706136. The structural analysis of the AlkA and AlkA-hypoxanthine structures indicate that free hypoxanthine binding in the active site may inhibit glycosylase activity PUBMED:12009927.

    \ ' '5858' 'IPR010317' '\

    This family consists of putative cell surface proteins, from Firmicutes, of unknown function.

    \ ' '5859' 'IPR009263' '\

    This entry represents a novel motif designated as SERTA (for SEI-1, RBT1, and TARA), corresponding to the largest conserved region among TRIP-Br proteins PUBMED:11861561. The function of this motif is uncertain, but the CDK4-interacting segment of p34SEI-1 (amino acid residues 44-161) includes most of the SERTA motif PUBMED:10580009.

    \ ' '5860' 'IPR010318' '\

    This family consists of hypothetical bacterial and archaeal proteins of unknown function.

    \ ' '5861' 'IPR009264' '\

    This family consists of several nucleopolyhedrovirus proteins of unknown function.

    \ ' '5862' 'IPR009265' '\

    This family consists of several short baculovirus proteins of unknown function.

    \ ' '5863' 'IPR010319' '\

    Structural analysis predicts that this family of proteins are bacterial transglutaminase-like cysteine peptidases (BTLCPs) with an invariant Cys-His-Asp catalytic triad and an N-terminal signal sequence. They are predicted to possess the papain-like cysteine proteinase fold and catalyse post-translational protein modification through transamidase, acetylase or hydrolase activity. Inspection of neighbouring genes suggests a link between this predicted activity and a type-I secretion system resembling ATP-binding cassette exporters of toxins and proteases involved in bacterial pathogenicity PUBMED:15288868.

    \ ' '5865' 'IPR010321' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '5867' 'IPR006231' '\

    The membrane-associated enzyme, malate:quinone-oxidoreductase, is an alternative to the better-known NAD-dependent malate dehydrogenase as part of the TCA cycle. The reduction of a quinone rather than NAD+ makes the reaction essentially irreversible in the direction of malate oxidation to oxaloacetate. Both forms of malate dehydrogenase are active in Escherichia coli; disruption of this form causes less phenotypic change. In some bacteria, this form is the only or the more important malate dehydrogenase PUBMED:11092847.

    \ ' '5868' 'IPR009266' '\

    This family consists of several Adenovirus E3 proteins. The E3 protein does not seem to be essential for virus replication in cultured cells suggesting that the protein may function in virus-host interactions PUBMED:7769690.

    \ ' '5869' 'IPR010323' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '5870' 'IPR009267' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '5871' 'IPR009268' '\

    These proteins of unknown function are found in Rice black streaked dwarf virus (RBSDV) and other viruses.

    \ ' '5872' 'IPR010324' '\

    Dam-replacing protein (DRP) is a restriction endonuclease that is flanked by pseudo-transposable small repeat elements. The replacement of Dam-methylase by DRP allows phase variation through slippage-like mechanisms in several pathogenic isolates of Neisseria meningitidis PUBMED:11334887.

    \ ' '5873' 'IPR010325' '\

    Rhamnogalacturonate lyase degrades the rhamnogalacturonan I (RG-I) backbone of pectin PUBMED:12591882. This family contains mainly members from plants, but also contains the plant pathogen Erwinia chrysanthemi.

    \ ' '5874' 'IPR010326' '\

    Sec6 is a component of the multiprotein exocyst complex. Sec6 interacts with Sec8, Sec10 and Exo70.These exocyst proteins localise to regions of active exocytosis-at the growing ends of interphase cells and in the medial region of cells undergoing cytokinesis-in an F-actin-dependent and exocytosis- independent manner PUBMED:11854409.

    \ ' '5875' 'IPR009269' '\

    This is a family of eukaryotic proteins with undetermined function.

    \ ' '5876' 'IPR009270' '\

    This is a family of bacterial proteins of unknown function.

    \ ' '5877' 'IPR009271' '\

    The name LSPD derives from the conserved residues in the middle of this repeat. These repeats are found in coagulation factor V and occur in the B domain, which is cleaved prior to activation of the protein. It has been suggested that domain B bring domains A and C together for activation PUBMED:11229814.

    \ ' '5878' 'IPR010327' '\

    Degradation of glutamate via the hydroxyglutarate pathway involves the syn-elimination of water from 2-hydroxyglutaryl-CoA. This anaerobic process is catalysed by 2-hydroxyglutaryl-CoA dehydratase, an enzyme with two components (A and D) that reversibly associate during reaction cycles. This component contains one non-reducible [4Fe-4S]2+ cluster and a reduced riboflavin 5\'-monophosphate PUBMED:11980491.

    \ ' '5879' 'IPR010328' '\

    This is a family of uncharacterised bacterial proteins.

    \ ' '5880' 'IPR010329' '\

    Members of this protein family, from both bacteria and eukaryotes, are the enzyme 3-hydroxyanthranilate 3,4-dioxygenase (). It is part of the kynurenine pathway for the degradation of tryptophan and the biosynthesis of nicotinic acid PUBMED:9539135.The prokaryotic homologue is involved in the 2-nitrobenzoate degradation pathway PUBMED:12620844.

    \ \

    The enzyme acts on the tryptophan metabolite 3-hydroxyanthranilate and produces 2-amino-3-carboxymuconate semialdehyde, which can rearrange spontaneously to quinolinic acid and feed into nicotinamide biosynthesis, or undergo further enzymatic degradation.

    \ ' '5881' 'IPR009272' '\

    This is a family of proteins from the archaeon Sulfolobus, with undetermined function.

    \ ' '5882' 'IPR010330' '\

    Many of the members of this family are described as transcription factors. CoiA falls within a competence-specific operon in Streptococcus. CoiA is an uncharacterised protein.

    \ ' '5883' 'IPR010331' '\

    Among the bacterial genes required for nodule invasion are the exo genes. These genes are involved in the production of an extracellular polysaccharide. Mutations in the exoD result in altered exopolysaccharide production and defects in nodule invasion PUBMED:1987158.

    \ ' '5884' 'IPR010332' '\

    This family of proteins are annotated as ATPase subunits of phage terminase after PUBMED:10949585. Terminases are viral proteins that are involved in packaging viral DNA into the capsid.

    \ ' '5885' 'IPR010333' '\

    This entry contains several bacterial VirJ virulence proteins. VirJ is thought to be involved in the type IV secretion system. It is thought that the substrate proteins localised to the periplasm may associate with the pilus in a manner that is mediated by VirJ, and suggest a two-step process for type IV secretion in Agrobacterium PUBMED:12207700.

    \ ' '5886' 'IPR010334' '\

    An essential step in mRNA turnover is decapping. In yeast, two proteins have been identified that are essential for decapping, Dcp1 (this family) and Dcp2 (). The precise role of these proteins in the decapping reaction has not been established. Evidence suggests that the Dcp1 may enhance the function of Dcp2 PUBMED:12554866.

    \ ' '5887' 'IPR009273' '\

    This is a family of bacterial proteins with undetermined function. All bacteria in this family are from the Rhizobiales order.

    \ ' '5888' 'IPR010335' '\

    This family consists of several mammalian pre-pro-megakaryocyte potentiating factor precursor (MPF) or mesothelin proteins. Mesothelin is a glycosylphosphatidylinositol-linked glycoprotein highly expressed in mesothelial cells, mesotheliomas, and ovarian cancer, but the biological function of the protein is not known PUBMED:10733593,PUBMED:10500211.

    \ ' '5889' 'IPR010336' '\

    ME53 is one of the major early-transcribed genes. The ME53 protein is reported to contain a putative zinc finger motif PUBMED:8093490.

    \ ' '5890' 'IPR008249' '\ There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '5892' 'IPR009274' '\

    The Gam protein inhibits RecBCD nuclease and is found in both bacteria and bacteriophage PUBMED:8335632.

    \ ' '5894' 'IPR009275' '\

    SepZ is a component of the type III secretion system use in bacteria. SepZ is a gene within the enterocyte effacement locus. SepZ mutants exhibit reduced invasion efficiency and lack of tyrosine phosphorylation of Hp90 PUBMED:8878013.

    \ ' '5895' 'IPR009276' '\

    Family of Proteobacteria proteins with unknown function.

    \ ' '5896' 'IPR010339' '\

    This family consists of the C-terminal region of several eukaryotic and archaeal RuvB-like 1 (Pontin or TIP49a) and RuvB-like 2 (Reptin or TIP49b) proteins. The N-terminal domain contains the AAA ATPase, central region domain. In zebrafish, the liebeskummer (lik) mutation, causes development of hyperplastic embryonic hearts. lik encodes Reptin, a component of a DNA-stimulated ATPase complex. Beta-catenin and Pontin, a DNA-stimulated ATPase that is often part of complexes with Reptin, are in the same genetic pathways. The Reptin/Pontin ratio serves to regulate heart growth during development, at least in part via the beta-catenin pathway PUBMED:12464178. TBP-interacting protein 49 (TIP49) was originally identified as a TBP-binding protein, and two related proteins are encoded by individual genes, tip49a and b. Although the function of this gene family has not been elucidated, they are supposed to play a critical role in nuclear events because they interact with various kinds of nuclear factors and have DNA helicase activities. TIP49a has been suggested to act as an autoantigen in some patients with autoimmune diseases PUBMED:10902922.

    \ ' '5897' 'IPR009277' '\

    PerC is a transcriptional activator of EaeA/BfpA expression in enteropathogenic bacteria PUBMED:7729884.

    \ ' '5898' 'IPR010340' '\

    The large phosphorylated protein (UL32-like) of herpes viruses is the polypeptide most frequently reactive in immuno-blotting analyses with antisera when compared with other viral proteins PUBMED:2455019.

    \ ' '5899' 'IPR013029' '\

    This domain is found at the C-terminus of family of conserved hypothetical proteins found in both prokaryotes and eukaryotes. While the function of these proteins is not known, the crystal structure of from Haemophilus influenzae has been determined PUBMED:12837776. This protein consists of three domains: an N-terminal domain which has a mononucleotide binding fold typical for the P-loop NTPases, a central domain which forms an alpha-helical coiled coil, and this C-terminal domain which is composed of a six-stranded half-barrel curved around an alpha helix. The central domain and this domain are topologically similar to RNA-binding proteins, while the N-terminal region contains the features typical of GTP-dependent molecular switches. The purified protein was capable of binding both double-stranded nucleic acid and GTP. It was suggested, therefore, that this protein might be part of a nucleoprotein complex and could function as a GTP-dependent translation factor.

    \ ' '5900' 'IPR009278' '\

    This family consists of several US9 and related proteins from the Alphaherpesviruses. The function of the US9 protein is unknown although in Bovine herpesvirus 5 Us9 is essential for the anterograde spread of the virus from the olfactory mucosa to the bulb PUBMED:11907224.

    \ ' '5901' 'IPR008318' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '5902' 'IPR009279' '\

    This family consists of several bacterial proteins of unknown function as well as the Bacteriophage Mu gp29 protein .

    \ ' '5903' 'IPR010341' '\

    This family consists of several hypothetical proteins from plants. The function of this family is unknown.

    \ ' '5904' 'IPR009280' '\

    This family consists of several short Orthopoxvirus F14 proteins. The function of this protein is unknown.

    \ ' '5906' 'IPR009282' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '5907' 'IPR009283' '\

    This family consists of several eukaryotic apyrase (or adenosine diphosphatase) proteins (), and related nucleoside diphosphatases (). The salivary apyrases of blood-feeding arthropods are nucleotide hydrolysing enzymes implicated in the inhibition of host platelet aggregation through the hydrolysis of extracellular adenosine diphosphate PUBMED:12234496.

    \ ' '5908' 'IPR010342' '\

    This family consists of several hypothetical proteins from both prokaryotes and eukaryotes. The function of this family is unknown.

    \ ' '5909' 'IPR010343' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '5910' 'IPR010344' '\

    This family consists of hypothetical bacterial proteins several of which are described as putative lipoproteins.

    \ ' '5911' 'IPR010345' '\

    Interleukins (IL) are a group of cytokines that play an important role in the immune system. They modulate inflammation and immunity by regulating growth, mobility and differentiation of lymphoid and other cells.

    Interleukin-17 (IL-17) is a potent proinflammatory cytokine produced by activated memory T cells PUBMED:11781375. The IL-17 family (of which there are 6 known members, termed IL-17A to IL-17F) is thought to represent a distinct signalling system that appears to have been highly conserved across vertebrate evolution PUBMED:11781375. Family members play an active role in inflammatory diseases, autoimmune diseases and cancer PUBMED:15485625.

    \ ' '5912' 'IPR009284' '\

    This family consists of several Cytomegalovirus TRL10 proteins. TRL10 represents a structural component of the virus particle and like the other HCMV envelope glycoproteins, is present in a disulphide-linked complex PUBMED:11773418.

    \ ' '5913' 'IPR010346' '\

    This family consists of several bacteria and phage lipoprotein Rz1 precursors. Rz1 is a proline-rich lipoprotein from bacteriophage lambda, which is known to have fusogenic properties. Rz1-induced liposome fusion is thought to be mediated primarily by the generation of local perturbation in the bilayer lipid membrane and to a lesser extent by electrostatic forces PUBMED:10651816.

    \ ' '5914' 'IPR009285' '\

    This family consists of several Orthopoxvirus A26L and A30L proteins. The Vaccinia A30L gene is regulated by a late promoter and encodes a protein of approximately 9 kDa. It is thought that the A30L protein is needed for vaccinia virus morphogenesis, specifically the association of the dense viroplasm with viral membranes PUBMED:11390577.

    \ ' '5915' 'IPR010347' '\

    Covalent intermediates between topoisomerase I and DNA can become dead-end complexes that lead to cell death. Tyrosyl-DNA phosphodiesterase can hydrolyse the bond between topoisomerase I and DNA PUBMED:10521354.

    \ ' '5916' 'IPR009092' '\

    The baculovirus, Autographa californica nuclear polyhedrosis virus (AcMNPV), telokin-like protein (Tlp20) lies in a region of the baculoviral genome that is expressed late in the viral replication cycle, however its function is unknown. Tlp20 was discovered using anti-telokin antibodies, telokin being the C-terminal domain of smooth-muscle myosin light-chain kinase PUBMED:7517434. Both Tlp20 and telokin display a seven-stranded antiparallel beta-barrel structure, although the 3-dimensional structures of the beta-barrels are different and there is no sequence homology between the two. Tlp20 is structurally similar to dUTPase in its fold and trimeric assembly PUBMED:15299576.

    \ ' '5917' 'IPR010349' '\

    This family consists of several bacterial L-asparaginase II proteins. L-asparaginase () catalyses the hydrolysis of L-asparagine to L-aspartate and ammonium. Rhizobium etli possesses two asparaginases: asparaginase I, which is thermostable and constitutive, and asparaginase II, which is thermolabile, induced by asparagine and repressed by the carbon source PUBMED:10930734.

    \ ' '5918' 'IPR009286' '\

    This is a family of inositol-pentakisphosphate 2-kinases (also known as inositol 1,3,4,5,6-pentakisphosphate 2-kinase, Ins(1,3,4,5,6)P5 2-kinase) and InsP5 2-kinase). This enzyme phosphorylates Ins(1,3,4,5,6)P5 to form Ins(1,2,3,4,5,6)P6 (also known as InsP6 or phytate). InsP6 is involved in many processes such as mRNA export, nonhomologous end-joining, endocytosis and ion channel regulation PUBMED:10960485.

    \ ' '5920' 'IPR010351' '\

    This family consists of several hypothetical proteins from Escherichia coli, Yersinia pestis and Salmonella typhi.

    \ ' '5921' 'IPR009287' '\

    This family consists of several eukaryotic transcription initiation Spt4 proteins. Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association of the Spt4-Spt5 complex with Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles PUBMED:11182892.

    \ ' '5922' 'IPR009288' '\

    AIG2 is an Arabidopsis protein that exhibit RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2 PUBMED:8742710. Its structure consists of a five-stranded beta-barrel surrounded by two alpha-helices and a small beta-sheet. A long flexible alpha-helix protrudes from the structure at the C-terminal end. Conserved residues in a hydrophilic cavity, which are able to bind small ligands, may act as an active site in AIG2-like proteins PUBMED:16754964.

    \ ' '5924' 'IPR009289' '\

    Family of proteins from various Baculoviruses with undetermined function.

    \ ' '5925' 'IPR010352' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '5926' 'IPR009290' '\

    This family consists of several radial spoke protein 3 (RSP3) sequences. Eukaryotic cilia and flagella present in diverse types of cells perform motile, sensory, and developmental functions in organisms from protists to humans. They are centred by precisely organised, microtubule-based structures, the axonemes. The axoneme consists of two central singlet microtubules, called the central pair, and nine outer doublet microtubules. These structures are well conserved during evolution. The outer doublet microtubules, each composed of A and B sub-fibres, are connected to each other by nexin links, while the central pair is held at the centre of the axoneme by radial spokes. The radial spokes are T-shaped structures extending from the A-tubule of each outer doublet microtubule to the centre of the axoneme. Radial spoke protein 3 (RSP3), is present at the proximal end of the spoke stalk and helps in anchoring the radial spoke to the outer doublet. It is thought that radial spokes regulate the activity of inner arm dynein through protein phosphorylation and dephosphorylation PUBMED:12589069.

    \ ' '5927' 'IPR010353' '\

    This family consists of several bacterial phenol hydroxylase subunit proteins, which are part of a multicomponent phenol hydroxylase. Some bacteria can utilise phenol or some of its methylated derivatives as their sole source of carbon and energy. The first step in this process is the conversion of phenol into catechol. Catechol is then further metabolised via the meta-cleavage pathway into TCA cycle intermediates PUBMED:7753034.

    \ ' '5928' 'IPR010354' '\

    Members of this family are thought to have structural features in common with the beta chain of the class II antigens, as well as myosin, and may play an important role in the pathogenesis PUBMED:8188369.

    \ ' '5929' 'IPR009291' '\

    This family consists of several hypothetical proteins from plants. The function of this family is unknown.

    \ ' '5930' 'IPR009292' '\

    This is a family of eukaryotic proteins with unknown function.

    \ ' '5931' 'IPR009293' '\

    This family consists of bacterial sequences several of which are thought to be general stress proteins.

    \ ' '5933' 'IPR009294' '\

    This family consists of several eukaryotic Aph-1 proteins. Gamma-secretase catalyses the intramembrane proteolysis of Notch, beta-amyloid precursor protein, and other substrates as part of a new signalling paradigm and as a key step in the pathogenesis of Alzheimer\'s disease. It is thought that the presenilin heterodimer comprises the catalytic site and that a highly glycosylated form of nicastrin associates with it. Aph-1 and Pen-2, two membrane proteins genetically linked to gamma-secretase, associate directly with presenilin and nicastrin in the active protease complex. Co-expression of all four proteins leads to marked increases in presenilin heterodimers, full glycosylation of nicastrin, and enhanced gamma-secretase activity PUBMED:12740439.

    \ ' '5934' 'IPR009295' '\

    This family consists of several hypothetical proteins from different Staphylococcus species. The function of this family is unknown.

    \ ' '5935' 'IPR009296' '\

    This family consists of several short hypothetical bacterial proteins of unknown function.

    \ ' '5936' 'IPR009297' '\

    This family consists of several hypothetical bacterial and plant proteins of unknown function.

    \ ' '5937' 'IPR010356' '\

    This family consists of several enterobacterial haemolysin (HlyE) proteins. Haemolysin E (HlyE) is a novel pore-forming toxin of Escherichia coli, Salmonella typhi, and Shigella flexneri. HlyE is unrelated to the well characterised pore-forming E. coli haemolysins of the RTX family, haemolysin A (HlyA), and the enterohaemolysin encoded by the plasmid borne ehxA gene of E. coli 0157. However, it is evident that expression of HlyE in the absence of the RTX toxins is sufficient to give a haemolytic phenotype in E. coli. HlyE is a protein of 34 kDa that is expressed during anaerobic growth of E. coli. Anaerobic expression is controlled by the transcription factor, FNR, such that, upon ingestion and entry into the anaerobic mammalian intestine, HlyE is produced and may then contribute to the colonisation of the host PUBMED:10660049.

    \ ' '5938' 'IPR010357' '\

    This family consists of several hypothetical eukaryotic proteins of unknown function that are thioredoxin-like.

    \ ' '5940' 'IPR009299' '\

    This family consists of several Gammaherpesvirus capsid proteins. The exact function of this family is unknown.

    \ ' '5941' 'IPR010358' '\

    This family consists of several eukaryotic brain and reproductive organ-expressed (BRE) proteins. BRE is a putative stress-modulating gene, found able to down-regulate TNF-alpha-induced-NF-kappaB activation upon over expression. A total of six isoforms are produced by alternative splicing predominantly at either end of the gene. Compared to normal cells, immortalised human cell lines uniformly express higher levels of BRE. Peripheral blood monocytes respond to LPS by down-regulating the expression of all the BRE isoforms. It is thought that the function of BRE and its isoforms is to regulate peroxisomal activities PUBMED:11676476.

    \ ' '5942' 'IPR010359' '\

    This is a family of bacterial and viral proteins with undetermined function. A conserved H-E-X-X-H motif is suggestive of a catalytic active site and shows similarity to .

    \ ' '5943' 'IPR010360' '\

    This is a family of bacterial sequences with undetermined function.

    \ ' '5944' 'IPR009300' '\

    This family consists of several Staphylococcus aureus bacteriophage RinB proteins and related sequences from their host. The int gene of staphylococcal bacteriophage phi 11 is the only viral gene responsible for the integrative recombination of phi 11. rinA and rinB, are both required to activate expression of the int gene PUBMED:8432703.

    \ ' '5945' 'IPR009301' '\

    This family consists of several hypothetical proteins from Escherichia coli, Salmonella typhi, Shigella flexneri and Proteus vulgaris. The function of this family is unknown.

    \ ' '5947' 'IPR003886' '\

    The ~180-residue NIDO domain is an extracellular domain of unknown function,\ found in nidogen (entactin) and hypothetical proteins. The NIDO domain is\ found in association with other domains, such as nidogen G2 beta-barrel (), thyroglobulin type-1 (), LDLRB (), AMOP (), EGF-like (), VWFD, IPT/TIG, or sushi/CCP/SCR () PUBMED:11893501, PUBMED:12084055, PUBMED:15053982, PUBMED:16500040.

    \ \

    Some proteins known to contain a NIDO domain are listed below:\

    \

    \ ' '5948' 'IPR009302' '\

    This entry consists of the tail length tape measure protein from Bacteriophage HK97 and related sequences from Escherichia coli (strain K12).

    \ ' '5949' 'IPR010363' '\

    The function of this N-terminal domain has not been characterised and is not expressed in the \'short\' isoform of collagen XVIII PUBMED:9503365.

    \ ' '5950' 'IPR010927' '\

    Six Tra proteins encoded by the F plasmid and required by F(+) cells to elaborate F pili. The six proteins are TraH, TraF, TraW, TraU, TrbI, and TrbB. Except for TrbI, these proteins were all identified as hallmarks of F-like type IV secretion systems (TFSSs), with no homologues among TFSS genes of P-type or I-type systems. With the exception of TrbI, which is an inner membrane protein, the remaining proteins are or are predicted to be periplasmic. TrbI consists of one membrane-spanning segment near its N terminus and an 88-residue, hydrophilic domain that extends into the periplasm PUBMED:15292150. It has been proposed that the TraH interaction group is to control F-pilus extension and retraction during conjugation PUBMED:2656408, PUBMED:11914349, PUBMED:1355084.

    \ ' '5951' 'IPR010364' '\

    This family consists of several bacterial CreD or Cet inner membrane proteins. Dominant mutations of the cet gene of Escherichia coli result in tolerance to colicin E2 and increased amounts of an inner membrane protein with a Mr of 42,000. The cet gene is shown to be in the same operon as the phoM gene, which is required in a phoR background for expression of the structural gene for alkaline phosphatase, phoA. Although the Cet protein is not required for phoA expression, it has been suggested that the Cet protein has an enhancing effect on the transcription of phoA PUBMED:2835585.

    \ ' '5952' 'IPR009303' '\

    This family consists of several hypothetical proteins from several species of Staphylococcus. The function of this family is unknown.

    \ ' '5953' 'IPR010365' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '5954' 'IPR009304' '\

    This is a family of Kaposi\'s sarcoma-associated herpesvirus (HHV8) latent membrane protein.

    \ ' '5955' 'IPR009305' '\

    This family consists of several eukaryotic and prokaryotic proteins of unknown function. The yeast protein has been found to be non-essential for cell growth.

    \ ' '5956' 'IPR010366' '\

    This family consists of the Shigella flexneri specific protein OspC. The function of this family is unknown but it is thought that Osp proteins may be involved in postinvasion events related to virulence. Since bacterial pathogens adapt to multiple environments during the course of infecting a host, it has been proposed that Shigella evolved a mechanism to take advantage of a unique intracellular cue, which is mediated through MxiE, to express proteins when the organism reaches the eukaryotic cytosol PUBMED:12142411.

    \ ' '5957' 'IPR010367' '\

    This family consists of several Chordopoxvirus specific G3 proteins. The function of this family is unknown.

    \ ' '5958' 'IPR008300' '\

    Salmonella enterica subsp. enterica serovar Typhimurium degrades 1,2-propanediol by a pathway that requires coenzyme B12, adenosylcobalamin (AdoCbl). Proteins required for 1,2-propanediol degradation are encoded by the pdu operon PUBMED:10498708. PduL functions in this pathway, but its exact role is not yet determined.

    \ \

    Propanediol degradation is thought to be important for the natural Salmonella populations, since propanediol is produced by the fermentation of the common plant sugars rhamnose and fucose PUBMED:10498708, PUBMED:9023178. More than 1% of the Salmonella enterica genome is devoted to the utilization of propanediol and cobalamin biosynthesis. In vivo expression technology has indicated that propanediol utilization (pdu) genes may be important for growth in host tissues, and competitive index studies with mice have shown that pdu mutations confer a virulence defect PUBMED:9539791, PUBMED:9922242. The pdu operon is contiguous and coregulated with the cobalamin (B12) biosynthesis cob operon, indicating that propanediol catabolism may be the primary reason for de novo B12 synthesis in Salmonella PUBMED:1312999, PUBMED:8226666, PUBMED:1313000. Please see , and for more details on the propanediol utilization pathway and the pdu operon.

    \ ' '5959' 'IPR009306' '\

    This family consists of a series of repeated sequences from one hypothetical protein () found in Schizosaccharomyces pombe. The function of this family is unknown.

    \ ' '5961' 'IPR010368' '\

    This family consists of several relatively short bacterial and archaeal hypothetical sequences. The function of this family is unknown.

    \ ' '5962' 'IPR009308' '\

    This family consists of several bacterial L-rhamnose isomerase proteins (). This enzyme interconverts L-rhamnose and L-rhamnulose. In some species, including Escherichia coli, this is the first step in rhamnose catabolism. Sequential steps are catalyzed by rhamnulose kinase (rhaB), then rhamnulose-1-phosphate aldolase (rhaD) to yield glycerone phosphate and (S)-lactaldehyde.

    \ ' '5963' 'IPR009309' '\

    This family consists of several hypothetical bacterial proteins. The function of the family is unknown.

    \ ' '5964' 'IPR010369' '\

    This is a family of plant proteins with unknown function.

    \ ' '5966' 'IPR009201' '\ This group represents a virion core protein, vaccinia E11L type.\ ' '5967' 'IPR009310' '\

    This is a family of bacterial proteins located in the phenyl dioxygenase (bph) operon. The function of this family is unknown.

    \ ' '5968' 'IPR009311' '\

    These proteins include several that are annotated as alpha-interferon inducible proteins.

    \ ' '5969' 'IPR009312' '\

    This entry represents a tail fibre component U of bacteriophage.

    \ ' '5971' 'IPR009313' '\

    This is a family of uncharacterised Baculovirus proteins that are all about 11 kDa in size.

    \ ' '5972' 'IPR010372' '\

    DNA polymerase III, delta subunit () is required for, along with delta\' subunit, the assembly of the processivity factor beta(2) onto primed DNA in the DNA polymerase III holoenzyme-catalysed reaction PUBMED:11432857. The delta subunit is also known as HolA.

    \ ' '5973' 'IPR009314' '\

    One of the members of this family is a 4.9 kDa proteins, encoded by Bovine coronavirus NS1 PUBMED:2142556.

    \ ' '5974' 'IPR009315' '\

    Phosphate-starvation-inducible E (PsiE) expression is under direct positive and negative control by PhoB and cAMP-CRP, respectively PUBMED:10986267. The function of PsiE remains to be determined.

    \ ' '5975' 'IPR010373' '\

    This is a family of uncharacterised prophage proteins that are also found in bacteria and humans.

    \ ' '5976' 'IPR009316' '\

    The COG complex comprises eight proteins COG1-8. The COG complex plays critical roles in Golgi structure and function PUBMED:11980916.

    \ ' '5977' 'IPR010374' '\

    This is a family of uncharacterised bacterial membrane proteins.

    \ ' '5978' 'IPR009317' '\

    This family of proteins contain a conserved 60 residue region. This protein is known as ChaB in Escherichia coli and is found next to ChaA, which is a cation transporter protein. ChaB may be regulate ChaA function in some way.

    \ ' '5979' 'IPR009318' '\

    In Drosophila, taste is perceived by gustatory neurons located in sensilla distributed on several different appendages throughout the body of the animal. This family represents the taste receptor sensitive to trehalose PUBMED:10710312,PUBMED:11516643.

    \ ' '5980' 'IPR009319' '\

    This is a family of related phage minor capsid proteins.

    \ ' '5981' 'IPR010375' '\

    This is a family of uncharacterised bacterial proteins.

    \ ' '5982' 'IPR009320' '\

    This family of proteins includes three proteins from Escherichia coli proteins YagB, YeeU and YfjZ. The function of these proteins is unknown. They are about 120 amino acids in length.

    \ ' '5983' 'IPR010376' '\

    This family consists of several short bacterial proteins and one sequence () from Oryza sativa. The function of this family is unknown.

    \ ' '5984' 'IPR010377' '\

    This family consists of several hypothetical bacterial and one Caenorhabditis elegans sequence (). The function of this family is unknown.

    \ ' '5985' 'IPR009321' '\

    This family consists of several hypothetical archaeal proteins of unknown function.

    \ ' '5986' 'IPR009322' '\

    This is a family of small phage tail protein, referred to as protein E.

    \ ' '5987' 'IPR010378' '\

    This is a family of uncharacterised eukaryotic proteins.

    \ ' '5988' 'IPR010379' '\

    During the bacterial cell cycle, the tubulin-like cell-division protein FtsZ polymerises into a ring structure that establishes the location of the nascent division site. EzrA modulates the frequency and position of FtsZ ring formation PUBMED:10449747.

    \ ' '5989' 'IPR010380' '\

    This is a family of uncharacterised bacterial proteins.

    \ ' '5990' 'IPR010381' '\

    This family consists of proteins of unknown function found in Caenorhabditis species.

    \ ' '5991' 'IPR010382' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '5993' 'IPR010383' '\

    The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates () and related proteins into distinct sequence based families has been described PUBMED:9334165. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form \'clans\'.

    \ \

    The glycosyltransferase family 36 includes cellobiose phosphorylase (), cellodextrin phosphorylase (), and chitobiose phosphorylase. Many members of this family contain two copies of the domain represented in this entry.

    \ ' '5994' 'IPR009323' '\

    This family consists of several putative bacterial membrane proteins. The function of this family is unclear.

    \ ' '5995' 'IPR010384' '\

    This is a family of uncharacterised bacterial sequences.

    \ ' '5996' 'IPR009324' '\

    This is a family of uncharacterised proteins found in bacteria and archaea.

    \ ' '5997' 'IPR010385' '\

    This family consists of several hypothetical proteins from Rhizobium meliloti (Sinorhizobium meliloti), Rhizobium loti (Mesorhizobium loti) and Agrobacterium tumefaciens. The function of this family is unknown.

    \ ' '5998' 'IPR009325' '\

    This family consists of several bacterial proteins of unknown function.

    \ ' '6000' 'IPR009327' '\

    This is a family of uncharacterised proteins found in bacteria and eukaryotes.

    \ ' '6001' 'IPR009328' '\

    This family consists of several bacterial putative membrane proteins of unknown function.

    \ ' '6002' 'IPR009329' '\

    This is a family of bacterial proteins that are related to the hypothetical protein YeeT.

    \ ' '6003' 'IPR010386' '\

    This family consists of several bacterial tRNA-(MSIO[6]A)-hydroxylase (MiaE) proteins. The modified nucleoside 2-methylthio-N-6-isopentenyl adenosine (ms2i6A) is present at position 37 (3\' of the anticodon) of tRNAs that read codons beginning with U except tRNA(I,V Ser) in Escherichia coli. Salmonella typhimurium 2-methylthio-cis-ribozeatin (ms2io6A) is found in tRNA, probably in the corresponding species that have ms2i6A in E. coli. The miaE gene is absent in E. coli, a finding consistent with the absence of the hydroxylated derivative of ms2i6A in this species PUBMED:8253666.

    \ ' '6004' 'IPR009330' '\

    This family consists of several bacterial lipopolysaccharide core biosynthesis proteins (WaaY or RfaY). The waaY, waaQ, and waaP genes are located in the central operon of the waa (formerly rfa) locus on the chromosome of Escherichia coli. This locus contains genes whose products are involved in the assembly of the core region of the lipopolysaccharide molecule. WaaY is the enzyme that phosphorylates HepII in this system PUBMED:9756860.

    \ ' '6005' 'IPR010387' '\

    This family includes the queT gene encoding a hypothetical integral membrane protein with 5 predicted transmembrane regions. The queT genes in Firmicutes are often preceded by the PreQ1 (7-aminomethyl-7-deazaguanine) riboswitches of two distinct classes PUBMED:17384645, PUBMED:18305186, suggesting involvement of the QueT transporters in uptake of a queuosine biosynthetic intermediate.

    \ ' '6006' 'IPR009331' '\

    This family consists of several bacterial proteins which are homologous to the oligogalacturonate-specific porin protein KdgM () from Erwinia chrysanthemi. The phytopathogenic Gram-negative bacteria E. chrysanthemi secretes pectinases, which are able to degrade the pectic polymers of plant cell walls, and uses the degradation products as a carbon source for growth. KdgM is a major outer membrane protein, whose synthesis is strongly induced in the presence of pectic derivatives. KdgM behaves like a voltage-dependent porin that is slightly selective for anions and that exhibits fast block in the presence of trigalacturonate. In contrast to most porins, KdgM seems to be monomeric PUBMED:11773048.

    \ ' '6007' 'IPR009332' '\

    This entry represents subunit Med22 of the Mediator complex. It contains several eukaryotic Surfeit locus protein 5 (SURF5) sequences. The human Surfeit locus has been mapped on chromosome 9q34.1. The locus includes six tightly clustered housekeeping genes (Surf1-6), and the gene organisation is similar in human, mouse and chicken Surfeit loci PUBMED:11891058.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '6008' 'IPR010388' '\

    This group, typified by Salmonella typhimurium CbiK, contains anaerobic cobalt chelatases that act in the anaerobic cobalamin biosynthesis pathway PUBMED:9150215, PUBMED:11215515.

    \

    Cobalamin (vitamin B12) can be complexed with metal via ATP-dependent reactions (aerobic pathway) (e.g., in Pseudomonas denitrificans) or via ATP-independent reactions (anaerobic pathway) (e.g., in S. typhimurium) PUBMED:8905078, PUBMED:11469861. The corresponding cobalt chelatases are not homologous. This group belongs to the class of ATP-independent, single-subunit chelatases that also includes distantly related protoporphyrin IX (PPIX) ferrochelatase (HemH) (Class II chelatases) PUBMED:12686546. The structure of S. typhimurium CbiK shows that it has a remarkably similar topology to Bacillus subtilis ferrochelatase despite only weak sequence conservation PUBMED:10451360. Both enzymes contain a histidine residue identified as the metal ion ligand, but CbiK contains a second histidine in place of the glutamic acid residue identified as a general base in PPIX ferrochelatase PUBMED:10451360. Site-directed mutagenesis has confirmed a role for this histidine and a nearby glutamic acid in cobalt binding, modulating metal ion specificity as well as catalytic efficiency PUBMED:10451360.

    \

    It should be noted that CysG and Met8p, which are multifunctional proteins associated with siroheme biosynthesis, include chelatase activity and can therefore be considered as the third class of chelatases PUBMED:12686546. As with the class II chelatases, they do not require ATP for activity. However, they are not structurally similar to HemH or CbiK, and it is likely that they have arisen by the acquisition of a chelatase function within a dehydrogenase catalytic framework PUBMED:11980703, PUBMED:12686546.

    \ ' '6009' 'IPR010389' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6010' 'IPR010390' '\

    This family consists of a number of hypothetical bacterial proteins of unknown function.

    \ ' '6011' 'IPR010391' '\

    This family of short proteins includes DNA-damage-inducible protein I (DinI) and related proteins. The SOS response, a set of cellular phenomena exhibited by eubacteria, is initiated by various causes that include DNA damage-induced replication arrest, and is positively regulated by the co- protease activity of RecA. Escherichia coli DinI, a LexA-regulated SOS gene product, shuts off the initiation of the SOS response when overexpressed in vivo. Biochemical and genetic studies indicated that DinI physically interacts with RecA to inhibit its co-protease activity PUBMED:12626715. The structure of DinI is known PUBMED:11152126.

    \ ' '6012' 'IPR010392' '\

    This entry represents coat proteins found in several satellite viruses, including satellite panicum mosaic virus PUBMED:7552713, satellite tobacco mosaic virus PUBMED:9514737, and satellite tobacco necrosis virus. The coat proteins of satellite viruses consist of a beta-sandwich jelly-roll fold, with usually eight strands making up the two sheets, although some members can have an extra 1-2 strands. The characteristic interaction between the domains of this fold allows the formation of five-fold and pseudo six-fold assemblies. Although the satellite virus coat proteins share the same jelly-roll fold, they differ in the arrangement of their secondary structural elements and in the interactions of adjacent subunits PUBMED:8553559.

    \ ' '6013' 'IPR010393' '\

    This family consists of several bacterial YecM proteins of unknown function.

    \ ' '6014' 'IPR009333' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6015' 'IPR009334' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6016' 'IPR009335' '\

    This family consists of several bacterial HrpE proteins, which are belived to function on the type III secretion system, specifically the secretion of HrpZ (harpinPss) PUBMED:9045830.

    \ ' '6017' 'IPR010394' '\

    This family consists of both eukaryotic and prokaryotic 5\'-nucleotidase sequences ().

    \ ' '6019' 'IPR009337' '\

    This is a family of uncharacterised Proteobacteria proteins.

    \ ' '6021' 'IPR010396' '\

    This family consists of several Orthopoxvirus A5L proteins. The vaccinia virus WR A5L open reading frame (corresponding to open reading frame A4L in vaccinia virus Copenhagen) encodes an immunodominant late protein found in the core of the vaccinia virion. The A5 protein appears to be required for the immature virion to form the brick-shaped intracellular mature virion PUBMED:10233918.

    \ ' '6022' 'IPR009338' '\

    This is a family of conserved bacteriophage open reading frames.

    \ ' '6023' 'IPR010397' '\

    This is a family of uncharacterised bacterial and archaeal proteins.

    \ ' '6024' 'IPR010398' '\

    This is a family of predicted bacterial membrane protein with unknown function.

    \ ' '6025' 'IPR009339' '\

    This is a family of conserved archaeal proteins.

    \ ' '6026' 'IPR009340' '\

    This is a family of conserved Schizosaccharomyces pombe proteins with unknown function.

    \ ' '6027' 'IPR009341' '\

    This subfamily includes the major tail proteins from various phage, including Lactococcus phage TP901-1 PUBMED:8610457.

    \ ' '6028' 'IPR010399' '\

    The tify domain is a 36-amino acid domain only found among Embryophyta (land plants). It has been named after the most conserved amino acid pattern (TIF[F/Y]XG) it contains, but was previously known as the Zim domain. As the use of uppercase characters (TIFY) might imply that the domain is fully conserved across proteins, a lowercase lettering has been chosen in an attempt to highlight the reality of its natural variability.

    \ \

    Based on the domain architecture, tify domain containing proteins can be classified into two groups. Group I is formed by proteins possessing a CCT (CONSTANS, CO-like, and TOC1) domain and a GATA-type zinc finger in addition to the tify domain. Group II contains proteins characterised by the tify domain but lacking a GATA-type zinc finger. Tify domain containing proteins might be involved in developmental processes and some of them have features that are characteristic for transcription factors: a nuclear localisation and the presence of a putative DNA-binding domain PUBMED:17499004. Some proteins known to contain a tify domain include: \

    \

    \ ' '6029' 'IPR010400' '\

    This is a family of proteins of unknown function.

    \ ' '6030' 'IPR010401' '\

    This family includes human glycogen branching enzyme . This enzyme contains a number of distinct catalytic activities. It has been shown for the yeast homologue that mutations in this region disrupt the enzymes Amylo-alpha-1,6-glucosidase ().

    \ ' '6031' 'IPR010402' '\

    The CCT (CONSTANS, CO-like, and TOC1) domain is a highly conserved basic module of ~43 amino acids, which is found near the C-terminus of plant proteins often involved in light signal transduction. The CCT domain is found in association with other domains, such as the B-box zinc finger, the GATA-type zinc finger, the ZIM motif or the response regulatory domain. The CCT domain contains a putative nuclear localisation signal within the second half of the CCT motif and has been shown to be involved in nuclear localization and probably also has a role in protein-protein interaction PUBMED:10926537.

    \ ' '6032' 'IPR009342' '\

    This domain is conserved in enzymes that have carbohydrates as substrate, and may be a carbohydrate-binding domain.

    \ ' '6033' 'IPR010403' '\

    This domain is found in the NvdB protein (), which is involved in the production of beta-(1-->2)-glucan.

    \ ' '6034' 'IPR010404' '\

    This family consists of proteins of unknown function. These proteins are around 200 amino acids in length. The proteins contain a conserved motif PYR in the N-terminal half of the protein that may be functionally important. The species distribution of the family is interesting. So far it is restricted to cyanobacteria, cryptomonads and plants. This suggests that this protein may be involved in some aspect of a photosynthetic lifestyle.

    \ ' '6035' 'IPR009343' '\

    This protein family has no known function. Its members are about 300 amino acids in length. It has so far been detected in Firmicute bacteria and some archaebacteria.

    \ ' '6036' 'IPR009344' '\

    This family consists of Borna disease virus G glycoprotein sequences. Borna disease virus (BDV) infection produces a variety of clinical diseases, from behavioural illnesses to classical fatal encephalitis PUBMED:12163584. G protein is important for viral entry into the host cell PUBMED:8985354,PUBMED:11435588.

    \ ' '6037' 'IPR010405' '\

    This family consists of several cofactor of BRCA1 (COBRA1) like proteins. It is thought that COBRA1 along with BRCA1 is involved in chromatin unfolding. COBRA1 is recruited to the chromosome site by the first BRCT repeat of BRCA1, and is itself sufficient to induce chromatin unfolding. BRCA1 mutations that enhance chromatin unfolding also increase its affinity for, and recruitment of, COBRA1. It is thought that that reorganisation of higher levels of chromatin structure is an important regulated step in BRCA1-mediated nuclear functions PUBMED:11739404.

    \ ' '6038' 'IPR010406' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6039' 'IPR009345' '\

    This family consists of several eukaryotic BMP and activin membrane-bound inhibitor (BAMBI) proteins. Members of the transforming growth factor-beta (TGF-beta) superfamily, including TGF-beta, bone morphogenetic proteins (BMPs), activins and nodals, are vital for regulating growth and differentiation. BAMBI is related to TGF-beta-family type I receptors but lacks an intracellular kinase domain. BAMBI is co-expressed with the ventralising morphogen BMP4 during Xenopus embryogenesis and requires BMP signalling for its expression. The protein stably associates with TGF-beta-family receptors and inhibits BMP and activin as well as TGF-beta signalling PUBMED:10519551.

    \ ' '6040' 'IPR009346' '\

    This family consists of several eukaryotic gene associated with retinoic-interferon-induced mortality 19 (GRIM-19) proteins. GRIM-19, was reported to encode a small protein primarily distributed in the nucleus and was able to promote cell death induced by IFN-beta and RA. A bovine homologue of GRIM-19 was co-purified with mitochondrial NADH:ubiquinone oxidoreductase (complex I) in bovine heart. Therefore, its exact cellular localisation and function are unclear. It has now been discovered that GRIM-19 is a specific interacting protein which negatively regulates Stat3 activity PUBMED:12628925.

    \ ' '6041' 'IPR006538' '\

    These proteins are CobT subunits of the aerobic cobalt chelatase (aerobic cobalamin biosynthesis pathway). Pseudomonas denitrificans CobT has been experimentally characterised PUBMED:1917840, PUBMED:1429466. Aerobic cobalt chelatase consists of three subunits, CobT, CobN () and CobS ().

    \

    Cobalamin (vitamin B12) can be complexed with metal via the ATP-dependent reactions (aerobic pathway) (e.g., in P. denitrificans) or via ATP-independent reactions (anaerobic pathway) (e.g., in Salmonella typhimurium) PUBMED:8905078, PUBMED:11469861. The corresponding cobalt chelatases are not homologous. However, aerobic cobalt chelatase subunits CobN and CobS are homologous to Mg-chelatase subunits BchH and BchI, respectively PUBMED:11469861. CobT, too, has been found to be remotely related to the third subunit of Mg-chelatase, BchD (involved in bacteriochlorophyll synthesis, e.g., in Rhodobacter capsulatus) PUBMED:11469861.

    \

    Nomenclature note: CobT of the aerobic pathway P. denitrificans is not a homologue of CobT of the anaerobic pathway (Salmonella typhimurium, Escherichia coli). Therefore, annotation of any members of this family as nicotinate-mononucleotide--5,6-dimethylbenzimidazole phosphoribosyltransferases is erroneous.

    \ \ ' '6042' 'IPR010407' '\

    This entry is found in several mammalian signalling lymphocytic activation molecule (SLAM) proteins. Optimal T cell activation and expansion require engagement of the TCR plus co-stimulatory signals delivered through accessory molecules. SLAM, a 70 kDa co-stimulatory molecule belonging to the Ig superfamily, is defined as a human cell surface molecule that mediates CD28-independent proliferation of human T cells and IFN-gamma production by human Th1 and Th2 clones PUBMED:10570270. SLAM has also been recognised as a receptor for Measles virus PUBMED:12610126.

    \ ' '6043' 'IPR010408' '\

    This entry represents the haemagglutinin-esterase fusion glycoprotein (HEF) found specifically in infectious anaemia virus (ISAV), an orthomyxovirus-type virus that is an important fish pathogen in marine aquaculture PUBMED:11714961, PUBMED:15824888. Other viruses, such as influenza C virus, coronaviruses and toroviruses, also contain surface HEF proteins, but whereas they usually bind 9-O-acetylsialic acid receptors, ISAV HEF appears to bind 4-O- acetylsialic acid receptors PUBMED:14990724.

    \

    Haemagglutinin-esterase fusion glycoprotein is a multi-functional protein embedded in the viral envelope of ISAV. HEF is required for infectivity, and functions to recognise the host cell surface receptor, to fuse the viral and host cell membranes, and to destroy the receptor upon host cell infection. The haemagglutinin region of HEF is responsible for receptor recognition and membrane fusion. The serine esterase region of HEF is responsible for the destruction of the receptor, though it appears to be distinct from the esterase domain found in influenza C virus.

    \

    Haemagglutinin-esterase glycoproteins must usually be cleaved by the host\'s trypsin-like proteases to produce two peptides (HEF1 and HEF2) necessary for the virus to be infectious. The cleaved HEF protein can then fuse the viral envelope to the cellular membrane of the host cell, which allows the virus to infect the host cell.

    \

    More information about haemagglutinin proteins can be found at Protein of the Month: Bird Flu, Haemagglutinin PUBMED:.

    \ ' '6044' 'IPR009347' '\

    This family consists of several Rice tungro bacilliform virus P46 proteins. The function of this family is unknown.

    \ ' '6045' 'IPR010409' '\

    This family includes gbp a protein from Soybean that binds to GAGA element dinucleotide repeat DNA PUBMED:12177492. It seems likely that the region which defines this family mediates DNA binding. This putative domain contains several conserved cysteines and a histidine suggesting this may be a zinc-binding DNA interaction domain.

    \ ' '6046' 'IPR009348' '\

    This family of regulators are involved in post-translational control of nitrogen permease.

    \ ' '6047' 'IPR010410' '\

    This is a family of plant proteins with undetermined function.

    \ ' '6048' 'IPR013085' '\

    C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf\'s can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 PUBMED:11361095. C2H2 Znf\'s are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes PUBMED:10664601. Transcription factors usually contain several Znf\'s (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA PUBMED:10940247. C2H2 Znf\'s can also bind to RNA and protein targets PUBMED:18253864.

    \

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a C2H2-type zinc finger motif found in several U1 small nuclear ribonucleoprotein C (U1-C) proteins. Some proteins contain multiple copies of this motif. The U1 small nuclear ribonucleoprotein (U1 snRNP) binds to the pre-mRNA 5\' splice site at early stages of spliceosome assembly. Recruitment of U1 to a class of weak 5\' splice site is promoted by binding of the protein TIA-1 to uridine-rich sequences immediately downstream from the 5\' splice site. Binding of TIA-1 in the vicinity of a 5\' splice site helps to stabilise U1 snRNP recruitment, at least in part, via a direct interaction with U1-C, thus providing one molecular mechanism for the function of this splicing regulator PUBMED:12486009.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '6049' 'IPR009349' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This zinc finger appears to be common in activating signal cointegrator 1/thyroid receptor interacting protein 4.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '6050' 'IPR010411' '\

    These proteins include several putative tail assembly chaperones encoded by phages of Gram-negative bacteria.

    \ ' '6051' 'IPR009350' '\

    This family represents the minor tail protein T of Lambda-like viruses and their prophage. The minor tail protein T is located at the distal end and is involved in the assembly of the initiator complex for tail polymerisation. The protein is essential for tail assembly but is not found in the mature virion PUBMED:6220514.

    \ ' '6052' 'IPR009351' '\

    This is a family of conserved bacterial proteins with unknown function.

    \ ' '6054' 'IPR010412' '\

    This is a family of conserved bacterial proteins with unknown function.

    \ ' '6055' 'IPR009353' '\

    This family consists of several Orthopoxvirus N1 proteins. The function of this family is unknown.

    \ ' '6056' 'IPR010413' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6057' 'IPR010414' '\

    This entry represents Frg1 (FSHD region gene 1), a protein that is considered to be a candidate for facioscapulohumeral muscular dystrophy (FSHD). FSHD is a dominant neuromuscular disorder caused by deletions in a number of tandem repeat units (called D4Z4) located on chromosome 4q35. D4Z4 contains a transcriptional silencer whose deletion causes the over-expression in skeletal muscle of 4q35 genes, including Frg1 PUBMED:, PUBMED:. Frg1 is localised to nucleoli and appears to be a component of the human spliceosome, but its exact function is unknown PUBMED:17103222.

    \ ' '6058' 'IPR010415' '\

    This is a family of uncharacterised bacterial proteins.

    \ ' '6059' 'IPR010416' '\

    This is a family of plasmid encoded proteins with unknown function.

    \ ' '6060' 'IPR010417' '\

    This is a family of plant seed-specific proteins identified in Arabidopsis thaliana (Mouse-ear cress). ATS3 is expressed in a pattern similar to the Arabidopsis seed storage protein genes PUBMED:10380802.

    \ ' '6061' 'IPR009354' '\

    This is a family of bacterial proteins, referred to as Usg. Usg is found in the same operon as trpF, trpB, and trpA and is expressed in a coupled transcription-translation system PUBMED:2828322.

    \ ' '6062' 'IPR009355' '\

    This family consists of several Toluene-4-monooxygenase system protein B (TmoB) sequences. Pseudomonas mendocina KR1 metabolises toluene as a carbon source. The initial step of the pathway is hydroxylation of toluene to form p-cresol by a multicomponent toluene-4-monooxygenase (T4MO) system PUBMED:1885512.

    \ ' '6063' 'IPR009356' '\

    This family consists of NADH dehydrogenase subunit 4L (NAD4L) proteins from the mitochondria of several parasitic flatworms.

    \ ' '6064' 'IPR010928' '\

    This family consists of several tyrosinase co-factor MELC1 proteins from a number of Streptomyces species. The melanin operon (melC) of Streptomyces antibioticus contains two genes, melC1 and melC2 (apotyrosinase). It is thought that MelC1 forms a transient binary complex with the downstream apotyrosinase MelC2 to facilitate the incorporation of copper ion and the secretion of tyrosinase indicating that MelC1 is a chaperone for the apotyrosinase MelC2 PUBMED:8360164.

    \ ' '6065' 'IPR009357' '\

    This is a family of uncharacterised eukaryotic proteins.

    \ ' '6066' 'IPR009358' '\

    This entry consists of a number of lipoproteins conserved in Borrelia species PUBMED:8655511.

    \ ' '6067' 'IPR010418' '\

    Activation of NF-kappaB as a consequence of signalling through the Toll and IL-1 receptors is a major element of innate immune responses. ECSIT plays an important role in signalling to NF-kappaB, functioning as the intermediate in the signalling pathways between TRAF-6 and MEKK-1 PUBMED:10465784.

    \ ' '6068' 'IPR010419' '\

    The CO dehydrogenase structural genes coxMSL are flanked by nine accessory genes arranged as the cox gene cluster. The cox genes are specifically and coordinately transcribed under chemolithoautotrophic conditions in the presence of CO as carbon and energy source PUBMED:10433972.

    \ ' '6069' 'IPR010420' '\

    This is a family of uncharacterised proteins found in both eukaryotes and bacteria.

    \ ' '6070' 'IPR010421' '\

    This is a family of uncharacterised proteins found in Proteobacteria.

    \ ' '6071' 'IPR009359' '\

    Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation PUBMED:9600981, PUBMED:9748275. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA PUBMED:9748275.

    \ ' '6072' 'IPR010422' '\

    This family consists of several hypothetical eukaryotic proteins of unknown function.

    \ ' '6073' 'IPR008323' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '6074' 'IPR009360' '\

    Isy1 protein is important in the optimisation of splicing PUBMED:10094305.

    \ ' '6075' 'IPR010423' '\

    This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals PUBMED:11738740.

    \ ' '6076' 'IPR009361' '\

    Zeste white 10 (ZW10) was initially identified as a mitotic checkpoint protein involved in chromosome segregation, and then implicated in targeting cytoplasmic dynein and dynactin to mitotic kinetochores, but it is also important in non-dividing cells. These include cytoplasmic dynein targeting to Golgi and other membranes, and SNARE-mediated ER-Golgi trafficking PUBMED:17102640, PUBMED:16505164. Dominant-negative ZW10, anti-ZW10 antibody, and ZW10 RNA interference (RNAi) cause Golgi dispersal. ZW10 RNAi also disperse endosomes and lysosomes PUBMED:16505164.

    \ \

    Drosophila kinetochore components Rough deal (Rod) and Zw10 are required for the proper functioning of the metaphase checkpoint in flies PUBMED:11146659. The eukaryotic spindle assembly checkpoint (SAC) monitors microtubule attachment to kinetochores and prevents anaphase onset until all kinetochores are aligned on the metaphase plate. It is an essential surveillance mechanism that ensures high fidelity chromosome segregation during mitosis. In higher eukaryotes, cytoplasmic dynein is involved in silencing the SAC by removing the checkpoint proteins Mad2 and the Rod-Zw10-Zwilch complex (RZZ) from aligned kinetochores PUBMED:17576797, PUBMED:18268100, PUBMED:17509882.

    \ ' '6077' 'IPR010424' '\

    The eut operon of Salmonella typhimurium encodes proteins involved in the cobalamin-dependent degradation of ethanolamine. The role of EutQ in this process is unclear PUBMED:10464203.

    \ ' '6078' 'IPR009362' '\

    This is a family of proteins which is found in viruses, archaea and bacteria. The function of these proteins has not been experimetnally determined, but computational analysis suggests that they may function as nucleases which enhance the activity of nearby restriction endonucleases PUBMED:15972856. Cooperation of this kind between a restriction endonuclease and exonucleases has been shown to be essential in allowing some restriction enzymes to perform multiple rounds of DNA cleavage PUBMED:12655005.

    \ ' '6079' 'IPR010425' '\

    This is a family of uncharacterised protein from proteobacteria.

    \ ' '6080' 'IPR009363' '\

    This family consists of several bacterial and phage proteins of unknown function.

    \ ' '6081' 'IPR010426' '\

    This family consists of several trimethylamine methyltransferase (MTTB) proteins from numerous Rhizobium and Methanosarcina species.

    \ ' '6082' 'IPR009364' '\

    This is a family of uncharacterised proteins found in Proteobacteria.

    \ ' '6083' 'IPR008106' '\

    The pathogenic neisseriae are a small group of virulent bacteria that \ initiate infection at the human host mucosal membranes PUBMED:11173033. They are Gram-negative cocci and usually exist in pairs. Neisseria gonorrhoeae is passed through sexual transmission and can cause renal failure in extreme cases. The more extreme Neisseria meningitidis is a usually commensal nasopharynx microbe that causes meningococcemia and acute bacterial meningitis, especially in young children and teenagers PUBMED:11173033. There are several serogroups, of which types A, B and C are the most virulent. Despite recent advances in vaccinology, this pathogen is highly important to research and still poorly understood PUBMED:11173033.

    \

    N. meningitidis has many virulence factors, its major determinant being a \ antiphagocytic polysaccharide capsule that allows the bacterium to evade \ the host immune response PUBMED:11738731. Vaccines based on this polysaccharide have proven effective against serogroups A and C meningococci, but serogroup Bstill does not possess an efficient vaccine, and causes the most severe form of meningitis PUBMED:11738731. It is believed that a conjugate protein vaccine derived from published neisserial genome sequences, rather than one based on polysaccharide, will be the best way of eradicating this disease PUBMED:11738731.

    \

    The focus on novel vaccine targets for N. meningitidis has shifted to the \ adhesins the bacterium secretes to colonise host mucosal epithelia before a\ serious infection takes hold PUBMED:11031243. Interaction of these adhesion moleculeswith their cognate host receptors allows bacterial entry to the epithelium,intracellular transport across the host cell, and exit into the bloodstreamon the other side PUBMED:11031243. Following publication of the complete genome sequence of an N. meningitidis serogroup B strain PUBMED:10710307, several new adhesins have been identified, including one identical to MafB from N. gonorrhoreae.

    \ \ ' '6084' 'IPR009365' '\

    This family consists of several Nucleopolyhedrovirus late expression factor-12 (LEF-12) proteins. The function of this family is unknown PUBMED:10814576,PUBMED:12414945.

    \ ' '6085' 'IPR009366' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6086' 'IPR009367' '\

    This family consists of several hypothetical eukaryotic and prokaryotic proteins. The function of this family is unknown.

    \ ' '6087' 'IPR010427' '\

    This is a family of uncharacterised proteins found in Actinobacteria. Computational analysis suggests that they may belong to the alpha-beta hydrolase family of enzymes, as they are predicted to form the core secondary structures and catalytic machinery common to these proteins PUBMED:15688435. Genomic context suggests that they may function as lipases, controlling the concentration of their putative phospholipid substrates.

    \ ' '6088' 'IPR009368' '\

    This family consists of several hypothetical Staphylococcus aureus and Staphylococcus phage PVL proteins. The function of this family is unknown.

    \ ' '6089' 'IPR009369' '\

    This family consists of several Actinobacillus actinomycetemcomitans leukotoxin activator (LktC) proteins. Actinobacillus actinomycetemcomitans is a Gram-negative bacterium that has been implicated in the etiology of several forms of periodontitis, especially localised juvenile periodontitis. LktC along with LktB and LktD are thought to be required for activation and localisation of the leukotoxin PUBMED:2004819.

    \ ' '6090' 'IPR010428' '\

    This is a family of bacterial protein with undetermined function.

    \ ' '6093' 'IPR009370' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6094' 'IPR009371' '\

    The species Pseudomonas syringae encompasses plant pathogens with differing host specificities and corresponding pathovar designations. P. syringae requires the Hrp (type III protein secretion) system, encoded by a 25-kb cluster of hrp and hrc genes, in order to elicit the hypersensitive response (HR) in nonhosts or to be pathogenic in hosts. The exact function of HrpF is unknown but the protein is needed for pathogenicity PUBMED:9721291.

    \ ' '6095' 'IPR010430' '\

    This is a family of bacterial and archaeal proteins with unknown function.

    \ ' '6096' 'IPR010431' '\

    This family consists of several eukaryotic fascin or singed proteins. The fascins are a structurally unique and evolutionarily conserved group of actin cross-linking proteins. Fascins function in the organisation of two major forms of actin-based structures: dynamic, cortical cell protrusions and cytoplasmic microfilament bundles. The cortical structures, which include filopodia, spikes, lamellipodial ribs, oocyte microvilli and the dendrites of dendritic cells, have roles in cell-matrix adhesion, cell interactions and cell migration, whereas the cytoplasmic actin bundles appear to participate in cell architecture PUBMED:11948621.

    \ ' '6097' 'IPR009372' '\

    This family consists of several short Chordopoxvirus proteins of unknown function.

    \ ' '6098' 'IPR009373' '\

    This family consists of several short Circovirus proteins of unknown function.

    \ ' '6099' 'IPR010432' '\

    This domain contains three highly conserved amino acids: one arginine and two aspartates, hence the name of RDD domain. This region contains two predicted transmembrane regions. The arginine occurs at the N terminus of the first helix and the first aspartate occurs in the middle of this helix. The molecular function of this region is unknown. However this region may be involved in transport of an as yet unknown set of ligands.

    \ ' '6101' 'IPR010433' '\

    This family consists of several plant specific eukaryotic initiation factor 4B proteins.

    \ ' '6103' 'IPR009376' '\

    This family consists of several Lactococcus lactis bacteriophage and L. lactis proteins of unknown function.

    \ ' '6104' 'IPR008090' '\

    Iron is essential for growth in both bacteria and mammals. Controlling the \ amount of free iron in solution is often used as a tactic by hosts to limit \ invasion of pathogenic microbes; binding iron tightly within protein \ molecules can accomplish this. Such iron-protein complexes include haem in \ blood, lactoferrin in tears/saliva, and transferrin in blood plasma. Some \ bacteria express surface receptors to capture eukaryotic iron-binding \ compounds, while others have evolved siderophores (enterobactins) to \ scavenge iron from iron-binding host proteins PUBMED:8057905. \

    \

    The control of such siderophore gene expression in Escherichia coli is under \ the regulation of the negative repressor protein FUR PUBMED:9990318. When complexed \ with Fe2+, it down-regulates the transcription not only of the siderophore \ genes, but also of the moieties that release Fe2+ ions bound to the hydrox-\ amate enterobactin proteins in the microbial cytoplasm PUBMED:9990318. An example of \ the latter is FhuF from the Gram-negative microbes Yersinia pestis, \ Salmonella typhi, and E. coli PUBMED:9990318. In conjunction with the \ siderophore system, this gene has been demonstrated to be essential for \ growth and virulence in pathogenic enterobacteria PUBMED:9990318.\

    \

    \ FhuF is a member of the [2Fe-2S] ferric iron reductase family. However,\ in place of the symmetrical tetrahedral arrangement at the ferric iron\ binding site, an unusual Cys-Cys C-terminal group distorts the site in this\ protein PUBMED:10322040. This property makes FhuF inherently unstable, and another set\ of regulatory genes, designated "suf", is thought to maintain its activity\ in the cytoplasm.\

    \ ' '6105' 'IPR009377' '\

    Proteins in this entry are EutA ethanolamine utilization proteins, reactivating factors for ethanolamine ammonia lyase, encoded by the ethanolamine utilization eut operon.

    \ \

    The holoenzyme of adenosylcobalamin-dependent ethanolamine ammonia-lyase (EutBC, , ), which is part of the ethanolamine utilization pathway PUBMED:2656649, PUBMED:10464203, PUBMED:11160088, undergoes suicidal inactivation during catalysis as well as inactivation in the absence of substrate. The inactivation involves the irreversible cleavage of the Co-C bond of the coenzyme. The inactivated holoenzyme undergoes rapid and continuous reactivation in the presence of ATP, Mg2+, and free adenosylcobalamin in permeabilised cells (in situ), homogenate, and cell extracts of Escherichia coli. The EutA protein is essential for reactivation. It was demonstrated with purified recombinant EutA that both the suicidally inactivated and O2-inactivated holoethanolamine ammonia lyase underwent rapid reactivation in vitro by EutA in the presence of adenosylcobalamin, ATP, and Mg2+ PUBMED:15466038. The inactive enzyme-cyanocobalamin complex was also activated in situ and in vitro by EutA under the same conditions. Thus EutA is believed to be the only component of the reactivating factor for ethanolamine ammonia lyase. Reactivation and activation occur through the exchange of modified coenzyme for free intact adenosylcobalamin PUBMED:15466038.

    \ \

    Bacteria that harbor the ethanolamine utilization pathway can use ethanolamine as a source of carbon and nitrogen. For more information on the ethanolamine utilization pathway, please see , .

    \ ' '6106' 'IPR009378' '\

    This family consists of several conserved eukaryotic proteins of unknown function.

    \ ' '6107' 'IPR010434' '\

    This family consists of several hypothetical bacterial proteins. Many of the sequences in this family are annotated as putative DNA binding proteins but the function of this family is unknown.

    \ ' '6108' 'IPR010435' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This domain of unknown function is present in bacterial and plant peptidases belonging to MEROPS peptidase family S8 (subfamily S8A subtilisin, clan SB). It is C-terminal to and adjacent to the S8 peptidase domain and can be found in conjunction with the PA (Protease associated) domain () and additionally in Gram-positive bacteria with the surface protein anchor domain ().

    \ ' '6109' 'IPR009379' '\

    Sulfolobus virus-like particle SSV1 and its fusellovirus homologues can be found in many acidic (pH less than 4.0) hot springs (greater than 70 degrees C) around the world. SSV1 contains a 15.5-kb double-stranded DNA genome that encodes 34 proteins with greater than 50 amino acids PUBMED:1926776. A site-specific integrase and a DnaA-like protein have been previously identified by sequence homology, and three structural proteins have been isolated from purified virus and identified by N-terminal sequencing (VP1, VP2, and VP3).

    \ ' '6110' 'IPR009380' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6111' 'IPR009381' '\

    This family consists of several bacterial ThuA like proteins. The function of the family is unknown.

    \ ' '6112' 'IPR010436' '\

    This family consists of several Cytomegalovirus UL84 proteins. The open reading frame UL84 of human cytomegalovirus encodes a multifunctional regulatory protein which is required for viral DNA replication and binds with high affinity to the immediate-early transactivator IE2-p86 PUBMED:12610148.

    \ ' '6114' 'IPR009382' '\

    This family consists of several insect coleoptericin, acaloleptin, holotricin and rhinocerosin proteins which are all known to be antibacterial proteins PUBMED:11520352. These all appear to be short, glycine-rich molecules, inducible by infection.

    \ ' '6115' 'IPR010437' '\

    This family describes a small protein, always smaller than 100 amino acids, encoded in pathogenicity islands for bacterial type III secretion systems in various strains of Yersinia, Salmonella, and enteropathogenic Escherichia coli, as well as Chromobacterium violaceum and Citrobacter rodentium. Although strictly associated with type III secretion systems, this protein seems not yet to have been characterised as part of the apparatus or as an effector protein.

    \ ' '6116' 'IPR009383' '\

    This family consists of several bacterial YihD proteins of unknown function PUBMED:9868784.

    \ ' '6117' 'IPR009384' '\

    This family consists of several bacterial FlbD flagellar proteins. The exact function of this family is unknown PUBMED:9168127.

    \ ' '6118' 'IPR009385' '\

    This family consists of several plasmid SOS inhibition protein (PsiB) sequences PUBMED:9987116.

    \ ' '6119' 'IPR010438' '\

    This family consists of several Bacteriophage lambda Bor and Escherichia coli Iss proteins. Expression of bor significantly increases the survival of the E. coli host cell in animal serum. This property is a well known bacterial virulence determinant indeed, bor and its adjacent sequences are highly homologous to the iss serum resistance locus of the plasmid ColV2-K94, which confers virulence in animals. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis PUBMED:2144037.

    \ ' '6120' 'IPR010439' '\

    This domain is often found in tandem repeats and co-occur with C2 domains , Protein kinase C, phorbol ester/diacylglycerol binding regions and PH domains .

    \ ' '6121' 'IPR010440' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    This entry represents lipopolysaccharide kinases which are related to protein kinases . This family includes waaP (rfaP) gene product is required for the addition of phosphate to O-4 of the first heptose residue of the lipopolysaccharide (LPS) inner core region. It has previously been shown that WaaP is necessary for resistance to hydrophobic and polycationic antimicrobials in E. coli and that it is required for virulence in invasive strains of Salmonella enterica\ PUBMED:11069912.

    \ ' '6122' 'IPR010441' '\

    This is a family of proteins of unknown function.

    \ ' '6123' 'IPR009386' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6124' 'IPR009387' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6125' 'IPR010442' '\

    The PET domain is a ~110 amino acid motif in the N-terminal part of LIM domain proteins. The domain was described in Drosophila proteins involved in cell differentiation and is named after Prickle, Espinas and Testin. PET domain proteins contain about three zinc-binding LIM domains (see , ) and are found among metazoans. The PET domain has been suggested to play a role in protein-protein interactions with proteins involved in planar polarity signalling or organisation of the cytoskeleton PUBMED:10485852. Some proteins known to contain a PET domain:

    \ \

    \ ' '6126' 'IPR009388' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight transmembrane protein PsbY found in PSII. In higher plants, two related PsbY proteins exist, PsbY-1 and PsbY-2, which appear to function as a heterodimer. In spinach and Arabidopsis, these two proteins arise from a single-copy nuclear gene that is processed in the chloroplast. By contrast, prokaryotic and organellar chromosomes encode a single PsbY protein, as found in cyanobacteria and red algae, indicating a duplication event in the evolution of higher plants PUBMED:15042356. PsbY has two low manganese-dependent activities: a catalase-like activity and an L-arginine metabolising activity that converts L-arginine into ornithine and urea PUBMED:9829828. In addition, a redox-active group is thought to be present in the protein. In cyanobacteria, PsbY deletion mutants have a slightly impaired PSII that is less capable of coping with low levels of calcium ions than the wild-type.

    \ ' '6127' 'IPR009389' '\

    This family consists of several hypothetical proteins from Agrobacterium, Rhizobium and Brucella species. The function of this family is unknown.

    \ ' '6128' 'IPR010443' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents type II restriction endonucleases such as Tsp45I, which recognises the DNA sequence 5\' GTSAC, cleaving prior to G-1 PUBMED:9427549.

    \ ' '6129' 'IPR010444' '\

    This family consists of several Bacteriophage lambda Kil protein like sequences. A cessation of division, followed by one or two fairly synchronous cell divisions in Escherichia coli is due to two genetically separable events: a temporary block of cell division and, at the same time, a block to the initiation of new rounds of DNA replication. The cell division block is a result of the transient expression of the lambda kil gene PUBMED:12441108.

    \ \

    The lambda kil gene has been shown to be responsible for premature lysis on the addition of chloramphenicol between 15 and 20 min after thermal induction of a lambda prophage PUBMED:8460474. Induction of a lambda prophage causes the death of the host cell even in the absence of phage replication and lytic functions due to expression of functions from the lambda p(L) operon. The kil gene causes cell death and filamentation PUBMED:11470529.

    \ ' '6131' 'IPR009390' '\

    This family consists of several uncharacterised bacterial proteins of unknown function.

    \ ' '6132' 'IPR008316' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '6133' 'IPR010445' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6134' 'IPR010446' '\

    This family consists of several beta-1,4-N-acetylgalactosaminyltransferase proteins from Campylobacter jejuni PUBMED:10660542.

    \ ' '6135' 'IPR010447' '\

    This family consists of several Herpesvirus IR6 proteins. The equine herpesvirus 1 (EHV-1) IR6 protein forms typical rod-like structures in infected cells, influences virus growth at elevated temperatures, and determines the virulence of EHV-1 Rac strains PUBMED:9811716.

    \ ' '6136' 'IPR009391' '\

    This family consists of several very short bacterial 23S rRNA methylase leader peptide (ErmC) sequences. ermC confers resistance to macrolide-lincosamide streptogramin B antibiotics by specifying a ribosomal RNA methylase, which results in decreased ribosomal affinity for these antibiotics. ermC expression is induced by exposure to erythromycin PUBMED:4018035.

    \ ' '6137' 'IPR010449' '\

    This entry represents a domain found in the cell-fate determinant Numb, and in related proteins. In Drosophila, two signalling pathways, one mediated by Numb and the other by Notch, play essential but antagonistic roles in enabling the two daughters to adopt different fates after a wide variety of asymmetric cell divisions PUBMED:16508312. Numb acts to inhibit Notch signalling, this inhibition being critical for many cell fate decisions PUBMED:17116748. Mammalian Numb (mNumb) has multiple functions and plays important roles in the regulation of neural development, including maintenance of neural progenitor cells and promotion of neuronal differentiation in the central nervous system (CNS) PUBMED:16508311.

    \ ' '6138' 'IPR010450' '\

    This family consists of mammalian neurexophilin proteins. Mammalian brains contain four different neurexophilin proteins. Neurexophilins form a family of related glycoproteins that are proteolytically processed after synthesis and bind to alpha-neurexins. The structure and characteristics of neurexophilins indicate that they function as neuropeptides that may signal via alpha-neurexins PUBMED:9570794.

    \ ' '6139' 'IPR010448' '\

    This family consists of several eukaryotic torsin proteins. Torsion dystonia is an autosomal dominant movement disorder characterised by involuntary, repetitive muscle contractions and twisted postures. The most severe early-onset form of dystonia has been linked to mutations in the human DYT1 (TOR1A) gene encoding a protein termed torsinA. While causative genetic alterations have been identified, the function of torsin proteins and the molecular mechanism underlying dystonia remain unknown. Phylogenetic analysis of the torsin protein family indicates these proteins share distant sequence similarity with the large and diverse family of AAA ATPase, central region containing proteins () proteins. It has been suggested that torsins play a role in effectively managing protein folding and that possible breakdown in a neuroprotective mechanism that is, in part, mediated by torsins may be responsible for the neuronal dysfunction associated with dystonia PUBMED:12554684.

    \ ' '6140' 'IPR009392' '\

    This family consists of several Drosophila ACP53EA accessory gland (seminal) proteins.

    \ ' '6141' 'IPR010451' '\

    Acetoacetate decarboxylase (ADC) is involved in solventogenesis in certain bacteria, which occurs at the end of the exponential growth phase when there is a metabolic switch from classical sugar fermentation with the production of acetate and butyrate to the re-internalisation and oxidation of these acids to acetate and butanol PUBMED:11824611. In Clostridium, SpoOA controls the switch from acid to solvent production. A SpoAO-binding motif occurs in the gene encoding ADC PUBMED:10972834.

    \

    This family also contains the fungal decarboxylase DEC1 encoded by the Tox1B locus, which along with the Tox1A gene product is required for the production of the polyketide T-toxin. The pathogenic fungus Cochliobolus heterostrophus (Drechslera maydis) requires the T-toxin for high virulence to maize with T-cytoplasm PUBMED:12236595.

    \ ' '6142' 'IPR010452' '\

    This family consists of several bacterial isocitrate dehydrogenase kinase/phosphatase (ICDH kinase/phosphatase) proteins. The enzyme has no activating compound but is specific for its substrate. It is a bifunctional enzyme that catalyses the reversible phosphorylation of isocitrate dehydrogenase (IDH, ) on a seryl residue. It consequently belongs to the serine/threonine protein kinase family PUBMED:11258918, and it acts in the Krebs cycle PUBMED:11751849. The catalytic activities of ICDH kinase/phosphatase constitute a moiety conserved cycle, require ATP, and exhibit \'zero-order ultrasensitivity\' PUBMED:16415587.

    \ ' '6143' 'IPR000758' '\ Virulence-related outer membrane proteins are expressed in Gram-negative bacteria and are essential to bacterial survival within macrophages and for eukaryotic cell invasion. Members of this group include: \
  • PagC, required by Salmonella typhimurium for survival in macrophages and for virulence in mice PUBMED:1766380
  • \
  • Rck outer membrane protein of the S. typhimurium virulence plasmid PUBMED:8675302
  • \
  • Ail, a product of the Yersinia enterocolitica chromosome capable of mediating bacterial adherence to and invasion of epithelial cell lines PUBMED:1688838
  • \
  • OmpX from Escherichia coli that promotes adhesion to and entry into mammalian cells. It also has a role in the resistance against attack by the human complement system PUBMED:1987115
  • \
  • a Bacteriophage lambda outer membrane protein, Lom PUBMED:1846140
  • \

    The crystal structure of OmpX from E. coli reveals that OmpX consists of an eight-stranded antiparallel all-next-neighbour beta barrel PUBMED:10545325. The structure shows two girdles of aromatic amino acid residues and a ribbon of nonpolar residues that attach to the membrane interior. The core of the barrel consists of an extended hydrogen-bonding network of highly conserved residues. OmpX thus resembles an inverse micelle. The OmpX structure shows that the membrane-spanning part of the protein is much better conserved than the extracellular loops. Moreover, these loops form a protruding beta sheet, the edge of which presumably binds to external proteins. It is suggested that this type of binding promotes cell adhesion and invasion and helps defend against the complement system. Although OmpX has the same beta-sheet topology as the structurally related outer membrane protein A (OmpA) , their barrels differ with respect to the shear numbers and internal hydrogen-bonding networks.

    \ ' '6144' 'IPR010453' '\

    This family consists of several Arenavirus RNA polymerase proteins () PUBMED:2705303.

    \ ' '6146' 'IPR009394' '\

    This family consists of several bacterial proteins of unknown function.

    \ ' '6147' 'IPR009395' '\

    This family consists of several eukaryotic GCN5-like protein 1 (GCN5L1) sequences. The function of this family is unknown PUBMED:8646881,PUBMED:9426003.

    \ ' '6148' 'IPR008110' '\

    Periodontal disease in humans is a major health problem in the developed \ world, and is caused by a number of specialised pathogens that inhabit \ the oral cavity. Amongst the bacterial species culturable from periodontal \ lesions are the streptococcal microbes Streptococcus mutans and Streptococcus sobrinus, and \ the Gram-negative anaerobe Porphyromonas gingivalis (Bacteroides gingivalis) PUBMED:2895100. The latter bacterium has been implicated as the causative agent of peridontitis, pulpal infections and tonsillar abcesses PUBMED:2895100.

    \

    Adherence by P. gingivalis to the periodontal surface is mediated by its \ major virulence factor fimbriae PUBMED:1987052. This differs from other pathogenic Gram-negative bacterial polymeric Type I and IV fimbriae/pili in that it is much more simplified, consisting of only a monomeric fimbrillin repeating subunit, Fma1/FimA. Fma1/FimA has a molecular weight of 43kDa, and can exhibit antigenic diversity in different P. gingivalis strains PUBMED:1987052. Unusually, this form of fimbrillin possesses a far longer leader peptide compared to the fimbrial subunits of other bacteria PUBMED:1987052. It has been hypothesised that this allows for the maturation of the preprotein during secretion PUBMED:1987052.

    \

    Recently, a study into the different antigenic types of P. gingivalis\ fimbrillin classified them into five distinct groups, depending on their \ gene sequences PUBMED:11748193. Investigations into the functional differences of each type revealed that in the majority of peridontitis cases, bacterial strains possessing the type II Fma1/FimA were the most prevalent PUBMED:11748193; in healthy adults, type I strains were the most common. This has implications for particular strains that are associated with periodontal disease.

    \ \ ' '6149' 'IPR010454' '\

    This family consists of several phage NinH proteins. The function of this family is unknown.

    \ ' '6150' 'IPR010455' '\

    This family consists of several phage antitermination protein Q and related bacterial sequences. Phage 82 gene Q encodes a phage-specific positive regulator of late gene expression, thought, by analogy to the corresponding gene of phage lambda, to be a transcription antiterminator PUBMED:3624233.

    \ ' '6151' 'IPR009396' '\

    This family consists of several eukaryotic pigment-dispersing hormone (PDH) proteins. The pigment-dispersing hormone (PDH) is produced in the eyestalks of Crustacea where it induces light-adapting movements of pigment in the compound eye and regulates the pigment dispersion in the chromatophores PUBMED:8477858.

    \ ' '6152' 'IPR010456' '\

    This family consists of several Ribosomal protein L11 methyltransferase sequences. Its genetic determinant is prmA, which forms a bifunctional operon with the downstream panF gene PUBMED:8226664. The role of L11 methylation in ribosome function is, as yet, unknown. Deletion of the prmA gene in Escherichia coli showed no obvious effect PUBMED:15317787 except for the production of undermethylated forms of L11 PUBMED:7715456. Methylation is the most common post-transcriptional modification to ribosomal proteins in all organisms. PrmA is the only bacterial enzyme that catalyses the methylation of a ribosomal protein PUBMED:12777815.

    \ ' '6153' 'IPR009397' '\

    This family consists of several Vesiculovirus matrix proteins. The matrix (M) protein of vesicular stomatitis virus (VSV) expressed in the absence of other viral components causes many of the cytopathic effects of VSV, including an inhibition of host gene expression and the induction of cell rounding. It has been shown that M protein also induces apoptosis in the absence of other viral components. It is thought that the activation of apoptotic pathways causes the inhibition of host gene expression and cell rounding by M protein PUBMED:12692256.

    \ ' '6154' 'IPR009398' '\

    Cyclic AMP (cAMP) is a ubiquitous signalling molecule which mediates many cellular processes by activating cAMP-dependent kinases and also inducing protein-protein interactions. This molecule is produced by the adenylate cyclase (AC) enzyme, using ATP as its substrate. Mammalian adenylate cyclase has nine closely related membrane-bound isoforms (AC1-9) showing significant sequence homology and sharing the same overall structure: two hydrophobic transmembrane domains, and two cytoplasmic domains that are responsible for the catalytic activity. These isoforms differ in both their tissue specificity and their regulation. Regulatory factors known to influence one or more of these isoforms include G proteins, protein kinases, calcium and calmodulin. For more information see PUBMED:11264454, PUBMED:12940771.

    \ \

    This entry represents a region of unknown function found in many of these isoforms. It is part of the N-terminal cytoplasmic domain but its presence is not necessary for catalytic activity PUBMED:9417641.

    \ \ ' '6155' 'IPR010457' '\

    This entry represents a ligand-binding domain that displays similarity to C2-set immunoglobulin domains (antibody constant domain 2) PUBMED:9501088. The two cysteine residues form a disulphide bridge.

    \ ' '6157' 'IPR010458' '\

    This family consists of several fungal trichodiene synthase proteins. TRI5 encodes the enzyme trichodiene synthase, which has been shown to catalyse the first step in the trichothecene pathways of Fusarium and Trichothecium species PUBMED:9529523,PUBMED:11698643.

    \ ' '6158' 'IPR009400' '\

    This entry represents nucleotide excision repair (NER) proteins, such as TTDA subunit of TFIIH basal transcription factor complex (also known as subunit 5 of RNA polymerase II transcription factor B), and Rex1. These proteins have a structural motif consisting of a 2-layer sandwich structure with an alpha/beta plait topology. Nucleotide excision repair is a major pathway for repairing UV light-induced DNA damage in most organisms.

    \

    Transcription/repair factor IIH (TFIIH) is essential for RNA polymerase II transcription and nucleotide excision repair. The TFIIH complex consists of ten subunits: ERCC2, ERCC3, GTF2H1, GTF2H2, GTF2H3, GTF2H4, GTF2H5, MNAT1, CDK7 and CCNH. Defects in GTF2H5 cause the disease trichothiodystrophy (TTD), therefore GTF2H5 (general transcription factor 2H subunit 5) is also known as the TTD group A (TTDA) subunit (and as Tfb5) PUBMED:15220921. The TTDA subunit is responsible for the DNA repair function of the complex. TTDA is present both bound to TFIIH, and as a free fraction that shuffles between the cytoplasm and nucleus; induction of NER-type DNA lesions shifts the balance towards TTDA\'s more stable association with TFIIH PUBMED:16669699. TTDA is also required for the stability of the TFIIH complex and for the presence of normal levels of TFIIH in the cell.

    \

    REX1 (required for excision 1) is required for DNA repair in the single-celled, photosynthetic algae Chlamydomonas reinhardtii PUBMED:12697762, and has homologues in other eukaryotes.

    \ ' '6160' 'IPR009401' '\

    This entry represents subunit Med13 of the Mediator complex and are involved in transcriptional repression PUBMED:12738880.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '6161' 'IPR009402' '\

    This family consists of several Orthopoxvirus A47 proteins. The function of this family is unknown.

    \ ' '6162' 'IPR009403' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6163' 'IPR009404' '\

    This family consists of several Coronavirus 5a proteins. The function of this family is unknown PUBMED:9168126.

    \ ' '6164' 'IPR010460' '\

    This domain, of unknown function, is found associated with ubiquitin carboxyl-terminal hydrolase family 2 (, MEROPS peptidase family C19). They are a family 100 to 200 kDa peptides which includes the Ubp1 ubiquitin peptidase from yeast.

    \ ' '6165' 'IPR010461' '\

    This family consists of several bacterial ComK proteins. ComK of Bacillus subtilis is a positive autoregulatory protein occupying a central position in the competence-signal-transduction network. It positively regulates the transcription of late competence genes, which specify morphogenetic and structural proteins necessary for construction of the DNA-binding and uptake apparatus, as well as the transcription of comK itself PUBMED:12761164,PUBMED:7783616. ComK specifically binds to the promoters of the genes that it affects. It has been found that ClpX plays an important role in the regulation of ComK at the post-transcriptional level PUBMED:12761164.

    \ ' '6166' 'IPR010462' '\

    This family consists of several bacterial ectoine synthase proteins. The ectABC genes encode the diaminobutyric acid acetyltransferase (EctA), the diaminobutyric acid aminotransferase (EctB), and the ectoine synthase (EctC). Together these proteins constitute the ectoine biosynthetic pathway PUBMED:11823218.

    \ ' '6167' 'IPR009405' '\

    This family consists of several Vibrio cholerae toxin co-regulated pilus biosynthesis protein F (TcpF) sequences. TcpF is known to be a secreted virulence protein but its exact function is unknown PUBMED:11466276.

    \ ' '6168' 'IPR009406' '\

    This family consists of several putative head-tail joining bacteriophage proteins.

    \ ' '6169' 'IPR010463' '\

    This family consists of uncharacterised proteins that are puatative lipases.

    \ ' '6171' 'IPR009407' '\

    The viral polyprotein of parechoviruses contains: coat protein VP0 (P1AB); coat protein VP3 (P1C); coat protein VP1 (P1D); picornain 2A (, core protein P2A); core protein P2B; core protein P2C; core protein P3A; genome-linked protein VPg (P3B); picornain 3C (, MEROPS peptidase subfamily 3CF: parechovirus picornain 3C (P3C)) PUBMED:9820139.

    \ \

    This entry consists of the genome-linked protein Vpg type P3B.

    \ ' '6172' 'IPR010465' '\

    This domain is found in Diaphanous-related formins (Drfs). It binds the N-terminal GTPase-binding domain; this link is broken when GTP-bound Rho binds to the GBD and activates the protein. The addition of diaphanous activating domains (DAD) to mammalian cells induces actin filament formation, stabilises microtubules, and activates serum-response mediated transcription PUBMED:12676083.

    \ ' '6173' 'IPR009408' '\

    This region is found in some of the Diaphanous related formins (Drfs) PUBMED:12676083. It consists of low complexity repeats of around 12 residues.

    \ ' '6174' 'IPR010466' '\ SH3 (src Homology-3) domains are small protein modules containing \ approximately 50 amino acid residues PUBMED:15335710, PUBMED:11256992. They are found in a \ great variety of intracellular or\ membrane-associated proteins PUBMED:1639195, PUBMED:14731533, PUBMED:7531822 for example, in a variety of\ proteins with enzymatic activity, in adaptor\ proteins that lack catalytic sequences and in cytoskeletal\ proteins, such as fodrin and yeast actin binding protein ABP-1. \

    The SH3 domain has a characteristic fold which consists of five or six beta-strands arranged as two tightly packed anti-parallel beta sheets. The linker\ regions may contain short helices PUBMED:. The surface of the SH3-domain bears a flat, hydrophobic ligand-binding pocket which consists of three shallow grooves defined by conservative aromatic residues in which the ligand adopts an extended left-handed helical arrangement. The ligand binds with low affinity but this may be enhanced by multiple interactions.\ The region bound by the SH3 domain is in all cases proline-rich and contains PXXP as a core-conserved binding motif. The function of the SH3 domain is not well understood but they may mediate many diverse processes such as increasing local concentration of proteins, altering their subcellular location and mediating the assembly of large multiprotein complexes PUBMED:7953536.

    \

    This family consists of several hypothetical bacterial proteins of unknown function, but that contain an SH-3 region.

    \ ' '6175' 'IPR009409' '\

    This family consists of several short hypothetical archaeal proteins of unknown function.

    \ ' '6177' 'IPR010468' '\

    This domain is found in several mammalian hormone-sensitive lipase (HSL) proteins. Hormone-sensitive lipase, a key enzyme in fatty acid mobilisation, overall energy homeostasis, and possibly steroidogenesis, is acutely controlled via reversible phosphorylation by catecholamines and insulin PUBMED:3420405.

    \ ' '6178' 'IPR009410' '\

    This family consists of several plant specific allene oxide cyclase proteins (). The allene oxide cyclase (AOC)-catalysed step in jasmonate (JA) biosynthesis is important in the wound response of tomato PUBMED:12581315.

    \ ' '6180' 'IPR009412' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6182' 'IPR009413' '\

    This family consists of several bacterial and eukaryotic Aegerolysin-like proteins. Aegerolysin and ostreolysin are expressed during formation of primordia and fruiting bodies, and these haemolysins may play an important role in initial phase of fungal fruiting. The bacterial members of this family are expressed during sporulation PUBMED:12020804. Ostreolysin was found cytolytic to various erythrocytes and tumour cells PUBMED:15912956. It forms transmembrane pores 4 nm in diameter. Its activity is inhibited by total membrane lipids, and modulated by lysophosphatides.

    \ ' '6183' 'IPR009414' '\

    This family consists of several phage and bacterial proteins whose functions have not been experimentally determined. Computational analysis involving sequence, predicted strucutre and genomic context suggests that these proteins may be endonucleases which function in phage genome segregation, or the repair of double-stranded breaks introduced during either this process or DNA replication PUBMED:15972856.

    \ ' '6184' 'IPR009415' '\

    This family consists of several Hadronyche versuta (Blue mountains funnel-web spider) specific omega-atracotoxin proteins. Omega-Atracotoxin-Hv1a is an insect-specific neurotoxin whose phylogenetic specificity derives from its ability to antagonise insect, but not vertebrate, voltage-gated calcium channels. Two spatially proximal residues, Asn(27) and Arg(35), form a contiguous molecular surface that is essential for toxin activity. It has been proposed that this surface of the beta-hairpin is a key site for interaction of the toxin with insect calcium channels PUBMED:11313356.

    \ ' '6185' 'IPR010470' '\

    This family consists of several Benyvirus proteins of unknown function.

    \ ' '6187' 'IPR009064' '\

    Protozoan pheromones are cell-type specific protein signals. This entry represents a family of mating ciliate pheromones (or gamones) from the protozoan Euplotes raikovi, including Er-1, Er-2, Er-10, Er11 and Er22. These pheromones are diffusible extracellular communication signals that distinguish different intra-specific classes of cells commonly referred to as \'mating types\'. Pheromones prepare these cells for conjugation by changing their cell surface properties. The mitogenic activity induced by Er pheromone autocrine signalling can be inhibited by cAMP PUBMED:12681291. The NMR structure of several pheromones have been determined, revealing a closed up-and-down bundle of three helices with a left-handed twist PUBMED:7833812, PUBMED:7833811, PUBMED:8515452, PUBMED:8844842, PUBMED:11246857. In some cases, different pheromones can compete with each other in binding to their cell-surface receptors.

    \ ' '6188' 'IPR009417' '\

    This family consists of several Rice tungro bacilliform virus P12 proteins. The function of this family is unknown PUBMED:2041739.

    \ ' '6189' 'IPR009418' '\

    This family consists of several hypothetical Mycobacterium leprae specific proteins. The function of this family is unknown.

    \ ' '6190' 'IPR009419' '\

    The viral polyprotein of parechoviruses contains: coat protein VP0 (P1AB); coat protein VP3 (P1C); coat protein VP1 (P1D); picornain 2A (, core protein P2A); core protein P2B; core protein P2C; core protein P3A; genome-linked protein VPg (P3B); picornain 3C (, MEROPS peptidase subfamily 3CF: parechovirus picornain 3C (P3C))PUBMED:9820139.

    \ \

    This entry consists of the parechovirus P3A protein. P3A has been identified as a genome-linked protein (VPg), which is involved in replication PUBMED:3018280.

    \ ' '6191' 'IPR010471' '\

    This family consists of several hypothetical plant proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown.

    \ ' '6192' 'IPR013836' '\

    This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation PUBMED:10722749, PUBMED:1694174. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighbouring foot processes in the glomerular epithelium by charge repulsion PUBMED:10982412.

    \ ' '6193' 'IPR009420' '\

    This family consists of several Enterobacterial FlhE flagellar proteins. The exact function of this family is unknown PUBMED:9387224.

    \ ' '6194' 'IPR010472' '\

    Formin homology (FH) proteins play a crucial role in the reorganization of the actin cytoskeleton, which mediates various functions of the cell cortex including motility, adhesion, and cytokinesis PUBMED:10631086. Formins are multidomain proteins that interact with diverse signalling molecules and cytoskeletal proteins, although some formins have been assigned functions within the nucleus. Formins are characterised by the presence of three FH domains (FH1, FH2 and FH3), although members of the formin family do not necessarily contain all three domains PUBMED:12538772. The proline-rich FH1 domain mediates interactions with a variety of proteins, including the actin-binding protein profilin, SH3 (Src homology 3) domain proteins, and WW domain proteins. The FH2 domain () is required to inhibit actin polymerisation. The FH3 domain is less well conserved and is required for directing formins to the correct intracellular location, such the mitotic spindle PUBMED:11171383, or the projection tip during conjugation PUBMED:9606213. In addition, some formins can contain a GTPase-binding domain (GBD) () required for binding to Rho small GTPases, and a C-terminal conserved Dia-autoregulatory domain (DAD).

    \

    This entry represents the FH3 domain.

    \ ' '6195' 'IPR006396' '\

    Glutamate mutase (methylaspartate mutase) catalyses the reversible interconversion of L-glutamate and L-threo-3-methylaspartate, the first step in the pathway of glutamate fermentation PUBMED:16285720. Catalysis is initiated using the cobalamin cofactor. The E subunit is the catalytic subunit (MutE) PUBMED:14738967.

    \ ' '6196' 'IPR009104' '\

    Sea anemones are a rich source of lethal pore-forming peptides and proteins, known collectively as cytolysins or actinoporins. There are several different groups of cytolysins based on their structure and function PUBMED:11689232. This entry represents the most numerous group, the 20-kDa highly basic peptides. These cytolysins form cation-selective pores in sphingomyelin-containing membranes. Examples include equinatoxins (from Actinia equina), sticholysins (from Stichodactyla helianthus), magnificalysins (from Heteractis magnifica), and tenebrosins (from Actinia tenebrosa), which exhibit pore-forming, haemolytic, cytotoxic, and heart stimulatory activities.

    \

    Cytolysins adopt a stable soluble structure, which undergoes a conformational change when brought in contact with a membrane, leading to an active, membrane-bound form that inserts spontaneously into the membrane. They often oligomerise on the membrane surface, before puncturing the lipid bilayers, causing the cell to lyse. The 20-kDa sea anemone cytolysins require a phosphocholine lipid headgroup for binding, however sphingomyelin is required for the toxin to promote membrane permeability PUBMED:14604518. The crystal structures of equinotoxin II PUBMED:11827489 and sticholysin II PUBMED:14604522 both revealed a compact beta-sandwich consisting of ten strands in two sheets flanked on each side by two short alpha-helices, which is a similar topology to osmotin. It is believed that the beta sandwich structure attaches to the membrane, while a three-turn alpha helix lying on the surface of the beta sheet may be involved in membrane pore formation, possibly by the penetration of the membrane by the helix.

    \ \ ' '6197' 'IPR009421' '\

    This family consists of several Maize streak virus 21.7 kDa proteins. The function of this family is unknown.

    \ ' '6198' 'IPR010473' '\

    Diaphanous-related formins (Drfs) are a family of formin homology (FH) proteins that act as effectors of Rho small GTPases during growth factor-induced cytoskeletal remodelling, stress fibre formation, and cell division PUBMED:10631086. Drf proteins are characterised by a variety of shared domains: an N-terminal GTPase-binding domain (GBD), formin-homology domains FH1, FH2 () and FH3 (), and a C-terminal conserved Dia-autoregulatory domain (DAD) that binds the GBD.

    \

    This entry represents the GBD, which is a bifunctional autoinhibitory domain that interacts with and is regulated by activated Rho family members. Mammalian Drf3 contains a CRIB-like motif within its GBD for binding to Cdc42, which is required for Cdc42 to activate and guide Drf3 towards the cell cortex where it remodels the actin skeleton PUBMED:12676083.

    \ ' '6199' 'IPR009422' '\

    This family consists of several mammalian Gemin6 proteins. The exact function of Gemin6 is unknown but it has been found to form part of the complex. The SMN complex plays a key role in the biogenesis of spliceosomal small nuclear ribonucleoproteins (snRNPs) and other ribonucleoprotein particles PUBMED:11748230.

    \ ' '6200' 'IPR009106' '\

    The cocaine and amphetamine regulated transcript (CART) is a brain-localised peptide that acts as a satiety factor in appetite regulation. CART was found to inhibit both normal and starvation-induced feeding, and completely blocks the feeding response induced by neuropeptide Y. CART is regulated by leptin in the hypothalamus, and can be transcriptionally induced after cocaine or amphetamine administration PUBMED:9590691. Posttranslational processing of CART produces an N-terminal CART peptide and a C-terminal CART peptide. The C-terminal CART peptide has been isolated from the hypothalamus, nucleus accumbens, and the anterior pituitary lobe in rats. C-terminal CART is the biologically active part of the molecule affecting food intake. The structure of C-terminal CART consists of a disulphide-bound fold containing a beta-hairpin and two adjacent disulphide bridges PUBMED:11478874.

    \ \ ' '6201' 'IPR009423' '\

    This family consists of several NADH-ubiquinone oxidoreductase subunit b14.5b proteins.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '6202' 'IPR010474' '\

    Bovine leukemia virus (BLV) is one of the most common infectious cattle viruses, with between 30 and 40% of cows in the United States being infected. It is closely related to the human T-cell leukaemia virus type 1 (HTLV-1) and has highly conserved envelope glycoprotein functional domains PUBMED:12368329. BLV is an oncogenic C-type retrovirus, which results in the animals developing a malignant lymphoma. BLV, like the human and simian T cell leukaemia viruses, is a deltaretrovirus. 182 residues at the amino-terminal of the BLV envelope glycoprotein surface unit encompass the receptor-binding domain. The metabolic activity in B cells, T cells, and thymocytes is indicated by the expression of the BLV-binding receptor PUBMED:18606640.

    \ \

    A candidate gene of the receptor (BLVR) is related, but unique, to a gene family of the delta subunit of the adaptor protein (AP) complex 3, AP-3 PUBMED:12692298. The AP-3 complex is not clathrin-associated but is associated with the Golgi region as well as more peripheral structures. It facilitates the budding of vesicles from the Golgi membrane and may be directly involved in trafficking to lysosomes.\

    \ ' '6203' 'IPR009424' '\

    This family consists of several short hypothetical plant proteins of unknown function.

    \ ' '6204' 'IPR010475' '\

    This family consists of several insect adipokinetic hormone as well as the related crustacean red pigment concentrating hormone. Flight activity of insects comprises one of the most intense biochemical processes known in nature, and therefore provides an attractive model system to study the hormonal regulation of metabolism during physical exercise. In long-distance flying insects, such as the migratory locust, both carbohydrate and lipid reserves are utilised as fuels for sustained flight activity. The mobilisation of these energy stores in Locusta migratoria (Migratory locust) is mediated by three structurally related adipokinetic hormones (AKHs), which are all capable of stimulating the release of both carbohydrates and lipids from the fat body PUBMED:9723879.

    \ ' '6205' 'IPR009425' '\

    This family consists of several hypothetical bacterial and phage proteins of unknown function.

    \ ' '6206' 'IPR010476' '\

    This family consists of several bacterial L-rhamnose-proton symport protein (RhaT) sequences PUBMED:1551902,PUBMED:8757746.

    \ ' '6207' 'IPR009426' '\

    This family consists of several Barley yellow dwarf virus proteins of unknown function.

    \ ' '6208' 'IPR009427' '\

    This entry represents a protein family of unknown function found in Borrelia species.

    \ ' '6209' 'IPR010477' '\

    This family consists of several proteins which appear to be specific to Drosophila melanogaster. The function of this family is unknown.

    \ ' '6211' 'IPR009428' '\

    This family consists of several eukaryotic beta-catenin-interacting (ICAT) proteins. Beta-catenin is a multifunctional protein involved in both cell adhesion and transcriptional activation. Transcription mediated by the beta-catenin/Tcf complex is involved in embryological development and is upregulated in various cancers. ICAT selectively inhibits beta-catenin/Tcf binding in vivo, without disrupting beta-catenin/cadherin interactions PUBMED:12408824.

    \ ' '6212' 'IPR009429' '\

    This family consists of several Baculovirus LEF-11 proteins. The exact function of this family is unknown although it has been shown that LEF-11 is required for viral DNA replication during the infection cycle PUBMED:11861844 and plays a role in late/very late gene activation.

    \ ' '6213' 'IPR009430' '\

    Gas vesicles provide cells with buoyancy, enabling them to remain at the water surface. These organelles are generally synthesized by halophilic archaea and cyanobacteria, as well as some other prokaryotes. A cluster of 12-14 gvp genes (gvpMLKJIHGFEDACNO)is responsible for gas vesicle synthesis in Halobacterium sp. PUBMED:15126480. GvpF and GvpL are essential for gas vesicle formation and display sequence similarity to one another, both containing predicted coiled-coil domains that are often involved in self-oligomerisation; and are structural components of the vesicle PUBMED:15126480.

    \ ' '6214' 'IPR009431' '\

    This family consists of several D1 dopamine receptor-interacting (calcyon) proteins. D1/D5 dopamine receptors in the basal ganglia, hippocampus, and cerebral cortex modulate motor, reward, and cognitive behaviour. D1-like dopamine receptors likely modulate neocortical and hippocampal neuronal excitability and synaptic function via Ca2+ as well as cAMP-dependent signalling PUBMED:11929934. Defective calcyon proteins have been implicated in both attention-deficit/hyperactivity disorder (ADHD) PUBMED:11923911 and schizophrenia.

    \ ' '6215' 'IPR009432' '\

    This family consists of several eukaryotic proteins of unknown function.

    \ ' '6216' 'IPR009433' '\

    This family consists of several membrane-associated protein VP24 sequences from a variety of Ebola viruses, as well as Lake Victoria marburgvirus. The VP24 protein of Ebola virus sp. is believed to be a secondary matrix protein and minor component of virions. VP24 possesses structural features commonly associated with viral matrix proteins and that VP24 may have a role in virus assembly and budding PUBMED:12525613.

    \ ' '6217' 'IPR009434' '\

    This family consists of several mammalian neuroendocrine-specific golgi protein P55 (NESP55) sequences. NESP55 is a novel member of the chromogranin family and is a soluble, acidic, heat-stable secretory protein that is expressed exclusively in endocrine and nervous tissues, although less widely than chromogranins PUBMED:12438142.

    \ ' '6219' 'IPR015877' '\

    MAT1 (menage a trois 1) is a RING finger protein with a characteristic C3HC4 motif located in the N-terminal domain. This entry represents the central region of MAT1. MAT1 stabilises the cyclin H-CDK7 complex to form a functional CDK-activating kinase (CAK) enzymatic complex which then goes on to activate many of the CDK enzymes intimately involved in the cell cycle PUBMED:11007478. CDK7 forms a stable complex with cyclin H and MAT1 in vivo only when phosphorylated on either one of two residues (Ser164 or Thr170) in its T-loop. The requirement for MAT1 for the activation of CAK can be by-passed by the phosphorylation of CDK7 on the T-loop. The two mechanisms for CDK7 complex stabilisation and activation (MAT1 addition and T-loop phosphorylation), which can operate independently in vitro, actually cooperate under physiological conditions to maintain complex integrity. With prolonged exposure to elevated temperature, dissociation to monomeric subunits occurs in vivo when CDK7 is dephosphorylated, even in the presence of MAT1 PUBMED:11447116.

    \

    The Cyclin H-MAT1-CDK7 complex also forms part of TFIIH, a multiprotein complex required for both transcription and DNA repair.

    \ ' '6220' 'IPR009435' '\

    The Asr protein is synthesised as a precursor and the cleavage is essential for moderate to high acid tolerance PUBMED:12670971. Enterobacteria have developed numerous constitutive and inducible strategies to sense and adapt to external acidity. These molecular responses require many specific acid shock proteins (ASPs) PUBMED:12670971.

    \ ' '6221' 'IPR010479' '\

    Apoptosis, or programmed cell death (PCD), is a common and evolutionarily conserved property of all metazoans PUBMED:11341280. In many biological processes, apoptosis is required to eliminate supernumerary or dangerous (such as pre-cancerous) cells and to promote normal development. Dysregulation of apoptosis can, therefore, contribute to the development of many major diseases including cancer, autoimmunity and neurodegenerative disorders. In most cases, proteins of the caspase family execute the genetic programme that leads to cell death.

    \

    Bcl-2 proteins are central regulators of caspase activation, and play a key role in cell death by regulating the integrity of the mitochondrial and endoplasmic reticulum (ER) membranes PUBMED:12631689. At least 20 Bcl-2 proteins have been reported in mammals, and several others have been identified in viruses. Bcl-2 family proteins fall roughly into three subtypes, which either promote cell survival (anti-apoptotic) or trigger cell death (pro-apoptotic). All members contain at least one of four conserved motifs, termed Bcl-2 Homology (BH) domains. Bcl-2 subfamily proteins, which contain at least BH1 and BH2, promote cell survival by inhibiting the adapters needed for the activation of caspases.

    \ \

    Pro-apoptotic members potentially exert their effects by displacing the adapters from the pro-survival proteins; these proteins belong either to the Bax subfamily, which contain BH1-BH3, or to the BH3 subfamily, which mostly only feature BH3 PUBMED:9735050. Thus, the balance between antagonistic family members is believed to play a role in determining cell fate. Members of the wider Bcl-2 family, which also includes Bcl-x, Bcl-w and Mcl-1, are described by their similarity to Bcl-2 protein, a member of the pro-survival Bcl-2 subfamily PUBMED:9735050. Full-length Bcl-2 proteins feature all four BH domains, seven alpha-helices, and a C-terminal hydrophobic motif that targets the protein to the outer mitochondrial membrane, ER and nuclear envelope.

    \

    BID is a member of the Bcl-2 superfamily of proteins that are key regulators of programmed cell death, hence this family is related to the Apoptosis regulator Bcl-2 protein BH domain. BID is a pro-apoptotic member of the Bcl-2 superfamily and as such posses the ability to target intracellular membranes and contains the BH3 death domain. The activity of BID is regulated by a Caspase 8-mediated cleavage event, exposing the BH3 domain and significantly changing the surface charge and hydrophobicity, which causes a change of cellular localisation PUBMED:10089878.

    \ ' '6222' 'IPR010480' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    The members of this group of proteins belong to MEROPS inhibitor family I33, clan IR; the nematode aspartyl protease inhibitors or Aspins. They are restricted to parasitic nematode species. Structural features common to the nematode Aspins include the presence of a signal peptide sequence and the conservation of all four cysteine residues in the mature protein. The Y[V.A]RDLT sequence motif has been suggested as being of crucial functional importance in several filarial nematode inhibitors PUBMED:8433724, this sequence is not conserved in Tco-API-1 from Trichostrongylus colubriformis (Black scour worm) and it has been demonstrated that Tco-API-1, is not an Aspin as it does not inhibit porcine pepsin PUBMED:13678638. Related inhibitors from Onchocerca volvulus, Ov33 PUBMED:9392607 and Ascaris suum (Pig roundworm), PI-3 PUBMED:9654082 inhibit the in vitro activity of aspartyl proteases such as pepsin and cathepsin E (MEROPS peptidase family A1).

    \

    Aspin may facilitate the safe passage of the eggs of Ascaris through the host stomach without digestion by pepsin PUBMED:3916913, PUBMED:9654082. The other parasitic nematodes known to express homologous proteins do not pass through the stomach of their hosts PUBMED:10896483. Several proteins in the family are potent allergens in mammals.

    \

    The three-dimensional structures of pepsin inhibitor-3 (PI-3) from A. suum and of the complex between PI-3 and porcine pepsin at 1. 75 A and 2.45 A resolution, respectively, have revealed the mechanism of aspartic protease inhibition. PI-3 has a new fold consisting of two identical domains, each comprising an antiparallel beta-sheet flanked by an alpha-helix. In the enzyme-inhibitor complex, the N-terminal beta-strand of PI-3 pairs with one strand of the \'active site flap\' (residues 70-82) of pepsin, thus forming an eight-stranded beta-sheet that spans the two proteins. PI-3 has a novel mode of inhibition, using its N-terminal residues to occupy and therefore block the first three binding pockets in pepsin for substrate residues C-terminal to the scissile bond (S1\'-S3\') PUBMED:10932249.

    \ \ ' '6223' 'IPR010481' '\

    This is a calponin homology domain.

    \ ' '6224' 'IPR009436' '\

    This family consists of several angiotensin II, type I receptor-associated protein (AGTRAP) sequences. AGTRAP is known to interact specifically with the C-terminal cytoplasmic region of the angiotensin II type 1 (AT(1)) receptor to regulate different aspects of AT(1) receptor physiology. The function of this family is unclear.

    \ ' '6225' 'IPR004462' '\ This domain is found as essentially the full length of desulforedoxin, a 37-residue homodimeric non-haem iron protein. It is also found as the N-terminal domain of desulfoferrodoxin (rbo), a homodimeric non-haem iron protein with 2 Fe atoms per monomer in different oxidation states. This domain binds the ferric rather than the ferrous Fe of desulfoferrodoxin. Neelaredoxin, a monomeric blue non-haem iron protein, lacks this domain.\ ' '6226' 'IPR010482' '\

    Peroxisomes play diverse roles in the cell, compartmentalising many activities related to lipid metabolism and functioning in the decomposition of toxic hydrogen peroxide. Sequence similarity was identified between two hypothetical proteins and the peroxin integral membrane protein Pex24p PUBMED:12707309.

    \ ' '6227' 'IPR009112' '\

    GTP cyclohydrolase I feedback regulatory protein (GFRP) in mammals helps regulate the biosynthesis of tetrahydrobiopterin through the feedback inhibition of the rate-limiting enzyme GTP cyclohydrolase I (GTPCHI). Tetrahydrobiopterin is the cofactor required for the hydroxylation of aromatic amino acids. The crystal structure of GFRP reveals that the protein forms a homopentamer PUBMED:11580249. In the presence of phenylalanine, the stimulatory complex consists of a GTPCHI decamer sandwiched by two GFRP pentamers, which is thought to enhance GTPCHI activity by locking the enzyme in the active state PUBMED:11818540. The structure of GFRP consists of two alpha/beta layers arranged beta(2)-alpha-beta(2)-alpha-beta(2), with antiparallel beta-sheets in the order 342165.

    \ ' '6228' 'IPR009066' '\

    The alpha-2-macroglobulin receptor-associated protein (RAP) is a glycoprotein that binds to the alpha-2-macroglobulin receptor, as well as to other members of the low density lipoprotein receptor family (). RAP acts to inhibit the binding of all know ligands for these receptors, and may prevent receptor aggregation and degradation in the endoplasmic reticulum, thereby acting as a molecular chaperone PUBMED:9207124. RAP may be under the regulatory control of calmodulin, since it is able to bind calmodulin and be phosphorylated by calmodulin-dependent kinase II ().

    \

    RAP is comprised of three domains. Both domains 1 and 3 are involved in binding to the alpha-2-macroglobulin receptor, while domain 1 is also involved in inhibiting the binding of activated alpha-2-macroglobulin (). Structural studies have revealed the RAP domain 1 to be comprised of a partly opened bundle of three helices, the first one being shorter than the other two.

    \ \ ' '6229' 'IPR010483' '\

    The alpha-2-macroglobulin receptor-associated protein (RAP) is a intracellular glycoprotein that binds to the 2-macroglobulin receptor and other members of the low density lipoprotein receptor family. The protein inhibits binding of all currently known ligands of these receptors PUBMED:9207124. Two different studies have provided conflicting domain boundaries.

    \ ' '6231' 'IPR009437' '\

    This family consists of several lamprin proteins from the Sea lamprey Petromyzon marinus. Lamprin, an insoluble non-collagen, non-elastin protein, is the major connective tissue component of the fibrillar extracellular matrix of lamprey annular cartilage. Although not generally homologous to any other protein, soluble lamprins contain a tandemly repeated peptide sequence (GGLGY), which is present in both silkmoth chorion proteins and spider dragline silk. Strong homologies to this repeat sequence are also present in several mammalian and avian elastins. It is thought that these proteins share a structural motif which promotes self-aggregation and fibril formation in proteins through interdigitation of hydrophobic side chains in beta-sheet/beta-turn structures, a motif that has been preserved in recognisable form over several hundred million years of evolution PUBMED:7678258.

    \ ' '6232' 'IPR009438' '\

    This family consists of several plant specific phytosulfokine precursor proteins. Phytosulfokines, are active as either a pentapeptide or a C-terminally truncated tetrapeptide. These compounds were first isolated because of their ability to stimulate cell division in somatic embryo cultures of Asparagus officinalis PUBMED:12049922.

    \ ' '6233' 'IPR009439' '\

    This family consists of several red chlorophyll catabolite reductase (RCC reductase) proteins. Red chlorophyll catabolite (RCC) reductase (RCCR) and pheophorbide (Pheide) a oxygenase (PaO) catalyse the key reaction of chlorophyll catabolism, porphyrin macrocycle cleavage of Pheide a to a primary fluorescent catabolite (pFCC) PUBMED:10743659.

    \ ' '6234' 'IPR009440' '\

    This family consists of several bacterial StbA plasmid stability proteins PUBMED:1706707.

    \ ' '6235' 'IPR009441' '\

    This entry represents P40 nucleoproteins from several Borna disease virus (BDV) strains. BDV is an RNA virus that is a member of the Mononegavirales family, which includes such members as Measles virus and Ebola virus sp.. BDV causes an infection of the central nervous system in a wide range of vertebrates, which can progress to an often fatal immune-mediated disease. Viral nucleoproteins are central to transcription, replication, and packaging of the RNA genome. P40 nucleoprotein from BDV is multi-helical in structure and can be divided into two subdomains, each of which has an alpha-bundle topology PUBMED:9882386. The nucleoprotein assembles into a planar homotetramer, with the RNA genome either wrapping around the outside of the tetramer or possibly fitting within the charged central channel of the tetramer PUBMED:.

    \ ' '6237' 'IPR009443' '\

    This family consists of a series of primate specific nuclear pore complex interacting protein (NPIP) sequences. The function of this family is unknown but is well conserved from African apes to humans PUBMED:11586358.

    \ ' '6239' 'IPR010486' '\

    HNS (histone-like nucleoid structuring)-dependent expression A (HdeA) protein is a stress response protein found in highly acid resistant bacteria such as Shigella flexneri and Escherichia coli, but which is lacking in mildly acid tolerant bacteria such as Salmonella PUBMED:10623550. HdeA is one of the most abundant proteins found in the periplasmic space of E. coli, where it is one of a network of proteins that confer an acid resistance phenotype essential for the pathogenesis of enteric bacteria PUBMED:12694615. HdeA is thought to act as a chaperone, functioning to prevent the aggregation of periplasmic proteins denatured under acidic conditions. The HNS protein, a chromatin-associated protein that influences the gene expression of several environmentally-induced target genes, represses the expression of HdeA. HdeB, which is encoded within the same operon, may form heterodimers with HdeA. HdeA is a single domain protein with an overall fold that is similar to the fold of the N-terminal subdomain of the GluRS anticodon-binding domain.

    \ ' '6240' 'IPR009444' '\

    This family consists of a group of TraD conjugal transfer proteins found primarily, though not exclusively, in the alphaproteobacteria PUBMED:8763953.

    \ ' '6241' 'IPR010487' '\

    This family consists of several mouse and human neugrin proteins. Neugrin and m-neugrin are mainly expressed in neurons in the nervous system, and are thought to play an important role in the process of neuronal differentiation PUBMED:11118320.

    \ ' '6242' 'IPR010488' '\

    This family consists of several bacterial zeta toxin proteins. Zeta toxin is thought to be part of a postregulational killing system in bacteria. It relies on antitoxin/toxin systems that secure stable inheritance of low and medium copy number plasmids during cell division and kill cells that have lost the plasmid PUBMED:12571357.

    \ ' '6243' 'IPR011258' '\

    This family represents the N-terminal region of the 2,3-bisphosphoglycerate-independent phosphoglycerate mutase (or phosphoglyceromutase or BPG-independent PGAM) protein (). The family is found in conjunction with Metalloenzyme (located in the C-terminal region of the protein).

    \ ' '6244' 'IPR010489' '\

    This family consists of several hypothetical bacterial proteins exclusive to Escherichia coli and Salmonella typhi. The function of this family is unknown.

    \ ' '6245' 'IPR009445' '\

    This family consists of several hypothetical eukaryotic proteins of unknown function.

    \ ' '6246' 'IPR017456' '\

    CTP synthase is involved in pyrimidine ribonucleotide/ribonucleoside metabolism, catalysing the synthesis of CTP from UTP by amination of the pyrimidine ring at the 4-position PUBMED:12522217. The enzyme exists as a dimer of identical chains that aggregates as a tetramer. This gene has been found roughly 500 bp upstream of enolase in both beta (Nitrosomonas europaea) and gamma (Escherichia coli) subdivisions of Proteobacterium PUBMED:9711852.

    \ ' '6247' 'IPR010490' '\

    COG6 is a component of the conserved oligomeric golgi complex, which is composed of eight different subunits and is required for normal golgi morphology and localisation.

    \ ' '6248' 'IPR009446' '\

    The mgm101 gene was identified as essential for maintenance of the mitochondrial genome in Saccharomyces cerevisiae PUBMED:10209025. Based on its DNA-binding activity, and experimental work with a temperature-sensitive mgm101 mutant, it has been proposed that the mgm101 gene product performs an essential function in the repair of oxidatively damaged mitochondrial DNA PUBMED:10209025.

    \ ' '6249' 'IPR013842' '\

    LepA (GUF1 in Saccaromyces) is a GTP-binding membrane protein related to EF-G and EF-Tu. Two types of phylogenetic tree, rooted by other GTP-binding proteins, suggest that eukaryotic homologs (including GUF1 of yeast) originated within the bacterial LepA family. The function of the proteins in this family is unknown.

    \ \

    This entry represents the C-terminal region of these proteins PUBMED:11489118.

    \ ' '6250' 'IPR010929' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain PUBMED:9873074.

    \ The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site PUBMED:11421269, PUBMED:1282354, PUBMED:9640644.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis PUBMED:11988180, PUBMED:11470432, PUBMED:11402022, PUBMED:9872322, PUBMED:11080142, PUBMED:11532960.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions PUBMED:9873074. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette PUBMED:9873074, PUBMED:11421270. More than 50 subfamilies have been described based on a phylogenetic and functional classification PUBMED:9873074, PUBMED:11421269, PUBMED:11421270; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    \

    In yeast, the PDR and CDR ABC transporters display extensive sequence homology, and confer resistance to several anti-fungal compounds by actively transporting their substrates out of the cell. These transporters have two homologous halves, each with an N-terminal intracellular hydrophilic region that contains an ATP-binding site, followed by a C-terminal membrane-associated region containing six transmembrane segments PUBMED:12709320. This entry represents a domain of the PDR/CDR ABC transporter comprising extracellular loop 3, transmembrane segment 6 and a linker region.

    \ ' '6251' 'IPR009447' '\

    Glycosylphosphatidylinositol (GPI) is a conserved post-translational modification to anchor cell surface proteins to plasma membrane in eukaryotes. GWT1 is involved in GPI anchor biosynthesis; it is required for inositol acylation in yeast PUBMED:12714589.

    \ ' '6252' 'IPR010491' '\

    This domain is specific to the N-terminal part of the prp1 splicing factor, which is involved in mRNA splicing (and possibly also poly(A)+ RNA nuclear export and cell cycle progression). This domain is specific to the N terminus of the RNA splicing factor encoded by prp1 PUBMED:9003295. It is involved in mRNA splicing and possibly also poly(A)and RNA nuclear export and cell cycle progression.

    \ ' '6254' 'IPR010493' '\

    The N-terminal domain of serine acetyltransferase has a sequence that is conserved in plants PUBMED:7608200 and bacteria PUBMED:7608200.

    \ ' '6255' 'IPR009448' '\

    The N-terminal region of this group of proteins is required for correct folding of the ER UDP-Glc: glucosyltransferase. These proteins selectively reglucosylates unfolded glycoproteins, thus providing quality control for protein transport out of the ER. Unfolded, denatured glycoproteins are substantially better substrates for glucosylation by this enzyme than are the corresponding native proteins. This protein and transient glucosylation may be involved in monitoring and/or assisting the folding and assembly of newly made glycoproteins, in order to identify glycoproteins that need assistance in folding from chaperones

    \ ' '6256' 'IPR009449' '\

    In Saccharomyces cerevisiae, Sec2p is a GDP/GTP exchange factor for Sec4p, which is required for vesicular transport at the post-Golgi stage of yeast secretion PUBMED:9199166.

    \ ' '6257' 'IPR010930' '\

    This entry consists of a number of C-terminal domains of unknown function. This domain seems to be specific to flagellar basal-body rod and flagellar hook proteins in which is often present at the extreme N terminus.

    \ ' '6258' 'IPR010931' '\

    This entry represents the C-terminal region of RepB proteins from Lactococcus lactis (See ).

    \ ' '6259' 'IPR010932' '\

    The group of polyomaviruses is formed by the homonymous murine virus (Py) as well as other representative members such as the simian virus 40 (SV40) and the human BK and JC viruses PUBMED:8824775. Their large T antigen (T-ag) protein binds to and activates DNA replication from the origin of DNA replication (ori). Insofar as is known, the T-ag binds to the origin first as a monomer to its pentanucleotide recognition element. The monomers are then thought to assemble into hexamers and double hexamers, which constitute the form that is active in initiation of DNA replication. When bound to the ori, T-ag double hexamers encircle DNA PUBMED:17139255. T-ag is a multidomain protein that contains an N-terminal J domain, which mediates protein interactions (see , ), a central origin-binding domain (OBD), and a C-terminal superfamily 3 helicase domain (see , ) PUBMED:16611889.

    \

    This entry represents the helicase domain of LTag, which assembles into a hexameric structure containing a positively charged central channel that can bind both single- and double-stranded DNA PUBMED:12774115. ATP binding and hydrolysis trigger large conformational changes which are thought to be coupled to the melting of origin DNA and the unwinding of duplex DNA PUBMED:15454080. These conformational changes cause the angles and orientations between regions of a monomer to alter, creating what was described as an "iris"-like motion in the hexamer. In addition to this, six beta hairpins on the channel surface move longitudinally along the central channel, possibly serving as a motor for pulling DNA into the LTag double hexamer for unwinding.

    \ ' '6260' 'IPR009450' '\

    Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins. The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This step involves products of three or four genes in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively.

    \ ' '6261' 'IPR009451' '\

    Methylamine dehydrogenase () is a periplasmic quinoprotein found in several methyltrophic bacteria PUBMED:8021187. It is induced when grown on methylamine as a carbon source MADH and catalyses the oxidative deamination of amines to their corresponding aldehydes. The redox cofactor of this enzyme is tryptophan tryptophylquinone (TTQ). Electrons derived from the oxidation of methylamine are passed to an electron acceptor, which is usually the blue-copper protein amicyanin ().

    \ \ \ \

    MADH is a hetero-tetramer, comprised of two heavy subunits and two light subunits. The heavy subunit forms a seven-bladed beta-propeller like structure PUBMED:9514722.

    \ ' '6262' 'IPR015929' '\

    Aconitase (aconitate hydratase; ) is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop PUBMED:10087914, PUBMED:15877277. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal \'swivel\' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the \'swivel\' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3) PUBMED:9020582. Below is a description of some of the multi-functional activities associated with different aconitases.

    \ \

    \ \

    \ \

    \ \

    \ \

    This entry represents the N-terminal region of bacterial aconitase B (AcnB), which consists of both a HEAT-like domain and a \'swivel\' domain. HEAT-like domains are usually implicated in protein-protein interactions, while the \'swivel\' domain is usually a mobile unit in proteins that carry it. In AcnB, this N-terminal region was shown to be sufficient for dimerisation and for AcnB binding to mRNA. An iron-mediated dimerisation mechanism may be responsible for switching AcnB between its catalytic and regulatory roles, as dimerisation requires iron while mRNA binding is inhibited by iron.

    \

    More information about these proteins can be found at Protein of the Month: Aconitase PUBMED:.

    \ ' '6263' 'IPR010494' '\

    This entry represents several repeats of 31 residues in length and seems to be exclusive to Moraxella catarrhalis UspA proteins. The UspA1 and UspA2 proteins of M. catarrhalis are structurally related and are exposed on the bacterial cell surface where can function adhesins PUBMED:10671460. This repeat is commonly found with the .

    \ ' '6264' 'IPR009452' '\

    This entry consists of several Pneumovirus matrix glycoprotein M2 sequences. This family functions as a transcription processivity factor that is essential for virus replication PUBMED:12692207.

    \ ' '6265' 'IPR009453' '\

    The Saccharomyces cerevisiae ISN1 (YOR155c) gene encodes an IMP-specific 5\'-nucleotidase, which catalyses degradation of IMP to inosine as part of the purine salvage pathway.

    \ ' '6266' 'IPR010495' '\

    Free iron is limited in vertebrate hosts, thus an alternative to siderophores has been developed by pathogenic bacteria to access host iron bound in protein complexes. HasA is a secreted haemophore that has the ability to obtain iron from haemoglobin. Once bound to HasA, the haem is shuttled to the receptor HasR, which releases the haem into the bacterium PUBMED:10360351.

    \ ' '6267' 'IPR010496' '\

    This is a family of proteins of unknown function.

    \ ' '6268' 'IPR009052' '\

    This entry represents the theta subunit of DNA polymerase III from bacteria, whose core structure consists of an irregular array of three helices PUBMED:10794414.

    \ \

    DNA polymerase III (Pol III) is the primary enzyme responsible for replication of Escherichia coli chromosomal DNA. The holoenzyme consists of 17 proteins and contains two core polymerases. The Pol III catalytic core has three tightly associated subunits: alpha, epsilon and theta. The alpha subunit is responsible for the DNA polymerase activity, while the epsilon subunit is the 3\'-5\' proofreading exonuclease. The epsilon subunit binds to both the alpha and theta subunits in the linear order alpha-epsilon-theta. The theta subunit is the smallest, and may act to enhance the proofreading activity of epsilon, especially under extreme conditions PUBMED:16753031.

    \ \

    This entry also includes a homologue of polymerase III theta called HOT (homologue of theta) from Bacteriophage P1. HOT contains three alpha-helices, as reported for theta, but the folding topology of the two is different, which could account for the suggested greater heat stability of HOT as compared to theta PUBMED:15576035.

    \ ' '6269' 'IPR010497' '\

    This entry represents the N-terminal region of the eukaryotic epoxide hydrolase protein. Epoxide hydrolases () comprise a group of functionally related enzymes that catalyse the addition of water to oxirane compounds (epoxides), thereby usually generating vicinal trans-diols. EHs have been found in all types of living organisms, including mammals, invertebrates, plants, fungi and bacteria. In animals, the major interest in EH is directed towards their detoxification capacity for epoxides since they are important safeguards against the cytotoxic and genotoxic potential of oxirane derivatives that are often reactive electrophiles because of the high tension of the three-membered ring system and the strong polarisation of the C--O bonds. This is of significant relevance because epoxides are frequent intermediary metabolites, which arise during the biotransformation of foreign compounds PUBMED:10548561. This domain is often found in conjunction with .

    \ \ ' '6270' 'IPR009159' '\

    Dihydrofolate reductase (DHFR) () catalyses the NADPH-dependent reduction of dihydrofolate to tetrahydrofolate, an essential step in de novo synthesis both of glycine and of purines and deoxythymidine phosphate (the precursors of DNA synthesis) PUBMED:2830673, and important also in the conversion of deoxyuridine monophosphate to deoxythymidine monophosphate. Although DHFR is found ubiquitously in prokaryotes and eukaryotes, and is found in all dividing cells, maintaining levels of fully reduced folate coenzymes, the catabolic steps are still not well understood PUBMED:3383852.

    \

    Bacterial species possesses distinct DHFR enzymes (based on their pattern of binding diaminoheterocyclic molecules), but mammalian DHFRs are highly similar PUBMED:500653. The active site is situated in the N-terminal half of the sequence, which includes a conserved Pro-Trp dipeptide; the tryptophan has been shown PUBMED:6815178 to be involved in the binding of substrate by the enzyme. Its central role in DNA precursor synthesis, coupled with its inhibition by antagonists such as trimethoprim and methotrexate, which are used as anti-bacterial or anti-cancer agents, has made DHFR a target of anticancer chemotherapy. However, resistance has developed against some drugs, as a result of changes in DHFR itself PUBMED:2601715.

    \

    This entry represents a plasmid-encoded DHFR which shows a high level of resistance to the antibiotic trimethoprim. It is a homotetramer with an unusual pore, which contains the active site, passing through the middle of the molecule PUBMED:7583655. Its structure is unrelated to that of chromosomal DHFRs.

    \ ' '6271' 'IPR010498' '\

    This is a family of enterotoxigenic bacterial adhesins.

    \ ' '6272' 'IPR010933' '\

    This entry represents the C-terminal region specific to the eukaryotic NADH dehydrogenase subunit 2 protein and is found in conjunction with . NADH dehydrogenase (ubiquinone) is a flavoprotein (FAD) containing iron-sulphur centres. It is found across the Eukaryotes.

    \ ' '6273' 'IPR010499' '\

    This domain is found in the probable effector binding domain of a number of different bacterial transcription activators PUBMED:10802742 and is also present in some DNA gyrase inhibitors. The absence of a HTH motif in the DNA gyrase inhibitors is thought to indicate the fact that these do not bind DNA.

    \ ' '6274' 'IPR010500' '\

    Hepcidin is a antibacterial and anti-fungal protein expressed in the liver and is also a signalling molecule in iron metabolism. The hepcidin protein is cysteine-rich and forms a distorted beta-sheet with an unusual disulphide bond found at the turn of the hairpin PUBMED:12138110.

    \ ' '6276' 'IPR009454' '\

    This entry represents a conserved open beta-sheet domain found in several lipid transport proteins, including vitellogenin and apolipoprotein B-100 PUBMED:9687371.

    \

    Vitellinogen precursors provide the major egg yolk proteins that are a source of nutrients during early development of oviparous vertebrates and invertebrates. Vitellinogen precursors are multi-domain apolipoproteins that are cleaved into distinct yolk proteins. Different vitellinogen precursors exist, which are composed of variable combinations of yolk protein components; however, the cleavage sites are conserved. In vertebrates, a complete vitellinogen is composed of an N-terminal signal peptide for export, followed by four regions that can be cleaved into yolk proteins: heavy chain lipovitellin (lipovitellin-1), phosvitin, light chain lipovitellin (lipovitellin-2), and a von Willebrand factor type D domain (YGP40) PUBMED:17314313, PUBMED:12135361. In vitellinogen, this domain is often found as part of the lipovitellin-1 peptide product.

    \

    Apolipoprotein B can exist in two forms: B-100 and B-48. Apoliporotein B-100 is present on several lipoproteins, including very low-density lipoproteins (VLDL), intermediate density lipoproteins (IDL) and low density lipoproteins (LDL), and can assemble VLDL particles in the liver PUBMED:16238675. Apolipoprotein B-100 has been linked to the development of atherosclerosis.

    \ ' '6277' 'IPR009455' '\

    The domain is found exclusively in plant mitochonchria and is a putative homing endonuclease, though such a function remains to be demonstrated. The domain is found C-terminal to the plant mitochondrial ATPase subunit 8 domain .

    \ ' '6278' 'IPR004671' '\

    The Escherichia coli NhaB Na+:H+ Antiporter (NhaB) protein has 12 predicted TMS, and catalyses sodium/proton exchange. Unlike NhaA, , this activity is not pH dependent.

    \ ' '6279' 'IPR009456' '\

    Moricin is a antibacterial peptide that is highly basic. The structure of moricin reveals that it is comprised of a long alpha-helix. The N terminus of the helix is amphipathic, and the C terminus of the helix is predominately hydrophobic. The amphipathic N-terminal segment of the alpha- helix is mainly responsible for the increase in permeability of the bacterial membrane which kills the bacteria PUBMED:11997013.

    \ ' '6280' 'IPR010502' '\

    This entry represents the family 9 carbohydrate-binding module (CBD9), which exhibit an immunoglobulin-like beta-sandwich fold, with an additional beta-strand at the N-terminus PUBMED:12796496.

    \

    Bacterial extracellular cellulases and hemicellulases are involved in the hydrolysis of the major structural polysaccharides of plant cell walls. These are usually modular enzymes that contain catalytic and non-catalytic domains. The CBD9 domain binds to cellulose, xylan, as well as to a range of soluble di- and mono-saccharides, and is found in cellulose- and xylan-degrading enzymes, such as endo-1,4-beta-xylanase () PUBMED:9752722.

    \ ' '6281' 'IPR010503' '\

    These are B subunits from the type II heat-labile enterotoxin. The B subunits form a pentameric ring, which interacts with one A subunit. Thus, the structural arrangement of type I and type II heat-labile enterotoxins are very similar PUBMED:8805549.

    \ ' '6282' 'IPR009457' '\

    This family consists of several hypothetical plant specific proteins of unknown function.

    \ ' '6283' 'IPR010934' '\

    This entry represents the C-terminal region of several NADH dehydrogenase subunit 5 proteins and is found in conjunction with and . Subunit 5 is the core of the mitochondrial membrane respiratory chain NADH dehydrogenase.

    \ ' '6284' 'IPR010504' '\

    Arfaptin interacts with ARF1, a small GTPase involved in vesicle budding at the Golgi complex and immature secretory granules. The structure of arfaptin shows that upon binding to a small GTPase, arfaptin forms a an elongated, crescent-shaped dimer of three-helix coiled-coils PUBMED:11346801. The N-terminal region of ICA69 is similar to arfaptin PUBMED:12682071.

    \ ' '6285' 'IPR009458' '\

    Ectatomin is a toxin from the venom of the ant Ectatomma tuberculatum. Ectatomin can efficiently insert into the plasma membrane, where it can form channels. Ectatomin was shown to inhibit L-type calcium currents in isolated rat cardiac myocytes PUBMED:10336635. In these cells, ectatomin induces a gradual, irreversible increase in ion leakage across the membrane, which can lead to cell death.

    \

    Ectatomin is comprised of two subunits, A and B, which are homologous. The structure of ectatomin reveals that each subunit consists of two alpha helices with a connecting hinge region, which form a hairpin structure that is stabilised by disulphide bridges. A disulphide bridge between the hinge regions of the two subunits links the heterodimer together, forming a closed bundle of four helices with a left-handed twist PUBMED:7881269.

    \ ' '6286' 'IPR009459' '\

    This entry represents a series of repeated sequences of around 50 residues in length. The repeat is found in bacterial peptidoglycan bound proteins and is often found in conjunction with and .

    \ ' '6287' 'IPR009460' '\

    The release of Ca2+ ions from intracellular stores is a key step in a wide variety of cellular functions. In striated muscle, the release of Ca2+ from the sarcoplasmic reticulum (SR) leads to muscle contraction. Ca2+ release occurs through large, high-conductance Ca2+ release channels, also known as ryanodine receptors (RyRs) because they bind the plant alkaloid ryanodine with high affinity and specificity PUBMED:15110152.

    \

    This region covers TM regions 4-6 of the ryanodine receptor 1 family.

    \ ' '6288' 'IPR009461' '\

    This domain covers the NSP13 region of the coronavirus polyprotein. This protein has the predicted function of an mRNA cap-1 methyltransferase PUBMED:12809601. The human coronavirus 229E (HCoV-229E) replicase gene-encoded nonstructural protein 13 (nsp13) contains an N-terminal zinc-binding domain and a C-terminal superfamily 1 helicase domain PUBMED:15220459. All natural ribonucleotides and nucleotides are substrates of nsp13, with ATP, dATP, and GTP being hydrolyzed most efficiently. Using the NTPase active site, HCoV-229E nsp13 also mediates RNA 5\'-triphosphatase activity, which may be involved in the capping of viral RNAs.

    \ ' '6289' 'IPR009462' '\

    This entry represents several eukaryotic domains of unknown function, which are present in chromodomain helicase DNA binding proteins. This domain is often found in conjunction with , , , and .

    \ ' '6290' 'IPR006624' '\

    Tectonins I and II are two dominant proteins in the nuclei and nuclear matrix from plasmodia of Physarum polycephalum (Slime mold) which encode 217 and 353 amino acids, respectively. Tectonin I is homologous to the C-terminal two-thirds of tectonin II. Both proteins contain six tandem repeats that are each 33-37 amino acids in length and define a new consensus sequence. Homologous repeats are found in L-6, a bacterial lipopolysaccharide-binding lectin from horseshoe crab hemocytes. The repetitive sequences of the tectonins and L-6 are reminiscent of the WD repeats of the beta-subunit of G proteins, suggesting that they form beta-propeller domains. The tectonins may be lectins that function as part of a transmembrane signalling complex during phagocytosis PUBMED:9497393.

    \ ' '6291' 'IPR010505' '\

    The majority of molybdenum-containing enzymes utilise a molybdenum cofactor (MoCF or Moco) consisting of a Mo atom coordinated via a cis-dithiolene moiety to molybdopterin (MPT). MoCF is ubiquitous in nature, and the pathway for MoCF biosynthesis is conserved in all three domains of life. MoCF-containing enzymes function as oxidoreductases in carbon, nitrogen, and sulphur metabolism PUBMED:16784786, PUBMED:12114025.

    \ \

    In Escherichia coli, biosynthesis of MoCF is a three stage process. It begins with the MoaA and MoaC conversion of GTP to the meta-stable pterin intermediate precursor Z. The second stage involves MPT synthase (MoaD and MoaE), which converts precursor Z to MPT; MoeB is involved in the recycling of MPT synthase. The final step in MoCF synthesis is the attachment of mononuclear Mo to MPT, a process that requires MoeA and which is enhanced by MogA in an Mg2 ATP-dependent manner PUBMED:17198377. MoCF is the active co-factor in eukaryotic and some prokaryotic molybdo-enzymes, but the majority of bacterial enzymes requiring MoCF, need a modification of MTP for it to be active; MobA is involved in the attachment of a nucleotide monophosphate to MPT resulting in the MGD co-factor, the active co-factor for most prokaryotic molybdo-enzymes. Bacterial two-hybrid studies have revealed the close interactions between MoeA, MogA, and MobA in the synthesis of MoCF PUBMED:12372836. Moreover the close functional association of MoeA and MogA in the synthesis of MoCF is supported by fact that the known eukaryotic homologues to MoeA and MogA exist as fusion proteins: CNX1 () of Arabidopsis thaliana (Mouse-ear cress), mammalian Gephryin (e.g. ) and Drosophila melanogaster (Fruit fly) Cinnamon () PUBMED:8528286.

    \ \

    This entry represents MoaA, which belongs to a family of enzymes involved in the synthesis of metallo-cofactors (). Each subunit of the MoaA dimer is comprised of an N-terminal SAM domain () that contains the [4Fe-4S] cluster typical for this family of enzymes, as well as an additional [4Fe-4S] cluster in the C-terminal domain that is unique to MoaA proteins PUBMED:15317939. The unique Fe site of the C-terminal [4Fe-4S] cluster is thought to be involved in the binding and activation of 5\'-GTP.

    \

    Mutations in the human MoCF biosynthesis proteins MOCS1, MOCS2 or GEPH cause MoCF Deficiency type A (MOCOD), causing the loss of activity of MoCF-containing enzymes, resulting in neurological abnormalities and death PUBMED:12754701.

    \ \ ' '6292' 'IPR010506' '\

    This domain binds DMAP1, a transcriptional co-repressor.

    \ ' '6293' 'IPR009463' '\

    This is a group of proteins of unknown function.

    \ ' '6294' 'IPR009464' '\

    This region is spliced out of isoform 2. It is predicted to be of a mixed alpha/beta fold - though predominantly helical.

    \ ' '6295' 'IPR010507' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    MYM-type zinc fingers were identified in MYM family proteins PUBMED:9716603. Human protein is involved in a chromosomal translocation and may be responsible for X-linked retardation in XQ13.1 PUBMED:8817323. is also involved in disease. In myeloproliferative disorders it is fused to FGF receptor 1 PUBMED:9576949; in atypical myeloproliferative disorders it is rearranged PUBMED:9694738. Members of the family generally are involved in development. This Zn-finger domain functions as a transcriptional trans-activator of late vaccinia viral genes, and orthologues are also found in all nucleocytoplasmic large DNA viruses, NCLDV. This domain is also found fused to the C termini of recombinases from certain prokaryotic transposons PUBMED:9716603.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '6296' 'IPR009465' '\

    This conserved region is found in the N-terminal half of several Spondin proteins. Spondins are involved in patterning axonal growth trajectory through either inhibiting or promoting adhesion of embryonic nerve cells PUBMED:11287656.

    \ ' '6297' 'IPR010508' '\

    This domain is found in the neurobeachins. The function of this region is not known.

    \ ' '6298' 'IPR010935' '\

    This entry represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction PUBMED:12411491.

    \ ' '6299' 'IPR009466' '\

    This region of coronavirus polyproteins encodes the NSP11 protein.

    \ ' '6300' 'IPR010509' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain PUBMED:9873074.

    \ The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site PUBMED:11421269, PUBMED:1282354, PUBMED:9640644.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis PUBMED:11988180, PUBMED:11470432, PUBMED:11402022, PUBMED:9872322, PUBMED:11080142, PUBMED:11532960.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions PUBMED:9873074. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette PUBMED:9873074, PUBMED:11421270. More than 50 subfamilies have been described based on a phylogenetic and functional classification PUBMED:9873074, PUBMED:11421269, PUBMED:11421270; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    \

    This region covers the N terminus and first two membrane regions of a small family of ABC transporters. Mutations in this domain in are believed responsible for Zellweger Syndrome-2 PUBMED:1301993; mutations in are responsible for recessive X-linked adrenoleukodystrophy PUBMED:8441467. A Saccharomyces cerevisiae protein containing this domain is involved in the import of long-chain fatty acids PUBMED:8670886.

    \ ' '6301' 'IPR010510' '\

    This family consists of several mammalian FGF binding protein 1. Fibroblast growth factors (FGFs) play important roles during foetal and embryonic development PUBMED:11819092. Fibroblast growth factor-binding protein (FGF-BP) 1 is a secreted protein that can bind fibroblast growth factors (FGFs) 1 and 2 PUBMED:11509569.

    \ ' '6302' 'IPR010511' '\

    This entry comprises the N-terminal domain of membrane-bound lytic murein transglycosylase D PUBMED:10843862.

    \ ' '6303' 'IPR009467' '\

    This family consists of several hypothetical bacterial proteins. The function of this family is unknown.

    \ ' '6304' 'IPR009468' '\

    This family consists of several bacterial proteins of unknown function and is known as YqjC in Escherichia coli.

    \ ' '6305' 'IPR010512' '\

    This family consists of several Drosophila melanogaster specific proteins. The function of this family is unknown.

    \ ' '6306' 'IPR009469' '\

    This domain represents the N-terminal region of the coronavirus RNA-directed RNA Polymerase.

    \ ' '6307' 'IPR010513' '\

    This domain is found in a group of endoribonucleases PUBMED:9637683. Specifically, these enzymes cleave an intron from Hac1 mRNA in humans, which cause it to be much more efficiently translated.

    \ ' '6308' 'IPR011546' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This domain is found in the FtsH family of proteins that include FtsH a membrane-bound ATP-dependent protease universally conserved in prokaryotes PUBMED:12732516. The FtsH peptidases, which belong to MEROPS peptidase family M41 (clan MA(E)), efficiently degrade proteins that have a low thermodynamic stability - e.g. they lack robust unfoldase activity. This feature may be key and implies that this could be a criterion for degrading a protein. In Oenococcus oeni (Leuconostoc oenos) FtsH is involved in protection against environmental stress PUBMED:12667449, and shows increased expression under heat or osmotic stress. These two lines of evidence suggest that it is a fundamental prokaryotic self-protection mechanism that checks if proteins are correctly folded. The precise function of this N-terminal region is unclear.

    \ ' '6309' 'IPR010514' '\

    COX2 (Cytochrome O ubiquinol OXidase 2) is a major component of the respiratory complex during vegetative growth. It transfers electrons from a quinol to the binuclear centre of the catalytic subunit 1. The function of this region is not known.

    \ ' '6310' 'IPR010515' '\

    NC10 stands for Non-helical region 10 and is taken from . A mutation in this region in is associated with an increased risk of prostrate cancer. This domain is cleaved from the precursor and forms endostatin. Endostatin is a key tumour suppressor and has been used highly successfully to treat cancer. It is a potent angiogenesis inhibitor PUBMED:11606364. Endostatin also binds a zinc ion near the N terminus; this is likely to be of structural rather than functional importance according to PUBMED:10704302.

    \ ' '6311' 'IPR009470' '\

    This ~170 aa region is found at the C-terminal to the catalytic domain () found in members of glycoside hydrolase family 18.

    \ ' '6312' 'IPR009471' '\

    Teneurins are a family of phylogenetically conserved transmembrane glycoproteins expressed during pattern formation and morphogenesis PUBMED:11146505. Originally discovered as ten-m and ten-a in Drosophila melanogaster, the teneurin family is conserved from Caenorhabditis elegans (ten-1) to vertebrates, in which four paralogs exist (teneurin-1 to -4 or odz-1 to -4). Their distinct domain architecture is highly conserved between invertebrate and vertebrate teneurins, particularly in the extracellular part. The intracellular domains of Ten-a, Ten-m/Odz and C. elegans Ten-1 are significantly different, both in size and structure, from the comparable domains of vertebrate teneurins, but the extracellular domains of all of these proteins are remarkably similar.

    \

    The large C-terminal extracellular domain consists of eight EGF-like repeats (see ), a region of conserved cysteines and unique YD-repeats. The N-terminal\ intracellular domain of vertebrate teneurins contains two EF-hand-like calcium-binding motifs and two polyproline regions involved in protein-protein interactions, followed by a single-span transmembrane domain. The intracellular domain is linked to the cytoskeleton through its interaction with the adaptor protein CAP/ponsin and can be cleaved near (or possibly in)\ the transmembrane domain and transported to the nucleus PUBMED:12361962, PUBMED:10588872, giving teneurins the\ potential to act as transcription factors PUBMED:12783990, PUBMED:12783990. There is considerable divergence between intracellular domains of invertebrate and vertebrate teneurins as well as between different invertebrate proteins PUBMED:10341219, PUBMED:12783990, PUBMED:16406038, PUBMED:17095284, PUBMED:17502993.

    \ \

    This domain is found in the intracellular N-terminal region of the Teneurin family.

    \ ' '6313' 'IPR009472' '\

    This family consists of several hypothetical proteins of unknown function all from photosynthetic organisms including plants and cyanobacteria.

    \ ' '6314' 'IPR006542' '\

    These are a family of small (about 115 amino acids) uncharacterised proteins with N-terminal signal sequences, found exclusively in Gram-positive organisms. Most genomes that have any members of this family have at least two members.

    \ ' '6315' 'IPR010516' '\

    This family consists of several eukaryotic Sin3 associated polypeptide p18 (SAP18) sequences. SAP18 is known to be a component of the Sin3-containing complex, which is responsible for the repression of transcription via the modification of histone polypeptides PUBMED:9150135. SAP18 is also present in the ASAP complex which is thought to be involved in the regulation of splicing during the execution of programmed cell death PUBMED:12665594.

    \ ' '6316' 'IPR010517' '\

    This family consists of several Lactococcus lactis bacteriophage F4-1 major structural proteins PUBMED:8892814.

    \ ' '6317' 'IPR009473' '\

    This family consists of several Orthopoxvirus A49R proteins. The function of this family is unknown.

    \ ' '6318' 'IPR010518' '\

    This domain is found at the N terminus of a subset of sigma54-dependent transcriptional activators that are involved in regulation of flagellar motility e.g. FleQ in Pseudomonas aeruginosa. It is clearly related to , but lacks the conserved aspartate residue that undergoes phosphorylation in the classic two-component system response regulator ().

    \ ' '6319' 'IPR009474' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6321' 'IPR009475' '\

    This family represents the N-terminal region of several proteins found in Caenorhabditis elegans. The family is often found with .

    \ ' '6323' 'IPR010519' '\

    This family consists of transformer proteins from several Drosophila species and also from Ceratitis capitata (Mediterranean fruit fly). The transformer locus (tra) produces an RNA processing protein that alternatively splices the doublesex pre-mRNA in the sex determination hierarchy of Drosophila melanogaster PUBMED:8013913.

    \ ' '6324' 'IPR009476' '\

    This family consists of several bacterial putative membrane proteins.

    \ ' '6325' 'IPR009477' '\

    This family consists of several hypothetical Baculovirus proteins of unknown function.

    \ ' '6328' 'IPR010520' '\

    This family consists of bacterial proteins of unknown function, which are hydrolase-like.

    \ ' '6329' 'IPR009479' '\

    This family consists of several human herpesvirus U55 proteins. The function of this family is unknown.

    \ ' '6330' 'IPR009480' '\

    This family consists of several equine infectious anaemia virus S2 proteins. The function of this family is unknown.

    \ ' '6331' 'IPR010521' '\

    This family consists of several hypothetical Fijivirus proteins of unknown function.

    \ ' '6332' 'IPR010522' '\

    This family consists of several bacterial replication protein C (RepC) sequences.

    \ ' '6333' 'IPR010523' '\

    This domain is found at the N terminus of a subset of sigma54-dependent transcriptional activators in several proteobacteria, including activators of phenol degradation such as XylR. It is found adjacent to .

    \ ' '6334' 'IPR010524' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    This entry represents a domain found at the N terminus of several sigma54- dependent transcriptional activators including PrpR, which activates catabolism of propionate. In Salmonella enterica subsp. enterica serovar Typhimurium, PrpR acts as a sensor of 2-methylcitrate (2-MC), an intermediate of the 2-methylcitric acid cycle used by this bacterium to convert propionate to pyruvate PUBMED:15528672.

    \ ' '6335' 'IPR010525' '\

    This pattern represents a conserved region of auxin-responsive transcription factors.

    \

    The plant hormone auxin (indole-3-acetic acid) can regulate the gene expression of several families, including Aux/IAA, GH3 and SAUR families. Two related families of proteins, Aux/IAA proteins () and the auxin response factors (ARF), are key regulators of auxin-modulated gene expression PUBMED:12036262. There are multiple ARF proteins, some of which activate, while others repress transcription. ARF proteins bind to auxin-responsive cis-acting promoter elements (AuxREs) using an N-terminal DNA-binding domain. It is thought that Aux/IAA proteins activate transcription by modifying ARF activity through the C-terminal protein-protein interaction domains () found in both Aux/IAA and ARF proteins.

    \ ' '6336' 'IPR018317' '\

    This protein family is represented by a single member in nearly every completed large (> 1000 genes) prokaryotic genome.

    \ \

    In Rhizobium meliloti (Sinorhizobium meliloti), a species in which the exo genes make succinoglycan, a symbiotically important exopolysaccharide, exsB is located nearby and affects succinoglycan levels, probably through polar effects on exsA expression or the same polycistronic mRNA PUBMED:8544814, PUBMED:9045825.

    \ \

    In Arthrobacter viscosus, the homologous gene is designated alu1 and is associated with an aluminum tolerance phenotype. When expressed in Escherichia coli, it conferred aliminium tolerance PUBMED:9367855.

    \ \

    The entry also contains the gene queC, which is responsible for the conversion of GTP to 7-cyano-7-deazaguanine (preQ0). The biosynthesis of hypermodified tRNA nucleoside queuosine only occurs in eubacteria. It occupies the wobble position for all known tRNAs that are specific for Asp, Asn, His or Tyr PUBMED:7748953.

    \ ' '6338' 'IPR009482' '\

    This family consists of several hypothetical archaeal proteins of unknown function.

    \ ' '6339' 'IPR009483' '\

    This family consists of several invasion plasmid antigen IpaD proteins. Entry of Shigella flexneri into epithelial cells and lysis of the phagosome involve the IpaB, IpaC, and IpaD proteins, which are secreted by type III secretion machinery, and appear to form a multi-protein complex capable of inducing the phagocytic event which internalizes the bacterium PUBMED:11083774.

    \ ' '6340' 'IPR010526' '\

    Members of this entry contain a region found exclusively in eukaryotic sodium channels or their subunits, many of which are voltage-gated. Members very often also contain between one and four copies of and, less often, one copy of .

    \ ' '6341' 'IPR009484' '\

    This family consists of several repeats of around 30 residues in length which are found specifically in mature-parasite-infected erythrocyte surface antigen proteins from Plasmodium falciparum. This family often found in conjunction with .

    \ ' '6342' 'IPR010527' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    In PSII, the oxygen-evolving complex (OEC) is responsible for catalysing the splitting of water to O(2) and 4H+. The OEC is composed of a cluster of manganese, calcium and chloride ions bound to extrinsic proteins. In cyanobacteria there are five extrinsic proteins in OEC (PsbO, PsbP-like, PsbQ-like, PsbU and PsbV), while in plants there are only three (PsbO, PsbP and PsbQ), PsbU and PsbV having been lost during the evolution of green plants PUBMED:15258264.

    \

    This family represents the PSII extrinsic protein PsbU, which forms part of the OEC in cyanobacteria and red algae. PsbU acts to stabilise the oxygen-evolving machinery of PSII against heat-induced inactivation, which is crucial for cellular thermo-tolerance PUBMED:10318707.

    \ \ ' '6343' 'IPR009485' '\

    This family consists of several Borna disease virus P10 (or X) proteins. Borna disease virus (BDV) is unique among the non-segmented negative-strand RNA viruses of animals and man because it transcribes and replicates its genome in the nucleus of the infected cell. It has been suggested that the p10 protein plays a role in viral RNA synthesis or ribonucleoprotein transport PUBMED:10725419.

    \ ' '6344' 'IPR009486' '\

    This family consists of several purine nucleoside permease from both bacteria and fungi PUBMED:9802205.

    \ ' '6345' 'IPR009487' '\

    This family consists of several Orthopoxvirus A43R proteins. The function of this family is unknown.

    \ ' '6346' 'IPR009488' '\

    This family consists of several hypothetical proteins of unknown function which appear to be found exclusively in Helicobacter pylori.

    \ ' '6347' 'IPR014161' '\

    TolA couples the inner membrane complex of itself with TolQ and TolR to the outer membrane complex of TolB and OprL (also called Pal). Most of the length of the protein consists of low-complexity sequence that may differ in both length and composition from one species to another, complicating efforts to discriminate TolA (the most divergent gene in the tol-pal system) from paralogs such as TonB. Selection of members of the seed alignment and criteria for setting scoring cut-offs are based largely on conserved operon structure. The Tol-Pal complex is required for maintaining outer membrane integrity, and is also involved in transport (uptake) of colicins and filamentous DNA, and implicated in pathogenesis. Transport is energized by the proton motive force. TolA is an inner membrane protein that interacts with periplasmic TolB and with outer membrane porins OmpC, PhoE and LamB.

    \ ' '6349' 'IPR009489' '\

    This family consists of several plant specific PAR1 proteins from Nicotiana tabacum (Common tobacco) and Arabidopsis thaliana (Mouse-ear cress). The function of this family is unknown.

    \ ' '6350' 'IPR010530' '\

    This family consists of several plant specific B12D proteins. The function of this protein is unknown but in barley B12D transcripts are expressed mainly during seed maturation and germination PUBMED:11473698.

    \ ' '6351' 'IPR009490' '\

    This family consists of several hypothetical bacterial proteins found in Escherichia coli and Citrobacter rodentium. The function of this family is unknown.

    \ ' '6352' 'IPR010531' '\

    This family consists of several NOA36 proteins which contain 29 highly conserved cysteine residues. The function of this protein is unknown.

    \ ' '6353' 'IPR010532' '\

    Members of this family are blue-copper redox proteins designated sulfocyanin, from the archaeal genera Sulfolobus, Ferroplasma, and Picrophilus. The most closely related proteins characterised as functionally different are the rusticyanins.

    \ ' '6354' 'IPR009491' '\

    This family consists of several short, hypothetical bacterial proteins of unknown function.

    \ ' '6355' 'IPR009492' '\

    This family consists of several bacterial TniQ proteins. TniQ along with TniA and B is involved in the transposition of the mercury-resistance transposon Tn5053 that carries the mer operon. It has been suggested that the tni genes are involved in the dissemination of integrons PUBMED:8594337.

    \ ' '6356' 'IPR009493' '\

    This family consists of several phage and bacterial proteins which are closely related to the GpE tail protein from Phage P2.

    \ ' '6357' 'IPR010533' '\

    This entry includes vertebrate transcription factors, some of which are regulated by IL-3/adenovirus E4 promoter binding protein PUBMED:1620116. Others were found to strongly repress transcription in a DNA-binding-site-dependent manner PUBMED:1620116.

    \ ' '6358' 'IPR010534' '\

    This family consists of several phage antitermination protein Q and related bacterial sequences. Antiterminator proteins control gene expression by recognising control signals near the promoter and preventing transcriptional termination which would otherwise occur at sites that may be a long way downstream PUBMED:8332211.

    \ ' '6359' 'IPR009494' '\

    This family consists of several bacterial proteins from Staphylococcus aureus as well as a number of phage proteins. The function of this family is unknown.

    \ ' '6360' 'IPR009495' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6361' 'IPR010535' '\

    This family consists of hypothetical proteins specific to Oryza sativa. One sequence () appears to be tandemly repeated.

    \ ' '6362' 'IPR009496' '\

    This entry contains of several mammalian and one bird sequence from Gallus gallus (Chicken) and represents the C-terminal region of several sequences, but in others it represents the full protein. All of the mammalian proteins are hypothetical and have no known function, but from the chicken is annotated as being a repulsive guidance molecule (RGM). RGM is a GPI-linked axon guidance molecule of the retinotectal system. RGM is repulsive for a subset of axons, those from the temporal half of the retina. Temporal retinal axons invade the anterior optic tectum in a superficial layer, and encounter RGM expressed in a gradient with increasing concentration along the anterior-posterior axis. Temporal axons are able to receive posterior-dependent information by sensing gradients or concentrations of guidance cues. Thus, RGM is likely to provide positional information for temporal axons invading the optic tectum in the stratum opticum PUBMED:12353034.

    \ \ ' '6363' 'IPR010536' '\

    This entry represents the N-terminal region of several mammalian and one bird sequence from Gallus gallus (Chicken). All of the mammalian proteins are hypothetical and have no known function but from the chicken is annotated as being a repulsive guidance molecule (RGM). RGM is a GPI-linked axon guidance molecule of the retinotectal system. RGM is repulsive for a subset of axons, those from the temporal half of the retina. Temporal retinal axons invade the anterior optic tectum in a superficial layer, and encounter RGM expressed in a gradient with increasing concentration along the anterior-posterior axis. Temporal axons are able to receive posterior-dependent information by sensing gradients or concentrations of guidance cues. Thus, RGM is likely to provide positional information for temporal axons invading the optic tectum in the stratum opticum PUBMED:12353034.

    \ ' '6364' 'IPR010537' '\

    This family contains avian adenovirus fibre proteins, which have been linked to variations in virulence PUBMED:8764019. Avian adenoviruses possess penton capsomers that consist of a pentameric base associated with two fibres PUBMED:7563058.

    \ ' '6365' 'IPR010538' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6368' 'IPR013068' '\

    Galanin is a peptide hormone that controls various biological activities PUBMED:1710578. Galanin-like immuno-reactivity has been found in the central and peripheral nervous systems of mammals, with high concentrations demonstrated in discrete regions of the central nervous system, including the median eminence, hypothalamus, arcuate nucleus, septum, neuro-intermediate lobe of the pituitary, and the spinal cord. Its localisation within neurosecretory granules suggests that galanin may function as a neurotransmitter, and it has been shown to coexist with a variety of other peptide and amine neurotransmitters within individual neurons PUBMED:2448788.

    \ \

    Although the precise physiological role of galanin is uncertain, it has a number of pharmacological properties: it stimulates food intake, when injected into the third ventricle of rats; it increases levels of plasma growth hormone and prolactin, and decreases dopamine levels in the median eminence PUBMED:2448788; and infusion into humans results in hyperglycemia and glucose intolerance, and inhibits pancreatic release of insulin, somatostatin and pancreatic peptide. Galanin also modulates smooth muscle contractility within the gastro-intestinal and genito-urinary tracts, all such activities suggesting that the hormone may play an important role in the nervous modulation of endocrine and smooth muscle function PUBMED:2448788.

    \ \

    This domain represents the galanin message-associated peptide (GMAP) domain which is found C-terminal to the galanin domain in the preprogalanin precursor protein. GMAP sequences in different species show a high degree of homology, but the biological function of the GMAP peptide is not known PUBMED:9639260.

    \ \ ' '6369' 'IPR010540' '\

    This family consists of several bacterial proteins of unknown function.

    \ ' '6370' 'IPR009497' '\

    This family consists of hypothetical Caenorhabditis elegans proteins.

    \ ' '6371' 'IPR009498' '\

    This entry represents the C terminus of various Lactococcus bacteriophage repressor proteins.

    \ ' '6372' 'IPR010541' '\

    This entry represents the C terminus of several eukaryotic RWD domain-containing proteins of unknown function.

    \ ' '6373' 'IPR009499' '\

    This family contains hypothetical bacterial proteins of unknown function.

    \ ' '6374' 'IPR010542' '\

    This domain represents the C-terminal region of vertebrate heat shock transcription factors. Heat shock transcription factors regulate the expression of heat shock proteins - a set of proteins that protect the cell from damage caused by stress and aid the cell\'s recovery after the removal of stress PUBMED:11509572. This C-terminal region is found with the N-terminal , and may contain a three-stranded coiled-coil trimerisation domain and a CE2 regulatory region, the latter of which is involved in sustained heat shock response PUBMED:11509572.

    \ ' '6375' 'IPR010543' '\

    This entry represents the C terminus of a number of hypothetical plant proteins.

    \ ' '6376' 'IPR010544' '\

    This domain represents a region within kinesin-related proteins from higher plants. Many proteins containing this domain also contain the domain. Kinesins are ATP-driven microtubule motor proteins that produce directed force PUBMED:12471890. Some family members are associated with the phragmoplast, a structure composed mainly of microtubules that executes cytokinesis in higher plants PUBMED:10898978.

    \ ' '6377' 'IPR009500' '\

    This family consists of several hypothetical plant proteins of unknown function.

    \ ' '6378' 'IPR010545' '\

    This family consists of several hypothetical archaeal proteins of unknown function.

    \ ' '6379' 'IPR010546' '\

    This family consists of several bacterial proteins, at least one of which is involved in enzyme induction following nitrogen deprivation. The exact function of this family is unknown

    \ ' '6380' 'IPR010547' '\

    This family consists of several plant specific mitochondrial import receptor subunit TOM20 (translocase of outer membrane 20 kDa subunit) proteins. Most mitochondrial proteins are encoded by the nuclear genome, and are synthesised in the cytosol. TOM20 is a general import receptor that binds to mitochondrial pre-sequences in the early step of protein import into the mitochondria PUBMED:12691756.

    \ ' '6381' 'IPR010548' '\

    This family consists of several mammalian specific BCL2/adenovirus E1B 19 kDa protein-interacting protein 3 or BNIP3 sequences. BNIP3 belongs to the Bcl-2 homology 3 (BH3)-only family, a Bcl-2-related family possessing an atypical Bcl-2 homology 3 (BH3) domain, which regulates PCD from mitochondrial sites by selective Bcl-2/Bcl-XL interactions. BNIP3 family members contain a C-terminal transmembrane domain that is required for their mitochondrial localisation, homodimerisation, as well as regulation of their pro-apoptotic activities. BNIP3-mediated apoptosis has been reported to be independent of caspase activation and cytochrome c release and is characterised by early plasma membrane and mitochondrial damage, prior to the appearance of chromatin condensation or DNA fragmentation PUBMED:12690108.

    \ ' '6382' 'IPR009103' '\

    Olfactory marker protein (OMP) is a highly expressed, cytoplasmic protein found in mature olfactory sensory receptor neurons of all vertebrates. OMP is a modulator of the olfactory signal transduction cascade. The crystal structure of OMP reveals a beta sandwich consisting of eight strands in two sheets with a jelly-roll topology PUBMED:12054873. Three highly conserved regions have been identified as possible protein-protein interaction sites in OMP, indicating a possible role for OMP in modulating such interactions, thereby acting as a molecular switch PUBMED:12054872.

    \ \ ' '6384' 'IPR010549' '\

    This entry represents the C-terminal region of the African swine fever virus (ASFV) IAP-like protein p27. This domain is found in conjunction with . It has been suggested that the domain may be incoded by the gene involved in aspects of infection in the arthropod host, ticks of the genus Ornithodoros PUBMED:9143281.

    \ ' '6385' 'IPR008304' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '6386' 'IPR009502' '\

    This family consists of several bacterial Secretion monitor precursor (SecM) proteins. SecM is known to regulate SecA expression by translational coupling of the secM secA operon. Translational pausing at a specific Pro residue 5 residues before the end of the protein may allow disruption of a mRNA repressor helix that normally suppresses secA translation initiation. The eubacterial protein secretion machinery consists of a number of soluble and membrane associated components. One critical element is SecA ATPase, which acts as a molecular motor to promote protein secretion at translocation sites that consist of SecYE, the SecA receptor, and SecG and SecDFyajC proteins, which regulate SecA membrane cycling PUBMED:10986266.

    \ ' '6387' 'IPR010550' '\

    This family consists of several bacterial 2\'-deoxycytidine 5\'-triphosphate deaminase proteins ().

    \ ' '6388' 'IPR010551' '\

    This family consists of several bacterial and archaeal glucose-6-phosphate isomerase (GPI) proteins (), which are involved in glycolysis and in gluconeogenesis and catalyse the conversion of D-glucose 6-phosphate to D-fructose 6-phosphate. The deduced amino acid sequence of the first archaeal PGI isolated from Pyrococcus furiosus revealed that it is not related to its eukaryotic and many of its bacterial counterparts. In contrast, this archaeal PGI shares similarity with the cupin superfamily that consists of a variety of proteins that are generally involved in sugar metabolism in both prokaryotes and eukaryotes PUBMED:11533028.

    \ ' '6391' 'IPR009503' '\

    This family consists of several short Lactococcus lactis and bacteriophage proteins. The function of this family is unknown.

    \ ' '6392' 'IPR009504' '\

    The YhjQ protein is encoded immediately upstream of bacterial cellulose synthase (bcs) genes in a broad range of bacteria, including both copies of the bcs locus in Klebsiella pneumoniae, and in several species is clearly part of the bcs operon. It is identified as a probable component of the bacterial cellulose metabolic process not only by gene location, but also by partial phylogenetic profiling, or Haft-Selengut algorithm PUBMED:16930487, based on a bacterial cellulose biosynthesis genome property profile. Cellulose plays an important role in biofilm formation and structural integrity in some bacteria. Mutants in yhjQ in Escherichia coli, show altered morphology an growth, but the function of YhjQ has not yet been determined.

    \ \

    This entry represents a subset of the YhjQ proteins.

    \ ' '6393' 'IPR010554' '\

    This group contains several eukaryote specific repeats of around 35 residues in length. The function of this family is unknown.

    \ ' '6394' 'IPR010555' '\

    This family represents the chondroitin sulphate attachment domain of vertebrate neural transmembrane proteoglycans that contain EGF modules. Evidence has been accumulated to support the idea that neural proteoglycans are involved in various cellular events including mitogenesis, differentiation, axonal outgrowth and synaptogenesis PUBMED:9321696. This domain contains several potential sites of chondroitin sulphate attachment, as well as potential sites of N-linked glycosylation PUBMED:9950058.

    \ ' '6395' 'IPR009505' '\

    This entry represents the C-terminal cytoplasmic domain of vertebrate neural chondroitin sulphate proteoglycans that contain EGF modules. Evidence has been accumulated to support the idea that neural proteoglycans are involved in various cellular events including mitogenesis, differentiation, axonal outgrowth and synaptogenesis PUBMED:9321696. This domain contains a number of potential sites of phosphorylation by protein kinase C PUBMED:9950058.

    \ ' '6396' 'IPR009506' '\

    This family is found in several hypothetical bacterial proteins. In some cases it represents it represents the C-terminal region whereas in others it represents the whole sequence.

    \ ' '6397' 'IPR009507' '\

    This family consists of several short, hypothetical bacterial proteins of unknown function.

    \ ' '6398' 'IPR009214' '\ There are currently no experimental data for members of this group or their homologues. However, these proteins contain predicted integral membrane proteins (with several transmembrane segments).\ ' '6400' 'IPR010938' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6401' 'IPR009508' '\

    This family consists of several eukaryotic Churchill proteins. This protein contains a novel zinc binding region that mediates FGF signalling during neural development. The slow induction by FGF of a transcription factor (Churchill) in the neural plate in turn induces expression of Sip1 (Smad interacting protein-1), which inhibits mesodermal genes and sensitizes cells to later neural inducing factors PUBMED:14651843.

    \ ' '6402' 'IPR015864' '\

    Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase (), which converts it into FMN, and FAD synthetase (), which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme PUBMED:14580199, the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family PUBMED:17049878. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases PUBMED:12517446.

    \

    This entry represents prokaryotic-type FAD synthetase, which occurs primarily as part of a bifunctional enzyme.

    \ ' '6403' 'IPR009509' '\

    This family consists of several hypothetical proteins from Neisseria meningitidis. The function of this family is unknown.

    \ ' '6404' 'IPR010557' '\

    This family consists of a number of hypothetical proteins from Escherichia coli O157:H7 and Salmonella typhi. The function of this family is unknown.

    \ ' '6405' 'IPR008325' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '6406' 'IPR009510' '\

    This family consists of several YscK proteins. The function of this protein is unknown but it belongs to an operon involved in the secretion of Yop proteins across bacterial membranes.

    \ ' '6407' 'IPR010558' '\

    This family consists of several Caenorhabditis elegans specific ly-6-related HOT and ODR proteins. These proteins are involved in the olfactory system. Odr-2 mutants are known to be defective in the ability to chemotax to odorants that are recognised by the two AWC olfactory neurons. Odr-2 encodes a membrane-associated protein related to the Ly-6 superfamily of GPI-linked signalling proteins PUBMED:11139503.

    \ ' '6408' 'IPR010559' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms PUBMED:8868347, PUBMED:11406410. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation PUBMED:10426948, and CheA, which plays a central role in the chemotaxis system PUBMED:9989504. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water PUBMED:11145881. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.

    \

    HKs can be roughly divided into two classes: orthodox and hybrid kinases PUBMED:8029829, PUBMED:1482126. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK PUBMED:10966457. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.

    \

    This family represents a region within bacterial histidine kinase enzymes. Two-component signal transduction systems such as those mediated by histidine kinase are integral parts of bacterial cellular regulatory processes, and are used to regulate the expression of genes involved in virulence. Members of this family often contain and/or .

    \ ' '6409' 'IPR009511' '\

    This family of proteins may function to silence the spindle checkpoint and allow mitosis to proceed through anaphase by binding to MAD2L1 after it has become dissociated from the MAD2L1-CDC20 complex. During early mitosis, the protein is unevenly distributed throughout the nucleoplasm. From metaphase to anaphase, it is concentrated on the spindle.

    \ ' '6410' 'IPR010939' '\

    This family consists of several eukaryote specific repeats of unknown function. This repeat seems to always be found with .

    \ ' '6411' 'IPR010560' '\

    This entry represents the C terminus of eukaryotic neogenin precursor proteins, which contains several potential phosphorylation sites PUBMED:9121761. Neogenin is a member of the N-CAM family of cell adhesion molecules (and therefore contains multiple copies of and ) and is closely related to the DCC tumour suppressor gene product - these proteins may play an integral role in regulating differentiation programmes and/or cell migration events within many adult and embryonic tissues PUBMED:9264410.

    \ ' '6412' 'IPR010561' '\

    DIRP (Domain in Rb-related Pathway) is postulated to be involved in the Rb-related pathway, which is encoded by multiple eukaryotic genomes and is present in proteins including lin-9 of Caenorhabditis elegans, aly of Drosophila melanogaster and mustard weed. Studies of lin-9 and aly of fruit fly proteins containing DIRP suggest that this domain might be involved in development. Aly, lin-9, act in parallel to, or downstream of, activation of MAPK by the RTK-Ras signalling pathway.

    \ ' '6413' 'IPR010562' '\

    This family consists of several insect specific haemolymph juvenile hormone binding proteins (JHBP). Juvenile hormone (JH) has a profound effect on insects. It regulates embryogenesis, maintains the status quo of larva development and stimulates reproductive maturation in the adult forms. JH is transported from the sites of its synthesis to target tissues by a haemolymph carrier called juvenile hormone-binding protein (JHBP). This protects the JH molecules from hydrolysis by non-specific esterases present in the insect haemolymph PUBMED:12595713.

    \ ' '6414' 'IPR010563' '\

    This family consists of several TraK proteins from Escherichia coli, Salmonella typhi and Salmonella typhimurium. TraK is known to be essential for pilus assembly but its exact role in this process is unknown PUBMED:8655498.

    \ ' '6415' 'IPR010564' '\

    This family consists of several hypothetical proteins specific to Chlamydia species. The function of this family is unknown.

    \ ' '6416' 'IPR010565' '\

    This entry represents the N-terminal region of muskelin and is found in conjunction with several repeats. Muskelin is an intracellular, kelch repeat protein that is needed in cell-spreading responses to the matrix adhesion molecule, thrombospondin-1 PUBMED:12384287.

    \ ' '6417' 'IPR009512' '\

    The mode of choroquine action or resistance of the malarial parasite Plasmodium falciparum is not fully elucidated and presents a huge challenge world wide. Plasmodial EXP-1 protein, also called circumsporozoite-related antigen, changes under chloroquine treatment, making it a potential chloroquine resistance marker PUBMED:17295353.

    \ \

    Although there are no authentic repeats in this antigen, there are a number of internal homologies (N-A-N-P) and (N-A-D-P). The first of these tetramers is the dominant repeat found in the circumsporozoite protein (CSP) of P. falciparum and reacts with antibodies against circumsporozoite-related antigen (CRA). It is possible that immune responses to CRA may act against the CSP also. The CRA is found in many parasitic strains.

    \ ' '6418' 'IPR009513' '\

    This family consists of several PerB or BfpV proteins found specifically in Escherichia coli. PerB is thought to play a role in regulating the expression of BfpA PUBMED:7729884.

    \ ' '6419' 'IPR009514' '\

    This family consists of several nuclear disruption (Ndd) proteins from T4-like phages. Early in a Bacteriophage T4 infection, the phage ndd gene causes the rapid destruction of the structure of the Escherichia coli nucleoid. The targets of Ndd action may be the chromosomal sequences that determine the structure of the nucleoid PUBMED:9748458.

    \ ' '6420' 'IPR009515' '\

    This family consists of several hypothetical short plant proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown.

    \ ' '6421' 'IPR009516' '\

    This family consists of several Raspberry bushy dwarf virus (RBDV) coat proteins.

    \ ' '6422' 'IPR010566' '\

    This family consists of a number of bacteria specific domains, which are found in haemolysin-type calcium binding proteins. This family is found in conjunction with and is often found in multiple copies.

    \ ' '6423' 'IPR009517' '\

    Borna disease virus (BDV) is a non-cytolytic, neurotropic RNA virus that has a broad host range in warm-blooded animals. BDV is an enveloped virus, non-segmented, negative-stranded RNA genome and has an organization characteristic of a member of Bornaviridae in the order of Mononegavirale. This family consists of several BDV P24 (phosphoprotein 24) proteins. They are essential components of the RNA polymerase transcription and replication complex.

    \ \

    P24 is encoded by open reading frame II (ORF-II) and undergoes high rates of mutation in humans. They bind amphoterin-HMGB1, a multifunctional protein, directly may cause deleterious effects in cellular functions by its interference with HMGB1 PUBMED:14581561. Horse and human P24 have no species-specific amino acid residues, suggesting that the two viruses related PUBMED:8523585, PUBMED:9811743.

    \ \

    Numerous interactions of the immune system with the central nervous system have been described. Mood and psychotic disorders, such as severe depression and schizophrenia, are both heterogeneous disorders regarding clinical symptomatology, the acuity of symptoms, the clinical course and the treatment response PUBMED:18623121. BDV p24 RNA has been detected in the peripheral blood mononuclear cells (PBMCs) of psychiatric patients with such conditions PUBMED:9811743. Some studies find a significant difference in the prevalence of BDV p24 RNA in patients with mood disorders and schizophrenia PUBMED:16324750, whilst others find no difference between patients and control groups PUBMED:9811743. Consequently, debate about the role of BDV in psychiatric diseases remains alive. \

    \ ' '6424' 'IPR009518' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight transmembrane protein PsbX found in PSII, which is associated with the oxygen-evolving complex. Its expression is light-regulated. PsbX appears to be involved in the regulation of the amount of PSII PUBMED:11202442, and may be involved in the binding or turnover of quinone molecules at the Qb (PsbA) site PUBMED:11230572.

    \ ' '6425' 'IPR010567' '\

    This family consists of several P-47 proteins from various Clostridium species as well as two related sequences from Pseudomonas putida. The function of this family is unknown.

    \ ' '6426' 'IPR010568' '\

    This entry contains a number of repeats found in Chlorovirus glycoproteins. The function of these proteins is unknown.

    \ ' '6427' 'IPR009519' '\

    This family consists of several hypothetical Fijivirus proteins of unknown function.

    \ ' '6428' 'IPR009520' '\

    This family consists of several short, hypothetical phage and bacterial proteins. The function of this family is unknown.

    \ ' '6429' 'IPR009521' '\

    This family consists of several Orthopoxvirus F6L proteins the function of which is unknown.

    \ ' '6430' 'IPR010569' '\

    This family represents a region within eukaryotic myotubularin-related proteins that is sometimes found with . Myotubularin is a dual-specific lipid phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bi-phosphate PUBMED:12847286. Mutations in gene encoding myotubularin-related proteins have been associated with disease PUBMED:12045210.

    \ ' '6431' 'IPR010570' '\

    This family consists of several hypothetical proteins of unknown function and seems to be specific to Bacteroides species.

    \ ' '6433' 'IPR010572' '\

    This family consists of hypothetical bacterial and viral proteins of unknown function.

    \ ' '6435' 'IPR009523' '\

    This family consists of several prokineticin proteins and related BM8 sequences. The suprachiasmatic nucleus (SCN) controls the circadian rhythm of physiological and behavioural processes in mammals. It has been shown that prokineticin 2 (PK2), a cysteine-rich secreted protein, functions as an output molecule from the SCN circadian clock. PK2 messenger RNA is rhythmically expressed in the SCN, and the phase of PK2 rhythm is responsive to light entrainment. Molecular and genetic studies have revealed that PK2 is a gene that is controlled by a circadian clock PUBMED:12024206.

    \ ' '6436' 'IPR009524' '\

    This family consists of several hypothetical mammalian proteins (from mouse and human). The function of this family is unknown.

    \ ' '6437' 'IPR010573' '\

    This family consists of several fungal specific trichothecene efflux pump proteins. Many of the genes involved in trichothecene toxin biosynthesis in Fusarium sporotrichioides are present within a gene cluster. It has been suggested that TRI12 may play a role in F. sporotrichioides self-protection against trichothecenes PUBMED:10485289.

    \ ' '6438' 'IPR010574' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6439' 'IPR009525' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6440' 'IPR009526' '\

    Members of this protein family are small, typically about 80 residues in length, and are highly hydrophobic. The gene is found so far only in a subset of the firmicutes in association with genes of the ATP synthase F1 complex or NADH-quinone oxidoreductase. This family includes YwzB from Bacillus subtilis.

    \ ' '6441' 'IPR010575' '\

    This domain is found in several KorB transcriptional repressor proteins. The korB gene is a major regulatory element in the replication and maintenance of broad host-range plasmid RK2. It negatively controls the replication gene trfA, the host-lethal determinants kilA and kilB, and the korA-korB operon PUBMED:3430606. This family is found in conjunction with .

    \ ' '6442' 'IPR017454' '\ This entry represents the C-teminal domain of neuromodulin, which is a component of the motile growth cones. It is a membrane protein whose expression is widely correlated with successful axon elongation and is a crucial component of an effective regeneration response in the nervous system PUBMED:3272162, PUBMED:2641999. Although its function is uncertain, the N-terminal region is well conserved and contains both a calmodulin binding domain, and sites for acylation, membrane attachment and protein kinase C phosphorylation. Structural predictions suggest that this C-terminal domain may exist as an extended, negatively-charged rod with some similarity to the side arms of neurofilaments, indicating that the biological role of neuromodulin may depend on its ability to form a dynamic membrane-cytoplasm-calmodulin complex PUBMED:2641999.\ ' '6443' 'IPR009527' '\

    This family consists of several short Circovirus proteins of unknown function.

    \ ' '6444' 'IPR009528' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the C terminus of bacterial enzymes similar to type II restriction endonucleases BsuBI and PstI (). The enzymes of the BsuBI restriction/modification (R/M) system recognise the target sequence 5\'CTGCAG and are functionally identical with those of the PstI R/M system PUBMED:1480472.

    \ ' '6445' 'IPR000751' '\

    M-phase inducer phosphatases function as dosage-dependent inducers in mitotic control PUBMED:1836978, PUBMED:2120044, PUBMED:8156993, PUBMED:1392080. They are tyrosine protein phosphatases required for progression of the cell cycle. They may directly dephosphorylate p34(cdc2) and activate p34(cdc2) kinase activity. They catalyse the reaction:

    \ \ \ ' '6446' 'IPR009529' '\

    This family consists of several Maize streak virus proteins of unknown function.

    \ ' '6447' 'IPR009530' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6448' 'IPR009531' '\

    This family consists of several hypothetical bacterial proteins of unknown function.

    \ ' '6449' 'IPR010578' '\

    In Drosophila, single-minded (sim) is a transcription factor that acts as the master regulator of neurogenesis. Two mammalian homologs of Sim which have been identified, Sim1 and Sim2, are novel heterodimerisation partners for ARNT in vitro, and may function both as positive and negative transcriptional regulators in vivo, during embryogenesis and in the adult organism PUBMED:9199934. SIM2 is thought to contribute to some specific Down syndrome phenotypes PUBMED:9020169. There is a high level of homology among mammalian and Drosophila sim proteins in their amino-terminal half where the conserved bHLH (see , ), PAS (see , ) and PAC motifs are present (). The PAC region occurs C-terminal to the PAS domains and are proposed to contribute to the PAS domain fold PUBMED:9301332, PUBMED:7756254, PUBMED:9382818. In contrast, the carboxy-terminal parts are only conserved in vertebrates PUBMED:9199934. The Sim1 C-terminus contains a Ser-rich region, whereas the Sim2 C-terminus both contain Ser/Thr-rich regions, Pro/Ser-rich regions, Pro/Ala-rich regions, and positively charged regions. Sim2s, a splice variant of Sim2, still contains the Ser/Thr- and Pro/Ser-rich regions shown to harbor repressive activities, but is missing the Pro/Ala-rich repressor region PUBMED:9199934, PUBMED:11091086, PUBMED:16484282.

    \ ' '6450' 'IPR009532' '\

    This family consists of several enterobacterial SepQ proteins from Escherichia coli and Citrobacter rodentium. The function of this family is unclear.

    \ ' '6451' 'IPR010579' '\

    Major Histocompatibility Complex (MHC) glycoproteins are heterodimeric cell surface receptors that function to present antigen peptide fragments to T cells responsible for cell-mediated immune responses. MHC molecules can be subdivided into two groups on the basis of structure and function: class I molecules present intracellular antigen peptide fragments (~10 amino acids) on the surface of the host cells to cytotoxic T cells; class II molecules present exogenously derived antigenic peptides (~15 amino acids) to helper T cells. MHC class I and II molecules are assembled and loaded with their peptide ligands via different mechanisms. However, both present peptide fragments rather than entire proteins to T cells, and are required to mount an immune response.

    \

    Class I MHC glycoproteins are expressed on the surface of all somatic nucleated cells, with the exception of neurons. MHC class I receptors present peptide antigens that are synthesised in the cytoplasm, which includes self-peptides (presented for self-tolerance) as well as foreign peptides (such as viral proteins). These antigens are generated from degraded protein fragments that are transported to the endoplasmic reticulum by TAP proteins (transporter of antigenic peptides), where they can bind MHC I molecules, before being transported to the cell surface via the Golgi apparatus PUBMED:9485452, PUBMED:15526153. MHC class I receptors display antigens for recognition by cytotoxic T cells, which have the ability to destroy viral-infected or malignant (surfeit of self-peptides) cells.

    \

    MHC class I molecules are comprised of two chains: a MHC alpha chain (heavy chain), and a beta2-microglobulin chain (light chain), where only the alpha chain spans the membrane. The alpha chain has three extracellular domains (alpha 1-3, with alpha1 being at the N-terminus), a transmembrane region and a C-terminal cytoplasmic tail. The soluble extracellular beta-2 microglobulin chain associates primarily with the alpha-3 domain and is necessary for MHC stability. The alpha1 and alpha2 domains of the alpha chain are referred to as the recognition region, because the peptide antigen binds in a deep groove between these two domains.

    \ \

    This entry represents the alpha chain C-terminal tail domain.

    \

    More information about these proteins can be found at Protein of the Month: MHC PUBMED:.

    \ ' '6452' 'IPR010580' '\

    This family consists of several ribosome associated membrane protein RAMP4 (or SERP1) sequences. Stabilisation of membrane proteins in response to stress involves the concerted action of a rescue unit in the ER membrane comprised of SERP1/RAMP4, other components of the translocon, and molecular chaperones in the ER PUBMED:10601334.

    \ ' '6453' 'IPR009533' '\

    This family consists of several hypothetical eukaryotic proteins of unknown function.

    \ ' '6454' 'IPR010581' '\

    This family consists of several hypothetical archaeal proteins of unknown function.

    \ ' '6455' 'IPR009534' '\

    This family consists of several short, hypothetical bacterial proteins of unknown function.

    \ ' '6456' 'IPR010582' '\

    Catalases () are antioxidant enzymes that catalyse the conversion of hydrogen peroxide to water and molecular oxygen, serving to protect cells from its toxic effects PUBMED:11351128. Hydrogen peroxide is produced as a consequence of oxidative cellular metabolism and can be converted to the highly reactive hydroxyl radical via transition metals, this radical being able to damage a wide variety of molecules within a cell, leading to oxidative stress and cell death. Catalases act to neutralise hydrogen peroxide toxicity, and are produced by all aerobic organisms ranging from bacteria to man. Most catalases are mono-functional, haem-containing enzymes, although there are also bifunctional haem-containing peroxidase/catalases () that are closely related to plant peroxidases, and non-haem, manganese-containing catalases () that are found in bacteria PUBMED:14745498.

    \ \

    This entry represents a small conserved region within catalase enzymes. This domain carries the immune-responsive amphipathic octa-peptide that is recognised by T cells PUBMED:15585332.

    \ ' '6457' 'IPR010583' '\

    This family consists of several bacterial MltA-interacting protein (MipA) like sequences. As well as interacting with the membrane-bound lytic transglycosylase MltA, MipA is known to bind to PBP1B, a bifunctional murein transglycosylase/transpeptidase. MipA is considered to be a structural protein mediating the assembly of MltA to PBP1B into a complex PUBMED:10037771.

    \ ' '6458' 'IPR010584' '\

    This family consists of several Enterobacterial exodeoxyribonuclease VIII proteins.

    \ ' '6459' 'IPR009535' '\

    This family represents a small conserved region of unknown function within eukaryotic phospholipase C (). All members also contain and .

    \ ' '6460' 'IPR010585' '\

    This family consists of several mammalian specific DNA double-strand break repair and V(D)J recombination protein XRCC4 sequences. In the non-homologous end joining pathway of DNA double-strand break repair, the ligation step is catalysed by a complex of XRCC4 and DNA ligase IV. It is thought that XRCC4 and ligase IV are essential for alignment-based gap filling, as well as for final ligation of the breaks PUBMED:12517771.

    \ ' '6461' 'IPR009536' '\

    This family consists of several Cucumber mosaic virus ORF IIB proteins. The function of this family is unknown.

    \ ' '6462' 'IPR009537' '\

    This entry represents a conserved region within hypothetical prokaryotic and archaeal proteins of unknown function.

    \ ' '6463' 'IPR010586' '\

    This family consists of several nodulation protein NolV sequences from different Rhizobium species PUBMED:8412662. The function of this family is unclear.

    \ ' '6464' 'IPR010587' '\

    This family consists of several uncharacterised proteins from Melanoplus sanguinipes entomopoxvirus (MsEPV). The function of this family is unknown.

    \ ' '6465' 'IPR009538' '\

    This family consists of several PV-1 (PLVAP) proteins, which seem to be specific to mammals. PV-1 is a novel protein component of the endothelial fenestral and stomatal diaphragms PUBMED:11401446. The function of this family is unknown.

    \ ' '6466' 'IPR009539' '\

    This family consists of several strabismus (STB) or Van Gogh-like (VANGL) proteins 1 and 2. The exact function of this family is unknown. It is thought, however that STB1 gene and STB2 may be potent tumour suppressor gene candidates PUBMED:12060845.

    \ ' '6467' 'IPR009540' '\

    This family consists of several basal layer antifungal peptide (BAP) sequences specific to Zea mays (Maize). The BAP2 peptide exhibits potent broad-range activity against a range of filamentous fungi, including several plant pathogens PUBMED:11319035.

    \ ' '6468' 'IPR010588' '\

    This entry represents the C terminus of plant P proteins. The maize P gene is a transcriptional regulator of genes encoding enzymes for flavonoid biosynthesis in the pathway leading to the production of a red phlobaphene pigment PUBMED:8768374, and P proteins are homologous to the DNA-binding domain of myb-like transcription factors PUBMED:8313474. This domain is associated with domain.

    \ ' '6471' 'IPR010590' '\

    This family consists of several enterobacterial YbdJ proteins. The function of this family is unknown

    \ ' '6472' 'IPR010591' '\

    This family consists of several eukaryotic ATP11 proteins. The expression of functional F1-ATPase requires two proteins which are encoded by the ATP11 and ATP12 genes PUBMED:1532796. Atp11p is a molecular chaperone of the mitochondrial matrix that participates in the biogenesis pathway to form F1, which is the catalytic unit of ATP synthase. It binds to the free beta subunits of F1, which prevents the beta subunit from associating with itself in non-productive complex. It also allows for the formation of a (alpha beta)3 hexamer PUBMED:12829692.

    \ ' '6473' 'IPR009542' '\

    This family consists of several microsomal signal peptidase 12 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 12 kDa subunit (SPC12).

    \ ' '6474' 'IPR010592' '\

    This family consists of several high affinity transport system protein p37 sequences, which are specific to Mycoplasma species. The p37 gene is part of an operon encoding two additional proteins, which are highly similar to components of the periplasmic binding-protein-dependent transport systems of Gram-negative bacteria. It has been suggested that p37 is part of a homologous, high-affinity transport system in Mycoplasma hyorhinis, a Gram-positive bacterium PUBMED:3208756.

    \ ' '6476' 'IPR010594' '\

    This family consists of several hypothetical Baculovirus proteins of unknown function.

    \ ' '6477' 'IPR010595' '\

    This family consists of several short, hypothetical bacterial proteins of unknown function.

    \ ' '6478' 'IPR009543' '\

    Proteins in this entry may play a role in the control of protein cycling through the trans-Golgi network. Vacuolar sorting protein is an ATPase required for endosomal trafficking PUBMED:10637304. Defects in the human protein VPS13A cause chorea-acanthocytosis, an autosomal recessive neurodegenerative disorder characterised by the gradual onset of hyperkinetic movements and abnormal erythrocyte morphology PUBMED:11381253.

    \ ' '6479' 'IPR009544' '\

    This entry represents the C terminus of hypothetical Arabidopsis thaliana proteins of unknown function.

    \ ' '6480' 'IPR010596' '\

    This entry represents the N-terminal region of the Drosophila specific Methuselah protein. Drosophila Methuselah (Mth) mutants have a 35% increase in average lifespan and increased resistance to several forms of stress, including heat, starvation, and oxidative damage. The protein affected by this mutation is related to G protein-coupled receptors of the secretin receptor family. Mth, like secretin receptor family members, has a large N-terminal ectodomain, which may constitute the ligand binding site PUBMED:11274391.

    \ ' '6481' 'IPR009545' '\

    This family consists of several Caenorhabditis elegans specific proteins of unknown function.

    \ ' '6484' 'IPR009547' '\

    This family consists of several Tenuivirus PVC2 proteins from Rice grassy stunt virus, Maize stripe virus and Rice hoja blanca virus. The function of this family is unknown.

    \ ' '6485' 'IPR010597' '\

    This family consists of several uncharacterised mammalian proteins of unknown function.

    \ ' '6486' 'IPR009548' '\

    This family consists of several hypothetical eukaryotic proteins of unknown function.

    \ ' '6489' 'IPR009550' '\

    This family represents a conserved region within Agrobacterium tumefaciens VirE3. Agrobacterium tumefaciens (a plant pathogen) has a tumour-inducing (Ti) plasmid of which part, the transfer (T)-region, is transferred to plant cells during the infection process. Vir proteins mediate the processing of the T-region and the transfer of a single-stranded (ss) DNA copy of this region, the T-strand, into the recipient cells. VirE3 is a translocated effector protein, but its specific role has not been established PUBMED:12560481.

    \ ' '6490' 'IPR010598' '\

    This entry consists of known or predicted D-glucuronyl C5-epimerases which share a common C-terminal region. Glucuronyl C5-epimerases catalyse the conversion of D-glucuronic acid (GlcUA) to L-iduronic acid (IdceA) units during the biosynthesis of glycosaminoglycans PUBMED:9346972.

    \ ' '6491' 'IPR010599' '\

    This region of unknown function is situated between the and domains in a cytoplasmic and membrane associated protein which appears to function as an adapter protein or regulator of Ras signalling pathways PUBMED:14597674.

    \ ' '6492' 'IPR009551' '\

    This family represents a conserved region of unknown function within a number of hypothetical eukaryotic proteins.

    \ ' '6494' 'IPR009553' '\

    This family contains a group of hypothetical bacterial proteins that contain three conserved cysteine residues towards the N-terminal. The function of these proteins is unknown.

    \ ' '6495' 'IPR009554' '\

    This family consists of several bacterial phage shock protein B (PspB) sequences. The phage shock protein (psp) operon is induced in response to heat, ethanol, osmotic shock and infection by filamentous bacteriophages PUBMED:1712397. Expression of the operon requires the alternative sigma factor sigma54 and the transcriptional activator PspF. In addition, PspA plays a negative regulatory role, and the integral-membrane proteins PspB and PspC play a positive one PUBMED:12562786.

    \ ' '6496' 'IPR010600' '\

    This entry represents the C-terminal region of inter-alpha-trypsin inhibitor heavy chains. Inter-alpha-trypsin inhibitors are glycoproteins with a high inhibitory activity against trypsin, built up from different combinations of four polypeptides: bikunin and the three heavy chains that belong to this family (HC1, HC2, HC3). The heavy chains do not have any protease inhibitory properties but have the capacity to interact in vitro and in vivo with hyaluronic acid, which promotes the stability of the extra-cellular matrix. This domain is associated with the VWA domain .

    \ ' '6497' 'IPR009555' '\

    This entry represents several Xylella fastidiosa surface protein specific repeats which are found in found in conjunction with , and .

    \ ' '6498' 'IPR009556' '\

    This family consists of several Microneme protein Etmic-2 sequences from Eimeria tenella. Etmic-2 is a 50 kDa acidic protein, which is found within the microneme organelles of E. tenella sporozoites and merozoites PUBMED:8855556.

    \ ' '6499' 'IPR009557' '\

    This family consists of a number of Caenorhabditis elegans specific repeats of around 36 residues in length which are found in two hypothetical proteins. This family is found in conjunction with .

    \ ' '6500' 'IPR009558' '\

    This family consists of several hypothetical bacterial proteins of around 210 residues in length. The function of this family is unknown.

    \ ' '6501' 'IPR009559' '\

    This family consists of several Lactococcus lactis bacteriophage major capsid proteins.

    \ ' '6502' 'IPR009560' '\

    This family consists of several hypothetical bacterial proteins of around 340 residues in length. Members of this family contain six highly conserved cysteine residues. The function of this family is unknown.

    \ ' '6503' 'IPR009561' '\

    This family consists of several hypothetical archaeal and bacterial proteins of around 300 residues in length. The function of this family is unknown.

    \ ' '6504' 'IPR009562' '\

    This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown.

    \ ' '6505' 'IPR009563' '\

    This family consists of several Sjogren\'s syndrome/scleroderma autoantigen 1 (Autoantigen p27) sequences. It is thought that the potential association of anti-p27 with anti-centromere antibodies suggests that autoantigen p27 might play a role in mitosis PUBMED:9486406.

    \ ' '6506' 'IPR009564' '\

    This family consists of several hypothetical Caenorhabditis elegans proteins of around 106 residues in length. The function of the family is unknown.

    \ ' '6507' 'IPR009565' '\

    This family consists of several hypothetical mammalian proteins of around 190 residues in length. The function of this family is unknown.

    \ ' '6508' 'IPR009566' '\

    This family consists of several hypothetical proteins of around 120 residues in length which are found specifically in Trypanosoma brucei. The function of this family is unknown.

    \ ' '6509' 'IPR010601' '\

    This family consists of several hypothetical proteins of around 360 residues in length and seems to be specific to Caenorhabditis elegans. The function of this family is unknown.

    \ ' '6510' 'IPR009567' '\

    This family consists of several eukaryotic proteins of around 360 residues in length. The function of this family is unknown.

    \ ' '6511' 'IPR009568' '\

    This family contains a number of hypothetical proteins of unknown function from Arabidopsis thaliana.

    \ ' '6512' 'IPR009569' '\

    This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown.

    \ ' '6513' 'IPR010602' '\

    This family consists of several hypothetical bacterial proteins of around 250 residues in length and is found in several Chlamydia and Anabaena species. The function of this family is unknown.

    \ ' '6514' 'IPR009570' '\

    This family consists of several bacterial stage III sporulation protein AC (SpoIIIAC) sequences. The exact function of this family is unknown.

    \ ' '6515' 'IPR009571' '\

    This family consists of several fungal specific SUR7 proteins. In Saccharomyces cerevisiae the SUR7 gene encodes a putative integral membrane protein with four transmembrane domains. It has been suggested that the Rvs161 and Rvs167 proteins act together in relation with SUR7. The transmembranous character of SUR7 suggests a membrane localisation of the Rvs function, a localisation that is consistent with the different rvs phenotypes and the actin-Rvs167p interaction PUBMED:9219339. It has also been suggested that SUR7 may play a role in sporulation PUBMED:11784867.

    \ ' '6516' 'IPR009572' '\

    This family consists of several short, hypothetical bacterial proteins of around 62 residues in length. Members of this family are found in Escherichia coli and Salmonella typhi. The function of this family is unknown.

    \ ' '6517' 'IPR010603' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    The ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type PUBMED:11278349. This presumed zinc binding domain (ZBD) is found at the N terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. ZBD is a member of the treble clef zinc finger family, a motif known to facilitate protein-ligand, protein-DNA, and protein-protein interactions and forms a constitutive dimer that is essential for the degradation of some, but not all, ClpX substrates PUBMED:14525985.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '6518' 'IPR009573' '\

    This family consists of several hypothetical archaeal proteins of around 260 residues in length, which seem to be specific to Methanobacterium, Methanococcus and Methanopyrus species. The function of this family is unknown.

    \ ' '6519' 'IPR009574' '\

    This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown.

    \ ' '6520' 'IPR009575' '\

    This family consists of several Melon necrotic spot virus (MNSV) P7B proteins. The function of this family is unknown.

    \ ' '6521' 'IPR009576' '\

    This family consists of several hypothetical Enterobacterial proteins of around 212 residues in length and is known as YjfM in Escherichia coli. The function of this family is unknown.

    \ ' '6522' 'IPR010604' '\

    This family consists of several plant specific nuclear matrix protein 1 (NMP1) sequences. Nuclear Matrix Protein 1 is a ubiquitously expressed 36 kDa protein, which has no homologues in animals and fungi, but is highly conserved among flowering and non-flowering plants. NMP1 is located both in the cytoplasm and nucleus and that the nuclear fraction is associated with the nuclear matrix. NMP1 is a candidate for a plant-specific structural protein with a function both in the nucleus and cytoplasm PUBMED:12654864.

    \ ' '6523' 'IPR009577' '\

    This family contains a small number of putative small multi-drug export proteins.

    \ ' '6524' 'IPR009578' '\

    This family consists of a number of ~25 residue long repeats found commonly in Streptococcal surface antigens although one copy is present in the HPSR2-heavy chain potential motor protein of Giardia lamblia (Giardia intestinalis) (). This family is often found in conjunction with .

    \ ' '6525' 'IPR010605' '\

    This family contains hypothetical plant proteins of unknown function.

    \ ' '6526' 'IPR009579' '\

    This family consists of several short, hypothetical, bacterial proteins of around 60 residues in length. The function of this family is unknown.

    \ ' '6527' 'IPR009580' '\

    Glycosylphosphatidylinositol anchor biosynthesis protein Pig-F is involved in glycosylphosphatidylinositol (GPI) anchor biosynthesis PUBMED:10781593, PUBMED:11995915, PUBMED:12655644.

    \ ' '6529' 'IPR010606' '\

    Mib is a RING ubiquitin ligase in the Notch pathway. Mib interacts with the intracellular domain of Delta to promote its ubiquitylation and internalisation. Cell transplantation studies suggest that mib function is essential in the signalling cell for efficient activation of Notch in neighbouring cells. This domain has been named \'mib/herc2 domain\' in PUBMED:12530964and usually the protein also contains an E3 ligase domain (either Ring or Hect).

    \ ' '6530' 'IPR009581' '\

    This family is baesd on the C terminus of several hypothetical eukaryotic proteins of unknown function. Proteins in this entry contain two conserved motifs: DRHHYE and QCC, as well as a number of conserved cysteine residues.

    \ ' '6531' 'IPR009582' '\

    This family consists of several microsomal signal peptidase 25 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 25 kDa subunit (SPC25).

    \ ' '6532' 'IPR009583' '\

    This family consists of several DspF and related sequences from several plant pathogenic bacteria. The \'disease-specific\' (dsp) region next to the hrp gene cluster of Erwinia amylovora (Fire blight bacteria) is required for pathogenicity but not for elicitation of the hypersensitive reaction. DspF and AvrF are small (16 kDa and 14 kDa) and acidic with predicted amphipathic alpha helices in their C termini; they resemble chaperones for virulence factors secreted by type III secretion systems of animal pathogens PUBMED:9448330.

    \ ' '6533' 'IPR008374' '\

    \ Striated fibre assemblin (SFA), an acidic 33kDa protein, is the major\ component of striated microtubule-associated fibres (SMAFs) in the flagellar\ basal apparatus of green flagellates. In Chlamydomonas, and other green\ flagellates, the SMAFs form a cross-like pattern and run alongside the\ proximal parts of four bundles of flagellar root microtubules.\

    \

    \ The sequence of SFA contains two structurally distinct domains PUBMED:8491776. The\ head domain, with ~30 residues, contains all the prolines (3-8 depending on\ species) and is rich in hydroxyamino acids. This non-helical domain is\ further characterised by the presence of repetitive SP-motifs, some of them\ in the context SP(M/T)R, which is a putative substrate for p34-CDC2 kinase. The rod domain, with ~250 residues, is predicted to be mostly alpha-\ helical (the alpha-helix content was estimated to be 76% for the entire\ molecule or 85% for the postulated rod domain PUBMED:8491776). This domain shows a\ pronounced coiled-coil-forming ability and contains a 29-residue repeat\ pattern based on four heptads, followed by a skip residue.

    \ ' '6534' 'IPR009584' '\

    This family consists of several Citrus tristeza virus (CTV) 6 kDa, 51 residue long hydrophobic (P6) proteins. The function of this family is unknown.

    \ ' '6535' 'IPR010607' '\

    This family consists of several hypothetical Rhizobiales specific proteins of around 270 residues in length. The function of this family is unknown.

    \ ' '6536' 'IPR010608' '\

    This family consists of several plant specific hypothetical proteins of around 160 residues in length. The function of this family is unknown.

    \ ' '6537' 'IPR009585' '\

    This family consists of several hypothetical bacterial proteins of around 51 residues in length which seem to be specific to Vibrio cholerae. The function of this family is unknown.

    \ ' '6539' 'IPR009587' '\

    This family consists of several bacterial proteins of around 150 residues in length which are specific to Escherichia coli, Salmonella species and Yersinia pestis. The function of this family is unknown.

    \ ' '6540' 'IPR009588' '\

    This family consists of several hypothetical Feline immunodeficiency virus (FIV) proteins. Members of this family are typically around 67 residues long and are often annotated as ORF3 proteins. The function of this family is unknown.

    \ ' '6541' 'IPR009589' '\

    This family consists of several hypothetical proteins specific to Oceanobacillus and Bacillus species. Members of this family are typically around 130 residues in length. The function of this family is unknown.

    \ ' '6542' 'IPR009590' '\

    This domain is found at the N terminus of the Gp5 baseplate protein of Bacteriophage T4. This domain binds to the Gp27 protein PUBMED:11823865. This domain has the common OB fold PUBMED:11823865.

    \ ' '6543' 'IPR010609' '\

    This repeat composes the C-terminal part of the Bacteriophage T4 baseplate protein Gp5. This region of the protein forms a needle like projection from the baseplate that is presumed to puncture the bacterial cell membrane. Structurally three copies of the repeated region trimerise to form a beta solenoid type structure PUBMED:11823865. This family also includes repeats from bacterial Vgr proteins.

    \ ' '6544' 'IPR009591' '\

    This family consists of several Beet yellows virus (BYV) putative membrane-binding proteins of around 54 residues in length. The function of this family is unknown.

    \ ' '6545' 'IPR009592' '\

    This family consists of several hypothetical bacterial proteins of around 335 residues in length. Members of this family are found exclusively in Escherichia coli and Salmonella species and are often referred to as YggM proteins. The function of this family is unknown.

    \ ' '6546' 'IPR009593' '\

    This family consists of several hypothetical bacterial proteins of around 155 residues in length. Family members are present in Rhizobium, Agrobacterium and Streptomyces species.

    \ ' '6547' 'IPR009594' '\

    This entry represents the N terminus of bacterial ARAC-type transcriptional regulators. In Escherichia coli these regulate the L-arabinose operon through sensing the presence of arabinose, and when the sugar is present, transmitting this information from the arabinose-binding domains to the protein s DNA-binding domains PUBMED:12683999. This family might represent the N-terminal arm of the protein, which binds to the C-terminal DNA binding domains to hold them in a state where the protein prefers to loop and remain non-activating PUBMED:9600837. This domain is associated with the domain.

    \ ' '6548' 'IPR009595' '\

    The host of the lytic bacteriophage phi 29 is the spore-forming bacterium Bacillus subtilis. When infection occurs during early stages of sporulation, however, phi 29 development is suppressed and the infecting phage genome becomes trapped into the developing spore PUBMED:18410285. This family consists of several bacteriophage phi-29 early protein GP16.7 sequences of around 130 residues in length. The function of this family is unknown.

    \ ' '6549' 'IPR009596' '\

    This entry represents the C terminus of a number of Arabidopsis thaliana hypothetical proteins of unknown function. Family members contain a conserved DFD motif.

    \ ' '6550' 'IPR010610' '\

    This entry represents a conserved region of unknown function within bacterial glycosyl transferases. Many proteins containing this domain are members of the glycosyl transferase family 28 .

    \ ' '6551' 'IPR004753' '\

    Bacterial cell shape varies greatly between species, and characteristic\ morphologies are used for identification purposes. In addition to individual\ cell shape, the way in which groups of cells are arranged is also typical of\ some bacterial species, especially Gram-positive coccoids. For many years, it was believed that micro-organisms with other than\ spheroidal cell shapes maintained morphology by means of their external cell \ walls. Recently, however, studies of the Gram-positive rod Bacillus subtilis\ have revealed two related genes that are essential for the integrity of cell\ morphogenesis PUBMED:11290328. Termed mreB and mbl, the gene products localise close to\ the cell surface, forming filamentous helical structures. Many \ homologues have been found in diverse bacterial groups, suggesting a common \ ancestor PUBMED:11544518.

    \

    The crystal structure of MreB from Thermotoga maritima has been resolved \ using X-ray crystallography PUBMED:11544518. It consists of 19 beta-strands and 15 alpha-\ helices, and shows remarkable structural similarity to eukaryotic actin. \ MreB crystals also contain proto-filaments, with individual proteins \ assembling into polymers like F-actin, in the same orientation. It is \ hypothesised therefore, that MreB was the forerunner of actin in early \ eukaryotes PUBMED:11731313.

    \ ' '6552' 'IPR009597' '\

    This region consists of two a pair of transmembrane helices and occurs three times in each of the family member proteins.

    \ ' '6553' 'IPR010611' '\

    This short presumed domain contains three conserved aspartate residues, hence the name 3D. This conservation is suggestive of a cation binding function. The central aspartate is found in a DTG motif that is suggestive of a peptidase like active site.

    \ ' '6554' 'IPR018015' '\

    This family consists of a series of short proteins of around 90 residues in length. The human protein (or BC10) has been implicated in bladder cancer where the transcription of the gene coding for this protein is nearly completely abolished in highly invasive transitional cell carcinomas (TCCs) PUBMED:15797904. The protein is a small globular protein containing two transmembrane helices, and it is a multiply edited transcript. All the editing sites are found in either the 5\'-UTR or the N-terminal section of the protein, which is predicted to be outside the membrane. The three coding edits are all non-synonymous and predicted to encode exposed residues PUBMED:11920613. The function of this family is unknown.

    \ ' '6555' 'IPR009599' '\

    This family consists of a number of hypothetical bacterial proteins of around 410 residues in length, which seem to be specific to Chlamydia species. The function of this family is unknown.

    \ ' '6556' 'IPR009600' '\

    Many eukaryotic proteins are anchored to the cell surface via glycosylphosphatidylinositol (GPI), which is posttranslationally attached to the C terminus by GPI transamidase. The mammalian GPI transamidase is a complex of at least four subunits, GPI8, GAA1, PIG-S, and PIG-T. PIG-U is thought to represent a fifth subunit in this complex and may be involved in the recognition of either the GPI attachment signal or the lipid portion of GPI PUBMED:12802054.

    \ ' '6557' 'IPR009601' '\

    This family consists of mammalian nuclear receptor co-activator NRIF3 proteins. NRIF3 exhibits a distinct receptor specificity in interacting with and potentiating the activity of only TRs and RXRs but not other examined nuclear receptors. NRIF3 as a coregulator that possesses both transactivation and transrepression domains and/or functions. Collectively, the NRIF3 family of coregulators may play dual roles in mediating both positive and negative regulatory effects on gene expression PUBMED:11713274.

    \ ' '6558' 'IPR009602' '\

    This family consists of several eukaryotic sequences of around 270 residues in length. Members of this family are found in mouse, human and Drosophila melanogaster. The function of this family is unknown.

    \ ' '6560' 'IPR010613' '\

    This entry represents the N-terminal region of Pescadillo. Pescadillo protein localises to distinct substructures of the interphase nucleus including nucleoli, the site of ribosome biogenesis. During mitosis pescadillo closely associates with the periphery of metaphase chromosomes and by late anaphase is associated with nucleolus-derived foci and prenucleolar bodies. Blastomeres in mouse embryos lacking pescadillo arrest at morula stages of development, the nucleoli fail to differentiate and accumulation of ribosomes is inhibited. It has been proposed that in mammalian cells pescadillo is essential for ribosome biogenesis and nucleologenesis and that disruption to its function results in cell cycle arrest PUBMED:12237316.

    \ ' '6561' 'IPR010614' '\

    This represents a conserved region within a number of RAD3-like DNA-binding helicases that are seemingly ubiquitous - members include proteins of eukaryotic, bacterial and archaeal origin. RAD3 is involved in nucleotide excision repair, and forms part of the transcription factor TFIIH in yeast PUBMED:10915862.

    \ ' '6562' 'IPR010615' '\

    The gene product form the gene UL97 is a serine/threonine kinase. Although it can phosphorylates the antiviral drug ganciclovir PUBMED:9217058, its biological function is the phosphorylation of its natural viral and cellular protein substrates which affect viral replication at many levels.

    \ \

    The kinase phosphorylates eukaryotic elongation factor 1delta, the carboxyl terminal domain of the large subunit of RNA polymerase II, the retinoblastoma tumour suppressor and lamins A and C performing a similar function to the cdc2/cyclin-dependent kinase 1. The activity of UL97 appear to stimulate the cell cycle to support viral DNA synthesis, enhance the expression of viral genes, promote virion morphogenesis and facilitate the egress of mature capsids from the nucleus PUBMED:19434630.

    \ \ ' '6564' 'IPR010617' '\

    This family represents a conserved region within a number of hypothetical proteins of unknown function found in eukaryotes, bacteria and archaea. These may possibly be integral membrane proteins.

    \ ' '6565' 'IPR010618' '\

    This family of proteins is very likely to act as transglycosylase enzymes related to and . These other families are weakly matched by this family, and include the known active site residues.

    \ ' '6566' 'IPR010619' '\

    This family represents a conserved region within a number of hypothetical proteins of unknown function found in eukaryotes, bacteria and archaea. Some family members are membrane proteins.

    \ ' '6567' 'IPR010620' '\

    This family is related to and is likely to also form a beta-propeller. SBBP stands for Seven Bladed Beta Propeller.

    \ ' '6568' 'IPR009603' '\

    This family represents a short conserved repeat within Drosophila melanogaster proteins of unknown function. Approximately 50 copies of this repeat are present in each protein.

    \ ' '6569' 'IPR009604' '\

    This entry represents a conserved region approximately 250 residues long located on eukaryotic ataxin-2 PUBMED:16115810. Ataxin-2 is a protein of unknown function, within which expansion of a polyglutamine tract (due to expansion of unstable CAG repeats in the coding region of the SCA2 gene) causes spinocerebellar ataxia type 2 (SCA2), a late-onset neurodegenerative disorder PUBMED:9339681. The expanded polyglutamine repeat in ataxin-2 causes disruption of the normal morphology of the Golgi complex and increased incidence of cell death PUBMED:12812977. Ataxin-2 is predicted to consist of mostly non-globular domains PUBMED:9462862.

    \ ' '6570' 'IPR010621' '\

    This entry represents the C-terminal region of several hypothetical proteins of unknown function. Proteins in this entry are mostly bacterial, but a few are also found in eukaryotes and archaea.

    \ ' '6571' 'IPR010622' '\

    This entry represents a conserved region of eukaryotic Fas-activated serine/threonine (FAST) kinases that contains several conserved leucine residues. FAST kinase is rapidly activated during Fas-mediated apoptosis, when it phosphorylates TIA-1, a nuclear RNA-binding protein that has been implicated as an effector of apoptosis PUBMED:7544399. Note that many family members are hypothetical proteins.

    \ ' '6572' 'IPR010623' '\

    This domain represents a conserved region situated towards the C-terminal end of several hypothetical bacterial proteins of unknown function. A few members resemble the ImcF protein, which has been proposed PUBMED:12127983 to be involved in Vibrio cholerae cell surface reorganisation that results in increased adherence to epithelial cells line and increased conjugation frequency.

    \ ' '6573' 'IPR014774' '\

    This entry represents a conserved region within bacterial and archaeal proteins, most of which are hypothetical. More than one copy is sometimes found in each protein in this entry. These include KaiC, which is one of the Kai proteins among which direct protein-protein association may be a critical process in the generation of circadian rhythms in cyanobacteria PUBMED:10064581.

    \ \

    The circadian clock protein KaiC, is encoded in the kaiABC operon that controls circadian rhythms and may be universal in\ Cyanobacteria. Each member contains two copies of this domain, which is also\ found in other proteins. KaiC performs autophosphorylation and acts as its own transcriptional repressor.

    \ ' '6574' 'IPR009605' '\

    This entry represents a conserved region within Arabidopsis thaliana proteins of unknown function. Proteins in this entry sometimes contain more than one copy of the domain.

    \ ' '6575' 'IPR010625' '\

    A conserved motif was identified in the LOC118487 protein was called the CHCH motif. Alignment of this protein with related members showed the presence of three subgroups of proteins, which are called the S (Small), N (N-terminal extended) and C (C-terminal extended) subgroups. All three sub-groups of proteins have in common that they contain a predicted conserved [coiled coil 1]-[helix 1]-[coiled coil 2]-[helix 2] domain (CHCH domain). Within each helix of the CHCH domain, there are two cysteines present in a C-X9-C motif. The N-group contains an additional double helix domain, and each helix contains the C-X9-C motif. This family contains a number of characterised proteins: Cox19 protein - a nuclear gene of Saccharomyces cerevisiae, codes for an 11 kDa protein (Cox19p) required for expression of cytochrome oxidase. Because cox19 mutants are able to synthesise the mitochondrial and nuclear gene products of cytochrome oxidase, Cox19p probably\ functions post-translationally during assembly of the enzyme. Cox19p is present in the cytoplasm and mitochondria, where it exists as a soluble intermembrane protein. This dual location is similar to what was previously reported for Cox17p, a low molecular weight copper protein thought to be required for maturation of the CuA centre of subunit 2 of cytochrome oxidase. Cox19p have four conserved potential metal ligands, these are three cysteines and one histidine. Mrp10 - belongs to the class of yeast mitochondrial ribosomal proteins that are essential for translation PUBMED:9065385. Eukaryotic NADH-ubiquinone oxidoreductase 19 kDa (NDUFA8) subunit PUBMED:9860297.

    \ \ ' '6576' 'IPR010626' '\

    This family represents a conserved region that is found within bacterial proteins, most of which are hypothetical. Some members contain multiple copies.

    \ ' '6577' 'IPR009606' '\

    This family contains hypothetical plant proteins of unknown function. Family members contain a number of conserved cysteine residues.

    \ ' '6578' 'IPR010627' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This domain is found at the N terminus of bacterial aspartic peptidases belonging to MEROPS peptidase family A24 (clan AD), subfamily A24A (type IV prepilin peptidase, ). It\'s function has not been specifically determined; however some of the family have been characterised as bifunctional PUBMED:8057924, and this domain may contain the N-methylation activity. The domain consists of an intracellular region between a pair of transmembrane domains. This intracellular region contains an invariant proline and four conserved cysteines. These Cys residues are arranged in a two-pair motif, with the Cys residues of a pair separated (usually) by 2 aa and with each pair separated by 21 largely hydrophilic residues (C-X-X-C...X21...C-X-X-C); they have been shown to be essential to the overall function of the enzyme PUBMED:8340405, PUBMED:9224881.

    \ \

    The bifunctional enzyme prepilin peptidase (PilD) from Pseudomonas aeruginosa is a key determinant in both type-IV pilus biogenesis and extracellular protein secretion, in its roles as a leader peptidase and methyl transferase (MTase). It is responsible for endopeptidic cleavage of the unique leader peptides that characterise type-IV pilin precursors, as well as proteins with homologous leader sequences that are essential components of the general secretion pathway found in a variety of Gram-negative pathogens. Following removal of the leader peptides, the same enzyme is responsible for the second posttranslational modification that characterises the type-IV pilins and their homologues, namely N-methylation of the newly exposed N-terminal amino acid residue PUBMED:9224881.

    \ \ ' '6579' 'IPR010628' '\

    This family consists of several bacterial ethanolamine ammonia lyase large subunit (EutB) proteins. Ethanolamine ammonia-lyase is a bacterial enzyme that catalyses the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia. The enzyme is a heterodimer composed of subunits of Mr approximately 55,000 (EutB) and 35,000 (EutC) PUBMED:2197274.

    \ ' '6580' 'IPR009607' '\

    This entry represents the C terminus of eukaryotic enhancer of polycomb proteins, which have roles in heterochromatin formation PUBMED:9735366. This family contains several conserved motifs.

    \ ' '6581' 'IPR009608' '\

    This family consists of several Bombina species specific bradykinin sequences. The skins of anuran amphibians, in addition to mucus glands, contain highly specialised poison glands, which, in reaction to stress or attack, exude a complex noxious cocktail of biologically active molecules. These secretions often contain a plethora of peptides among which bradykinin or structural variants have been identified PUBMED:12230583.

    \ ' '6582' 'IPR009609' '\

    This family consists of several bacterial phosphonate metabolism protein PhnG sequences. In Escherichia coli, the phn operon encodes proteins responsible for the uptake and breakdown of phosphonates. The exact function of PhnG is unknown, however it is thought likely that along with six other proteins PhnG makes up the the C-P (carbon-phosphorus) lyase PUBMED:9882650.

    \ ' '6583' 'IPR009610' '\

    This family consists of several hypothetical proteins which seem to be specific to the enterobacteria Escherichia coli and Shigella flexneri. Family members are often known as YeeV proteins and are around 125 residues in length. The function of this family is unknown.

    \ ' '6584' 'IPR009611' '\

    This entry represents the C-terminal region of eukaryotic chorion protein S19. In Drosophilidae, the S19 gene is known to form part of an autosomal cluster that also contains s16, s15 and s18 PUBMED:11404001. Note that members of this family contain a conserved PVA motif, and many contain .

    \ ' '6585' 'IPR010629' '\

    This entry represents several insect specific allergen repeats. These repeats are commonly found in various proteins from cockroaches, fruit flies and mosquitos. It has been suggested that the repeat sequences have evolved by duplication of an ancestral amino acid domain, which may have arisen from the mitochondrial energy transfer proteins PUBMED:9804858.

    \ ' '6586' 'IPR010630' '\

    This entry represents several mammalian specific repeats of around 65 residues in length. Proteins of the neuroblastoma breakpoint family (NBPF) contain a highly conserved domain of unknown function, which has been named the NBPF domain. The NBPF domain is present in multiple copies in NBPF proteins and once, with lower homology, in mammalian myomegalin, a protein localised in the Golgi/centrosomal area which functions as an anchor to localise components of the cyclic adenosine monophosphate-dependent pathway to this region. The number of NBPF repeat copies is highly expanded in humans, reduced in African great apes, further reduced in orangutan and Old World monkeys, single-copy in non-primate mammals, and absent in non-mammalian species. The NBPF domain that is found as a singly copy in non-primate mammals is the likely ancestral domain. The implications of the resemblance of NBPF proteins to myomegalin remain obscure as no functional properties have been ascribed to the NBPF domains PUBMED:16079250, PUBMED:16946073.

    \ ' '6588' 'IPR010632' '\

    This is a group of plant proteins, most of which are hypothetical and of unknown function. All members contain the domain, suggesting that they may possess kinase activity.

    \ ' '6589' 'IPR009612' '\

    This entry represents a conserved region within several bacterial proteins that resemble ImcF, which has been proposed PUBMED:12127983 to be involved in Vibrio cholerae cell surface reorganisation, resulting in increased adherence to epithelial cells and increased conjugation frequency. Note that many entry members are hypothetical proteins.

    \ ' '6590' 'IPR009613' '\

    This family, which includes bacterial and eukaryotic members, represents a conserved region located towards the C-terminal end of a number of hypothetical proteins of unknown function. These are possibly integral membrane proteins.

    \ ' '6591' 'IPR010633' '\

    This family consists of several prophage minor tail protein Z like sequences from Escherichia coli, Salmonella typhimurium and Lambda-like bacteriophages.

    \ ' '6592' 'IPR010634' '\

    This family consists of several hypothetical proteins of around 250 residues in length, which are found in both plants and bacteria. The function of this family is unknown.

    \ ' '6594' 'IPR010636' '\

    This is a domain found in fungal hydrophobins that seems to be restricted to ascomycetes. These are small, moderately hydrophobic extracellular proteins that have eight cysteine residues arranged in a strictly conserved motif. Hydrophobins are generally found on the outer surface of conidia and of the hyphal wall, and may be involved in mediating contact and communication between the fungus and its environment PUBMED:11343402. Note that some family members contain multiple copies of the domain.

    \ ' '6595' 'IPR010637' '\

    This family consists of several SifA and SifB and SseJ proteins, which seem to be specific to the Salmonella species. SifA, SifB and SseJ have been demonstrated to localise to the Salmonella-containing vacuole (SCV) and to Salmonella-induced filaments (Sifs). Trafficking of SseJ and SifB away from the SCV requires the SPI-2 effector SifA. SseJ trafficking away from the SCV along Sifs is unnecessary for its virulence function PUBMED:12496192.

    \ ' '6597' 'IPR009614' '\

    The Axe-Txe pair in Enterococcus faecium (Streptococcus faecium) and the homologous YefM-YoeB pair in Escherichia coli have been shown to act as an antitoxin-toxin pair. This family describes the toxin component. Nearly every example found is next to an identifiable antitoxin, as indicated by match to PUBMED:12603745.

    \ ' '6598' 'IPR010639' '\

    This family consists of several Nucleopolyhedrovirus actin-rearrangement-inducing factor (Arif-1) proteins. In response to Autographa californica nuclear polyhedrosis virus (AcMNPV) infection, a sequential rearrangement of the actin cytoskeleton occurs this is induced by Arif-1 PUBMED:9311884. Arif-1 is tyrosine phosphorylated and is located at the plasma membrane as a component of the actin rearrangement-inducing complex PUBMED:11264366.

    \ ' '6599' 'IPR009615' '\

    This entry represents the N terminus of viral desmoplakin. Desmoplakin is a component of mature desmosomes, which are the main adhesive junctions in epithelia and cardiac muscle. Desmoplakin is also essential for the maturation of adherens junctions PUBMED:11781580. Note that many family members are hypothetical.

    \ ' '6600' 'IPR010640' '\

    This family consists of several bacteria specific low temperature requirement A (LtrA) protein sequences which have been found to be essential for growth at low temperatures in Listeria monocytogenes PUBMED:8534098.

    \ ' '6601' 'IPR014771' '\

    Apoptosis, or programmed cell death (PCD), is a common and evolutionarily conserved property of all metazoans PUBMED:11341280. In many biological processes, apoptosis is required to eliminate supernumerary or dangerous (such as pre-cancerous) cells and to promote normal development. Dysregulation of apoptosis can, therefore, contribute to the development of many major diseases including cancer, autoimmunity and neurodegenerative disorders. In most cases, proteins of the caspase family execute the genetic programme that leads to cell death.

    \

    Bcl-2 proteins are central regulators of caspase activation, and play a key role in cell death by regulating the integrity of the mitochondrial and endoplasmic reticulum (ER) membranes PUBMED:12631689. At least 20 Bcl-2 proteins have been reported in mammals, and several others have been identified in viruses. Bcl-2 family proteins fall roughly into three subtypes, which either promote cell survival (anti-apoptotic) or trigger cell death (pro-apoptotic). All members contain at least one of four conserved motifs, termed Bcl-2 Homology (BH) domains. Bcl-2 subfamily proteins, which contain at least BH1 and BH2, promote cell survival by inhibiting the adapters needed for the activation of caspases.

    \ \

    Pro-apoptotic members potentially exert their effects by displacing the adapters from the pro-survival proteins; these proteins belong either to the Bax subfamily, which contain BH1-BH3, or to the BH3 subfamily, which mostly only feature BH3 PUBMED:9735050. Thus, the balance between antagonistic family members is believed to play a role in determining cell fate. Members of the wider Bcl-2 family, which also includes Bcl-x, Bcl-w and Mcl-1, are described by their similarity to Bcl-2 protein, a member of the pro-survival Bcl-2 subfamily PUBMED:9735050. Full-length Bcl-2 proteins feature all four BH domains, seven alpha-helices, and a C-terminal hydrophobic motif that targets the protein to the outer mitochondrial membrane, ER and nuclear envelope.

    \

    This entry represents the N-terminal region of several mammal specific Bim proteins. The Bim protein is one of the BH3-only proteins, members of the Bcl-2 family that have only one of the Bcl-2 homology regions, BH3.

    \ ' '6603' 'IPR009617' '\

    Seipin is a protein of approximately 400 residues in humans, which is the product of a gene homologous to the murine guanine nucleotide-binding protein (G protein) gamma-3 linked gene. This gene is implicated in the regulation of body fat distribution and insulin resistance and particularly in the auto-immune disease Berardinelli-Seip congenital lipodystrophy type 2. Seipin has no similarity with other known proteins or consensus motifs that might predict its function, but it is predicted to contain two transmembrane domains at residues 28-49 and 237-258, in humans, and a third transmembrane domain might be present at residues 155-173. Seipin may also be implicated in Silver spastic paraplegia syndrome and distal hereditary motor neuropathy type V PUBMED:11479539.

    \ ' '6604' 'IPR010642' '\

    This family consists of several invasion associated locus B (IalB) proteins and related sequences. IalB is known to be a major virulence factor in Bartonella bacilliformis where it was shown to have a direct role in human erythrocyte parasitism. IalB is up-regulated in response to environmental cues signalling vector-to-host transmission. Such environmental cues would include, but not be limited to, temperature, pH, oxidative stress, and haemin limitation. It is also thought that IalB would aide B. bacilliformis survival under stress-inducing environmental conditions PUBMED:12668141. The role of this protein in other bacterial species is unknown.

    \ ' '6605' 'IPR010643' '\

    This domain represents a conserved region within a number of eukaryotic DNA repair helicases.

    \ ' '6606' 'IPR010644' '\

    This family contains chlorite dismutase enzymes of bacterial and archaeal origin. This enzyme catalyses the disproportionation of chlorite into chloride and oxygen PUBMED:8929278. Note that many family members are hypothetical proteins.

    \ ' '6607' 'IPR010645' '\

    This entry represents the N terminus of several putative bacterial membrane proteins, which may be sugar transporters. Note that many members are hypothetical proteins.

    \ ' '6608' 'IPR009618' '\

    This entry represents the C terminus of bacterial Erp proteins that seem to be specific to Borrelia burgdorferi (a causative agent of Lyme disease). Borrelia Erp proteins are particularly heterogeneous, which might enable them to interact with a wide variety of host components PUBMED:12616490.

    \ ' '6609' 'IPR009619' '\

    This is a group of proteins of unknown function.

    \ ' '6610' 'IPR009620' '\

    This is a group of proteins of unknown function.

    \ ' '6611' 'IPR009621' '\

    This is a group of transmembrane proteins of unknown function.

    \ ' '6612' 'IPR009622' '\

    This is a group of proteins of unknown function.

    \ ' '6613' 'IPR009623' '\

    This is a group of proteins of unknown function.

    \ ' '6614' 'IPR009624' '\

    This is a group of proteins of unknown function.

    \ ' '6615' 'IPR009625' '\

    This is a group of proteins of unknown function.

    \ ' '6616' 'IPR010646' '\

    This is a group of proteins of unknown function.

    \ ' '6617' 'IPR009626' '\

    This is a group of proteins of unknown function.

    \ ' '6618' 'IPR009627' '\

    This is a group of proteins of unknown function.

    \ ' '6619' 'IPR009628' '\

    This entry represents a conserved region located towards the N-terminal end of prophage tail length tape measure protein (TMP). TMP is important for assembly of phage tails and involved in tail length determination. Mutated forms TMP cause tail fibres to be shortened PUBMED:11040123.

    \ ' '6620' 'IPR008322' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '6621' 'IPR008321' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '6622' 'IPR010648' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '6623' 'IPR009629' '\

    This family consists of several Erythrovirus X proteins, which seem to be found exclusively in human parvovirus and human erythrovirus. The function of this family is unknown.

    \ ' '6624' 'IPR010649' '\

    This family consists of several bacterial periplasmic nitrate reductase NapE proteins. Seven genes, napKEFDABC, encoding the periplasmic nitrate reductase system were cloned from the denitrifying phototrophic bacterium Rhodobacter sphaeroides. NapE is thought to be a transmembrane protein PUBMED:10227138.

    \ ' '6625' 'IPR009630' '\

    This family consists of several hypothetical proteins of around 415 residues in length which seem to be specific to the bacterium Leptospira interrogans.

    \ ' '6626' 'IPR010650' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    This entry is found at the C-terminus of PrkA proteins - bacterial and archaeal serine kinases approximately 630 residues in length. PrkA possesses the A-motif of nucleotide-binding proteins and exhibits distant homology to eukaryotic protein kinases PUBMED:8626065. Note that many of these are hypothetical.

    \ ' '6627' 'IPR009631' '\

    This family consists of several hypothetical plant and photosynthetic bacterial proteins of around 160 residues in length. The function of this family is unknown although looking at the species distribution the protein may play a part in photosynthesis.

    \ ' '6628' 'IPR010651' '\

    This is a family of bacterial sugar transporters approximately 300 residues long. Members include glucose uptake proteins PUBMED:10438764, ribose transport proteins, and several putative and hypothetical membrane proteins probably involved in sugar transport across bacterial membranes.

    \ ' '6630' 'IPR009633' '\

    This family consists of several Orthopoxvirus specific proteins predominantly of around 340 residues in length. This family contains both B17 and B15 proteins, the function of which are unknown.

    \ ' '6631' 'IPR010652' '\

    This family represents a conserved region of approximately 60 residues within a number of hypothetical bacterial and archaeal proteins of unknown function.

    \ ' '6632' 'IPR010653' '\

    This family consists of a number of bacterial lipoproteins often known as NlpB or DapX. This lipoprotein is detected in outer membrane vesicles in Escherichia coli and appears to be nonessential PUBMED:1885529.

    \ ' '6633' 'IPR010654' '\

    This family consists of several Bacteriophage lambda tail assembly protein I and related phage and bacterial sequences. Members of this family are typically around 200 residues in length. The function of this family is unknown.

    \ ' '6634' 'IPR009634' '\

    This family consists of several putative phage excisionase proteins of around 80 residues in length.

    \ ' '6635' 'IPR010655' '\

    This family consists of several pre-mRNA cleavage complex II Clp1 (or HeaB) proteins. Six different protein factors are required in vitro for 3\' end formation of mammalian pre-mRNAs by endonucleolytic cleavage and polyadenylation. Clp1 is a subunit of cleavage complex IIA, which is required for cleavage, but not for polyadenylation of pre-mRNA PUBMED:11060040.

    \ ' '6636' 'IPR010656' '\

    This domain represents a conserved region located towards the N terminus of the DctM subunit of the bacterial and archaeal TRAP C4-dicarboxylate transport (Dct) system permease. In general, C4-dicarboxylate transport systems allow C4-dicarboxylates like succinate, fumarate, and malate to be taken up. TRAP C4-dicarboxylate carriers are secondary carriers that use an electrochemical H+ gradient as the driving force for transport. DctM is an integral membrane protein that is one of the constituents of TRAP carriers PUBMED:11803016, PUBMED:11524131. Note that many family members are hypothetical proteins.

    \ ' '6637' 'IPR009635' '\

    This family consists of several neural proliferation differentiation control-1 (NPDC1) proteins. NPDC1 plays a role in the control of neural cell proliferation and differentiation. It has been suggested that NPDC1 may be involved in the development of several secretion glands. This family also contains the C-terminal region of the Caenorhabditis elegans protein CAB-1 () which is known to interact with AEX-3 PUBMED:10970871.

    \ ' '6638' 'IPR009636' '\

    This family consists of several phage minor structural protein GP20 sequences of around 180 residues in length. The function of this family is unknown.

    \ ' '6640' 'IPR010657' '\

    This entry represents a conserved region located towards the N-terminal end of ImpA and related proteins. ImpA is an inner membrane protein, which has been suggested to be involved with proteins that are exported and associated with colony variations in Actinobacillus actinomycetemcomitans PUBMED:11083768. Note that many members are hypothetical proteins.

    \ ' '6641' 'IPR010658' '\

    This entry represents a conserved region within plant nodulin-like proteins.

    \ ' '6642' 'IPR009637' '\

    This family represents a conserved region with eukaryotic lung seven transmembrane receptors and related proteins.

    \ ' '6643' 'IPR010659' '\

    This domain is known as the connection domain. This domain lies between the thumb and palm domains PUBMED:1377403.

    \ ' '6644' 'IPR010660' '\

    NOTCH signalling plays a fundamental role during a great number of developmental processes in multicellular animals PUBMED:10221902. NOD (NOTCH protein domain) represents a region present in many NOTCH proteins and NOTCH homologues in multiple species such as 0, NOTCH2 and NOTCH3, LIN12, SC1 and TAN1. Role of NOD domain remains to be elucidated.

    \ ' '6645' 'IPR010661' '\

    This domain is known as the thumb domain. It is composed of a four helix bundle PUBMED:1377403. Reverse transcriptase converts the viral RNA genome into double-stranded viral DNA. Reverse transcriptase often occurs in a polyprotein; with integrase, ribonuclease H and/or protease, which is cleaved before the enzyme takes action. The impact of antiretroviral treatment on the first 400 amino acids of HIV reverse transcriptase is good. Little is known, however, of the antiretroviral drug impact on the C-terminal domains of Pol, which includes the thumb, connection and RNase H PUBMED:18335052. Evidence suggests that these might be well conserved domains.

    \ ' '6646' 'IPR009638' '\

    This family represents the eukaryotic Fez1 protein. Fez1 contains a leucine-zipper region with similarity to the DNA-binding domain of the cAMP-responsive activating-transcription factor 5 PUBMED:10097140. There is evidence that Fez1 inhibits cancer cell growth through regulation of mitosis, and that its alterations result in abnormal cell growth PUBMED:11504921. Note that some family members contain more than one copy of this region.

    \ ' '6647' 'IPR009639' '\

    This region is of unknown function found at the C terminus of some archael proteins that have multiple transmembrane domains and are predicted to be aspartic peptidases belonging to the MEROPS peptidase subfamily A24A (type 4 prepilin peptidase 1.

    \ ' '6648' 'IPR009640' '\

    This entry represents the C terminus of a prophage tail fibre protein found mostly in Escherichia coli. This domain is found together with conserved RLGP motif.

    \ ' '6649' 'IPR010662' '\

    This family contains a number of hypothetical bacterial proteins of unknown function, which may be cytosolic.

    \ ' '6650' 'IPR009641' '\

    This family contains a number of viral proteins of unknown function.

    \ ' '6651' 'IPR009642' '\

    This family contains a number of hypothetical bacterial proteins of unknown function. Some family members contain more than one copy of the region represented by this family.

    \ ' '6652' 'IPR008313' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '6653' 'IPR009643' '\

    Heat shock factor binding protein 1 (HSBP1) appears to be a negative regulator of the heat shock response PUBMED:9649501.

    \ ' '6654' 'IPR006512' '\

    These sequences contain a domain that is duplicated in HI0035 of Haemophilus influenzae, in YidE and YbjL of Escherichia coli, and\ in a number of other putative transporters. Member proteins may have 0, 1, or 2 copies of the TrkA-C potassium uptake domain () between the duplications. The duplication appears distantly related to both the N- and the C-terminal domains the sodium/hydrogen exchanger family domain (). The domain contains several apparent transmembrane regions and is proposed here to act in transport.

    \ ' '6655' 'IPR010663' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a zinc finger domain found at the C-terminal in both DNA glycosylase/AP lyase enzymes and in isoleucyl tRNA synthetase. In these two types of enzymes, the C-terminal domain forms a zinc finger. Some related proteins may not bind zinc.

    \

    DNA glycosylase/AP lyase enzymes are involved in base excision repair of DNA damaged by oxidation or by mutagenic agents. These enzymes have both DNA glycosylase activity () and AP lyase activity () PUBMED:11912217. Examples include formamidopyrimidine-DNA glycosylases (Fpg; MutM) and endonuclease VIII (Nei). Formamidopyrimidine-DNA glycosylases (Fpg, MutM) is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidation-damaged bases (N-glycosylase activity; ) and cleaves both the 3\'- and 5\'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity; ). Fpg has a preference for oxidised purines, excising oxidized purine bases such as 7,8-dihydro-8-oxoguanine (8-oxoG). ITs AP (apurinic/apyrimidinic) lyase activity introduces nicks in the DNA strand, cleaving the DNA backbone by beta-delta elimination to generate a single-strand break at the site of the removed base with both 3\'- and 5\'-phosphates. Fpg is a monomer composed of 2 domains connected by a flexible hinge PUBMED:10921868. The two DNA-binding motifs (a zinc finger and the helix-two-turns-helix motifs) suggest that the oxidized base is flipped out from double-stranded DNA in the binding mode and excised by a catalytic mechanism similar to that of bifunctional base excision repair enzymes PUBMED:10921868. Fpg binds one ion of zinc at the C-terminus, which contains four conserved and essential cysteines PUBMED:8473347. Endonuclease VIII (Nei) has the same enzyme activities as Fpg above, but with a preference for oxidized pyrimidines, such as thymine glycol, 5,6-dihydrouracil and 5,6-dihydrothymine PUBMED:11847126, PUBMED:15232006.

    \

    An Fpg-type zinc finger is also found at the C-terminus of isoleucyl tRNA synthetase () PUBMED:10446055, PUBMED:7488160. This enzyme catalyses the attachment of isoleucine to tRNA(Ile). As IleRS can inadvertently accommodate and process structurally similar amino acids such as valine, to avoid such errors it has two additional distinct tRNA(Ile)-dependent editing activities. One activity is designated as \'pre-transfer\' editing and involves the hydrolysis of activated Val-AMP. The other activity is designated \'post-transfer\' editing and involves deacylation of mischarged Val-tRNA(Ile) PUBMED:16697013.

    \ \ \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '6658' 'IPR009646' '\

    The cells at the periphery of the root cap are continuously sloughed off from the root into the mucilage, and are thought to be programmed to die PUBMED:10427770.This family represents a conserved region approximately 60 residues in length within plant root cap proteins, which may be involved in the process.

    \ ' '6659' 'IPR015886' '\

    This entry represents a helix-2turn-helix DNA-binding domain found in DNA glycosylase/AP lyase enzymes, which are involved in base excision repair of DNA damaged by oxidation or by mutagenic agents. Most damage to bases in DNA is repaired by the base excision repair pathway PUBMED:15588838. These enzymes are primarily from bacteria, and have both DNA glycosylase activity () and AP lyase activity (). Examples include formamidopyrimidine-DNA glycosylases (Fpg; MutM) and endonuclease VIII (Nei).

    \

    Formamidopyrimidine-DNA glycosylases (Fpg, MutM) is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidation-damaged bases (N-glycosylase activity; ) and cleaves both the 3\'- and 5\'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity; ). Fpg has a preference for oxidised purines, excising oxidized purine bases such as 7,8-dihydro-8-oxoguanine (8-oxoG). ITs AP (apurinic/apyrimidinic) lyase activity introduces nicks in the DNA strand, cleaving the DNA backbone by beta-delta elimination to generate a single-strand break at the site of the removed base with both 3\'- and 5\'-phosphates. Fpg is a monomer composed of 2 domains connected by a flexible hinge PUBMED:10921868. The two DNA-binding motifs (a zinc finger and the helix-two-turns-helix motifs) suggest that the oxidized base is flipped out from double-stranded DNA in the binding mode and excised by a catalytic mechanism similar to that of bifunctional base excision repair enzymes PUBMED:10921868. Fpg binds one ion of zinc at the C-terminus, which contains four conserved and essential cysteines PUBMED:8473347, PUBMED:7704272.

    \

    Endonuclease VIII (Nei) has the same enzyme activities as Fpg above (, ), but with a preference for oxidized pyrimidines, such as thymine glycol, 5,6-dihydrouracil and 5,6-dihydrothymine PUBMED:15232006.

    \

    These protein contains three structural domains: an N-terminal catalytic core domain, a central helix-two turn-helix (H2TH) module and a C-terminal zinc finger (see PDB:1K82) PUBMED:11912217. The N-terminal catalytic domain and the C-terminal zinc finger straddle the DNA with the long axis of the protein oriented roughly orthogonal to the helical axis of the DNA. Residues that contact DNA are located in the catalytic domain and in a beta-hairpin loop formed by the zinc finger PUBMED:12055620.

    \ \

    This entry represents the central domain containing the DNA-binding helix-two turn-helix domain PUBMED:11912217.

    \ ' '6660' 'IPR009647' '\

    This conserved region of approximately 90 residues is found in a sub-group of bacterial Penicillin-Binding Proteins (PBPs). A variable length loop region separates this region from the transpeptidase unit (). It is predicted to be a beta fold.

    \ ' '6661' 'IPR009648' '\

    This family consists of several bacterial malonate decarboxylase gamma subunit proteins. Malonate decarboxylase of Klebsiella pneumoniae consists of four different subunits and catalyses the conversion of malonate plus H+ to acetate and CO2. The catalysis proceeds via acetyl and malonyl thioester residues with the phosphribosyl-dephospho-CoA prosthetic group of the acyl carrier protein (ACP) subunit. MdcD and E together probably function as malonyl-S-ACP decarboxylase PUBMED:9208947.

    \ \ \

    Malonate decarboxylase may be a soluble enzyme, or linked to membrane subunits and active as a sodium pump. In the malonate decarboxylase complex, the beta subunit appears to act as a malonyl-CoA decarboxylase, while the gamma subunit appears either to mediate subunit interaction or to act as a co-decarboxylase with the beta subunit. The beta and gamma subunits exhibit some local sequence similarity.

    \ ' '6662' 'IPR009649' '\

    This family consists of several bacterial TraU proteins. TraU appears to be more essential to conjugal DNA transfer than to assembly of pilus filaments PUBMED:2198250.

    \ ' '6663' 'IPR010664' '\

    This family consists of several hypothetical bacterial proteins of around 190 residues in length. The function of this family is unknown.

    \ ' '6664' 'IPR010665' '\

    This family consists of a number of hypothetical putative membrane proteins which seem to be specific to Yersinia pestis. The function of this family is unknown.

    \ ' '6665' 'IPR009650' '\

    This family consists of several Fijivirus specific P9-2 proteins from Rice black streaked dwarf virus (RBSDV) and Fiji disease virus. The function of this family is unknown.

    \ ' '6666' 'IPR009651' '\

    This family represents the aluminium resistance protein, which confers resistance to aluminium in bacteria PUBMED:9367855.

    \ ' '6667' 'IPR010666' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This presumed zinc-binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to .

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '6668' 'IPR009652' '\

    This family consists of several programmed cell death 10 protein (PDCD10 or TFAR15) sequences. The function of this family is unknown.

    \ ' '6669' 'IPR010667' '\

    This family consists of several tail tube protein gp19 sequences from the T4-like viruses PUBMED:3363870,PUBMED:2403438.

    \ ' '6670' 'IPR009653' '\

    This family consists of a number of eukaryotic proteins of around 72 residues in length. The function of this family is unknown.

    \ ' '6672' 'IPR009654' '\

    This family consists of several short bacterial proteins of around 100 residues in length. The function of this family is unknown.

    \ ' '6675' 'IPR009655' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This region is of unknown function, which is found at the C terminus of archaeal preflagellin aspartic acid signal peptidases PUBMED:14622420. \ The preflagellin peptidase is a membrane-bound enzyme topologically similar to its counterpart in the type IV pilus system (prepilin peptidase); the two enzymes utilizing the same catalytic mechanism PUBMED:16983194. The preflagellin peptidase is required for the removal of the leader peptide from archaeal flagellin PUBMED:14622420.

    \ \ \ \

    Preflagellin aspartic acid signal peptidases belong to the MEROPS peptidase family A24B (preflagellin peptidase, clan AD).

    \ ' '6676' 'IPR010671' '\

    This entry describes several repeats which seem to be specific to the Methanosarcina archaea species and are often found in multiple copies in disaggregatase proteins. Members of this family are also found in single copies in several hypothetical proteins.

    \ ' '6677' 'IPR010672' '\

    The last two steps of de novo purine biosynthesis are: \

    \ \ In bacteria and eukaryotes, these steps are catalysed by the well-characterised bifunctional enzyme PurH PUBMED:11323713. Archaea do not appear to posses PurH, however, and perform these reactions by a different mecahnism PUBMED:9150241. In archaea, step i) is catalysed by the well-conserved PurP protein, while step ii) is catalysed by the PurO enzyme in some (though not all) species PUBMED:15623504, PUBMED:11844782.

    \ \

    This entry represents the N-terminal domain of PurP. Its function is not known, though it is almost always found in association with .

    \ ' '6678' 'IPR009656' '\

    This entry represents the C terminus of bacterial poly(3-hydroxybutyrate) (PHB) de-polymerase. This degrades PHB granules to oligomers and monomers of 3-hydroxy-butyric acid.

    \ ' '6679' 'IPR009657' '\

    This family contains a number of hypothetical viral proteins of unknown function approximately 200 residues long.

    \ ' '6680' 'IPR009658' '\

    This entry represents a conserved region within a number of proteins of unknown function that seem to be specific to Caenorhabditis elegans. Note that some proteins in the entry contain more than one copy of this region.

    \ ' '6681' 'IPR009659' '\

    This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown.

    \ ' '6682' 'IPR009660' '\

    This family consists of bacteriophage Gp15 proteins and related bacterial sequences. Two proteins, gene product (gp)15 and gp3, are required to complete the assembly of the T4 tail. Gp15 forms the connector which enables the tail to bind to the head and gp3 is involved in terminating the elongation of the tail tube. Both proteins are hexamers of the respective polypeptide chains PUBMED:12591887.

    \ ' '6683' 'IPR010673' '\

    This family consists of several short hypothetical bacterial proteins of around 70 residues in length. Members of this family seem to all belong to the order Bacillales or Lactobacillales. The function of this family is unknown.

    \ ' '6684' 'IPR009661' '\

    This family consists of the N-terminal region of several hypothetical Nucleopolyhedrovirus proteins of unknown function.

    \ ' '6685' 'IPR009662' '\

    This family consists of the acyl carrier protein, also called the delta subunit, of malonate decarboxylase. This subunit has the same covalently bound prosthetic group, derived from and similar to coenzyme A, as does citrate lyase, although this protein and the acyl carrier protein of citrate lyase do not show significant sequence similarity. Both malonyl and acetyl groups are transferred to the prosthetic group for catalysis.

    \ \ \

    Malonate decarboxylase of Klebsiella pneumoniae consists of four different subunits and catalyses the conversion of malonate plus H+ to acetate and CO2. The catalysis proceeds via acetyl and malonyl thioester residues with the phosphribosyl-dephospho-CoA prosthetic group of the acyl carrier protein (ACP) subunit. MdcC is the (apo) ACP subunit PUBMED:9208947.

    \ ' '6686' 'IPR010674' '\

    This domain represents a conserved region of approximately 60 residues in length within nucleolar GTP-binding protein 1 (NOG1). The NOG1 family includes eukaryotic, bacterial and archaeal proteins. In Saccharomyces cerevisiae, the NOG1 gene has been shown to be essential for cell viability, suggesting that NOG1 may play an important role in nucleolar functions. In particular, NOG1 is believed to be functionally linked to ribosome biogenesis, which occurs in the nucleolus. In eukaryotes, NOG1 mutants were found to disrupt the biogenesis of the 60S ribosomal subunit PUBMED:12788953.

    \

    The DRG and OBG proteins as well as the prokaryotic NOG-like proteins are homologous throughout their length to the amino half of eukaryotic NOG1, which contains the GTP binding motifs (); the N-terminal GTP-binding motif is required for function.

    \ ' '6687' 'IPR010675' '\

    This entry represents a conserved region of approximately 120 residues within eukaryotic Bicoid-interacting protein 3 (Bin3). Bin3, which shows similarity to a number of protein methyltransferases that modify RNA-binding proteins, interacts with Bicoid, which itself directs pattern formation in the early Drosophila embryo. The interaction might allow Bicoid to switch between its dual roles in transcription and translation PUBMED:10717484. Note that proteins of the entry contain a conserved HLN motif.

    \ ' '6689' 'IPR010677' '\

    Epstein-Barr virus (strain GD1) (HHV-4), a human tumour DNA virus and a prominent member of gamma-herpesviruses, encodes homologues of cellular antiapoptotic viral Bcl-2 proteins BALF1 and BHRF1. They protect the virus from apoptosis in its host cell during virus synthesis PUBMED:16277553, PUBMED:16087121. The virus infects B lymphocytes to establish a latent infection and yield proliferating, growth-transformed B cells in vitro. Bcl-2 genes are essential for the initial evasion of apoptosis which allows it to establish a latent infection or cause cellular transformation, or both PUBMED:16277553.

    \ \

    Bcl-2 family proteins can inhibit or induce programmed cell death in part by counteracting the activity of other BCL-2 family members. BALF1, inhibits the antiapoptotic activity of EBV BHRF1 and of KSBcl-2 in several transfected cell lines. BALF1 fails, however, to inhibit the cellular BCL-2 family member, BCL-x(L). Thus, BALF1 acts as a negative regulator of the survival function of BHRF1, similar to the counterbalance observed between cellular BCL-2 family members PUBMED:11836425.\

    \ ' '6690' 'IPR010678' '\

    This family is defined by a C-terminal region of approximately 500 residues, which occurs in several hypothetical eukaryotic proteins of unknown function.

    \ ' '6691' 'IPR010679' '\

    This family represents a conserved region about 130 residues long within hypothetical proteins of unknown function. Family members include eukaryotic, bacterial and archaeal proteins.

    \ ' '6692' 'IPR009663' '\

    This family consists of several enterobacterial PilO proteins. The function of PilO is unknown although it has been suggested that it is a cytoplasmic protein in the absence of other Pil proteins, but PilO protein is translocated to the outer membrane in the presence of other Pil proteins. Alternatively, PilO protein may form a complex with other Pil protein(s). PilO has been predicted to function as a component of the pilin transport apparatus and thin-pilus basal body PUBMED:11751821. This family does not seem to be related to .

    \ ' '6693' 'IPR009664' '\

    This family consists of several conserved hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown

    \ ' '6694' 'IPR009665' '\

    This family consists of several uncharacterised bacterial proteins, which seem to be specific to the orders Clostridia and Bacillales. Family members are typically around 180 residues in length. The function of this family is unknown.

    \ ' '6696' 'IPR009666' '\

    This family contains hypothetical proteins of unknown function that are approximately 120 residues long. Family members include eukaryotic and bacterial proteins.

    \ ' '6697' 'IPR009667' '\

    This family represents a conserved region approximately 260 residues long within a number of hypothetical proteins of unknown function that seem to be specific to Caenorhabditis elegans. Note that this family contains a number of conserved cysteine and histidine residues.

    \ ' '6698' 'IPR009668' '\

    Saccharomyces cerevisiae A49 is a specific subunit associated with RNA polymerase I (Pol I) in eukaryotes. Pol I maintains transcription activities in A49 deletion mutants. However, such mutants are deficient in transcription activity at low temperatures. Deletion analysis of the fusion yeast homologue indicates that only the C-terminal two thirds are required for function. Transcript analysis has demonstrated that A49 is maximising transcription of ribosomal DNA PUBMED:12893961.

    \ ' '6699' 'IPR010680' '\

    This family consists of several TraH proteins, which seem to be specific to Agrobacterium and Rhizobium species. This protein is thought to be involved in conjugal transfer but its function is unknown. This family does not appear to be related to .

    \ ' '6700' 'IPR009669' '\

    This entry represents a family of bacterial virulence proteins, including EspG from Citrobacter rodentium and Escherichia coli and VirA from Shigella flexneri. Both EspG and VirA are delivered into infected host epithelial cells by a type III secretory system PUBMED:11349072, PUBMED:18312845. These proteins function through the disruption of the host cell microtubule network PUBMED:15972534. VirA acts as a cysteine protease () on alpha-tubulin, a major component of microtubules, in order to destabilise surrounding microtubules and invade the cytoplasm of their target host cells PUBMED:17095701. VirA also promotes the formation of membrane ruffles through the activation of host rac1, which is associated with the destruction of microtubule networks PUBMED:12065406. In this way, VirA creates a tunnel inside the host cell cytoplasm by breaking down the microtubule infrastructure, which facilitates the bacterium\'s movement through the cytoplasm and also helps other bacteria move faster during the invasion of the eukaryotic cell.

    \ ' '6701' 'IPR009670' '\

    This family consists of several cell surface immobilisation antigen SerH proteins which seem to be specific to Tetrahymena thermophila. The SerH locus of T. thermophila is one of several paralogous loci with genes encoding variants of the major cell surface protein known as the immobilisation antigen (i-ag) PUBMED:11973302.

    \ ' '6702' 'IPR009164' '\

    Fructose 1,6-bisphosphatase catalyses the hydrolysis of fructose 1,6-bisphosphate to fructose 6-phosphate PUBMED:3008716. This is an essential reaction in the process of gluconeogenesis, the process by which non-carbohydrate precursors are converted to glucose, and hence this enzyme is found almost universally. Enzyme activity can be regulated by a number of different mechanisms including AMP inhibition, cylic AMP-dependent phosphorylation and light-dependent-activation.

    \ \

    This entry represents a group of fructose 1,6-bisphosphatases found within the Firmicutes (low GC Gram-positive bacteria) which do not show any significant sequence similarity to the enzymes from other organisms. The Bacillus subtilis enzyme is inhibited by AMP, though this can be overcome by phosphoenolpyruvate, and is dependent on Mn(2+) PUBMED:221467, PUBMED:9696785. Mutants lacking this enzyme are apparently still able to grow on gluconeogenic growth substrates such as malate and glycerol.

    \ ' '6703' 'IPR010681' '\

    This family consists of several plethodontid receptivity factor (PRF) proteins which seem to be specific to Plethodon jordani (Jordan\'s salamander). PRF is a courtship pheromone produced by males increase female receptivity PUBMED:10489368.

    \ ' '6704' 'IPR010682' '\

    This family consists of several plant self-incompatibility response (SCRL) proteins. The male component of the self-incompatibility response in Brassica has been shown to be encoded by the S locus cysteine-rich gene (SCR). SCR is related, at the sequence level, to the pollen coat protein (PCP) gene family whose members encode small, cysteine-rich proteins located in the proteo-lipidic surface layer (tryphine) of Brassica pollen grains PUBMED:11437247.

    \ ' '6705' 'IPR009671' '\

    This entry occurs in several hypothetical bacterial proteins of around 120 residues in length. The function of these proteins is unknown. The protein structure has been determined for one member of this group, the hypothetical protein VCO424 from Vibrio cholerae; it has an alpha+beta sandwich fold.

    \ ' '6706' 'IPR009672' '\

    This family consists of several Pkip-1 proteins, which seem to be specific to Nucleopolyhedroviruses. The function of this family is unknown although it has been found that Pkip-1 is not essential for virus replication in cell culture or by in vivo intrahaemocoelic injection PUBMED:12867634.

    \ ' '6707' 'IPR009673' '\

    This family contains hypothetical proteins of unknown function that are approximately 200 residues long. They seem to be specific to Caenorhabditis elegans.

    \ ' '6708' 'IPR010683' '\

    This family represents a conserved region within a number of proteins of unknown function that seem to be specific to Arabidopsis thaliana. Note that some family members contain more than one copy of this region.

    \ ' '6709' 'IPR010684' '\

    This family represents a conserved region within RNA polymerase II transcription factor SIII (Elongin) subunit A. In mammals, the Elongin complex activates elongation by RNA polymerase II by suppressing transient pausing of the polymerase at many sites within transcription units. Elongin is a heterotrimer composed of A, B, and C subunits of 110, 18, and 15 kilodaltons, respectively. Subunit A has been shown to function as the transcriptionally active component of Elongin PUBMED:7660129.

    \ ' '6710' 'IPR010685' '\

    This entry represents a conserved region located towards the C terminus of a number proteins of unknown function that seem to be specific to Oryza sativa.

    \ ' '6711' 'IPR009674' '\

    This domain is found between domain 3 and domain 5, but shows no homology to domain 4 of Rpb2. The external domains in multisubunit RNA polymerase (those most distant from the active site) are known to demonstrate more sequence variability PUBMED:11313498.

    \ ' '6712' 'IPR010686' '\

    This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 200 residues long. Some family members are annotated as putative lipoproteins.

    \ ' '6714' 'IPR009675' '\

    This family represents a conserved region approximately 60 residues long within the eukaryotic targeting protein for Xklp2 (TPX2). Xklp2 is a kinesin-like protein localised on centrosomes throughout the cell cycle and on spindle pole microtubules during metaphase. In Xenopus, it has been shown that Xklp2 protein is required for centrosome separation and maintenance of spindle bi-polarity PUBMED:8548825. TPX2 is a microtubule-associated protein that mediates the binding of the C-terminal domain of Xklp2 to microtubules. It is phosphorylated during mitosis in a microtubule-dependent way PUBMED:10871281.

    \ ' '6715' 'IPR009676' '\

    This family represents a conserved region approximately 50 residues long within a number of proteins of unknown function that seem to be restricted to Caenorhabditis elegans.

    \ ' '6716' 'IPR006384' '\

    This group of sequences belong to the IB subfamily of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. With exceptions from Bacillus subtilis and Clostridium acetobutylicum, the members of this group are all eukaryotic, spanning metazoa, plants and fungi.

    \ ' '6717' 'IPR009677' '\

    This family consists of several hypothetical bacterial proteins of around 235 residues in length. Members of this family seem to be found exclusively in the Enterobacteria Salmonella typhimurium and Escherichia coli. The function of this family is unknown.

    \ ' '6718' 'IPR010688' '\

    Bacteriophage Mu is a double-stranded DNA phage. It has an icosahedral head, a contractile tail with baseplate and six tail fibres. It is similar to the well-studied T-even phages. The baseplate of bacteriophage Mu, which recognises and attaches to a host cell during infection, consists of at least eight different proteins PUBMED:16125724.

    \

    Bacteriophage Mu is used as a model for DNA transposition events in other systems. This entry contains Bacteriophage Mu Gp45 and related proteins from viruses and from prophage sequences.

    \ \

    The baseplate protein, gp44, is essential for Bacteriophage Mu assembly and the generation of viable phages PUBMED:16125724. The overall structure of the gp44 trimer is similar to that of the Bacteriophage T4 gp27 trimer but they share little primary sequence homology. Gp44 forms the central hub of the T4 baseplate. DNA can pass through this hub during infection PUBMED:16125724.

    \ \

    In T4, the activated transcription of the late genes occurs in a manner distinct from the usual mechanisms of transcriptional regulation PUBMED:18455735. Gp45, the viral replisome\'s sliding clamp, activates transcription and the two sliding-clamp-binding proteins, gp33 and gp55, replace the host RNA polymerase (RNAP) sigma subunit PUBMED:16807240, PUBMED:18455735.

    \ \

    The DNA polymerase holoenzyme is responsible for accurate DNA synthesis. The holoenzyme consists of DNA polymerase gp43 and clamp protein gp45. To form a productive holoenzyme complex, the clamp loader protein gp44/62 is required, along with MgATP, for the loading of gp45 as well as for the subsequent binding of polymerase to the loaded clamp PUBMED:16800624.

    \ ' '6719' 'IPR009678' '\

    This family consists of P2 phage tail completion protein R (GpR) like sequences. GpR is thought to be a tail completion protein which is essential for stable head joining PUBMED:8178426.

    \ ' '6720' 'IPR009679' '\

    This family consists of several phage regulatory protein CII (CP76) sequences which are thought to be DNA binding proteins which are involved in the establishment of lysogeny PUBMED:3806670.

    \ ' '6722' 'IPR010027' '\

    This entry identifies a family of bacteriophage proteins including G of phage lambda. This protein has been described as undergoing a translational frameshift at a Gly-Lys dipeptide near the C terminus of protein G from phage lambda, with about 4% efficiency, to produce tail assembly protein G-T.

    \ ' '6723' 'IPR009680' '\

    This family consists of several Lactococcus lactis and Lactococcus bacteriophage proteins of around 74 residues in length. The function of this family is unknown.

    \ ' '6724' 'IPR009681' '\

    This family consists of several bacterial and phage proteins of around 115 residues in length. The function of this family is unknown.

    \ ' '6725' 'IPR009200' '\ There are currently no experimental data for members of this group or their homologues. However, these proteins are predicted to contain two or more transmembrane segments.\ ' '6726' 'IPR010690' '\

    This family consists of several putative bacterial stage IV sporulation (SpoIV) proteins. YqfD of Bacillus subtilis () is known to be essential for efficient sporulation although its exact function is unknown PUBMED:12662922.

    \ ' '6727' 'IPR010691' '\

    This family consists of several WzyE proteins, which appear to be specific to Enterobacteria. Members of this family are described as putative ECA polymerases this has been found to be incorrect PUBMED:11673418. The function of this family is unknown.

    \ ' '6728' 'IPR009682' '\

    This family consists of several hypothetical Staphylococcus aureus and phage proteins of 53 residues in length. The function of this family is unknown.

    \ ' '6729' 'IPR010692' '\

    This family consists of several RTX iron-regulated FrpC proteins which appear to be found exclusively in Neisseria meningitidis. FrpC has been shown to be related to the RTX family of bacterial cytotoxins. FrpC is found in the meningococcal outer membrane. The function of this family is unknown although it is thought to be a virulence factor PUBMED:12654851.

    \ ' '6730' 'IPR010693' '\

    This entry is represented by bacterial ferredoxins such Ferredoxin-1, -2 and -soy from Streptomyces griseolus and Ferredoxin fas2 from Rhodococcus fascians, plus several bacterial hypothetical proteins that contain three highly conserved cysteine residues. These ferredoxins each bind a 3Fe-4S cluster. Ferredoxin-soy (SoyB) act as electron transport protein for the cytochrome P450-SOY system PUBMED:8483414. Ferredoxin-1 (SuaB) and Ferredoxin-2 (SubB) act as electron transport proteins for the herbicide-metabolising cytochrome P-450 SU1 and SU2 systems, respectively PUBMED:1551600, PUBMED:18314962. Ferredoxin-fas2 also plays a role in electrontransfer, the fas operon encoding genes involved in cytokinin production and in host plant fasciation (leafy gall).

    \ ' '6731' 'IPR010694' '\

    This family consists of several bacterial VirK proteins of around 145 residues in length. The function of this family is unknown PUBMED:11434457.

    \ ' '6732' 'IPR009683' '\

    This entry represents the C terminus (approx. 120 residues) of a number of bacterial extensin-like proteins. Extensins are cell wall glycoproteins normally associated with plants, where they strengthen the cell wall in response to mechanical stress PUBMED:8148875. Many proteins in this entry are hypothetical.

    \ ' '6733' 'IPR010695' '\

    This family consists of several fas apoptotic inhibitory molecule (FAIM) proteins. FAIM expression is upregulated in B cells by anti-Ig treatment that induces Fas-resistance, and overexpression of FAIM diminishes sensitivity to Fas-mediated apoptosis of B and non-B cell lines. FAIM is highly evolutionarily conserved and is widely expressed in murine tissues, suggesting that FAIM plays an important role in cellular physiology PUBMED:11483211.

    \ ' '6734' 'IPR010696' '\

    This family consists of several hypothetical bacterial proteins of around 80 residues in length. This family contains a number of conserved cysteine residues and its function is unknown.

    \ ' '6735' 'IPR009684' '\

    This family consists of several animal specific latexin and proteins related to latexin that belong to MEROPS proteinase inhibitor family I47, clan I- PUBMED:14705960.

    \

    Latexin, a protein possessing inhibitory activity against rat carboxypeptidase A1 (CPA1) and CPA2 (MEROPS peptidase family M14A), is expressed in a neuronal subset in the cerebral cortex\ and cells in other neural and non-neural tissues of rat PUBMED:10698712, PUBMED:11455960. OCX-32, the 32 kDa eggshell matrix protein, \ is present at high levels in the uterine fluid during the terminal phase of eggshell formation, and is localised predominantly in the outer eggshell. The timing of OCX-32 secretion into the uterine fluid suggests that it may play a role in the termination of mineral deposition PUBMED:12952168. OCX-32 protein possesses limited identity (32%) to two unrelated proteins: latexin and to a skin protein that is encoded by a retinoic acid receptor-responsive gene, TIG1. Tazarotene Induced Gene 1 (TIG1) is a putative 228 transmembrane protein with a small N-terminal intracellular region, a single membrane-spanning hydrophobic region, and a large C-terminal extracellular region containing a glycosylation signal. TIG1 is up-regulated by retinoic acid receptor but not by retinoid X receptor-specific synthetic retinoids PUBMED:8601727. TIG1 may be a tumour suppressor gene whose diminished expression is involved in the malignant progression of prostate cancer PUBMED:11929948.

    \ \ ' '6736' 'IPR010697' '\

    This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown.

    \ ' '6738' 'IPR009685' '\

    This family consists of several mammalian male enhanced antigen 1 (MEA1) proteins. The Mea-1 gene is found to be localised in primary and secondary spermatocytes and spermatids, but the protein products are detected only in spermatids. Intensive transcription of Mea-1 gene and specific localisation of the gene product suggest that Mea-1 may play a important role in the late stage of spermatogenesis PUBMED:8907304.

    \ ' '6739' 'IPR009686' '\

    This family contains a number of plant senescence-associated proteins of approximately 450 residues in length. In Hemerocallis, petals have a genetically based program that leads to senescence and cell death approximately 24 hours after the, flower opens, and it is believed that senescence proteins produced around that time have a role in this program PUBMED:10412903.

    \ ' '6740' 'IPR010699' '\

    This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although a few members are thought to be membrane proteins.

    \ ' '6743' 'IPR010701' '\

    This family consists of several hypothetical plant specific proteins of around 150 residues in length. Members of this family contain several conserved cysteine residues. The function of the family is unknown.

    \ ' '6744' 'IPR009688' '\

    This entry represents the C terminus (approx. 120 residues) of a number of eukaryotic proteins of unknown function.

    \ ' '6745' 'IPR010702' '\

    This family consists of several Enterobacterial periplasmic pectate lyase proteins. A major virulence determinant of the plant-pathogenic enterobacterium Erwinia chrysanthemi is the production of pectate lyase enzymes that degrade plant cell walls PUBMED:12423024.

    \ ' '6746' 'IPR009689' '\

    This family represents a conserved region approximately 200 residues long within a number of proteins of unknown function that seem to be specific to Caenorhabditis elegans.

    \ ' '6747' 'IPR009690' '\

    This family consists of several phage Gp30.7 proteins of 121 residues in length. Family members seem to be exclusively from the T4-like viruses. The function of this family is unknown.

    \ ' '6748' 'IPR010703' '\

    This family represents a conserved region of approximately 200 residues within a number of eukaryotic dedicator of cytokinesis (DOCK) proteins. These proteins are potential guanine nucleotide exchange factors that activate some small GTPases, such as Rac, by exchanging bound GDP for free GTP PUBMED:12432077. DOCK proteins are required during several cellular processes, such as cell motility and phagocytosis. For instance, DOCK2 is specifically expressed in haemopoietic cells, and plays a critical role in lymphocyte migration PUBMED:12829596.

    \ ' '6750' 'IPR009692' '\

    This entry represents the 13 kDa P13 protein from Citrus tristeza virus (CTV) strains. CTV, a member of the closterovirus group, is one of the more complex single-stranded RNA viruses PUBMED:9024813. The function of the P13 protein is unknown.

    \ ' '6751' 'IPR009693' '\

    This family consists of several glucitol operon activator (GutM) proteins. Expression of the glucitol (gut) operon in Escherichia coli is regulated by an unusual, complex system, which consists of an activator (encoded by the gutM gene) and a repressor (encoded by the gutR gene) in addition to the cAMP-CRP complex (CRP, cAMP receptor protein). Synthesis of the mRNA, which initiates at the promoter specific to the gutR gene, occurs within the gutM gene. Expressional control of the gut operon appears to occur as a consequence of the antagonistic action of the products of the autogenously regulated gutM and gutR genes PUBMED:3062173.

    \ ' '6752' 'IPR009694' '\

    This family consists of several hypothetical enterobacterial proteins of around 170 residues in length. Members of this family are found in Escherichia coli, Salmonella typhimurium and Shigella species. The function of this family is unknown.

    \ ' '6753' 'IPR009695' '\

    This family represents a conserved region of approximately 180 residues within plant and bacterial monogalactosyldiacylglycerol (MGDG) synthase (). In Arabidopsis, there are two types of MGDG synthase which differ in their N-terminal portion: type A and type B PUBMED:11553816.

    \ ' '6754' 'IPR009696' '\

    This entry represents the C terminus (approximately 100 residues) of a putative replisome organiser protein in Lactococcus bacteriophages PUBMED:11157223.

    \ ' '6757' 'IPR009697' '\

    This family consists of several Rotavirus specific VP3 proteins. VP3 is known to be a viral guanylyltransferase and is thought to posses methyltransferase activity and therefore VP3 is a predicted multifunctional capping enzyme PUBMED:10603323.

    \ ' '6758' 'IPR009698' '\

    This family consists of several hypothetical proteins of around 200 residues in length. The function of this family is unknown although a number of family members are thought to be putative membrane proteins.

    \ ' '6759' 'IPR009699' '\

    This family consists of several Mastadenovirus E4 ORF3 proteins. Early proteins E4 ORF3 and E4 ORF6 have complementary functions during viral infection. Both proteins facilitate efficient viral DNA replication, late protein expression, and prevention of concatenation of viral genomes. A unique function of E4 ORF3 is the reorganisation of nuclear structures known as PML oncogenic domains (PODs). The function of these domains is unclear, but PODs have been implicated in a number of important cellular processes, including transcriptional regulation, apoptosis, transformation, and response to interferon PUBMED:12692231.

    \ ' '6760' 'IPR009700' '\

    This family consists of several hypothetical proteins of around 115 residues in length, which seem to be specific to Enterobacteria. The function of the family is unknown.

    \ ' '6761' 'IPR009701' '\

    This family consists of several special lobe-specific silk protein SSP160 sequences which appear to be specific to Chironomus (Midge) species.

    \ ' '6762' 'IPR010706' '\

    This family consists of several fatty acid cis/trans isomerase proteins, which appear to be found exclusively in bacteria of the orders Vibrionales and Pseudomonadales. Cis/trans isomerase (CTI) catalyses the cis-trans isomerisation of esterified fatty acids in phospholipids, mainly cis-oleic acid (C(16:1,9)) and cis-vaccenic acid (C(18:1,11)), in response to solvents. The CTI protein has been shown to be involved in solvent resistance in Pseudomonas putida PUBMED:10482510.

    \ ' '6763' 'IPR009702' '\

    This family consists of several hypothetical bacterial and archaeal proteins of around 130 residues in length. The function of this family is unknown, although it is thought that they may be iron-sulphur binding proteins.

    \ ' '6764' 'IPR009703' '\

    This family consists of several mammalian selenoprotein S (SelS) sequences. SelS is a plasma membrane protein and is present in a variety of tissues and cell types. These proteins are involved in the degradation process of misfolded endoplasmic reticulum (ER) luminal proteins which participate in the transfer of misfolded proteins from the ER to the cytosol, where they are destroyed by the proteasome in a ubiquitin-dependent manner PUBMED:12477932. They probably serve as a linker between DER1, which mediates the retro-translocation of misfolded proteins into the cytosol, and the ATPase complex VCP, which mediates the translocation and ubiquitination.

    \ ' '6765' 'IPR009704' '\

    This family consists of several animal EURL proteins. EURL is preferentially expressed in chick retinal precursor cells as well as in the anterior epithelial cells of the lens at early stages of development. EURL transcripts are found primarily in the peripheral dorsal retina, i.e., the most undifferentiated part of the dorsal retina. EURL transcripts are also detected in the lens at stage 18 and remain abundant in the proliferating epithelial cells of the lens until at least day 11. The distribution pattern of EURL in the developing retina and lens suggest a role before the events leading to cell determination and differentiation PUBMED:12815627.

    \ ' '6766' 'IPR010707' '\

    This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown.

    \ ' '6767' 'IPR009705' '\

    This family consists of several hypothetical archaeal proteins of around 120 residues in length. All members of this family seem to be Sulfolobus species specific. The function of this family is unknown.

    \ ' '6768' 'IPR009706' '\

    This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown.

    \ ' '6769' 'IPR010708' '\

    This family consists of several 5\' nucleotidase, deoxy (Pyrimidine), and cytosolic type C (NT5C) proteins. 5\'(3\')-deoxyribonucleotidase is a ubiquitous enzyme in mammalian cells whose physiological function is not known PUBMED:10681516.

    \ ' '6770' 'IPR009707' '\

    This family consists of several bacterial GlpM membrane proteins. GlpM is a hydrophobic protein containing 109 amino acids. It is thought that GlpM may play a role in alginate biosynthesis in Pseudomonas aeruginosa PUBMED:7642508.

    \ ' '6771' 'IPR005735' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This model describes a putative zinc finger domain found in three closely spaced copies in Arabidopsis protein LSD1 and in two copies in other proteins from the same species. The motif resembles CxxCRxxLMYxxGASxVxCxxC PUBMED:9054508. This domain may play a role in the regulation of transcription, via either repression of a prodeath pathway or activation of an antideath pathway, in response to signals emanating from cells undergoing\ pathogen-induced hypersensitive cell death.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '6773' 'IPR010710' '\

    This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N terminus. The function of this family is unknown.

    \ ' '6774' 'IPR009708' '\

    This family consists of several Listeria bacteriophage holin proteins and related bacterial sequences. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the build up of a holin oligomer which causes the lysis PUBMED:11459934.

    \ ' '6775' 'IPR009709' '\

    This family consists of several bacterial small basic proteins of around 100 residues in length. The function of this family is unknown.

    \ ' '6777' 'IPR009711' '\

    This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown.

    \ ' '6778' 'IPR009712' '\

    This family consists of several bacterial and phage proteins of around 115 residues in length. The function of this family is unknown.

    \ ' '6779' 'IPR010711' '\

    This family consists of several group XII secretory phospholipase A2 precursor (PLA2G12) () proteins. Group XII and group V PLA(2)s are thought to participate in helper T cell immune response through release of immediate second signals and generation of downstream eicosanoids PUBMED:11278438.

    \ ' '6780' 'IPR009713' '\

    This family consists of several Enterobacterial PsiA proteins. The function of PsiA is unknown although it is thought that it may affect the generation of an SOS signal in Escherichia coli PUBMED:3526338.

    \ ' '6781' 'IPR010712' '\

    This family consists of several bacterial arsenical resistance operon trans-acting repressor ArsD proteins. ArsD is a trans-acting repressor of the arsRDABC operon that confers resistance to arsenicals and antimonials in Escherichia coli. It possesses two-pairs of vicinal cysteine residues, Cys(12)-Cys(13) and Cys(112)-Cys(113), that potentially form separate binding sites for the metalloids that trigger dissociation of ArsD from the operon. However, as a homodimer it has four vicinal cysteine pairs PUBMED:11980902.

    \ ' '6782' 'IPR009714' '\

    This family consists of several mammalian resistin proteins. Resistin is a 12.5 kDa cysteine-rich secreted polypeptide first reported from rodent adipocytes. It belongs to a multigene family termed RELMs or FIZZ proteins. Plasma resistin levels are significantly increased in both genetically susceptible and high-fat-diet-induced obese mice. Immunoneutralisation of resistin improves hyperglycemia and insulin resistance in high-fat-diet-induced obese mice, while administration of recombinant resistin impairs glucose tolerance and insulin action in normal mice. It has been demonstrated that increases in circulating resistin levels markedly stimulate glucose production in the presence of fixed physiological insulin levels, whereas insulin suppressed resistin expression. It has been suggested that resistin could be a link between obesity and type 2 diabetes PUBMED:12885401.

    \ ' '6783' 'IPR010713' '\

    This entry represents the C terminus (approximately 60 residues) of plant xyloglucan endo-transglycosylase (XET). Xyloglucan is the predominant hemicellulose in the cell walls of most dicotyledons. With cellulose, it forms a network that strengthens the cell wall. XET catalyses the splitting of xyloglucan chains and the linking of the newly generated reducing end to the non-reducing end of another xyloglucan chain, thereby loosening the cell wall PUBMED:9487728.

    \ ' '6784' 'IPR009715' '\

    RtcR is a sigma54-dependent enhancer binding protein PUBMED:12618438 that activates transcription of the rtcBA operon. The product of the rtcA gene is an RNA 3 -terminal phosphate cyclase PUBMED:9738023. This domain is found at the N terminus of the RtcR sequence. RtcR, and other sigma54-dependent activators, contain in the central region of the protein sequence.

    \ ' '6785' 'IPR010714' '\

    Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer PUBMED:15261670. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins PUBMED:14690497. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi PUBMED:11208122. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes PUBMED:17041781. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta\', gamma, delta, epsilon and zeta subunits.

    \

    This entry represents the C terminus (approximately 500 residues) of the eukaryotic coatomer alpha subunit PUBMED:12893528, PUBMED:9261053. This domain is found along with the domain.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '6786' 'IPR010715' '\

    Pyocins are polypeptide toxins produced by, and active against, bacteria. S-type pyocins cause cell death by DNA breakdown due to endonuclease activity PUBMED:12423794. Pyocins S1 and S2 are S-type bacteriocins of Pseudomonas aeruginosa with different receptor recognition specificities PUBMED:8491711. The genetic determinants of these pyocins have been cloned from the NIH-H and PAO chromosomes of P. aeruginosa. The determinants each constitute an operon that encodes two proteins of molecular weight 65,600 and 10,000 (pyocin S1) or 74,000 and 10,000 (pyocin S2) with a characteristic sequence (P box), a possible regulatory element involved in the induction of pyocin production, in the 5\' upstream region PUBMED:8491711. These pyocins have almost identical sequences, except in the N-terminal portions of the large proteins, which are substantially different. This similarity suggests that S1 and S2 pyocins, like pyocin AP41, originated from a common ancestor of the E2 group colicins. Purified pyocins S1 and S2 constitute a complex of the two proteins. Both pyocins cause breakdown of chromosomal DNA and complete inhibition of lipid synthesis in sensitive cells. The large protein (not the complex), shows in vitro DNase activity. This activity is inhibited by the small protein of either pyocin PUBMED:8491711.

    \ \

    This represents a conserved region approximately 180 residues long within bacterial S-type pyocins.

    \ ' '6787' 'IPR010716' '\

    This family represents a conserved region approximately 200 residues long within eukaryotic RecQ helicase protein-like 5 (RecQ5). The RecQ helicases have been implicated in DNA repair and recombination, and RecQ5 may have an important role in DNA metabolism PUBMED:10710432.

    \ ' '6789' 'IPR010718' '\

    This family includes a number of hypothetical bacterial and archaeal proteins of unknown function.

    \ ' '6790' 'IPR010719' '\

    This family contains a number of putative rRNA methylases.

    \ ' '6791' 'IPR009716' '\

    This family represents a conserved region approximately 100 residues long within eukaryotic Ferroportin1 (FPN1), a protein that may play a role in iron export from the cell PUBMED:11809412. This family may represent a number of transmembrane regions in Ferroportin1.

    \ ' '6792' 'IPR010720' '\

    This entry represents the C terminus (approximately 200 residues) of bacterial and eukaryotic alpha-L-arabinofuranosidase (). This catalyses the hydrolysis of non-reducing terminal alpha-L-arabinofuranosidic linkages in L-arabinose-containing polysaccharides PUBMED:7887599.

    \ ' '6793' 'IPR004670' '\

    The Escherichia coli NhaA Na+:H+ Antiporter (NhaA) protein probably functions in the regulation of the internal pH when the\ external pH is alkaline. It also uses the H+ gradient to expel Na+ from the cell. Its activity is highly pH dependent.

    \ \ ' '6794' 'IPR010721' '\

    This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 300 residues long.

    \ ' '6795' 'IPR009717' '\

    This entry represents the C terminus (approximately 80 residues) of a number of bacterial Mo-dependent nitrogenases. These are involved in nitrogen fixation in cyanobacteria PUBMED:7568132.

    \ ' '6796' 'IPR010722' '\

    Biotin synthase (BioB), , catalyses the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer PUBMED:12482614. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this entry) and form a heterodimerPUBMED:12650933. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers PUBMED:12482614, PUBMED:12650933. This domain therefore may be involved in co-factor binding or dimerisation.

    \ ' '6797' 'IPR010723' '\

    Proteins containing this domain are all oxygen-independent coproporphyrinogen-III oxidases (HemN). This enzyme catalyses the oxygen-independent conversion of coproporphyrinogen-III to protoporphyrinogen-IX PUBMED:12196143, one of the last steps in haem biosynthesis. The function of this domain is unclear, but comparison to other proteins containing a radical SAM domain suggest it may be a substrate binding domain.

    \ ' '6798' 'IPR010724' '\

    This entry represents the N terminus (approximately 80 residues) of replication initiator protein A (RepA), a DNA replication initiator in plasmids PUBMED:12637554. Most proteins in this entry are bacterial, but archaeal and eukaryotic members are also included.

    \ ' '6799' 'IPR009718' '\

    This entry represents the C terminus (approximately 30 residues) of a number of Rex proteins. These are redox-sensing repressors that appear to be widespread among Gram-positive bacteria PUBMED:12970197. They modulate transcription in response to changes in cellular NADH/NAD(+) redox state. Rex is predicted to include a pyridine nucleotide-binding domain (Rossmann fold), and residues that might play key structural and nucleotide binding roles are highly conserved.

    \ ' '6800' 'IPR009719' '\

    This family represents a conserved region approximately 60 residues long within a number of plant proteins of unknown function.

    \ ' '6801' 'IPR009720' '\

    The last two steps of de novo purine biosynthesis are: \

    \ \ In bacteria and eukaryotes, these steps are catalysed by the well-characterised bifunctional enzyme PurH PUBMED:11323713. Archaea do not appear to posses PurH, however, and perform these reactions by a different mechanism PUBMED:9150241. In archaea, step i) is catalysed by the well-conserved PurP protein, while step ii) is catalysed by the PurO enzyme in some (though not all) species PUBMED:15623504, PUBMED:11844782.

    \ \

    This entry represents the C-terminal domain of PurP, which is homologous to the ATP-GRASP fold and thus may be involved in ATP-binding. It is almost always found in association with .

    \ ' '6802' 'IPR009721' '\

    This entry represents the C terminus (approximately 170 residues) of a number of hypothetical plant proteins of unknown function.

    \ ' '6803' 'IPR010725' '\

    This entry represents a conserved region approximately 50 residues long within a number of proteins of unknown function that seem to be specific to Arabidopsis thaliana. Note that many proteins contain multiple copies of this region.

    \ ' '6805' 'IPR009722' '\

    This entry represents a conserved region approximately 100 residues long within a number of hypothetical bacterial proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators PUBMED:9495757. Some proteins contain the repeat.

    \ ' '6806' 'IPR009723' '\

    This entry represents a conserved region approximately 150 residues long located towards the N terminus of the POP1 subunit that is common to both the RNase MRP and RNase P ribonucleoproteins () PUBMED:7926742. These RNA-containing enzymes generate mature tRNA molecules by cleaving their 5\' ends.

    \ ' '6807' 'IPR009724' '\

    This family contains a number of eukaryotic proteins of unknown function that are approximately 160 residues long.

    \ ' '6808' 'IPR010727' '\

    This family contains a number of hypothetical bacterial proteins of unknown function that are approximately 600 residues long. Most family members seem to be from Pseudomonas.

    \ ' '6811' 'IPR009725' '\

    This entry contains a number of bacterial and archaeal 3-demethylubiquinone-9 3-methyltransferases () which have a conserved region of approximately 100 residues. In some proteins this region occurs more than once.

    \ ' '6812' 'IPR010729' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This entry represents the N-terminal region (approximately 8 residues) of the eukaryotic mitochondrial 39-S ribosomal protein L47 (MRP-L47). Mitochondrial ribosomal proteins (MRPs) are the counterparts of the cytoplasmic ribosomal proteins, in that they fulfil similar functions in protein biosynthesis. However, they are distinct in number, features and primary structure PUBMED:9445368.

    \ ' '6813' 'IPR010730' '\

    This entry represents a conserved region approximately 150 residues long within various heterokaryon incompatibility proteins that seem to be restricted to ascomycete fungi. Genetic differences in specific het genes prevent a viable heterokaryotic fungal cell from being formed by the fusion of filaments from two different wild-type strains PUBMED:12019224. Many proteins of this entry also contain the WD domain, G-beta repeat and the NACHT domain.

    \ ' '6814' 'IPR009726' '\

    Proteins in this entry include the TraN bacterial mating pair stabilisation proteins. TraN is thought to be required for the formation of stable mating aggregates during F-directed conjugation PUBMED:1593622. These proteins share a short conserved region (approximately 40 residues) which contains five conserved cysteine residues.

    \ ' '6816' 'IPR009727' '\

    This family consists of several NifT and FixU bacterial proteins. The function of NifT is unknown although it is thought that the protein may be involved in biosynthesis of the FeMo cofactor of nitrogenase although perturbation of nifT expression in Klebsiella pneumoniae has only a limited effect on nitrogen fixation PUBMED:9139910.

    \ ' '6817' 'IPR009728' '\

    This entry represents the mammalian BAALC proteins. BAALC (brain and acute leukaemia, cytoplasmic) is highly conserved among mammals, but is absent from lower organisms. Two isoforms are specifically expressed in neuroectoderm-derived tissues, but not in tumours or cancer cell lines of non-neural tissue origin. It has been shown that blasts from a subset of patients with acute leukaemia greatly overexpress eight different BAALC transcripts, resulting in five protein isoforms. Among patients with acute myeloid leukaemia, those overexpressing BAALC show distinctly poor prognosis, pointing to a key role of the BAALC products in leukaemia. It has been suggested that BAALC is a gene implicated in both neuroectodermal and hematopoietic cell functions PUBMED:11707601.

    \ ' '6818' 'IPR009729' '\

    This family consists of several mammalian galactose-3-O-sulphotransferase proteins. Gal-3-O-sulphotransferase is thought to play a critical role in 3\'-sulphation of N-acetyllactosamine in both O- and N-glycans PUBMED:11323440.

    \ ' '6819' 'IPR009730' '\

    This entry represents the C terminus (approximately 300 residues) of eukaryotic micro-fibrillar-associated protein 1, which is a component of elastin-associated microfibrils in the extracellular matrix PUBMED:8174780.

    \ ' '6820' 'IPR009731' '\

    This family consists of several Bacteriophage lambda replication protein P like proteins. The bacteriophage lambda P protein promoters replication of the phage chromosome by recruiting a key component of the cellular replication machinery to the viral origin. Specifically, P protein delivers one or more molecules of Escherichia coli DnaB helicase to a nucleoprotein structure formed by the lambda O initiator at the lambda replication origin PUBMED:2165499.

    \ ' '6821' 'IPR009732' '\

    This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown.

    \ ' '6822' 'IPR009733' '\

    This family represents a conserved region approximately 60 residues long, multiple copies of which are found within eukaryotic involucrin, and which is rich in glutamine and glutamic acid residues. Involucrin forms part of the insoluble cornified cell envelope (a specialised protective barrier) of stratified squamous epithelia PUBMED:12210515. Members of this family seem to be restricted to mammals.

    \ ' '6823' 'IPR009734' '\

    This family consists of several bacterial and phage proteins of around 130 residues in length which seem to be related to the bacteriophage P2 GpU protein () which is thought to be involved in tail assembly PUBMED:12426340.

    \ ' '6824' 'IPR010732' '\

    This entry consists of several hypothetical bacterial proteins of around 300 residues in length. Their function is unknown although they are associated with type VI secretion loci, suggesting a role in virulence PUBMED:16763151, PUBMED:12437215.

    \ ' '6826' 'IPR009736' '\

    This family consists of several hypothetical bacterial proteins of around 150 residues in length. Some family members are described as putative lipoproteins but the function of the family is unknown.

    \ ' '6827' 'IPR009737' '\

    This family contains a number of bacterial and eukaryotic proteins approximately 400 residues long that resemble ferredoxin and appear to have sucrolytic activity PUBMED:7957893.

    \ ' '6828' 'IPR010733' '\

    This family consists of several hypothetical eukaryotic sequences of around 400 residues in length. The function of this family is unknown.

    \ ' '6829' 'IPR009738' '\

    This entry represents the N terminus (approximately 200 residues) of the proline-rich protein BAT2. BAT2 is similar to other proteins with large proline-rich domains, such as some nuclear proteins, collagens, elastin, and synapsin PUBMED:2156268.

    \ ' '6830' 'IPR010734' '\

    This represents a conserved region approximately 180 residues long within eukaryotic copines. Copines are Ca2+-dependent phospholipid-binding proteins that are thought to be involved in membrane-trafficking, and may also be involved in cell division and growth PUBMED:12440769.

    \ ' '6832' 'IPR010736' '\

    This represents a short conserved region (approximately 30 residues long) that is repeated in several eukaryotic proteins of unknown function. One member of this family is annotated as possibly being related to alpha collagen.

    \ ' '6833' 'IPR010737' '\

    This entry represents a conserved region found in a range of Proteobacteria as well as the Gram-positive Oceanobacillus iheyensis. This entry includes YgbK from Escherichia coli, which is dependent upon FlhDC, the master regulator of the flagellar genes. The ygbK gene appears to be regulated by sigmaF PUBMED:11520622.

    \ ' '6834' 'IPR010738' '\

    This family consists of several hypothetical proteins of around 125 residues in length. Members of this family seem to be specific to Listeria and Streptococcus species. The function of this family is unknown.

    \ ' '6835' 'IPR009739' '\

    This family consists of several bacterial proteins of around 120 residues in length. Members of this family contain four highly conserved cysteine residues. The function of this family is unknown.

    \ ' '6837' 'IPR010739' '\

    This family consists of several bacterial proteins of around 120 residues in length. The function of this family is unknown.

    \ ' '6838' 'IPR010740' '\

    This family consists of several mammalian endomucin proteins. Endomucin is an early endothelial-specific antigen that is also expressed on putative hematopoietic progenitor cells.

    \ ' '6839' 'IPR009741' '\

    This family consists of several hypothetical plant proteins of around 100 residues in length. The function of this family is unknown.

    \ ' '6840' 'IPR009742' '\

    This entry represents a bacterial repeated motif of around 30 residues in length. These repeats are often found in multiple copies in the curlin proteins CsgA and CsgB. Curli fibres are thin aggregative surface fibres, connected with adhesion, which bind laminin, fibronectin, plasminogen, human contact phase proteins, and major histocompatibility complex (MHC) class I molecules. Curli fibres are coded for by the csg gene cluster, which is comprised of two divergently transcribed operons. One operon encodes the csgB, csgA, and csgC genes, while the other encodes csgD, csgE, csgF, and csgG. The assembly of the fibres is unique and involves extracellular self-assembly of the curlin subunit (CsgA), dependent on a specific nucleator protein (CsgB). CsgD is a transcriptional activator essential for expression of the two curli fibre operons, and CsgG is an outer membrane lipoprotein involved in extracellular stabilisation of CsgA and CsgB PUBMED:11254632.

    \ ' '6841' 'IPR010741' '\

    This family consists of several Alphaherpesvirus proteins of around 200 residues in length. The function of this family is unknown.

    \ ' '6842' 'IPR009743' '\

    This entry represents the C terminus (approximately 270 residues) of a number of plant Hs1pro-1 proteins, which are believed to confer nematode resistance PUBMED:12669798.

    \ ' '6843' 'IPR009744' '\

    This family consists of several bacterial VirC1 proteins. In Agrobacterium tumefaciens, a cis-active 24-base-pair sequence adjacent to the right border of the T-DNA, called overdrive, stimulates tumour formation by increasing the level of T-DNA processing. It is thought that the virC operon, which enhances T-DNA processing probably, does so because the VirC1 protein interacts with overdrive. It has now been shown that the virC1 gene product binds to overdrive but not to the right border of T-DNA PUBMED:2592351.

    \ ' '6844' 'IPR009745' '\

    This entry represents a 24 residue repeated motif from the Trypanosoma brucei cysteine-rich, acidic integral membrane protein precursor (CRAM). CRAM is concentrated in the flagellar pocket, an invagination of the cell surface of the trypanosome where endocytosis has been documented PUBMED:1697030.

    \ ' '6845' 'IPR009746' '\

    This family consists of several bacterial antimicrobial peptide resistance and lipid A acylation (PagP) proteins. The bacterial outer membrane enzyme PagP transfers a palmitate chain from a phospholipid to lipid A. In a number of pathogenic Gram-negative bacteria, PagP confers resistance to certain cationic antimicrobial peptides produced during the host innate immune response.

    \ ' '6847' 'IPR010742' '\

    This family consists of several Rab5-interacting protein (RIP5 or Rab5ip) sequences. The ras-related GTPase rab5 is rate-limiting for homotypic early endosome fusion. Rab5ip represents a novel rab5 interacting protein that may function on endocytic vesicles as a receptor for rab5-GDP and participate in the activation of rab5 PUBMED:10818110.

    \ ' '6848' 'IPR009748' '\

    This family consists of several Orthopoxvirus C10L proteins. C10L viral protein can play an important role in vaccinia virus evasion of the host immune system. It may consist in the blockade of IL-1 receptors by the C10L protein, a homologue of the IL-1 Ra PUBMED:12084512.

    \ ' '6849' 'IPR010743' '\

    This family consists of several bacterial and one archaeal methionine biosynthesis MetW proteins. Biosynthesis of methionine from homoserine in Pseudomonas putida takes place in three steps. The first step is the acylation of homoserine to yield an acyl-L-homoserine. This reaction is catalysed by the products of the metXW genes and is equivalent to the first step in enterobacteria, Gram-positive bacteria and fungi, except that in these microorganisms the reaction is catalysed by a single polypeptide (the product of the metA gene in Escherichia coli and the met5 gene product in Neurospora crassa). In P. putida, as in Gram-positive bacteria and certain fungi, the second and third steps are a direct sulphydrylation that converts the O-acyl-L-homoserine into homocysteine and further methylation to yield methionine. The latter reaction can be mediated by either of the two methionine synthetases present in the cells PUBMED:11479715.

    \ ' '6850' 'IPR010744' '\

    This family consists of several phage CI repressor proteins and related bacterial sequences. The CI repressor is known to function as a transcriptional switch, determining whether transcription is lytic or lysogenic PUBMED:2370665.

    \ ' '6851' 'IPR009749' '\

    This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown.

    \ ' '6852' 'IPR009211' '\

    This entry contains proteins of unknown function that occur in bacteria that interact with and manipulate eukaryotic cells PUBMED:12437215.

    \

    Salmonella enterica protein SciE is encoded in the centisome 7 genomic island (SCI) PUBMED:10417651. Deletion of the entire island affects the ability of bacteria to enter eukaryotic cells PUBMED:12437215. Therefore, SciE and other SCI proteins may be involved in virulence.

    \

    Interestingly, another member of this family, Rhizobium leguminosarum protein ImpE, has been reported to be encoded by an avirulence locus involved in temperature-dependent protein secretion PUBMED:12580282. It is believed that the imp locus is involved in the secretion to the environment of proteins, including periplasmic RbsB protein, that cause blocking of R. leguminosarum infection in plants PUBMED:12580282.

    \ ' '6854' 'IPR009750' '\

    This family consists of several hypothetical bacterial and phage proteins of around 60 residues in length. The function of this family is unknown.

    \ ' '6855' 'IPR008309' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '6856' 'IPR010746' '\

    This family contains a number of viral proteins of unknown function approximately 200 residues long. Family members seem to be restricted to badnaviruses.

    \ ' '6857' 'IPR009751' '\

    This family consists of several CryBP1 like proteins from Bacillus thuringiensis and Paenibacillus popilliae. Members of this family are thought to be involved in the overall toxicity of the bacteria to their hosts PUBMED:7730255,PUBMED:9209052.

    \ ' '6858' 'IPR009752' '\

    This family consists of both hypothetical bacterial and phage proteins of around 145 residues in length. The function of this family is unknown.

    \ ' '6860' 'IPR009753' '\

    This entry represents a family of hypothetical Borrelia proteins of around 78 residues in length. The function of this family is unknown.

    \ ' '6861' 'IPR009754' '\

    This family consists of several Orthopoxvirus B11R proteins of around 70 residues in length. The function of this family is unknown.

    \ ' '6862' 'IPR010748' '\

    This entry represents the N terminus (approximately 300 residues) of subunit 3 of the eukaryotic origin recognition complex (ORC). Origin recognition complex (ORC) is composed of six subunits that are essential for cell viability. They collectively bind to the autonomously replicating sequence (ARS) in a sequence-specific manner and lead to the chromatin loading of other replication factors that are essential for initiation of DNA replication PUBMED:11395502.

    \ ' '6863' 'IPR009755' '\

    This entry represents the C terminus (approximately 160 residues) of a number of proteins that resemble colon cancer-associated protein Mic1.

    \ ' '6865' 'IPR010749' '\

    This family consists of several hypothetical Enterobacterial proteins of around 120 residues in length. The function of this family is unknown.

    \ ' '6866' 'IPR009757' '\

    This family consists of several Circovirus proteins of around 60 residues in length. The function of this family is unknown.

    \ ' '6867' 'IPR010750' '\

    This family consists of several hypothetical eukaryotic proteins of around 300 residues in length. The function of this family is unknown.

    \ ' '6868' 'IPR009758' '\

    This family consists of several hypothetical bacterial proteins, which seem to be found exclusively in Rhizobium and Ralstonia species. Members of this family are typically around 210 residues in length and contain 5 highly conserved cysteine residues at their N terminus. The function of this family is unknown.

    \ ' '6869' 'IPR009759' '\

    This family consists of several hypothetical bacterial proteins of around 115 residues in length, which seem to be specific to Escherichia coli. The function of this family is unknown.

    \ ' '6870' 'IPR010751' '\

    This family consists of several bacterial TrfA proteins. The trfA operon of broad-host-range IncP plasmids is essential to activate the origin of vegetative replication in diverse species. The trfA operon encodes two ORFs. The first ORF is highly conserved and encodes a putative single-stranded DNA binding protein (Ssb). The second, trfA, contains two translational starts as in the IncP alpha plasmids, generating related polypeptides of 406 (TrfA1) and 282 (TrfA2) amino acids. TrfA2 is very similar to the IncP alpha product, whereas the N-terminal region of TrfA1 shows very little similarity to the equivalent region of IncP alpha TrfA1. This region has been implicated in the ability of IncP alpha plasmids to replicate efficiently in Pseudomonas aeruginosa PUBMED:8954881.

    \ ' '6871' 'IPR009760' '\

    This entry represents several hypothetical bacterial proteins of around 50 residues in length. The function of this family is unknown but is thought to be a membrane protein.

    \ ' '6872' 'IPR010752' '\

    This family consists of several hypothetical bacterial proteins of around 475 residues in length. The majority of family members are from Pseudomonas species but the family also contains sequences from Shewanella oneidensis and Thauera aromatica.

    \ ' '6873' 'IPR010753' '\

    This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown.

    \ ' '6874' 'IPR009761' '\

    This family consists of several repeats of around 42 residues in length. These repeated sequences are found in multiple copies in Trypanosoma cruzi antigens, contains 23 copies of this repeat.

    \ ' '6875' 'IPR010754' '\

    OPA3 deficiency causes type III 3-methylglutaconic aciduria (MGA) in humans. This disease manifests with early bilateral optic atrophy, spasticity, extrapyramidal dysfunction, ataxia, and cognitive deficits, but normal longevity PUBMED:12126933.

    \ \

    This family consists of several optic atrophy 3 (OPA3) proteins and related proteins from other eukaryotic species, the function is unknown.

    \ ' '6876' 'IPR009762' '\

    This family consists of several Circovirus proteins of around 35 residues in length. Members of this family are described as ORF-10 proteins and their function is unknown.

    \ ' '6879' 'IPR009764' '\

    This family consists of several ovarian carcinoma immunoreactive antigen (OCIA) and related eukaryotic sequences. The function of this family is unknown PUBMED:11162530,PUBMED:12445744.

    \ ' '6880' 'IPR010756' '\

    This family represents a conserved region approximately 100 residues long within mammalian hepatocellular carcinoma-associated antigen 59 and similar proteins. Family members are found in a variety of eukaryotes, mainly as hypothetical proteins.

    \ ' '6882' 'IPR009765' '\

    This entry represents a repeated sequence of around 34 residues in length. This repeat is found in multiple copies in the Drosophila pericardin and other extracellular matrix proteins.

    \ ' '6883' 'IPR010758' '\

    This family contains a number of bacterial short-chain alcohol dehydrogenases that are approximately 400 residues long. Alcohol dehydrogenases display a wide variety of substrate specificities, and play an important role in a broad range of physiological processes. Short-chain alcohol dehydrogenases form part of a group of alcohol dehydrogenases that are dependent upon NADP PUBMED:11358525.

    \ ' '6884' 'IPR009766' '\

    This family represents a conserved region approximately 130 residues long within a number of proteins of unknown function that seem to be specific to the white spot syndrome virus (WSSV).

    \ ' '6885' 'IPR009767' '\

    This entry represents a conserved region approximately 130 residues long within the bacterial DNA helicase TraI. TraI is a bifunctional protein that catalyses the unwinding of duplex DNA as well as acts as a sequence-specific DNA trans-esterase, providing the site- and strand-specific nick required to initiate DNA transfer PUBMED:11054423.

    \ ' '6886' 'IPR009768' '\

    This family represents a conserved region within a number of myosin II heavy chain-like proteins that seem to be specific to Arabidopsis thaliana.

    \ ' '6887' 'IPR009769' '\

    This entry represents the C terminus (approximately 250 residues) of a number of hypothetical plant proteins of unknown function.

    \ ' '6889' 'IPR010760' '\

    This is a group of proteins of unknown function.

    \ ' '6890' 'IPR010761' '\

    Clc proteins are a nine-member gene family of Chloride channels that have diverse roles in the plasma membrane and in intracellular organelles, especially membrane excitability and the maintenance of osmotic balance PUBMED:16596447, PUBMED:12512775. This family contains a number of Clc-like proteins that are approximately 250 residues long and appear to be found only in nematodes.

    \ ' '6891' 'IPR009770' '\

    This entry represents the C terminus (approximately 100 residues) of a number of hypothetical bacterial proteins of unknown function.

    \ ' '6892' 'IPR009771' '\

    This family represents a conserved region approximately 300 residues long within a number of hypothetical eukaryotic proteins of unknown function. These are possibly integral membrane proteins.

    \ ' '6893' 'IPR009772' '\

    This family contains a number of eukaryotic D123 proteins approximately 330 residues long. It has been shown that mutated variants of D123 exhibit temperature-dependent differences in their degradation rate PUBMED:11699637.

    \ ' '6894' 'IPR009773' '\

    This family consists of several Lactococcus bacteriophage middle-3 (M3) proteins of around 160 residues in length. The function of this family is unknown.

    \ ' '6895' 'IPR009774' '\

    This family consists of several hypothetical Streptococcus thermophilus bacteriophage proteins of around 235 residues in length. The function of this family is unknown.

    \ ' '6896' 'IPR010762' '\

    This family contains a number of major capsid Gp23 proteins approximately 500 residues long, from T4-like bacteriophages.

    \ ' '6897' 'IPR009775' '\

    This family consists of several Porcine reproductive and respiratory syndrome virus (PRRSV) ORF2b proteins. The function of this family is unknown however it is known that large amounts of 2b protein are present in the virion and it is thought that this protein may be an integral component of the virion PUBMED:11504553.

    \ ' '6898' 'IPR009776' '\

    This family consists of several bacterial SpoOM proteins which are thought to control sporulation in Bacillus subtilis.Spo0M exerts certain negative effects on sporulation and its gene expression is controlled by sigmaH PUBMED:9795118.

    \ ' '6899' 'IPR010763' '\

    Members of this family of relatively uncommon proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown.

    \ ' '6900' 'IPR009777' '\

    This family consists of several hypothetical bacterial proteins of around 250 residues in length. Members of this family are often known as YacF after the Escherichia coli protein . The function of this family is unknown.

    \ ' '6901' 'IPR009778' '\

    This family consists of several bacterial modulator of Rho-dependent transcription termination (ROF) proteins. ROF binds transcription termination factor Rho and inhibits Rho-dependent termination in vivo PUBMED:9723924.

    \ ' '6902' 'IPR009779' '\

    This family consists of several eukaryotic translocon-associated protein, gamma subunit (TRAP-gamma) sequences. The translocation site (translocon), at which nascent polypeptides pass through the endoplasmic reticulum membrane, contains a component previously called \'signal sequence receptor\' that is now renamed as \'translocon-associated protein\' (TRAP). The TRAP complex is comprised of four membrane proteins alpha, beta, gamma and delta, which are present in a stoichiometric relation, and are genuine neighbours in intact microsomes. The gamma subunit is predicted to span the membrane four times PUBMED:7916687.

    \ ' '6903' 'IPR008302' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '6904' 'IPR009780' '\

    This family consists of several short, hypothetical bacterial proteins of around 80 residues in length. Members of this family are found in Rhizobium, Agrobacterium and Brucella species. The function of this family is unknown.

    \ ' '6905' 'IPR009781' '\

    This family consists of several hypothetical bacterial proteins of around 230 residues in length. The function of this family is unknown.

    \ ' '6906' 'IPR009782' '\

    This family consists of several hypothetical mammalian proteins of around 320 residues in length. The function of this family is unknown although several of the family members are annotated as putative 40-2-3 proteins.

    \ ' '6907' 'IPR010764' '\

    This family consists of several hypothetical bacterial proteins of around 610 residues in length. Members of this family are highly conserved and seem to be specific to Chlamydia species. The function of this family is unknown.

    \ ' '6908' 'IPR009783' '\

    This family consists of several highly conserved hypothetical proteins of around 150 residues in length. The function of this family is unknown.

    \ ' '6909' 'IPR009784' '\

    This family consists of several hypothetical bacterial proteins but contains one sequence () from Saccharomyces cerevisiae. Members of this family are typically around 200 residues in length. The function of this family is unknown.

    \ ' '6910' 'IPR010765' '\

    This family consists of several hypothetical proteins from both cyanobacteria and plants. Members of this family are typically around 250 residues in length. The function of this family is unknown but the species distribution indicates that the family may be involved in photosynthesis.

    \ ' '6911' 'IPR009785' '\

    This family consists of several bacterial and phage proteins of around 230 residues in length. The function of this family is unknown.

    \ ' '6912' 'IPR009786' '\

    This family consists of several thyroid hormone-inducible hepatic protein (Spot 14 or S14) sequences. Mainly expressed in tissues that synthesise triglycerides, the mRNA coding for Spot 14 has been shown to be increased in rat liver by insulin, dietary carbohydrates, glucose in hepatocyte culture medium, as well as thyroid hormone. In contrast, dietary fats and polyunsaturated fatty acids, have been shown to decrease the amount of Spot 14 mRNA, while an elevated level of cAMP acts as a dominant negative factor. In addition, liver-specific factors or chromatin organisation of the gene have been shown to contribute to the regulation of its expression PUBMED:9003802. Spot 14 protein is thought to be required for induction of hepatic lipogenesis PUBMED:11564699.

    \ ' '6913' 'IPR010766' '\

    This presumed domain is about 120 amino acids in length. It is found associated with CBS domains , as well as the CbiA domain . The function of this domain is unknown. It is named the DRTGG domain after some of the most conserved residues. This domain may be very distantly related to a pair of CBS domains. There are no significant sequence similarities, but its length and association with CBS domains supports this idea.

    \ ' '6914' 'IPR009787' '\

    This family consists of several hypothetical eukaryotic proteins of around 190 residues in length. The function of this family is unknown.

    \ ' '6915' 'IPR010767' '\

    This family consists of several hypothetical bacterial proteins of around 100 residues in length. The function of this family is unknown.

    \ ' '6916' 'IPR009788' '\

    This family consists of several archaeal GvpD gas vesicle proteins. GvpD is thought to be involved in the regulation of gas vesicle formation PUBMED:8763925,PUBMED:12864859.

    \ ' '6918' 'IPR010768' '\

    This entry is found in several hypothetical bacterial proteins of around 250 residues in length. The function of these proteins is unknown.

    \ ' '6919' 'IPR010769' '\

    This family consists of several bacterial ribosomal RNA methyltransferase (aminoglycoside-resistance methyltransferase) proteins PUBMED:8486289,PUBMED:2013410.

    \ ' '6920' 'IPR009790' '\

    This family consists of several hypothetical mammalian proteins of around 250 residues in length. The function of this family is unknown.

    \ ' '6921' 'IPR010770' '\

    This family consists of several eukaryotic SGT1 proteins. Human SGT1 or hSGT1 is known to suppress GCR2 and is highly expressed in the muscle and heart. The function of this family is unknown although it has been speculated that SGT1 may be functionally analogous to the Gcr2p protein of Saccharomyces cerevisiae which is known to be a regulatory factor of glycolytic gene expression PUBMED:9928932.

    \ ' '6922' 'IPR009791' '\

    This entry represents a family of hypothetical proteins of around 225 residues in length found in Borrelia species. The function of this family is unknown.

    \ ' '6923' 'IPR010771' '\

    This family consists of several bacterial intracellular growth attenuator (IgaA) proteins. IgaA is involved in negative control of bacterial proliferation within fibroblasts. IgaA is homologous to the Escherichia coli YrfF and Proteus mirabilis UmoB proteins. Whereas the biological function of YrfF is currently unknown, UmoB has been shown elsewhere to act as a positive regulator of FlhDC, the master regulator of flagella and swarming. FlhDC has been shown to repress cell division during P. mirabilis swarming, suggesting that UmoB could repress cell division via FlhDC. This biological function, if maintained in Salmonella enterica, could sustain a putative negative control of cell division and growth exerted by IgaA in intracellular bacteria PUBMED:11553591.

    \ ' '6924' 'IPR009792' '\

    This family consists of several hypothetical eukaryotic proteins of around 125 residues in length. The function of this family is unknown.

    \ ' '6925' 'IPR010772' '\

    This family, represents ORF4 of the abiN operon of Lactococcus lactis, and ORF27 of the temperate bacteriophage TP901 PUBMED:11312666. Members of this family are found exclusively in L. lactis and the bacteriophages that infect this species. The function of this family is unknown.

    \ ' '6926' 'IPR010773' '\

    This family consists of several bacterial proteins of around 115 residues in length. Members of this family are found in Bacillus species and Streptomyces coelicolor, the function of the family is unknown.

    \ ' '6927' 'IPR009793' '\

    This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although some members are annotated as being putative integral membrane proteins.

    \ ' '6928' 'IPR009794' '\

    This family consists of several hypothetical bacterial proteins of around 125 resides in length, including the hypothetical protein TM1070 from Thermotoga maritima, which has a beta-sandwich structure with a jelly-roll fold. The function of these proteins is unknown.

    \ ' '6929' 'IPR009795' '\

    This family consists of several Trypanosoma brucei putative variant specific antigen proteins of around 80 residues in length.

    \ ' '6930' 'IPR010774' '\

    This family consists of several bacterial and phage proteins of around 95 residues in length. The function of this family is unknown.

    \ ' '6931' 'IPR010775' '\

    This family consists of several bacterial and plant proteins of around 250 residues in length. The function of this family is unknown.

    \ ' '6932' 'IPR009796' '\

    This family consists of several hypothetical Streptococcus thermophilus bacteriophage proteins of around 130 residues in length. One of the sequences in this family, from phage Sfi11 (Swisss:O80186) is known as Gp149. The function of this family is unknown.

    \ ' '6933' 'IPR009797' '\

    This family consists of several highly conserved, hypothetical bacterial and phage proteins of around 200 resides in length. The function of this family is unknown.

    \ ' '6934' 'IPR010776' '\

    This family consists of several eukaryotic TBP-1 interacting protein (TBPIP) sequences. TBP-1 has been demonstrated to interact with the human immunodeficiency virus type 1 (HIV-1) viral protein Tat, then modulate the essential replication process of HIV. In addition, TBP-1 has been shown to be a component of the 26S proteasome, a basic multiprotein complex that degrades ubiquitinated proteins in an ATP-dependent fashion. Human TBPIP interacts with human TBP-1 then modulates the inhibitory action of human TBP-1 on HIV-Tat-mediated transactivation PUBMED:10806355.

    \ ' '6935' 'IPR009798' '\

    This entry consists of several plant wound-induced protein sequences related to WI12 from Mesembryanthemum crystallinum (Common ice plant) (). Wounding, methyl jasmonate, and pathogen infection is known to induce local WI12 expression. WI12 expression is also thought to be developmentally controlled in the placenta and developing seeds. WI12 preferentially accumulates in the cell wall and it has been suggested that it plays a role in the reinforcement of cell wall composition after wounding and during plant development PUBMED:11598226.

    \ ' '6936' 'IPR010777' '\

    This family consists of several Salmonella PipA (pathogenicity island-encoded protein A) and related phage sequences. PipA is thought to contribute to enteric but not to systemic salmonellosis PUBMED:9723926.

    \ ' '6937' 'IPR010940' '\

    This entry represents the C terminus (approximately 100 residues) of bacterial and eukaryotic Magnesium-protoporphyrin IX methyltransferase (). This converts magnesium-protoporphyrin IX to magnesium-protoporphyrin IX metylester using S-adenosyl-L-methionine as a cofactor PUBMED:8071204.

    \ ' '6938' 'IPR009799' '\

    This family consists of several bacterial sequences which are related to the EthD protein of Rhodococcus ruber (). R. ruber (formerly Gordonia terrae) IFP 2001 is one of a few bacterial strains able to degrade ethyl tert-butyl ether (ETBE), which is a major pollutant from gasoline. This strain was found to undergo a spontaneous 14.3-kbp chromosomal deletion, which results in the loss of the ability to degrade ETBE. Sequence analysis of the region corresponding to the deletion revealed the presence of a gene cluster, ethABCD, encoding a ferredoxin reductase (EthA), a cytochrome P-450 (EthB), a ferredoxin (EthC), and a 10-kDa protein of unknown function (EthD), respectively. Upstream of ethABCD lies ethR, which codes for a putative positive transcriptional regulator of the AraC/XylS family. Transformation of the ETBE-negative mutant by a plasmid carrying the ethRABCD genes restored the ability to degrade ETBE. Complementation was abolished if the plasmid carried ethRABC only demonstrating that EthD is essential for the ETBE degradation system PUBMED:11673424.

    \ ' '6939' 'IPR009800' '\

    This family consists of several mammalian alpha helical coiled-coil rod HCR proteins. The function of HCR is unknown but it has been implicated in psoriasis in humans and is thought to affect keratinocyte proliferation PUBMED:11875053.

    \ ' '6940' 'IPR010778' '\

    This family consists of several proteins with seem to be specific to red algae plasmids. Members of this family are typically around 415 residues in length. The function of this family is unknown.

    \ ' '6942' 'IPR009801' '\

    This family consists of several hypothetical eukaryotic proteins of around 200 residues in length. Members of this family seem to be specific to mammals and their function is unknown.

    \ ' '6943' 'IPR009802' '\

    This entry represents a family of hypothetical proteins of around 110 residues in length found in Borrelia species. The function of this family is unknown.

    \ ' '6944' 'IPR010779' '\

    This family consists of several Streptococcus bacteriophage sequences and related proteins from Streptococcus species. Members of this family are typically around 100 residues in length and their function is unknown.

    \ ' '6945' 'IPR009803' '\

    This family consists of several hypothetical proteins which seem to be specific to Oryzias latipes (Japanese ricefish). Members of this family are typically around 200 residues in length. The function of this family is unknown.

    \ ' '6946' 'IPR009804' '\

    This family consists of several hypothetical Sulfolobus virus proteins of around 100 residues in length. The function of this family is unknown.

    \ ' '6947' 'IPR010780' '\

    This family consists of several hypothetical, putative lipoproteins of around 80 residues in length. Members of this family seem to be specific to the class Gammaproteobacteria. The function of this family is unknown.

    \ ' '6948' 'IPR010781' '\

    This family consists of several hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown.

    \ ' '6950' 'IPR009805' '\

    This entry represents a 29 residue repeated sequence which seem to be specific to the Ehrlichia chaffeensis variable length PCR target (VLPT) protein. E. chaffeensis is a tick-transmitted rickettsial agent and is responsible for human monocytic ehrlichiosis (HME). The function of this family is unknown PUBMED:12496165.

    \ ' '6951' 'IPR009806' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane PUBMED:12518057, PUBMED:15100025. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection PUBMED:14871485.

    \ \ \

    This family represents the low molecular weight transmembrane protein PsbW found in PSII, where it is a subunit of the oxygen-evolving complex. PsbW appears to have several roles, including guiding PSII biogenesis and assembly, stabilising dimeric PSII PUBMED:10950961, and facilitating PSII repair after photo-inhibition PUBMED:9335523. There appears to be two classes of PsbW, class 1 being found predominantly in algae and cyanobacteria, and class 2 being found predominantly in plants. This entry represents class 2 PsbW.

    \ ' '6952' 'IPR009807' '\

    This family consists of several Phytoreovirus outer capsid protein P8 sequences PUBMED:9343255.

    \ ' '6953' 'IPR009808' '\

    This family consists of hypothetical bacterial and phage proteins of around 59 residues in length. Bacterial members of this family seem to be specific to Enterobacteria. The function of this family is unknown.

    \ ' '6954' 'IPR009809' '\

    This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown.

    \ ' '6955' 'IPR009810' '\

    This family consists of several plant specific late nodulin sequences which are homologous to the Pisum sativum (Garden pea) ENOD3 protein. ENOD3 is expressed in the late stages of root nodule formation and contains two pairs of cysteine residues toward the proteins C terminus which may be involved in metal-binding PUBMED:2152123.

    \ ' '6956' 'IPR009811' '\

    This family consists of several hypothetical bacterial proteins of around 140 residues in length. Members of this family seem to be specific to Enterobacteria. The function of this family is unknown.

    \ ' '6957' 'IPR009812' '\

    This family consists of several hypothetical Staphylococcus aureus bacteriophage proteins of around 65 residues in length. The function of this family is unknown.

    \ ' '6958' 'IPR009813' '\

    This family consists of several bacterial YebG proteins of around 75 residues in length. The exact function of this protein is unknown but it is thought to be involved in the SOS response. The induction of the yebG gene occurs as cell enter into the stationary growth phase and is dependent on is dependent on cyclic AMP and H-NS PUBMED:10474193.

    \ ' '6959' 'IPR009814' '\

    This family consists of several hypothetical Escherichia coli and Bacteriophage lambda-like proteins of around 60 residues in length. The function of this family is unknown.

    \ ' '6961' 'IPR010784' '\

    This family consists of several Plasmodium falciparum SPAM (secreted polymorphic antigen associated with merozoites) proteins. Variation among SPAM alleles is the result of deletions and amino acid substitutions in non-repetitive sequences within and flanking the alanine heptad-repeat domain. Heptad repeats in which the a and d position contain hydrophobic residues generate amphipathic alpha-helices which give rise to helical bundles or coiled-coil structures in proteins. SPAM is an example of a P. falciparum antigen in which a repetitive sequence has features characteristic of a well-defined structural element PUBMED:7891748,PUBMED:7893643.

    \ ' '6962' 'IPR010785' '\

    This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 375 residues in length. The function of this family is unknown.

    \ ' '6964' 'IPR010787' '\

    This family contains a number of hypothetical bacterial proteins of unknown function approximately 300 residues in length. Some family members are predicted to be metal-dependent.

    \ ' '6965' 'IPR010788' '\

    This family represents a conserved region approximately 350 residues long within plant violaxanthin de-epoxidase (VDE). In higher plants, violaxanthin de-epoxidase forms part of a conserved system that dissipates excess energy as heat in the light-harvesting complexes of photosystem II (PSII), thus protecting them from photo-inhibitory damage PUBMED:8692813.

    \ ' '6966' 'IPR009815' '\

    This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 350 residues in length. The function of this family is unknown.

    \ ' '6967' 'IPR009816' '\

    This family represents a conserved region approximately 300 residues long within a number of hypothetical proteins of unknown function that seem to be restricted to mammals.

    \ ' '6968' 'IPR008355' '\

    Interferon (INF)-gamma is a dimeric glycoprotein produced by activated T\ cells and natural killer cells. Although originally isolated based on its\ antiviral activity, INF-gamma also displays powerful anti-proliferative and\ immuno-modulatory activities, which are essential for developing appropriate\ cellular defences against a variety of infectious agents. The first step in\ eliciting these responses is the specific high affinity interaction of INF-\ gamma with its cell-surface receptor (INF-gammaRalpha); the complex then\ interacts with at least one of a family of additional species-specific\ accessory factors (AF-1 or INF-gammabeta), which convey different cellular\ responses. One such response is the association and phosphorylation of two\ protein tyrosine kinases (Jak-1 and Jak-2), which in turn stimulate nuclear\ transcription activators PUBMED:7617032.

    \ \

    The human INF-gammaR, is a member of the hematopoietic cytokine receptor\ superfamily. It is expressed in a membrane-bound form in many cell types,\ and is over-expressed in tumour cells. It comprises an extracellular portion\ of 229 residues, a single transmembrane region, and a cytoplasmic domain of\ 221 residues. As with other members of its superfamily, the cytokine-binding\ sites are formed by a small set of closely-spaced surface loops that extend\ from a beta-sheet core, much like antigen-binding sites on antibodies. The\ extracellular INF-gammaR monomer comprises two domains (domain D1 from\ residue 14-102, and domain D2 from residue 114-221), each resembling an Ig\ fold with fibronectin type III topology PUBMED:9367779.

    \ ' '6969' 'IPR010789' '\

    This family consists of several putative Lactococcus bacteriophage terminase small subunit proteins. The exact function of this family is unknown.

    \ ' '6970' 'IPR010790' '\

    This entry represents a repeated motif of around 29 residues in length. This repeat are found in the variable surface lipoproteins in Mycoplasma bovis and in mammalian neurofilament triplet H (NefH or NF-H) proteins. This repeat contains several Lys-Ser-Pro (KSP) motifs and in NefH these are thought to function as the main target for neurofilament directed protein kinases in vivo PUBMED:3138108.

    \ ' '6971' 'IPR010791' '\

    This family consists of several purple photosynthetic bacterial hydroxyneurosporene synthase (CrtC) proteins. The enzyme catalyses the conversion of various acyclic carotenes including 1-hydroxy derivatives. This broad substrate specificity reflects the participation of CrtC in 1\'-HO-spheroidene and in spirilloxanthin biosynthesis PUBMED:12745254.

    \ ' '6973' 'IPR009818' '\

    This entry represents a conserved region approximately 250 residues long located towards the C terminus of eukaryotic ataxin-2. Ataxin-2 is a protein of unknown function, within which expansion of a polyglutamine tract (due to expansion of unstable CAG repeats in the coding region of the SCA2 gene) causes spinocerebellar ataxia type 2 (SCA2), a late-onset neurodegenerative disorder PUBMED:9339681. The expanded polyglutamine repeat in ataxin-2 causes disruption of the normal morphology of the Golgi complex and increased incidence of cell death PUBMED:12812977. Ataxin-2 is predicted to consist of mostly non-globular domains PUBMED:9462862.

    \ ' '6974' 'IPR010792' '\

    This family consists of several hypothetical bacterial proteins, which seem to be specific to Chlamydia pneumoniae (Chlamydophila pneumoniae). Members of this family are typically around 400 residues in length. The function of this family is unknown.

    \ ' '6975' 'IPR010793' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family consists of several eukaryotic mitochondrial 28S ribosomal protein S30 (or programmed cell death protein 9 PDCD9) sequences. The exact function of this family is unknown although it is known to be a component of the mitochondrial ribosome and a component in cellular apoptotic signalling pathways PUBMED:11248257.

    \ ' '6976' 'IPR010794' '\

    This family consists of several maltose operon periplasmic protein precursor (MalM) sequences. The function of this family is unknown PUBMED:1730061.

    \ ' '6977' 'IPR009819' '\

    This family consists of several Caenorhabditis elegans pes-10 and related proteins. Members of this family are typically around 400 residues in length. The function of this family is unknown.

    \ ' '6978' 'IPR009820' '\

    This family consists of several Paramecium bursaria Chlorella virus 1 (PBCV-1) proteins of around 250 residues in length. The function of this family is unknown.

    \ ' '6979' 'IPR009821' '\

    This family consists of several Enterobacterial proteins of around 50 residues in length. Members of this family are found in Escherichia coli and Salmonella typhi where they are often known as YdfA. The function of this family is unknown.

    \ ' '6980' 'IPR009822' '\

    This family consists of several hypothetical bacterial proteins of around 180 residues in length, which are often known as YaeQ. YaeQ is homologous to RfaH, a specialised transcription elongation protein. YaeQ is known to compensate for loss of RfaH function PUBMED:9604894.

    \ ' '6981' 'IPR009823' '\

    This family consists of several SORF3 proteins from the Marek\'s disease-like viruses (Meleagrid herpesvirus 1 (MeHV-1). Members of this family are around 350 residues in length. The function of this family is unknown.

    \ ' '6982' 'IPR009824' '\

    This family consists of several hypothetical cyanobacterial proteins of around 150 residues in length, which seem to be specific to Anabaena species. The function of this family is unknown.

    \ ' '6983' 'IPR009825' '\

    This family consists of several bacterial proteins of around 180 residues in length that appear to be multi-pass membrane proteins. The function of this family is unknown.

    \ ' '6984' 'IPR010795' '\

    This entry represents a conserved region found in a group of prenylcysteine lyases () that are approximately 500 residues long. Prenylcysteine lyase is a FAD-dependent thioether oxidase that degrades a variety of prenylcysteines, producing free cysteine, an isoprenoid aldehyde and hydrogen peroxide as products of the reaction PUBMED:12186880. It has been noted that this enzyme has considerable homology with ClP55, a 55 kDa protein that is associated with chloride ion pumps PUBMED:11716481.

    \ ' '6985' 'IPR009826' '\

    This entry represents the N terminus (approximately 100 residues) of a number of phage DNA circulation proteins.

    \ ' '6986' 'IPR009827' '\

    This entry represents the N-terminal region of the bacterial dicarboxylate carrier protein MatC. The MatC protein is an integral membrane protein that could function as a malonate carrier PUBMED:9826185.

    \ ' '6987' 'IPR009828' '\

    This family consists of several hypothetical eukaryotic proteins of around 320 residues in length. The function of this family is unknown.

    \ ' '6988' 'IPR009829' '\

    This family consists of several hypothetical eukaryotic proteins of around 250 residues in length. The function of this family is unknown.

    \ ' '6989' 'IPR009830' '\

    This entry consists of several lipoproteins from Mycobacterium species, collectively known as the LppX/LprAFG family. The best characterised of these is LprG () from Mycobacterium tuberculosis which is an immunogenic 27 kDa membrane-associated lipoprotein PUBMED:9387238. Expression of the lprG gene encoding this protein is essential for the growth of M. tuberculosis in immunocompetent mice PUBMED:14998516. Purification of LprG showed that it inhibits MHC-II antigen processing in primary human macrophages, providing a mechanism to avoid the host MHC-II-restricted CD4+ T cell response which is considered essential for control of M. tuberculosis infection PUBMED:15294983. LppX is a secreted antigen which may be a good target for vaccine design PUBMED:14723617, while LprF is a membrane lipoprotein involved in the kdp signal transduction pathway, thought to be the primary response to osmotic stress PUBMED:12581360.

    \ ' '6990' 'IPR010796' '\

    Proteins in this entry include the MSK1 protein () and other known or predicted flagellar basal body proteome components PUBMED:16415886 or cilia-containing species. Although the function is unknown, a cilia-specific role has been suggested for the poorly characterised B9 domain PUBMED:16415886, PUBMED:17127412, PUBMED:18337471.

    \ \

    Mutations in MSK1 have been shown to cause Meckel syndrome type 1, a severe foetal development disorder that has been reported in most populations.

    \ ' '6991' 'IPR010797' '\

    This family consists of Pex26 and related mammalian proteins. Pex26 is a type II peroxisomal membrane protein that recruits Pex6-Pex1 complexes to peroxisomes PUBMED:12717447. Mutations in Pex26 can lead to human disorders PUBMED:12851857.

    \ ' '6993' 'IPR009832' '\

    This family consists of several insect specific proteins. is annotated as being a haemolymph glycoprotein precursor. The function of this family is unknown PUBMED:7742978.

    \ ' '6994' 'IPR009833' '\

    This family consists of several hypothetical Enterobacterial proteins of around 130 residues in length. Members of this family seem to be found exclusively in Escherichia coli and Salmonella species. The function of this family is unknown.

    \ ' '6995' 'IPR010941' '\

    This entry represents the N-terminal region of the bacterial poly-beta-hydroxybutyrate polymerase (PhaC). Polyhydroxyalkanoic acids (PHAs) are carbon and energy reserve polymers produced in some bacteria when carbon sources are plentiful and another nutrient, such as nitrogen, phosphate, oxygen, or sulphur, becomes limiting. PHAs composed of monomeric units ranging from 3 to 14 carbons exist in nature. When the carbon source is exhausted, PHA is utilised by the bacterium. PhaC links D-(-)-3-hydroxybutyrl-CoA to an existing PHA molecule by the formation of an ester bond PUBMED:10427049.

    \ ' '6996' 'IPR009834' '\

    This family contains fatty acid elongase 3-ketoacyl-CoA synthase 1, a plant enzyme approximately 350 residues long.

    \ ' '6999' 'IPR010799' '\

    Proteins in this entry are involved in degradation of the cyanobacterial heptapeptide hepatotoxin microcystin LR, and are encoded in the mlr gene cluster PUBMED:11769251. MlrC from Sphingomonas wittichii (strain RW1 / DSM 6014 / JCM 10273) is believed to mediate the last step of peptidolytic degradation of the tetrapeptide. It is suspected to be a metallopeptidase based on homology to known peptidases and its inhibition by metal chelators. The proteins encoded by the mlr cluster may be involved in cell wall peptidoglycan cycling and subsequently act fortuitously in hydrolysis of microcystin LR.

    \ \

    This entry represents the C-terminal region of these proteins.

    \ ' '7000' 'IPR010800' '\

    This family of proteins includes several glycine rich proteins as well as two nodulins 16 and 24. The family also contains proteins that are induced in response to various stresses.

    \ ' '7001' 'IPR009836' '\

    This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function.

    \ ' '7002' 'IPR010801' '\

    This family contains bacterial fibronectin-attachment proteins (FAP). Family members are rich in alanine and proline, are approximately 300 long, and seem to be restricted to mycobacteria. These proteins contain a fibronectin-binding motif that allows mycobacteria to bind to fibronectin in the extracellular matrix PUBMED:9988684.

    \ ' '7003' 'IPR009837' '\

    This family represents a conserved region approximately 180 residues long within osteoregulin, a bone-remodelling protein expressed highly in osteocytes within trabecular and cortical bone. A conserved RGD motif is found towards the C-terminal end of this region, and this is potentially involved in integrin recognition PUBMED:10967096.

    \ ' '7004' 'IPR010802' '\

    This domain is specific to cyanobacterial proteins, its function and the function of the proteins it is associated with, are uncharacterised.

    \ ' '7005' 'IPR006573' '\

    NEUZ is a domain of unknown function found in neuralized proteins, i.e. proteins involved in the specification of the neuroblast during cellular differentiation.

    \ ' '7006' 'IPR009838' '\

    This family consists of several bacterial TraL proteins. TraL is a predicted peripheral membrane protein, which is thought to be involved in bacterial sex pilus assembly PUBMED:8655498. The exact function of this family is unclear.

    \ ' '7007' 'IPR009839' '\

    This family consists of several SseB proteins, which appear to be found exclusively in Enterobacteria. SseB is known to enhance serine-sensitivity in Escherichia coli PUBMED:7982894 and is part of the Salmonella pathogenicity island 2 (SPI-2) translocon PUBMED:12724372.

    \ ' '7008' 'IPR009840' '\

    This family consists of several hypothetical bacterial proteins of around 135 residues in length. Members of this family appear to be found exclusively in the Enterobacteria Escherichia coli, Citrobacter rodentium and Salmonella typhi. The function of this family is unknown.

    \ ' '7009' 'IPR009841' '\

    This family consists of several VirC2 proteins which seem to be found exclusively in Agrobacterium species and Rhizobium etli. VirC2 is known to be involved in virulence in Agrobacterium species but its exact function is unclear PUBMED:3584058, PUBMED:3759904.

    \ ' '7010' 'IPR009842' '\

    This family consists of several hypothetical bacterial proteins of around 310 residues in length. Members of this family seem to be found exclusively in Agrobacterium, Rhizobium and Brucella species. The function of this family is unknown.

    \ ' '7011' 'IPR009843' '\

    This family consists of several hypothetical bacterial proteins of around 320 residues in length. Members of this family are mainly found in Rhizobium and Agrobacterium species. The function of this family is unknown.

    \ ' '7012' 'IPR010803' '\

    This family consists of several Citrus tristeza virus (CTV) P33 proteins. The function of P33 is unclear although it is known that the protein is not needed for virion formation PUBMED:11112500.

    \ ' '7013' 'IPR009844' '\

    This family consists of several archaeal proteins of around 180 residues in length. Members of this family seem to be found exclusively in Sulfolobus tokodaii and Sulfolobus solfataricus. The function of this family is unknown.

    \ ' '7015' 'IPR009845' '\

    This family consists of several bacterial and related archaeal protein of around 180 residues in length. The function of this family is unknown.

    \ ' '7016' 'IPR010805' '\

    This family consists of Human herpesvirus 8 (HHV-8, Kaposi\'s sarcoma-associated herpesvirus) K8 proteins. HHV-8 is a human Gammaherpesvirus related to Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) and Saimiriine herpesvirus 2 (Herpesvirus saimiri). HHV-8 open reading frame K8 encodes a basic region-leucine zipper protein of 237 aa that homodimerises. K8 interacts and co-localises with human SNF5 (), a cellular chromatin-remodelling factor, both in vivo and in vitro. K8 is thought to function as a transcriptional activator under specific conditions and its transactivation activity requires its interaction with the cellular chromatin remodelling factor hSNF5 PUBMED:12604819.

    \ ' '7017' 'IPR009846' '\

    This family consists of several eukaryotic splicing factor 3B subunit 5 (SF3b5) proteins. SF3b5 is a 10 kDa subunit of the splicing factor SF3b. SF3b associates with the splicing factor SF3a and a 12S RNA unit to form the U2 small nuclear ribonucleoproteins complex. SF3b5 and SF3b14b are also thought to facilitate the interaction of U2 with the branch site PUBMED:12234937. Also included in this entry is RDS3 complex subunit 10, another protein involved in mRNA splicing PUBMED:15565172.

    \ ' '7018' 'IPR010806' '\

    This family consists of several Orthopoxvirus proteins of around 185 resides in length. Members of this family seem to be exclusive to Vaccinia virus, Camelpox virus and Cowpox virus (CPV). Some family members are annotated as being C8 proteins but their function is unknown.

    \ ' '7019' 'IPR010807' '\

    This family consists of several short, hypothetical bacterial proteins of around 70 residues in length. Members of this family 8 highly conserved cysteine residues. The function of the family is unknown.

    \ ' '7020' 'IPR009847' '\

    This family consists of several mammalian SNRPN upstream reading frame (SNURF) proteins. SNURF or RPF4 is a RING-finger protein and a coregulator of androgen receptor-dependent transcription. It has been suggested that SNURF is involved in the regulation of processes required for late steps of spermatid maturation PUBMED:12351196, PUBMED:12874792.

    \ ' '7021' 'IPR009848' '\

    This family consists of several hypothetical Lactococcus lactis and related phage proteins of around 75 residues in length. The function of this family is unknown.

    \ ' '7022' 'IPR010808' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms PUBMED:8868347, PUBMED:11406410. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation PUBMED:10426948, and CheA, which plays a central role in the chemotaxis system PUBMED:9989504. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water PUBMED:11145881. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.

    \

    HKs can be roughly divided into two classes: orthodox and hybrid kinases PUBMED:8029829, PUBMED:1482126. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK PUBMED:10966457. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.

    \

    The response regulators for CheA bind to the P2 domain, which is found between and as either one or two copies. Highly flexible linkers connect P2 to the rest of CheA and impart remarkable mobility to the P2 domain. This feature is thought to enhance the inter CheA dimer phosphotransfer reactions within the signalling complex, thereby amplifying the phosphorylation signal PUBMED:10564504.

    \ ' '7023' 'IPR010809' '\

    The flagellar hook-associated protein 2 (HAP2 or FliD) forms the distal end of the flagella, and plays a role in mucin specific adhesion of the bacteria PUBMED:9488388. This alignment covers the C-terminal region of the flagellar hook-associated protein 2.

    \ ' '7024' 'IPR010810' '\

    The function of this region is not clear, but it is found in many flagellar hook proteins, including FliD homologues PUBMED:11230454. This motif is found in single copy or repeated in various flagellar proteins. Conserved Ile-Asn (IN) residues are seen at the centre of the motif. The diversity of these motifs makes it likely that some members of the family are not identified.

    \ ' '7025' 'IPR010811' '\

    This represents a short conserved region (approximately 50 residues long), sometimes repeated, within a number of hypothetical Oryza sativa proteins of unknown function.

    \ ' '7026' 'IPR009849' '\

    This entry represents a conserved region approximately 100 residues long, multiple copies of which are sometimes found within hypothetical Ureaplasma parvum proteins of unknown function.

    \ ' '7027' 'IPR009850' '\

    This family represents a conserved region approximately 150 residues long that is sometimes repeated within some Babesia bovis proteins of unknown function.

    \ ' '7028' 'IPR009851' '\

    This entry represents a conserved region approximately 150 residues long within a number of eukaryotic proteins that show homology with Drosophila melanogaster Modifier of rudimentary (Mod(r)) proteins. The N-terminal half of Mod(r) proteins is acidic, whereas the C-terminal half is basic PUBMED:7651329, and both of these regions are represented in this family.

    \ ' '7029' 'IPR010812' '\

    This entry represents a conserved region approximately 200 residues long within a number of bacterial hypersensitivity response secretion protein HrpJ and similar proteins. HrpJ forms part of a type III secretion system through which, in phytopathogenic bacterial species, virulence factors are thought to be delivered to plant cells PUBMED:10449783.

    \ ' '7030' 'IPR009852' '\

    Proteins in this entry include T-complex 10, involved in spermatogenesis in mice, and centromere protein J, which not only inhibits microtubule nucleation from the centrosome, but also depolymerises taxol-stabilised microtubules PUBMED:12068715, PUBMED:15047868. These proteins share an approximately 180 residue C-terminal region which contains unsual G repreats PUBMED:11003675.

    \ ' '7031' 'IPR009853' '\

    This family consists of several Caenorhabditis elegans proteins of around 70-75 residues in length. The function of this family is unknown.

    \ ' '7032' 'IPR009854' '\

    This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction PUBMED:11893756.

    \ ' '7033' 'IPR010813' '\

    This family consists of several hypothetical bacterial proteins, which seem to be specific to Staphylococcus species. Members of this family are typically around 100 residues in length. The function of this family is unknown.

    \ ' '7034' 'IPR009855' '\

    This family consists of several Baculovirus specific late expression factor 10 (LEF-10) sequences. LEF-10 is thought to be a late expressed structural protein although its exact function is unknown PUBMED:12202224.

    \ ' '7035' 'IPR009856' '\

    This family consists of several plant specific light regulated Lir1 proteins. Lir1 mRNA accumulates in the light, reaching maximum and minimum steady-state levels at the end of the light and dark period, respectively. Plants germinated in the dark have very low levels of lir1 mRNA, whereas plants germinated in continuous light express lir1 at an intermediate but constant level. It is thought that lir1 expression is controlled by light and a circadian clock. The exact function of this family is unclear PUBMED:8499615.

    \ ' '7036' 'IPR009857' '\

    This family consists of several hypothetical bacterial proteins of around 70 residues in length. Members of this family are often referred to as YejL. The function of this family is unknown.

    \ ' '7037' 'IPR009858' '\

    This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown.

    \ ' '7038' 'IPR010814' '\

    This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family appear to be Actinomycete specific. The function of this family is unknown.

    \ ' '7040' 'IPR009860' '\

    This family consists of several phage associated hyaluronidase proteins () which seem to be specific to Streptococcus pyogenes and its bacteriophages. The substrate of hyaluronidase is hyaluronic acid, a sugar polymer composed of alternating N-acetylglucosamine and glucuronic acid residues. Hyaluronic acid is found in the ground substance of human connective tissue and the vitreous of the eye and also is the sole component of the capsule of group A streptococci. The capsule has been shown to be an important virulence factor of this organism by virtue of its ability to resist phagocytosis. Production by S. pyogenes of both a hyaluronic acid capsule and hyaluronidase enzymatic activity capable of destroying the capsule is an interesting, yet-unexplained, phenomenon PUBMED:7622224.

    \ ' '7041' 'IPR009861' '\

    This family consists of several mammalian DAP10 membrane proteins. In activated mouse natural killer (NK) cells, the NKG2D receptor associates with two intracellular adaptors, DAP10 and DAP12, which trigger phosphatidyl inositol 3 kinase (PI3K) and Syk family protein tyrosine kinases, respectively. It has been suggested that the DAP10-PI3K pathway is sufficient to initiate NKG2D-mediated killing of target cells PUBMED:12740576.

    \ ' '7042' 'IPR010815' '\

    This family consists of several hypothetical Enterobacterial proteins of around 100 residues in length. Members of this family are often described as YbjC. In Escherichia coli the ybjC gene is located downstream of nfsA (which encodes the major oxygen-insensitive nitroreductase). It is thought that nfsA and ybjC form an operon an its promoter is a class I SoxS-dependent promoter PUBMED:11741843. The function of this family is unknown.

    \ ' '7043' 'IPR009862' '\

    This family consists of several bacterial proteins of around 110 residues in length. Members of this family seem to be specific to Agrobacterium species and to Rhizobium loti (Mesorhizobium loti). The function of this family is unknown.

    \ ' '7044' 'IPR009863' '\

    This family consists of several bacterial LcrG proteins. Yersiniae are equipped with the Yop virulon, an apparatus that allows extracellular bacteria to deliver toxic Yop proteins inside the host cell cytosol in order to sabotage the communication networks of the host cell or even to cause cell death. LcrG is a component of the Yop virulon involved in the regulation of secretion of the Yops PUBMED:9484897.

    \ \

    This protein is found in type III secretion operons, along with LcrR, H and V. Also known as PcrG in Pseudomonas, the prot\ ein is believed to make a 1:1 complex with PcrV (LcrV) PUBMED:14565848. Mutations in LcrG cause premature secretion\ of effector proteins into the medium PUBMED:11443094.

    \ ' '7045' 'IPR010816' '\

    In filamentous fungi, het loci (for heterokaryon incompatibility) are believed to regulate self/nonself-recognition during vegetative growth. As filamentous fungi grow, hyphal fusion occurs within an individual colony to form a network. Hyphal fusion can occur also between different individuals to form a heterokaryon, in which genetically distinct nuclei occupy a common cytoplasm. However, heterokaryotic cells are viable only if the individuals involved have identical alleles at all het loci PUBMED:9770498.

    \ ' '7046' 'IPR009864' '\

    This family consists of several rhoptry-associated protein 1 (RAP-1) sequences which appear to be specific to Plasmodium falciparum PUBMED:11254620.

    \ ' '7047' 'IPR010817' '\

    This entry represents the N terminus (approximately 150 residues) of bacterial HemY porphyrin biosynthesis proteins. These are membrane protein involved in a late step of protoheme IX synthesis PUBMED:7928957.

    \ ' '7048' 'IPR010818' '\

    This family consists of several hypothetical putative lipoproteins which seem to be found specifically in the bacterium Leptospira interrogans. Members of this family are typically around 670 resides in length and their function is unknown.

    \ ' '7049' 'IPR010819' '\

    N-acylglucosamine 2-epimerase (AGE, ) reversibly converts N-acyl-D-glucosamine to N-acyl-D-mannosamine, the latter ultimately being converted to cytidine 5\'- monophospho-N-acetylneuraminic acid, which is used as a precursor for the synthesis of connective tissues, blood cells and cellular macromolecules. AGE is a renin-binding protein (RnBP), which might act as a cellular rennin inhibitor. AGE functions as a homodimer, where monomer has an alpha(6)/alpha(6)-barrel structure commonly found in glucoamylases and cellulases PUBMED:11061972. This family contains a number of eukaryotic and bacterial AGE enzymes.

    \ ' '7050' 'IPR009865' '\

    This family consists of several mammalian specific proacrosin binding protein sp32 sequences. sp32 is a sperm specific protein, which is known to bind with 55- and 53 kDa proacrosins and the 49 kDa acrosin intermediate. The exact function of sp32 is unclear, it is thought however that the binding of sp32 to proacrosin may be involved in packaging the acrosin zymogene into the acrosomal matrix PUBMED:8144514.

    \ ' '7051' 'IPR010820' '\

    This family represents a conserved region approximately 350 residues long within a number of plant proteins of unknown function.

    \ ' '7052' 'IPR010821' '\

    This family consists of several plant specific Chlorophyllase proteins (). Chlorophyllase (Chlase) is the first enzyme involved in chlorophyll (Chl) degradation and catalyses the hydrolysis of ester bond to yield chlorophyllide and phytol PUBMED:10611389.

    \ ' '7053' 'IPR009866' '\

    This family contains human NADH-ubiquinone oxidoreductase subunit NDUFB4 and related sequences.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '7054' 'IPR009867' '\

    This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown.

    \ ' '7055' 'IPR004082' '\ A total of 715 potential protein-coding genes have been identified in the \ nucleotide sequence of Arabidopsis thaliana chromosome 5, with an average gene density of 1 gene per 4001 bp PUBMED:10718197. Amongst the gene products is a \ well-conserved family of 130.7kDa proteins that share no sequence similarity\ with any other known proteins, other than in plants. The sequences are characterised by an N-terminal domain of variable length, a central cysteine-rich region and a relatively acidic C-terminal domain. The sequences may possess a PHD finger.\ ' '7056' 'IPR010822' '\

    This family contains a number of bacterial stage II sporulation E proteins (). These are required for formation of a normal polar septum during sporulation. The N-terminal region is hydrophobic and is expected to contain up to 12 membrane-spanning segments PUBMED:8830262.

    \ ' '7057' 'IPR009868' '\

    This family consists of several VirE2 proteins which seem to be specific to Agrobacterium tumefaciens and Rhizobium etli. VirE2 is known to interact, via its C terminus, with VirD4. A. tumefaciens transfers oncogenic DNA and effector proteins to plant cells during the course of infection. Substrate translocation across the bacterial cell envelope is mediated by a type IV secretion (TFS) system composed of the VirB proteins, as well as VirD4, a member of a large family of inner membrane proteins implicated in the coupling of DNA transfer intermediates to the secretion machine. VirE2 is therefore thought to be a protein substrate of a type IV secretion system which is recruited to a member of the coupling protein superfamily PUBMED:12950931.

    \ ' '7058' 'IPR010823' '\

    This family consists of several Bacteriophage T4-like capsid assembly (or portal) proteins. The exact mechanism by which the double-stranded (ds) DNA bacteriophages incorporate the portal protein at a unique vertex of the icosahedral capsid is unknown. In phage T4, there is evidence that this vertex, constituted by 12 subunits of gp20, acts as an initiator for the assembly of the major capsid protein and the scaffolding proteins into a prolate icosahedron of precise dimensions. The regulation of portal protein gene expression is an important regulator of prohead assembly in bacteriophage T4 PUBMED:8918937.

    \ ' '7059' 'IPR009869' '\

    This entry represents the N terminus (approximately 180 residues) of plant Hs1pro-1, which is believed to confer resistance to nematodes PUBMED:12669798.

    \ ' '7060' 'IPR009870' '\

    This family consists of several archaeal proteins of around 320 residues in length. Members of this family seem to be found exclusively in Halobacterium and Haloferax species. The function of this family is unknown.

    \ ' '7061' 'IPR010824' '\

    This family consists of several hypothetical bacterial proteins of around 125 residues in length. Several members of this family are described as putative lipoproteins and are often known as YcfL. The function of this family is unknown.

    \ ' '7062' 'IPR009871' '\

    This family consists of several Banana bunchy top virus proteins of around 120 residues in length. is annotated a movement protein whereas most other family members are hypothetical. The function of this family is unknown.

    \ ' '7063' 'IPR009872' '\

    This family consists of several bacterial proteins of around 100 residues in length. The function of this family is unknown.

    \ ' '7064' 'IPR009873' '\

    This family consists of several Phytoreovirus S7 proteins which are thought to be viral core proteins PUBMED:2313270.

    \ ' '7065' 'IPR009874' '\

    This family consists of several hypothetical bacterial and one archaeal sequence of around 120 residues in length. The function of this family is unknown.

    \ ' '7066' 'IPR009875' '\

    This entry is found in several bacterial type IV pilus assembly (PilZ) proteins. PilZ is thought to have a cytoplasmic location and be essential for type 4 fimbrial biogenesis but its exact function is unknown PUBMED:8550441.

    \ ' '7067' 'IPR009876' '\

    This entry represents several Neisseria species-specific OpcA-type outer membrane adhesion proteins. OpcA (formerly called 5C) was isolated from Neisseria meningitidis, causative agent of meningococcal meningitis and septicemia. An outer membrane protein embedded in the lipid bilayer, OpcA was shown to play an important role in meningococcal adhesion and invasion of both epithelial and endothelial cells, mediating attachment to host cells by binding proteoglycan cell-surface receptors PUBMED:12706886. OpcA forms a 10-stranded beta-barrel with five highly mobile extracellular loops that protrude above the surface of the membrane PUBMED:11891340. These extracellular loops combine to form a crevice in the external surface that is lined by positively charged residues, which is predicted to be a binding site for proteoglycan polysaccharides involved in pathogenesis. Conformational changes in the extracellular loops modulate the surface of OpcA, which could affect the proteoglycan binding site PUBMED:17114231. These conformational changes could also lead to pore opening.

    \ ' '7068' 'IPR010825' '\

    This family consists of several Drosophila species specific Turandot proteins. The Turandot A (TotA) gene encodes a humoral factor, which is secreted from the fat body and accumulates in the body fluids. TotA is strongly induced upon bacterial challenge, as well as by other types of stress such as high temperature, mechanical pressure, dehydration, UV irradiation, and oxidative agents. It is also upregulated during metamorphosis and at high age. Flies that overexpress TotA show prolonged survival and retain normal activity at otherwise lethal temperatures. Although TotA is only induced by severe stress, it responds to a much wider range of stimuli than heat shock genes such as hsp70 or immune genes such as Cecropin A1 PUBMED:11369236.

    \ ' '7070' 'IPR006541' '\

    These sequences represent a family of integral membrane proteins, most of which are about 650 residues in size and predicted to span the membrane seven times. Nearly half of the members of this family are found in association with a member of the lactococcin 972 family of bacteriocins () PUBMED:10589723. Others may be associated with uncharacterised proteins that may also act as bacteriocins. Although this protein is suggested to be an immunity protein, and the bacteriocin is suggested to be exported by a Sec-dependent process, the role of this protein is unclear.

    \ ' '7071' 'IPR010826' '\

    This domain is found in several Phlebovirus glycoprotein G1 sequences. Members of the Bunyaviridae family acquire an envelope by budding through the lipid bilayer of the Golgi complex. The budding compartment is thought to be determined by the accumulation of the two heterodimeric membrane glycoproteins G1 and G2 in the Golgi PUBMED:9811692.

    \ ' '7072' 'IPR010827' '\

    This motif is found primarily in bacterial surface antigens, normally as variable number repeats at the N terminus. The C terminus of these proteins is normally represented by . There may also be a relationship to haemolysin activator HlyB (). The alignment centres on a -GY- or -GF- motif. Some members of this family are found in the mitochondria. It is predicted to have a mixed alpha/beta secondary structure.

    \ ' '7073' 'IPR009878' '\

    This domain is found in several Phlebovirus glycoprotein G2 sequences. Members of the Bunyaviridae family acquire an envelope by budding through the lipid bilayer of the Golgi complex. The budding compartment is thought to be determined by the accumulation of the two heterodimeric membrane glycoproteins G1 and G2 in the Golgi PUBMED:9811692.

    \ ' '7074' 'IPR009879' '\

    This entry consists of several Phlebovirus nonstructural NS-M proteins, which represent the N-terminal region of the M polyprotein precursor. The function of this family is unknown.

    \ ' '7075' 'IPR010828' '\

    This family contains a number of alcohol acetyltransferase () enzymes approximately 500 residues long that seem to be restricted to Saccharomyces. These catalyse the esterification of isoamyl alcohol by acetyl coenzyme A PUBMED:7764365.

    \ ' '7076' 'IPR006611' '\

    This cysteine-rich family of proteins has currently only been identified in Drosophila species.

    \ ' '7077' 'IPR010829' '\

    Cerato-platanin (CP) is the first member of the cerato-platanin family. It is produced by the Ascomycete Ceratocystis fimbriata f. sp. platani and causes the severe plant disease: canker stain. This protein occurs in the cell wall of the fungus and is involved in the host-plane interaction and induces both cell necrosis and phytoalexin synthesis which is one of the first plant defense-related events. CP, like other fungal surface proteins, is able to self assemble in vitro PUBMED:17431609. CP is a 120 amino acid protein, containing 40% hydrophobic residues and two S-S bridges. It contains four cysteine residues that form two disulphide bonds PUBMED:10455173. The N-terminal region of CP is very similar to cerato-ulmin, a phytotoxic protein produced by the Ophiostoma species belonging to the hydrophobin family, which also self-assembles PUBMED:16931046. This entry also includes other precursor proteins.

    \ ' '7078' 'IPR009880' '\

    This entry represents the N terminus (approximately 300 residues) of a number of plant and fungal glyoxal oxidase enzymes. Glyoxal oxidase catalyses the oxidation of aldehydes to carboxylic acids, coupled with reduction of dioxygen to hydrogen peroxide. It is an essential component of the extracellular lignin degradation pathways of the wood-rot fungus Phanerochaete chrysosporium PUBMED:10593910.

    \ ' '7080' 'IPR009881' '\

    This family contains a number of hypothetical bacterial proteins of unknown function approximately 100 residues in length.

    \ ' '7081' 'IPR009882' '\

    This family consists of several Gypsy/Env proteins from Drosophila and Ceratitis fruit fly species. Gypsy is an endogenous retrovirus of Drosophila melanogaster. Phylogenetic studies suggest that occasional horizontal transfer events of gypsy occur between Drosophila species. gypsy possesses infective properties associated with the products of the envelope gene that might be at the origin of these interspecies transfers PUBMED:11805056.

    \ ' '7082' 'IPR009883' '\

    This family consists of several hypothetical bacterial proteins of around 135 residues in length. Members of this family all appear to be Enterobacterial proteins. The function of this family is unknown.

    \ ' '7083' 'IPR009884' '\

    This family consists of several Benyvirus specific 14 kDa proteins of around 125 residues in length. Members of this family contain 9 conserved cysteine residues. The function of this family is unknown.

    \ ' '7084' 'IPR009885' '\

    This family consists of several hypothetical Enterobacterial proteins of around 80 residues in length. The function of this family is unknown.

    \ ' '7086' 'IPR009886' '\

    This family consists of several mammalian HCaRG(hypertension-related, calcium-regulated gene) proteins. HCaRG is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals. HCaRG is a nuclear protein potentially involved in the control of cell proliferation PUBMED:10918053.

    \ ' '7087' 'IPR010832' '\

    Mature peptide hormones and neuropeptides are typically synthesised from much larger precursors and require several post-translational processing steps--including\ proteolytic cleavage--for the formation of the bioactive species. The subtilisin-related proteolytic enzymes that accomplish neuroendocrine-specific cleavages are\ known as prohormone convertases 1 and 2 (PC1 and PC2), which belong to MEROPS peptidase family S8B. The cell biology of these proteases within the regulated secretory pathway of neuroendocrine cells is\ complex, and they are themselves initially synthesised as inactive precursor molecules. ProPC1 propeptide cleavage occurs rapidly in the endoplasmic reticulum, yet its major site of action on prohormones takes place later in the secretory pathway. PC1 undergoes an interesting carboxyl terminal processing event whose function\ appears to be to activate the enzyme. ProPC2, on the other hand, exhibits comparatively long initial folding times and exits the endoplasmic reticulum without\ propeptide cleavage, in association with the neuroendocrine-specific protein 7B2. Once the proPC2/7B2 complex arrives at the trans-Golgi network, 7B2 is\ internally cleaved into two domains, the 21-kDa fragment and a carboxy-terminal 31 residue peptide. PC2 propeptide removal occurs in the maturing secretory granule, most likely through autocatalysis, and 7B2 association does not appear to be directly required for this cleavage event. However, if proPC2 has not encountered 7B2 intracellularly, it cannot generate a catalytically active mature species. The molecular mechanism behind the intriguing intracellular association of 7B2 and proPC2 is still unknown, but may involve conformational rearrangement or stabilisation of a proPC2 conformer mediated by a 36-residue internal segment of 21-kDa 7B2.

    \ \ \

    This family represents proSAAS, which belongs to MEROPS inhibitor family I49, clan I-. ProSAAS is the PC1 binding protein PUBMED:10632593, PUBMED:11742530. It exhibits both structural and functional homology to 7B2 (), which is the PC2 binding protein PUBMED:10812060. The CT domain of proSAAS contains the same inhibitor hexapeptide as 7B2 PUBMED:9756897, PUBMED:10812060, consequently both 7B2 and proSAAS are two members of a homologous family of prohormone convertase inhibitor proteins. However, despite their apparent similarities, there are profound differences in the evolutionary and cell biology of these two prohormone convertases, which are likely to be influenced by their binding proteins and their respective N-terminal PUBMED:12914799 and C-terminal domains.

    \ ' '7088' 'IPR009887' '\

    This family consists of several progressive ankylosis protein (ANK or ANKH) sequences. The ANK protein spans the outer cell membrane and shuttles inorganic pyrophosphate (PPi), a major inhibitor of physiologic and pathologic calcification, bone mineralisation and bone resorption PUBMED:11326272. Mutations in ANK are thought to give rise to Craniometaphyseal dysplasia (CMD) which is a rare skeletal disorder characterised by progressive thickening and increased mineral density of craniofacial bones and abnormally developed metaphyses in long bones PUBMED:11326338.

    \ ' '7089' 'IPR010833' '\

    This family consists of several bacterial replication initiation and membrane attachment (DnaB) proteins. The DnaB protein is essential for both replication initiation and membrane attachment of the origin region of the chromosome and plasmid pUB110 in Bacillus subtilis. It is known that there are two different classes (DnaBI and DnaBII) in the DnaB mutants; DnaBI is essential for both chromosome and pUB110 replication, whereas DnaBII is necessary only for chromosome replication PUBMED:3027697.

    \ ' '7090' 'IPR009888' '\

    This family consists of several hypothetical bacterial proteins of around 160 residues in length. The function of this family is unknown.

    \ ' '7091' 'IPR009889' '\

    This family consists of several mammalian dentin matrix protein 1 (DMP1) sequences. The dentin matrix acidic phosphoprotein 1 (DMP1) gene has been mapped to human chromosome 4q21 PUBMED:9177774. DMP1 is a bone and teeth specific protein initially identified from mineralised dentin. DMP1 is primarily localised in the nuclear compartment of undifferentiated osteoblasts. In the nucleus, DMP1 acts as a transcriptional component for activation of osteoblast-specific genes like osteocalcin. During the early phase of osteoblast maturation, Ca2+ surges into the nucleus from the cytoplasm, triggering the phosphorylation of DMP1 by a nuclear isoform of casein kinase II. This phosphorylated DMP1 is then exported out into the extracellular matrix, where it regulates nucleation of hydroxyapatite. DMP1 is a unique molecule that initiates osteoblast differentiation by transcription in the nucleus and orchestrates mineralised matrix formation extracellularly, at later stages of osteoblast maturation PUBMED:12615915. The DMP1 gene has been found to be ectopically expressed in lung cancer although the reason for this is unknown PUBMED:12929940.

    \ ' '7092' 'IPR009890' '\

    This family contains a number of eukaryotic etoposide-induced 2.4 (EI24) proteins approximately 350 residues long. In cells treated with the cytotoxic drug etoposide, EI24 is induced by p53 PUBMED:8649819. It has been suggested to play an important role in negative cell growth control PUBMED:10594026.

    \ ' '7093' 'IPR009891' '\

    This family consists of several plant tapetum specific proteins. Members of this family are found in Arabidopsis thaliana, Brassica napus and Sinapis alba. Members of this family may be involved in sporopollenin formation and/or deposition PUBMED:7764317.

    \ ' '7095' 'IPR009893' '\

    This family consists of several Nucleopolyhedrovirus capsid protein P87 sequences. P87 is expressed late in infection and concentrated in infected cell nuclei PUBMED:2184573.

    \ ' '7096' 'IPR009894' '\

    This family consists of a number of exported protein precursor (EppA and BapA) sequences which seem to be specific to Borrelia burgdorferi (Lyme disease spirochete). bapA gene sequences are quite stable but the encoded proteins do not provoke a strong immune response in most individuals. Conversely, EppA proteins are much more antigenic but are more variable in sequence. It is thought that BapA and EppA play important roles during the B. burgdorferi infectious cycle PUBMED:12724373.

    \ ' '7098' 'IPR009895' '\

    This family consists of several hypothetical proteins of around 170 residues in length, which appear to be mouse specific. The function of this family is unknown.

    \ ' '7099' 'IPR009896' '\

    This family consists of several Mycoplasma species specific Cytadhesin P32 and P30 proteins. P30 has been found to be membrane associated and localised on the tip organelle. It is thought that it is important in cytadherence and virulence PUBMED:9632619.

    \ ' '7100' 'IPR009897' '\

    This family consists of several Orthoreovirus P17 proteins. P17 is specified be ORF2 of the S1 gene and represents a nonstructural protein which associate with cell membranes PUBMED:11883183.

    \ ' '7101' 'IPR010835' '\

    This family consists of several hypothetical bacterial proteins of around 190 residues in length. Several members of this family are annotated as being putative lipoproteins and are often known as YceB. The function of this family is unknown.

    \ ' '7102' 'IPR009898' '\

    This family contains a number of bacterial proteins of unknown function approximately 180 residues long. These are possibly integral membrane proteins.

    \ ' '7103' 'IPR009899' '\

    This family consists of several bacterial antirestriction (ArdA) proteins. ArdA functions in bacterial conjugation to allow an unmodified plasmid to evade restriction in the recipient bacterium and yet acquire cognate modification PUBMED:12618468.

    \ ' '7104' 'IPR009900' '\

    This entry represents a series of 13 residue repeats found in the apopolysialoglycoprotein of Oncorhynchus mykiss (Rainbow trout) and Oncorhynchus masou (Cherry salmon). Polysialoglycoprotein (PSGP) of unfertilised eggs of rainbow trout consists of tandem repeats of a glycotridecapeptide, Asp-Asp-Ala-Thr*-Ser*-Glu-Ala-Ala-Thr*-Gly-Pro-Ser- Gly (* denotes the attachment site of a polysialoglycan chain). In response to egg activation, PSGP is discharged by exocytosis into the space between the vitelline envelope and the plasma membrane, i.e. the perivitelline space, where the 200 kDa PSGP molecules undergo rapid and dramatic depolymerisation by proteolysis into glycotridecapeptides PUBMED:3182867.

    \ ' '7105' 'IPR010836' '\

    This family contains a number of bacterial SapC proteins approximately 250 residues long. In Campylobacter fetus, SapC forms part of a paracrystalline surface layer (S-layer) that confers serum resistance PUBMED:9851986.

    \ ' '7106' 'IPR009901' '\

    This family consists of several hypothetical Enterobacterial proteins of around 160 residues in length. The function of this family is unknown.

    \ ' '7107' 'IPR009902' '\

    This family consists of several hypothetical Arabidopsis thaliana proteins of around 225 residues in length. The function of this family is unknown.

    \ ' '7108' 'IPR009903' '\

    This family consists of several Baculovirus proteins of around 55 residues in length. The function of this family is unknown.

    \ ' '7109' 'IPR009904' '\

    This family contains a number of eukaryotic Insulin-induced proteins (INSIG-1 and INSIG-2) approximately 200 residues long. INSIG-1 and INSIG-2 are found in the endoplasmic reticulum and bind the sterol-sensing domain of SREBP cleavage-activating protein (SCAP), preventing it from escorting SREBPs to the Golgi. Their combined action permits feedback regulation of cholesterol synthesis over a wide range of sterol concentrations PUBMED:12202038,PUBMED:12242332.

    \ ' '7110' 'IPR010095' '\

    This entry represents a region of a sequence similarity between a family of putative transposases of Thermoanaerobacter tengcongensis, smaller related proteins from Bacillus anthracis, putative transposes described by , and other proteins.

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '7111' 'IPR010837' '\

    This family contains TrbH, a bacterial conjugal transfer protein approximately 150 residues long. This contains a putative membrane lipoprotein lipid attachment site PUBMED:9829924.

    \ ' '7112' 'IPR009905' '\

    This family contains the bacterial enzyme 2-vinyl bacteriochlorophyllide hydratase (approximately 150 residues long). This is involved in the light-independent bacteriochlorophyll biosynthesis pathway by adding water across the 2-vinyl group PUBMED:8385667.This enzyme is apparently absent from cyanobacteria (which do not use bacteriochlorophyll).

    \ ' '7113' 'IPR010838' '\

    This family contains several hypothetical bacterial proteins of unknown function that are approximately 250 residues long.

    \ ' '7114' 'IPR009906' '\

    This family represents a conserved region approximately 150 residues long within a number of hypothetical bacterial and eukaryotic proteins of unknown function.

    \ ' '7115' 'IPR010839' '\

    This family consists of several bacterial and plant proteins of around 400 residues in length. The function of this family is unknown.

    \ ' '7116' 'IPR009907' '\

    This family consists of several bacterial proteins of around 70 residues in length. The function of this family is unknown.

    \ ' '7117' 'IPR006606' '\

    This family consists of several eukaryotic proteins of around 375 residues in length. The function of this family is unknown.

    \ ' '7118' 'IPR010840' '\

    This family consists of several bacterial proteins of around 210 residues in length. The function of this family is unknown.

    \ ' '7119' 'IPR009908' '\

    This family consists of several bacterial methylamine utilisation MauE proteins. Synthesis of enzymes involved in methylamine oxidation via methylamine dehydrogenase (MADH) is encoded by genes present in the mau cluster. MauE and MauD are specifically involved in the processing, transport, and/or maturation of the beta-subunit and that the absence of each of these proteins leads to production of a non-functional beta-subunit which becomes rapidly degraded PUBMED:9403107.

    \ ' '7120' 'IPR009909' '\

    This entry represents a domain of approximately 90 residues that is tandemly repeated within interferon-induced 35 kDa protein (IFP 35) and the homologous N-myc-interactor (Nmi). This domain mediates Nmi-Nmi protein interactions and subcellular localisation PUBMED:10950963.

    \ ' '7121' 'IPR009910' '\

    This family consists of several hypothetical bacterial proteins of around 80 residues in length. Members of this family contain four highly conserved cysteine residues. The function of this family is unknown.

    \ ' '7122' 'IPR009911' '\

    This family consists of several insect fibroin P25 proteins. Silk fibroin produced by the silkworm Bombyx mori consists of a heavy chain, a light chain, and a glycoprotein, P25. The heavy and light chains are linked by a disulphide bond, and P25 associates with disulphide-linked heavy and light chains by noncovalent interactions. P25 is plays an important role in maintaining integrity of the complex PUBMED:10986287.

    \ ' '7123' 'IPR009912' '\

    This family consists of several hypothetical bacterial proteins of around 160 residues in length. Members of this family contain four highly conserved cysteine resides toward the C-terminal region of the protein. The function of this family is unknown.

    \ ' '7124' 'IPR009913' '\

    This family consists of several bacterial conjugative transfer TraP proteins from Escherichia coli and Salmonella typhimurium. TraP appears to play a minor role in conjugation and may interact with TraB, which varies in sequence along with TraP, in order to stabilise the proposed transmembrane complex formed by the tra operon products PUBMED:8655498.

    \ ' '7125' 'IPR009914' '\

    This family consists of several eukaryotic dolichol phosphate-mannose biosynthesis regulatory (DPM2) proteins. Biosynthesis of glycosylphosphatidylinositol and N-glycan precursor is dependent upon a mannosyl donor, dolichol phosphate-mannose (DPM). DPM2, an 84 amino acid membrane protein expressed in the endoplasmic reticulum (ER), makes a complex with DPM1 that is essential for the ER localisation and stable expression of DPM1. Moreover, DPM2 enhances binding of dolichol phosphate, a substrate of DPM synthase. Biosynthesis of DPM in mammalian cells is regulated by DPM2 PUBMED:9724629.

    \ ' '7126' 'IPR009915' '\

    This family consists of several plant and bacterial NnrU proteins. NnrU is thought to be involved in the reduction of nitric oxide. The exact function of NnrU is unclear. It is thought however that NnrU and perhaps NnrT are required for expression of both nirK and nor PUBMED:9171397.

    \ ' '7127' 'IPR010841' '\

    This family consists of several bacterial fibronectin-binding proteins which are thought to be involved in virulence in Listeria species PUBMED:10569795,PUBMED:11023185.

    \ ' '7129' 'IPR009916' '\

    This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. Members of this family seem to be found exclusively in the Order Bacillales.

    \ ' '7130' 'IPR010843' '\

    This family consists of several bacterial and archaeal AroM proteins. In Escherichia coli the aroM gene is cotranscribed with aroL PUBMED:3001025. The function of this family is unknown.

    \ ' '7131' 'IPR010844' '\

    This represents a conserved region approximately 100 residues long within eukaryotic occludin proteins and the RNA polymerase II elongation factor ELL. Occludin is an integral membrane protein that localises to tight junctions PUBMED:8276896, while ELL is an elongation factor that can increase the catalytic rate of RNA polymerase II transcription by suppressing transient pausing by polymerase at multiple sites along the DNA PUBMED:8596958. This shared domain is thought to mediate protein interactions PUBMED:8276896.

    \ ' '7132' 'IPR009917' '\

    This family consists of several hypothetical mammalian steroid receptor RNA activator proteins. SRA-RNAs likely to encode stable proteins are widely expressed in breast cancer cell lines. SRA-RNA is a steroid receptor co-activator which acts as a functional RNA and is classified as belonging to the growing family of functional non-coding RNAs.

    \ ' '7133' 'IPR009918' '\

    This family consists of several Enterobacterial sequences of around 200 residues in length, which are often known as YiiQ proteins. The function of this family is unknown.

    \ ' '7134' 'IPR009919' '\

    This family consists of several hypothetical putative outer membrane proteins which appear to be specific to Anaplasma marginale and Anaplasma ovis.

    \ ' '7135' 'IPR009920' '\

    This family contains subunit 1 of bacterial heptaprenyl diphosphate synthase (HEPPP synthase) () (approximately 230 residues long). The enzyme consists of two subunits, both of which are required for catalysis of heptaprenyl diphosphate synthesis PUBMED:9748348.

    \ ' '7136' 'IPR009921' '\

    This domain occurs in several hypothetical bacterial proteins of around 150 residues in length. The function of this domain is unknown.

    \ ' '7137' 'IPR010845' '\

    This family consists of several bacterial FlaF flagellar proteins. FlaF and FlaG are trans-acting, regulatory factors that modulate flagellin synthesis during flagellum biogenesis PUBMED:1699845.

    \ ' '7138' 'IPR009922' '\

    This family contains a number of hypothetical bacterial proteins of unknown function approximately 200 residues long.

    \ ' '7139' 'IPR009923' '\

    This entry represents proteins with a Dodecin-like topology. Dodecin flavoprotein is a small dodecameric flavin-binding protein from Halobacterium salinarium (Halobacterium halobium) that contains two flavins stacked in a single binding pocket between two tryptophan residues to form an aromatic tetrade PUBMED:16460756. Dodecin binds riboflavin, although it appears to have a broad specificity for flavins. Lumichrome, a molecule associated with flavin metabolism, appears to be a ligand of dodecin, which could act as a waste-trapping device.

    \ ' '7140' 'IPR009924' '\

    This family consists of several hypothetical Caenorhabditis elegans proteins of around 85 residues in length. The function of this family is unknown.

    \ ' '7141' 'IPR010846' '\

    This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown.

    \ ' '7142' 'IPR010178' '\

    This entry represents a family of highly hydrophobic, uncharacterised predicted integral membrane proteins found almost entirely in low-GC Gram-positive bacteria, although a member is also found in Aquifex aeolicus.

    \ ' '7143' 'IPR009190' '\

    There are currently no experimental data for members of this group of bacterial proteins or their homologues. A crystal structure of UCP010603 revealed a thioredoxin-like fold, its core consisting of three layers alpha/beta/alpha.

    \ ' '7144' 'IPR009925' '\

    This entry represents a family of hypothetical proteins of around 140 residues in length found in Borrelia species. The function of this family is unknown.

    \ ' '7145' 'IPR009926' '\

    This family consists of several hypothetical YcgR proteins. YcgR may be involved in the flagellar motor function and may be a new member of the flagellar regulon PUBMED:11031114.

    \ ' '7146' 'IPR009927' '\

    This family consists of several hypothetical archaeal proteins of around 350 residues in length. The function of this family is unknown.

    \ ' '7147' 'IPR009928' '\

    This entry represents the N terminus (approximately 120 residues) of bacterial primosomal DnaI proteins, although one family member appears to be of viral origin. DnaI is one of the components of the Bacillus subtilis replication restart primosome, and is required for the DnaB75-dependent loading of the DnaC helicase PUBMED:11679082.

    \ ' '7149' 'IPR009929' '\

    This family contains the bacterial type III secretion protein YscO, which is approximately 150 residues long. YscO has been shown to be required for high-level expression and secretion of the anti-host proteins V antigen and Yops in Yersinia pestis PUBMED:9683485.

    \ ' '7150' 'IPR009930' '\

    This family consists of several Seadornavirus Vp10 proteins found in the Banna and Kadipiro virus. Members of this family are typically around 240 residues in length. The function of this family is unknown.

    \ ' '7151' 'IPR010848' '\

    This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown.

    \ ' '7152' 'IPR010849' '\

    This family contains DiGeorge syndrome critical region 6 (DGCR6) proteins (approximately 200 residues long) of a number of vertebrates. DGCR6 is a candidate for involvement in the DiGeorge syndrome pathology by playing a role in neural crest cell migration into the third and fourth pharyngeal pouches, the structures from which derive the organs affected in DiGeorge syndrome PUBMED:8733130. Also found in this family is the Drosophila melanogaster gonadal protein gdl.

    \ ' '7153' 'IPR009931' '\

    This family consists of several Curtovirus V2 proteins. The exact function of V2 is unclear but it is known that the protein is required for a successful host infection process PUBMED:9123819.

    \ ' '7154' 'IPR009932' '\

    This family consists of several hypothetical mammalian proteins of around 240 residues in length.

    \ ' '7155' 'IPR010850' '\

    This family consists of several locust specific neuroparsin proteins. Neuroparsins are produced by the A1 type of protocerebral median neurosecretory cells of the PI-CC system and display pleiotropic activities: inhibition of the effect of juvenile hormone, stimulation of fluid reabsorption of isolated recta, induction of an increase in hemolymph lipid and trehalose levels, and neurotrophic effects PUBMED:9114464.

    \ ' '7156' 'IPR009933' '\

    This family consists of several T-DNA border endonuclease VirD1 proteins, which appear to be found exclusively in Agrobacterium species. Agrobacterium, a plant pathogen, is capable to stably transform the plant cell with a segment of its own DNA called T-DNA (transferred DNA). This process depends, among others, on the specialised bacterial virulence proteins VirD1 and VirD2 that excise the T-DNA from its adjacent sequences. VirD1 is thought to interact with VirD2 in this process PUBMED:9689041.

    \ ' '7158' 'IPR009935' '\

    This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown.

    \ ' '7159' 'IPR009936' '\

    This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown.

    \ ' '7160' 'IPR009937' '\

    This family consists of several hypothetical bacterial proteins of around 140 residues in length. Members of the family seem to be found exclusively in Actinomycetes. The function of this family is unknown.

    \ ' '7161' 'IPR010851' '\

    This family consists of a number of cysteine rich SLR1 binding pollen coat like proteins. Adhesion of pollen grains to the stigmatic surface is a critical step during sexual reproduction in plants. In Brassica, S locus-related glycoprotein 1 (SLR1), a stigma-specific protein belonging to the S gene family of proteins, has been shown to be involved in this step. SLR1-BP specifically binds SLR1 with high affinity. The SLR1-BP gene is specifically expressed in pollen at late stages of development and is a member of the class A pollen coat protein (PCP) family, which includes PCP-A1, an SLG (S locus glycoprotein)-binding protein PUBMED:10716697.

    \ ' '7162' 'IPR009938' '\

    This entry represents the N terminus of interferon-induced 35 kDa protein (IFP 35) (approximately 80 residues long), which contains a leucine zipper motif in an alpha helical configuration PUBMED:10950963. This group of proteins also includes N-myc-interactor (Nmi), a homologous interferon-induced protein.

    \ ' '7163' 'IPR009939' '\

    This family consists of several fungal chitosanase proteins. Chitin, xylan, 6-O-sulphated chitosan and O-carboxymethyl chitin are indigestible by chitosanase PUBMED:11115392.

    \ ' '7164' 'IPR010852' '\

    This family consists of several hypothetical bacterial proteins of around 180 residues in length. Members of this family are found in Streptomyces, Rhizobium, Ralstonia, Agrobacterium and Bradyrhizobium species. The function of this family is unknown.

    \ ' '7165' 'IPR010853' '\

    This repeat is found in the CagY proteins - part of the CAG pathogenicity island - and involved in delivery of the protein CagA into host cells PUBMED:12823823. It forms part of a surface needle structure, and this repeat may form an alpha-helical rod structure PUBMED:12823823. The repeat contains a conserved -DC- and -EC-, which are regularly spaced in the alignment.

    \ ' '7166' 'IPR010854' '\

    This entry consists of several hypothetical Enterobacterial proteins of around 90 residues in length. Some of the proteins are annotated as ydgH precursors and contain two copies of this region, one at the N terminus and the other at the C terminus. The function of this family is unknown.

    \ ' '7167' 'IPR009940' '\

    This family consists of several Enterobacterial proteins of around 125 residues in length and contains 6 highly conserved cysteine residues. The function of this family is unknown.

    \ ' '7168' 'IPR010855' '\

    Expression from a human cytomegalovirus early promoter (E1.7) has been shown to be activated in trans by the IE2 gene product. Although the IE1 gene product alone had no effect on this early viral promoter, maximal early promoter activity was detected when both IE1 and IE2 gene products were present PUBMED:2157038. The IE1 protein from cytomegalovirus is also known as UL123.

    \ ' '7169' 'IPR009941' '\

    This entry represents a family of hypothetical proteins of around 150 residues in length found in Borrelia species. The function of this family is unknown.

    \ ' '7170' 'IPR009942' '\

    This family consists of several bacterial proteins of around 100 residues in length. Members of this family seem to be found exclusively in Staphylococcus aureus. The function of this family is unknown.

    \ ' '7171' 'IPR009943' '\

    This family consists of several hypothetical plant proteins of around 250 residues in length. Members of this family seem to be found exclusively in Arabidopsis thaliana. The function of this family is unknown.

    \ ' '7172' 'IPR009944' '\

    This family contains the eukaryotic surface glycoprotein amastin (approximately 180 residues long).In Trypanosoma cruzi, amastin is particularly abundant during the amastigote stage.

    \ ' '7173' 'IPR009945' '\

    This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family are found in Bradyrhizobium, Rhizobium, Brucella and Caulobacter species. The function of this family is unknown.

    \ ' '7174' 'IPR009946' '\

    This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 100 resides in length. The function of this family is unknown.

    \ ' '7175' 'IPR009947' '\

    This family contains the eukaryotic NADH:ubiquinone oxidoreductase subunit B14.5a (Complex I-B14.5a). This is approximately 100 residues long, and forms part of a multiprotein complex that resides on the inner mitochondrial membrane PUBMED:9878551.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '7176' 'IPR009948' '\

    This family contains a number of bacterial Syd proteins approximately 180 residues long. It has been suggested that Syd is loosely associated with the cytoplasmic surface of the cytoplasmic membrane, and that interaction with SecY may be involved in this membrane association PUBMED:7890670.

    \ ' '7177' 'IPR009949' '\

    This family consists of several hypothetical Sapovirus proteins of around 165 residues in length. The function of this family is unknown.

    \ ' '7178' 'IPR010856' '\

    This family consists of several hypothetical Enterobacterial proteins, of around 420 residues in length. Members of this family are often known as YbiU. The function of this family is unknown.

    \ ' '7179' 'IPR009950' '\

    This family consists of several hypothetical Enterobacterial proteins of around 80 residues in length. The function of this family is unknown.

    \ ' '7180' 'IPR009951' '\

    Bacteriophage Mu is a double-stranded DNA phage. It has an icosahedral head, a contractile tail with baseplate and six tail fibres. It is similar to the well-studied T-even phages. The baseplate of bacteriophage Mu, which recognises and attaches to a host cell during infection, consists of at least eight different proteins PUBMED:16125724.

    \

    This family consists of bacterial and phage Gam proteins. The gam gene of Bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo PUBMED:2945162.

    \

    The eukaryotic Ku protein has key roles in DNA repair and in certain transposition events. It was shown through biochemical studies that Gam and the related protein of Haemophilus influenzae display DNA binding characteristics remarkably similar to those of human Ku PUBMED:12524520. In addition, Gam can interfere with Ty1 retrotransposition in Saccharomyces cerevisiae (Baker\'s yeast). These data reveal structural and functional parallels between bacteriophage Gam and eukaryotic Ku and suggest that their functions have been evolutionarily conserved PUBMED:12524520.

    \ ' '7181' 'IPR009952' '\

    This family contains uroplakin II, which is approximately 180 residues long and seems to be restricted to mammals. Uroplakin II is an integral membrane protein, and is one of the components of the apical plaques of mammalian urothelium formed by the asymmetric unit membrane - this is believed to play a role in strengthening the urothelial apical surface to prevent the cells from rupturing during bladder distension PUBMED:8175808.

    \ ' '7182' 'IPR010857' '\

    This family contains a number of zona-pellucida-binding proteins that seem to be restricted to mammals. These are sperm proteins that bind to the 90 kDa family of zona pellucida glycoproteins in a calcium-dependent manner PUBMED:7729589. These represent some of the specific molecules that mediate the first steps of gamete interaction, allowing fertilisation to occur PUBMED:9378618.

    \ ' '7183' 'IPR010187' '\

    This entry represents selenoprotein B of glycine reductase, sarcosine reductase, betaine reductase, D-proline reductase, and perhaps others. All members are expected to contain an internal UGA codon, encoding selenocysteine, which may be misinterpreted as a stop codon.

    \ ' '7184' 'IPR010858' '\

    This family consists of several hypothetical bacterial proteins of around 230 residues in length. Members of this family are often referred to as YjaH and are found in the Orders Vibrionales and Enterobacteriales. The function of this family is unknown.

    \ ' '7185' 'IPR009953' '\

    This family consists of several bacterial dinitrogenase reductase ADP-ribosyltransferase (DRAT) proteins. Members of this family seem to be specific to Rhodospirillum, Rhodobacter and Azospirillum species. Dinitrogenase reductase ADP-ribosyl transferase (DRAT) carries out the transfer of the ADP-ribose from NAD to the Arg-101 residue of one subunit of the dinitrogenase reductase homodimer, resulting in inactivation of that enzyme. Dinitrogenase reductase-activating glycohydrolase (DRAG) removes the ADP-ribose group attached to dinitrogenase reductase, thus restoring nitrogenase activity. The DRAT-DRAG system negatively regulates nitrogenase activity in response to exogenous NH4+ or energy limitation in the form of a shift to darkness or to anaerobic conditions PUBMED:11160092.

    \ ' '7186' 'IPR009954' '\

    This family consists of several Enterobacterial proteins of around 60 residues in length. The function of this family is unknown.

    \ ' '7187' 'IPR009955' '\

    This family consists of several mammalian liver-expressed antimicrobial peptide 2 (LEAP-2) sequences. LEAP-2 is a cysteine-rich, and cationic protein. LEAP-2 contains a core structure with two disulphide bonds formed by cysteine residues in relative 1-3 and 2-4 positions. LEAP-2 is synthesised as a 77-residue precursor, which is predominantly expressed in the liver and highly conserved among mammals. The largest native LEAP-2 form of 40 amino acid residues is generated from the precursor at a putative cleavage site for a furin-like endoprotease. In contrast to smaller LEAP-2 variants, this peptide exhibits dose-dependent antimicrobial activity against selected microbial model organisms PUBMED:12493837. The exact function of this family is unclear.

    \ ' '7189' 'IPR009155' '\

    Cytochrome b562 is a haem-containing protein that is expressed in the periplasm of Escherichia coli. In b-type cytochromes, the haem atom is not covalently attached to the polypeptide. Cytochrome b562 has a four-helical bundle structure that is structurally similar to that found in members of the cytochrome c family (). Cytochrome b562 has a reduction potential of 167 mV, which sets the energy yield possible in metabolism and is also a key determinant of the rate at which redox reactions proceed PUBMED:11914078.

    \ ' '7190' 'IPR009956' '\

    This family consists of several Enterobacterial post-segregation antitoxin CcdA proteins. The F plasmid-carried bacterial toxin, the CcdB protein, is known to act on DNA gyrase in two different ways. CcdB poisons the gyrase-DNA complex, blocking the passage of polymerases and leading to double-strand breakage of the DNA. Alternatively, in cells that overexpress CcdB, the A subunit of DNA gyrase (GyrA) has been found as an inactive complex with CcdB. Both poisoning and inactivation can be prevented and reversed in the presence of the F plasmid-encoded antidote, the CcdA protein PUBMED:10196173.

    \ ' '7191' 'IPR009957' '\

    This family consists of several hypothetical bacterial proteins of around 110 residues in length. Members of this family appear to be found exclusively in Ralstonia solanacearum. The function of this family is unknown.

    \ ' '7192' 'IPR015995' '\

    Proteins in this entry are involved in degradation of the cyanobacterial heptapeptide hepatotoxin microcystin LR, and are encoded in the mlr gene cluster PUBMED:11769251. MlrC from Sphingomonas wittichii (strain RW1 / DSM 6014 / JCM 10273) is believed to mediate the last step of peptidolytic degradation of the tetrapeptide. It is suspected to be a metallopeptidase based on homology to known peptidases and its inhibition by metal chelators. The proteins encoded by the mlr cluster may be involved in cell wall peptidoglycan cycling and subsequently act fortuitously in hydrolysis of microcystin LR.

    \ \

    This entry represents the N-terminal region of these proteins.

    \ ' '7193' 'IPR009958' '\

    This family consists of several alpha conotoxin precursor proteins from a number of Conus species. Cone snail toxins, conotoxins, are small peptides with disulphide connectivity, that target ion-channels or G-protein coupled receptors. Based on the number and pattern of disulphide bonds and biological activities, conotoxins can be classified into several families PUBMED:11478951. Alpha-conotoxins are neurotoxins from the venom of fish-hunting cone snails that block nicotinic acetylcholine receptors (nAChRs) PUBMED:3196703. Omega, delta and kappa families of conotoxins have a knottin or inhibitor cystine knot scaffold. The knottin scaffold is a very special disulphide through disulphide knot, in which the III-VI disulphide bond crosses the macrocycle formed by two other disulphide bonds (I-IV and II-V) and the interconnecting backbone segments, where I-VI indicates the six cysteine residues starting from the N-terminus.

    \

    The disulphide bonding network as well as specific amino acids in inter-cysteine loops provide the specificity of conotoxin PUBMED:10988292. The cysteine arrangement is the same for omega, delta and kappa families, but omega conotoxins are calcium channel blockers, whereas delta conotoxins delay the inactivation of sodium channels and kappa conotoxins are potassium channel blockers PUBMED:11478951. Mu conotoxins have two types of cysteine arrangement, but the knottin scaffold is not observed. Mu conotoxins target the voltage-gated sodium channels PUBMED:11478951 and are useful probes for investigating voltage-dependent sodium channels of excitable tissues PUBMED:2410412. Alpha conotoxins have two types of cysteine arrangement PUBMED:1390774 and are competitive nicotinic acetylcholine receptor antagonists.

    \ ' '7194' 'IPR009959' '\

    This family consists of several hypothetical bacterial proteins of around 125 residues in length. The function of this family is unknown.

    \ ' '7195' 'IPR009960' '\

    This family consists of several fungal fruit body lectin proteins. Fruit body lectins are thought to have insecticidal activity PUBMED:12787928 and may also function in capturing nematodes PUBMED:12450118. One member of this family, the lectin XCL from Xerocomus chrysenteron, induces drastic changes in the actin cytoskeleton after sugar binding at the cell surface and internalization, and has potent insecticidal activity. The fold of lectin xcl is not related to any of several lectin folds, but but shows significant structural similarity to cytolysins PUBMED:15561152.

    \ ' '7196' 'IPR009961' '\

    This family consists of several uncharacterised proteins from Drosophila melanogaster. The function of this family is unknown.

    \ ' '7197' 'IPR009962' '\

    This family consists of several hypothetical bacterial proteins of around 85 residues in length. The function of this family is unknown.

    \ ' '7198' 'IPR008320' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '7199' 'IPR009963' '\

    This family consists of several hypothetical bacterial proteins of around 90 residues in length. Members of the family seem to be found exclusively in Mycobacterium species. The function of this family is unknown.

    \ ' '7200' 'IPR009964' '\

    This family consists of several bacterial proteins of around 115 residues in length. Members of this family seem to be found exclusively in the alphaproteobacteria. The function of this family is unknown.

    \ ' '7201' 'IPR010860' '\

    This family consists of several bacterial CAMP factor (Cfa) proteins, which seem to be specific to Streptococcus species. The CAMP reaction is a synergistic lysis of erythrocytes by the interaction of an extracellular protein (CAMP factor) produced by some streptococcal species with the Staphylococcus aureus sphingomyelinase C (beta-toxin) PUBMED:10456923.

    \ ' '7202' 'IPR010861' '\

    This family consists of several hypothetical, highly conserved Streptococcal and related phage proteins of around 100 residues in length. The function of this family is unknown.

    \ ' '7203' 'IPR009965' '\

    This family consists of several Tenuivirus PV2 proteins. PV2 is thought to be a membrane associated protein PUBMED:8883361. The function of this family is unclear.

    \ ' '7204' 'IPR009966' '\

    This family consists of several plant specific prosystemin proteins. Prosystemin is the precursor protein of the 18 amino acid wound signal systemin which activates systemic defence in plant leaves against insect herbivores PUBMED:9484462.

    \ ' '7205' 'IPR010862' '\

    This family consists of several bacterial proteins of around 115 residues in length. Members of this family are largely found in Salmonella and Yersinia species and several have been described as being putative cytoplasmic proteins. The function of this family is unknown.

    \ ' '7206' 'IPR009967' '\

    This family consists of several FlbT proteins. FlbT is a post-transcriptional regulator of flagellin. FlbT is associated with the 5\' untranslated region (UTR) of fljK (25 kDa flagellin) mRNA and that this association requires a predicted loop structure in the transcript. Mutations within this loop abolish FlbT association and result in increased mRNA stability. It is therefore thought that FlbT promotes the degradation of flagellin mRNA by associating with the 5\' UTR PUBMED:11029689.

    \ ' '7207' 'IPR009968' '\

    This family consists of several bacterial proteins of around 175 residues in length. Members of this family seem to be found exclusively in Chlamydia species. The function of this family is unknown.

    \ ' '7208' 'IPR009969' '\

    This family consists of several Pneumovirus M2 proteins. The M2-1 protein of respiratory syncytial virus (RSV) is a transcription processivity factor that is essential for virus replication PUBMED:12692207.

    \ ' '7209' 'IPR010863' '\

    This family consists of several hypothetical archaeal proteins of around 110 residues in length. The function of this family is unknown, although one sequence () is described as a putative HTH transcription regulator.

    \ ' '7210' 'IPR009970' '\

    This family contains the bacterial histone H1-like nucleoprotein HC2 (approximately 200 residues long), which seems to be found mostly in Chlamydia. HC2 functions in DNA condensation, although it has been suggested that it also has other roles PUBMED:8733229.

    \ ' '7211' 'IPR009971' '\

    This family consists of several bacterial proteins of around 90 residues in length. Members of this family seem to be found exclusively in the Orders Vibrionales and Enterobacteriales. The function of this family is unknown.

    \ ' '7212' 'IPR009972' '\

    This family consists of several phage and bacterial proteins of around 59 residues in length. Members of this family seem to be found exclusively in Lactococcus lactis and the bacteriophages that infect this organism. The function of this family is unknown.

    \ ' '7213' 'IPR010864' '\

    This family consists of several hypothetical bacterial proteins of around 225 residues in length. The function of this family is unknown.

    \ ' '7214' 'IPR010865' '\

    This family consists of several hypothetical bacterial and plant proteins of around 125 residues in length. The function of this family is unknown.

    \ ' '7215' 'IPR009973' '\

    This family consists of several Seadornavirus specific VP7 proteins of around 305 residues in length. The function of this family is unknown.

    \ ' '7216' 'IPR010866' '\

    This family contains the bacterial enzyme alpha-2,8-polysialyltransferase (approximately 500 residues long). This catalyses the polycondensation of alpha-2,8-linked sialic acid required for the synthesis of polysialic acid (PSA) PUBMED:12578835.

    \ ' '7217' 'IPR009974' '\

    This family consists of several Orthopoxvirus specific proteins of around 100 residues in length. The function of this family is unknown.

    \ ' '7218' 'IPR009975' '\

    This family consists of several P30 proteins which seem to be specific to Mycoplasma agalactiae. P30 is a 30 kDa immunodominant antigen and is known to be a transmembrane protein PUBMED:11473997.

    \ ' '7219' 'IPR010867' '\

    This is a nine residue repeat, which was called NPR after NonaPeptide Repeat. It is found in two malarial proteins and has the consensus EEhhEEhhP where h stands for a hydrophobic amino acid.

    \ ' '7220' 'IPR010868' '\

    This entry represents the N terminus (approximately 50 residues) of cyclin-dependent kinase inhibitor 2a p19Arf, which seems to be restricted to mammals. This is a tumour-suppressor protein that has been shown to inhibit the growth of human tumour cells lacking functional p53 by inducing a transient G2 arrest and subsequently apoptosis PUBMED:12660818.

    \ ' '7221' 'IPR009976' '\

    This family contains the Sec10 component (approximately 650 residues long) of the eukaryotic exocyst complex, which specifically affects the synthesis and delivery of secretory and basolateral plasma membrane proteins PUBMED:12665531.

    \ ' '7222' 'IPR010869' '\

    This family contains a number of hypothetical bacterial proteins of unknown function approximately 400 residues long.

    \ ' '7223' 'IPR009977' '\

    This family contains a number of bacterial mig-14 proteins (approximately 270 residues long). In Salmonella, mig-14 contributes to resistance to antimicrobial peptides, although the mechanism is not fully understood PUBMED:12029036.

    \ ' '7224' 'IPR010870' '\

    This family represents a conserved region approximately 400 residues long within the bacterial phosphate-selective porins O and P. These are anion-specific porins, the binding site of which has a higher affinity for phosphate than chloride ions. Porin O has a higher affinity for polyphosphates, while porin P has a higher affinity for orthophosphate PUBMED:1370289. In Pseudomonas aeruginosa, porin O was found to be expressed only under phosphate-starvation conditions during the stationary growth phase PUBMED:1406271.

    \ ' '7225' 'IPR010871' '\

    This family consists of a number of repeats of around 34 residues in length. Members of this family seem to be found exclusively in three hypothetical Murid herpesvirus 4 (MuHV-4) proteins. The function of this family is unknown.

    \ ' '7226' 'IPR010872' '\

    This family consists of several hypothetical bacterial proteins of around 250 residues in length. Members of this family seem to be found exclusively in Streptomyces coelicolor and Mycobacterium tuberculosis. The function of this family is unknown.

    \ ' '7227' 'IPR009978' '\

    This family consists of several hypothetical bacterial proteins of around 440 residues in length. The function of this family is unknown.

    \ ' '7228' 'IPR020412' '\

    Although IL-11 was initially believed to be restricted to mammals, subsequent studies demonstrated it to be expressed in fish PUBMED:15720388, PUBMED:16003467. Despite close similarity in gene structure and conservation of key amino acids between fish and mammalian IL-11, they share relatively low overall amino acid identity and may not necessarily be functionally analogous PUBMED:16003467.

    \

    Interleukins (IL) are a group of cytokines that play an important role in the immune system. They modulate inflammation and immunity by regulating growth, mobility and differentiation of lymphoid and other cells.

    Interleukin-11 (IL-11) is a pleiotropic cytokine that stimulates megakaryocytopoiesis, resulting in increased production of platelets, as well as activating osteoclasts, inhibiting epithelial cell proliferation and apoptosis, and inhibiting macrophage mediator production. These functions may be particularly important in mediating the hematopoietic, osseous and mucosal protective effects of IL-11 PUBMED:9416001. The cytokine also possesses anti-inflammatory activity, and has been proposed as a therapeutic agent in the treatment of chronic inflammatory diseases, such as Crohn\'s disease and rheumatoid arthritis PUBMED:15992047.

    \ ' '7229' 'IPR009979' '\

    This family consists of several Lentivirus viral infectivity factor (VIF) proteins. VIF is known to be essential for ability of cell-free virus preparation to infect cells PUBMED:9440006. Members of this family are specific to Bovine immunodeficiency virus (BIV) and Jembrana disease virus (JDV) which also infects cattle.

    \ ' '7230' 'IPR009980' '\

    This family consists of several Human herpesvirus U26 proteins of around 300 residues in length. The function of this family is unknown.

    \ ' '7231' 'IPR009981' '\

    This family consists of several uncharacterised Caenorhabditis elegans proteins of around 115 resides in length. Members of this family contain 6 highly conserved cysteine residues. The function of this family is unknown.

    \ ' '7232' 'IPR010874' '\

    This family consists of several telomere-binding protein beta subunits, which appear to be specific to the family Oxytrichidae. Telomeres are specialised protein-DNA complexes that compose the ends of eukaryotic chromosomes. Telomeres protect chromosome termini from degradation and recombination and act together with telomerase to ensure complete genome replication. TEBP beta forms a complex with TEBP alpha and this complex is able to recognise and bind ssDNA to form a sequence-specific, telomeric nucleoprotein complex that caps the very 3\' ends of chromosomes PUBMED:9875850.

    \ ' '7233' 'IPR010875' '\

    This entry represents a family of proteins of around 130 residues in length found primarily in Borrelia species. The function of this family is unknown.

    \ ' '7234' 'IPR010876' '\

    This family consists of several eukaryotic NICE-3 and related proteins. The gene coding for NICE-3 is part of the epidermal differentiation complex (EDC), which comprises a large number of genes that are of crucial importance for the maturation of the human epidermis PUBMED:11230159. The function of NICE-3 is unknown.

    \ ' '7235' 'IPR009982' '\

    This family consists of several VP6 proteins from the Banna virus as well as a related protein VP5 from the Kadipiro virus. Members of this family are typically of around 420 residues in length. The function of this family is unknown.

    \ ' '7236' 'IPR009983' '\

    This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown.

    \ ' '7237' 'IPR010877' '\

    This family contains GP46 phage proteins (approximately 120 residues long).

    \ ' '7238' 'IPR010878' '\

    This family consists of several Streptococcus thermophilus bacteriophage Gp111 proteins of around 110 residues in length. The function of this family is unknown.

    \ ' '7239' 'IPR010879' '\

    This family represents a series of bacterial domains of unknown function of around 50 residues in length. Members of this family are often found as tandem repeats and in some cases represent the whole protein. All member proteins are described as being hypothetical.

    \ ' '7240' 'IPR009984' '\

    This family contains the eukaryotic protein geminin (approximately 200 residues long). Geminin inhibits DNA replication by preventing the incorporation of MCM complex into prereplication complex, and is degraded during the mitotic phase of the cell cycle. It has been proposed that geminin inhibits DNA replication during S, G2, and M phases and that geminin destruction at the metaphase-anaphase transition permits replication in the succeeding cell cycle PUBMED:9635433.

    \ ' '7241' 'IPR010880' '\

    This family consists of several Betaherpesvirus immediate-early glycoprotein UL37 sequences. The human cytomegalovirus (HCMV) UL37 immediate-early regulatory protein is a type I integral membrane N-glycoprotein which traffics through the ER and the Golgi network PUBMED:8794367.

    \ ' '7243' 'IPR010881' '\

    This family consists of several Gammaherpesvirus latent membrane protein (LMP2) proteins. Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) is a human gammaherpesvirus that infects and establishes latency in B lymphocytes in vivo. The latent membrane protein 2 (LMP2) gene is expressed in latently infected B cells and encodes two protein isoforms, LMP2A and LMP2B, that are identical except for an additional N-terminal 119 aa cytoplasmic domain which is present in the LMP2A isoform. LMP2A is thought to play a key role in either the establishment or the maintenance of latency and/or the reactivation of productive infection from the latent state. The significance of LMP2B and its role in pathogenesis remain unclear PUBMED:11961256.

    \ ' '7244' 'IPR009985' '\

    This family consists of several Crinivirus P26 proteins which seem to be found exclusively in the Lettuce infectious yellows virus. The function of this family is unknown.

    \ ' '7245' 'IPR009986' '\

    This family contains the bacterial transcriptional regulator Crl (approximately 130 residues long). This is a transcriptional regulator of the csgA curlin subunit gene for curli fibres that are found on the surface of certain bacteria PUBMED:1357528.

    \ ' '7246' 'IPR010882' '\

    This family consists of several acidic phosphoprotein precursor PCEMA1 sequences which appear to be found exclusively in Plasmodium chabaudi. PCEMA1 is an antigen that is associated with the membrane of the infected erythrocyte throughout the entire intraerythrocytic cycle PUBMED:1475002. The exact function of this family is unclear.

    \ ' '7247' 'IPR009987' '\

    This family contains the bacterial protein PilM (approximately 150 residues long). PilM is an inner membrane protein that has been predicted to function as a component of the pilin transport apparatus and thin-pilus basal body PUBMED:11751821.

    \ ' '7248' 'IPR010883' '\

    This family consists of several uncharacterised viral proteins from the Marek\'s disease-like viruses (Meleagrid herpesvirus 1 (MeHV-1). Members of this family are typically around 400 residues in length. The function of this family is unknown.

    \ ' '7249' 'IPR008055' '\

    Neurotensin is a 13-residue peptide transmitter, sharing significant\ similarity in its 6 C-terminal amino acid residues with several other\ neuropeptides, including neuromedin N (which is derived from the same\ precursor). This C-terminal region is responsible for the full biological\ activity, the N-terminal portion having a modulatory role. \

    \

    Neurotensin is distributed throughout the central nervous system, with\ highest levels in the hypothalamus, amygdala and nucleus accumbens. It\ induces a variety of effects, including: analgesia, hypothermia and \ increased locomotor activity. It is also involved in regulation of dopamine\ pathways. In the periphery, neurotensin is found in endocrine cells of the\ small intestine, where it leads to secretion and smooth muscle contraction\ PUBMED:11811984. The neurotensin/neuromedin N precursor can also be processed to\ produce large 125-138 amino acid peptides with the neurotensin or neuromedin\ N sequence at their C-terminus. These large peptides appear to be less\ potent than their smaller counterparts, but are also less sensitive to \ degradation and may represent endogenous, long-lasting activators in a\ number of pathophysiological situations.

    \ ' '7250' 'IPR010884' '\

    This family contains sexual stage s48/45 antigens from Plasmodium (approximately 450 residues long). These are surface proteins expressed by Plasmodium male and female gametes that have been shown to play a conserved and important role in fertilisation PUBMED:11163248.

    \ ' '7251' 'IPR009988' '\

    This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown.

    \ ' '7252' 'IPR009989' '\

    This family contains the bacterial protein TrbM (approximately 180 residues long). In Comamonas testosteroni T-2, TrbM is derived from the IncP1beta plasmid pTSA, which encodes the widespread genes for p-toluenesulphonate (TSA) degradation PUBMED:11282598.

    \ ' '7253' 'IPR009990' '\

    This family consists of several Pardaxin proteins. Pardaxin, a 33-amino-acid pore-forming polypeptide toxin isolated from the Red Sea Moses sole Pardachirus marmoratus, has a helix-hinge-helix structure. This is a common structural motif found both in antibacterial peptides that can act selectively on bacterial membranes (e.g., cecropin), and in cytotoxic peptides that can lyse both mammalian and bacterial cells (e.g., melittin). Pardaxin possesses a high antibacterial activity with a significantly reduced haemolytic activity towards human red blood cells compared with melittin PUBMED:8620888. Pardaxin has also been found to have a shark repellent action PUBMED:3996550.

    \ ' '7254' 'IPR009991' '\

    This family contains p22, the smallest subunit of dynactin, a complex that binds to cytoplasmic dynein and is a required activator for cytoplasmic dynein-mediated vesicular transport. Dynactin localises to the cleavage furrow and to the midbodies of dividing cells, suggesting that it may function in cytokinesis PUBMED:9722614. Family members are approximately 170 residues long and seem to be restricted to mammals.

    \ ' '7256' 'IPR009992' '\

    This family represents a conserved region approximately 400 residues long within 15-O-acetyltransferase (Tri3), which seems to be restricted to ascomycete fungi. In Fusarium sporotrichioides, this is required for acetylation of the C-15 hydroxyl group of trichothecenes in the biosynthesis of T-2 toxin PUBMED:8593041.

    \ ' '7257' 'IPR009993' '\

    This family contains the bacterial enzyme 4-alpha-L-fucosyltransferase (Fuc4NAc transferase) (approximately 360 residues long). This catalyses the synthesis of Fuc4NAc-ManNAcA-GlcNAc-PP-Und (lipid III) as part of the biosynthetic pathway of enterobacterial common antigen (ECA), a polysaccharide comprised of the trisaccharide repeat unit Fuc4NAc-ManNAcA-GlcNAc PUBMED:11673418.

    \ ' '7258' 'IPR009994' '\

    This domain represents a conserved region approximately 200 residues long, four copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem PUBMED:9263452.

    \ ' '7259' 'IPR009995' '\

    This family consists of several archaeal proteins of around 370 residues in length. The function of this family is unknown.

    \ ' '7260' 'IPR010886' '\

    This family consists of several bacterial histone H1-like Hc1 proteins, which appear to be specific to Chlamydia species. Chlamydiae are prokaryotic obligate intracellular parasites that undergo a biphasic life cycle involving an infectious, extracellular form known as elementary bodies and an intracellular, replicating form termed reticulate bodies. The gene coding for Hc1 is expressed only during the late stages of the chlamydial life cycle concomitant with the reorganisation of chlamydial reticulate bodies into elementary bodies, suggesting that the Hc1 protein plays a role in the condensation of chlamydial chromatin during intracellular differentiation PUBMED:2023942.

    \ ' '7261' 'IPR008311' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '7262' 'IPR010888' '\

    This family consists of several minor pilin proteins including CblD from Burkholderia cepacia which is known to CblD be the initiator of pilus biogenesis PUBMED:12686638. The family also contains a variety of Enterobacterial minor pilin proteins.

    \ ' '7263' 'IPR009996' '\

    This family contains the bacterial protein YycH (approximately 450 residues long). The function of this protein is not known PUBMED:9829949.

    \ ' '7264' 'IPR009997' '\

    This family consists of several Curtovirus V3 proteins of around 90 residues in length. The function of this family is unknown.

    \ ' '7265' 'IPR009998' '\

    This family contains the precursor of the bacterial protein YfaZ (approximately 180 residues long). Many members of this family are hypothetical proteins.

    \ ' '7266' 'IPR009999' '\

    This family consists of several Staphylococcus aureus and related bacteriophage proteins of around 65 residues in length. The function of this family is unknown.

    \ ' '7267' 'IPR010889' '\

    This family consists of several hypothetical bacterial proteins of around 130 residues in length. Members of this family seem to be found exclusively in Rhizobium species. The function of this family is unknown.

    \ ' '7268' 'IPR010000' '\

    This family consists of several caerin 1 proteins from Litoria species, Australian tree frogs. The caerin 1 peptides are among the most powerful of the broad-spectrum antibiotic amphibian peptides PUBMED:12717721. These peptides are excreted from amphibian skin, and can interact with and disrupt bacterial membranes, leading to the permeabilisation of the cell membrane. Caerin 1.1 forms a helix-bend-helix sturcture, where both helices are required for activity, as well as the bend region for flexibility.

    \ ' '7269' 'IPR010001' '\

    This family contains the sigmaK-factor processing regulatory protein BofA (Bypass-of-forespore protein A) (approximately 80 residues long). During sporulation in Bacillus subtilis, transcription is controlled in the developing sporangium by a cascade of sporulation-specific transcription factors (sigma factors). Following engulfment, processing of sigmaK is inhibited by BofA. It has been suggested that this effect is exerted by alteration of the level of the SpoIVFA protein PUBMED:10464210.

    \ ' '7270' 'IPR010002' '\

    This family contains a number of ponericin peptides (approximately 30 residues long) from the venom of the predatory ant Pachycondyla goeldii (Ponerine ant). These peptides exhibit antibacterial and insecticidal properties, and may adopt an amphipathic alpha-helical structure in polar environments such as cell membranes PUBMED:11279030.

    \ ' '7272' 'IPR010004' '\

    This entry represents the N terminus (approximately 80 residues) of Ycf66, a protein that seems to be restricted to eukaryotes that contain chloroplasts and to cyanobacteria.

    \ ' '7273' 'IPR010890' '\

    This family contains the bacterial primosomal replication proteins priB and priC (approximately 180 residues long). In Escherichia coli, these function in the assembly of the primosome PUBMED:10613856.

    \ ' '7274' 'IPR010891' '\

    This family contains the bacterial protein GumN (approximately 330 residues long). Note that many members of this family are hypothetical proteins.

    \ ' '7275' 'IPR008986' '\

    Ebola virus sp. are non-segmented, negative-strand RNA viruses that causes severe haemorrhagic fever in humans with high rates of mortality. The virus matrix protein VP40 is a major structural protein that plays a central role in virus assembly and budding at the plasma membrane of infected cells. VP40 proteins associate with cellular membranes, interact with the cytoplasmic tails of glycoproteins, and bind to the ribonucleoprotein complex. The VP40 monomer consists of two domains, the N-terminal oligomerization domain and the C-terminal membrane-binding domain, connected by a flexible linker. Both the N- and C-terminal domains fold into beta sandwich structures of similar topology PUBMED:10944105. Within the N-terminal domain are two overlapping L-domains with the sequences PTAP and PPEY at residues 7 to13, which are required for efficient budding PUBMED:12559917. L-domains are thought to mediate their function in budding through their interaction with specific host cellular proteins, such as tsg101 and vps-4 PUBMED:12525615.

    \ ' '7276' 'IPR010892' '\

    This family represents a conserved region approximately 140 residues long within secreted phosphoprotein 24 (Spp-24), which seems to be restricted to vertebrates PUBMED:15062857. This is a non-collagenous protein found in bone that is related in sequence to the cystatin family of thiol protease inhibitors. This suggests that Spp-24 could function to modulate the thiol protease activities known to be involved in bone turnover. It is also possible that the intact form of Spp-24 found in bone could be a precursor to a biologically active peptide that coordinates an aspect of bone turnover PUBMED:7814406.

    \ ' '7277' 'IPR010893' '\

    This family contains bacterial hydrogenase-1 expression proteins approximately 120 residues long. This includes the Escherichia coli protein HyaE, and the homologous proteins HoxO of Ralstonia eutropha (Alcaligenes eutrophus) and HupG of Rhizobium leguminosarum. Deletion of the hoxO gene in R. eutropha led to complete loss of the uptake [NiFe] hydrogenase activity, suggesting that it has a critical role in hydrogenase assembly PUBMED:12914940.

    \ ' '7278' 'IPR010005' '\

    This family contains the bacterial formate hydrogenlyase maturation protein HycH, which is approximately 140 residues long. This may be required for the conversion of a precursor form of the large subunit of hydrogenlyase 3 into a mature form PUBMED:1625581.

    \ ' '7279' 'IPR010894' '\

    This family contains the bacterial stage V sporulation protein AD (SpoVAD), which is approximately 340 residues long. This is one of six proteins encoded by the spoVA operon, which is transcribed exclusively in the forespore at about the time of dipicolinic acid (DPA) synthesis in the mother cell. The functions of the proteins encoded by the spoVA operon are unknown, but it has been suggested they are involved in DPA transport during sporulation PUBMED:11751839.

    \ ' '7280' 'IPR010895' '\

    CHRD (after SWISS-PROT abbreviation for chordin) is a novel domain identified in chordin, an inhibitor of bone morphogenetic proteins. This family includes bacterial homologues. It is anticipated to have an immunoglobulin-like beta-barrel structure based on limited similarity to superoxide dismutases but, as yet, no clear functional prediction can be made PUBMED:13678956.

    \ ' '7281' 'IPR010896' '\

    This helix-turn-helix-containing DNA-binding domain is found associated in\ homing nucleases PUBMED:13678957.

    \ \ ' '7282' 'IPR010897' '\

    This family contains the bacterial stage II sporulation protein P (SpoIIP) (approximately 350 residues long). It has been shown that a block in polar cytokinesis in Bacillus subtilis is mediated partly by transcription of spoIID, spoIIM and spoIIP. This inhibition of polar division is involved in the locking in of asymmetry after the formation of a polar septum during sporulation PUBMED:11886548.

    \

    SpoIIP is one of the three genes (spoIID, spoIIM and spoIIP, PUBMED:8501064, PUBMED:7836306, PUBMED:3011962), under the control of sigma E, that have been shown to be essential for the engulfment of the forespore by the mother cell. Their products are involved in degradation of the septal peptidoglycan and mutations in spoIID, spoIIM or spoIIP block sporulation at morphological stage II, prior to the stage of engulfment. These three genes are absolutely conserved (sometimes even duplicated) in all endospore formers PUBMED:12662922.

    \ ' '7283' 'IPR010006' '\

    This family contains a number of phage polarity suppression proteins (Psu) (approximately 190 residues long). The Psu protein of Bacteriophage P4\ causes suppression of transcriptional polarity in Escherichia coli by overcoming Rho termination factor activity PUBMED:9007066.

    \ ' '7284' 'IPR010898' '\

    This family contains component I of bacterial heptaprenyl diphosphate synthase () (approximately 170 residues long). This is one of the two dissociable subunits that form the enzyme, both of which are required for the catalysis of the biosynthesis of the side chain of menaquinone-7 PUBMED:9748348.

    \ ' '7285' 'IPR010899' '\

    This family contains a number of hypothetical bacterial proteins of unknown function approximately 120 residues long.

    \ ' '7286' 'IPR010007' '\

    This entry represents SPANX (Sperm Protein Associated with the Nucleus on the X chromosome) family proteins, including N1, N2, N£ and N5. These human sperm proteins are associated with the nucleus and mapped to the X chromosome (SPAN-X) (approximately 100 residues long). SPAN-X proteins are cancer-testis antigens (CTAs), and thus represent potential targets for cancer immunotherapy because they are widely distributed in tumours but not in normal tissues, except testes. They are highly insoluble, acidic, and polymorphic PUBMED:11133693.

    \ ' '7287' 'IPR010008' '\

    This family contains a number of RstB proteins approximately 120 residues long, including RstB1 and RstB2, from the Vibrio cholerae phage CTX. Functional analyses indicate that rstB2 is required for integration of the CTXphi phage into the V. cholerae chromosome PUBMED:9220000.

    \ ' '7288' 'IPR003611' '\ This is a short helical motif of unknown function found in intron-associated nuclease 2, which is involved in intron homing.\ ' '7289' 'IPR010900' '\

    This family consists of several bacterial nicotine adenine dinucleotide glycohydrolase (NGA) proteins which appear to be specific to Streptococcus pyogenes. NAD glycohydrolase (NADase) is a potential virulence factor. Streptococcal NADase may contribute to virulence by its ability to cleave beta-NAD at the ribose-nicotinamide bond, depleting intracellular NAD pools and producing the potent vasoactive compound nicotinamide PUBMED:10979908.

    \ ' '7290' 'IPR010901' '\

    This entry represents the C-terminal region of merozoite surface protein 1 (MSP1), which is found in a number of Plasmodium species. MSP-1 is a 200 kDa protein expressed on the surface of the Plasmodium vivax merozoite. MSP-1 of Plasmodium species is synthesised as a high-molecular-weight precursor and then processed into several fragments. At the time of red cell invasion by the merozoite, only the 19 kDa C-terminal fragment (MSP-119), which contains two epidermal growth factor-like domains, remains on the surface. Antibodies against MSP-119 inhibit merozoite entry into red cells, and immunisation with MSP-119 protects monkeys from challenging infections. Hence, MSP-119 is considered a promising vaccine candidate PUBMED:12466500.

    \ ' '7291' 'IPR010902' '\

    NUMOD4 is a putative DNA-binding motif found in homing endonucleases and related proteins PUBMED:13678957.

    \ ' '7292' 'IPR010009' '\

    This family consists of several insect apolipoprotein-III sequences. Exchangeable apolipoproteins constitute a functionally important family of proteins that play critical roles in lipid transport and lipoprotein metabolism. Apolipophorin III (apoLp-III) is a prototypical exchangeable apolipoprotein found in many insect species that functions in transport of diacylglycerol (DAG) from the fat body lipid storage depot to flight muscles in the adult life stage PUBMED:11818551.

    \ ' '7293' 'IPR010010' '\

    Members of this protein family are PsaM, which is subunit XII of the photosystem I reaction centre. PsaM forms part of the photosystem I complex and its binding is stabilised by PsaI PUBMED:8787020. This protein is found in both the Cyanobacteria and the chloroplasts of plants, but is absent from non-oxygenic photosynthetic bacteria such as Rhodobacter sphaeroides. Species that contain photosystem I also contain photosystem II, which splits water and releases molecular oxygen.

    \ ' '7294' 'IPR010903' '\

    This family consists of several hypothetical glycine rich plant and bacterial proteins of around 300 residues in length. The function of this family is unknown.

    \ ' '7295' 'IPR009099' '\

    The beta-lactamase-inhibitor protein (BLIP) is produced by Streptomyces species. BLIP acts as a potent inhibitor of beta-lactamases such as TEM-1, which is the most widespread resistance enzyme to penicillin antibiotics. BLIP binds competitively to TEM-1 and makes direct contacts with TEM-1 active site residues. BLIP is able to inhibit a variety of class A beta-lactamases, possibly through flexibility of its two domains. The two tandemly repeated domains of BLIP have an alpha(2)-beta(4) structure, the beta-hairpin loop from domain 1 inserting into the active site of beta-lactamase PUBMED:8605632. BLIP shows no sequence similarity with BLIP-II, even though both bind to and inhibit TEM-1 PUBMED:11573088.

    \ ' '7296' 'IPR008998' '\

    Agglutinins are sugar-specific lectins that can agglutinate erythrocytes and other cell types. Lectins occur widely in plants, as well as some microorganisms and animal PUBMED:15229195. Agglutinin from Amaranthus caudatus (amaranthin) is a lectin from the ancient South American crop, amaranth grain. Although its biological function is unknown, it can agglutinate A, B and O red blood cells, and has a carbohydrate-binding site that is specific for the methyl-glycoside of the T-antigen found linked to serine or threonine residues of cell surface glycoproteins PUBMED:2271665. The protein is comprised of a homodimer, with each homodimer consisting of two beta-trefoil domains PUBMED:2777780. Lectin B chains from ricin and related toxins also contain beta-trefoil domain, however they are not related to agglutinin, showing little sequence similarity PUBMED:9334739.

    \ ' '7297' 'IPR010011' '\

    This domain, which is usually found tandemly repeated, is found various receptor co-activating proteins.

    \ ' '7298' 'IPR010905' '\

    Unsaturated glucuronyl hydrolase catalyses the hydrolytic release of unsaturated glucuronic acids from oligosaccharides produced by the reactions of polysaccharide lyases PUBMED:12777820.

    \ ' '7299' 'IPR010906' '\

    Terminase, the DNA packaging enzyme of bacteriophage lambda, is a heteromultimer composed of subunits Nu1 and A. The smaller Nu1 terminase subunit has a low-affinity ATPase stimulated by non-specific DNA PUBMED:10600592.

    \ ' '7300' 'IPR010907' '\

    This entry represents calcium-mediated lectins. Structures have been determined for both fucose-binding lectin II (PA-IIL) PUBMED:12909014 and mannose-specific lectin II (RS-IIL) PUBMED:15101976. These proteins have homologous structures, their monomers consisting of a 9-stranded beta sandwich with Greek-key topology. Each monomer contains two calcium ions that mediate an exceptionally high binding affinity to the monosaccharide ligand in a recognition mode unique among carbohydrate-protein interactions. In Pseudomonas aeruginosa, PA-IIL contributes to the pathogenic virulence of the bacterium, functioning as a tetramer when binding fucose PUBMED:12415289. In the plant pathogen Ralstonia solanacearum (Pseudomonas solanacearum), RS-IIL recognises fucose, but displays much higher affinity to mannose and fructose, which is opposite to the preference of PA-IIL.

    \ ' '7301' 'IPR010012' '\

    This family consists of several spasmodic peptide gm9a sequences. Conotoxin gm9a is a putative 27-residue polypeptide encoded by Conus gloriamaris and is known to be a homologue of the \'spasmodic peptide\', tx9a, isolated from the venom of the mollusk-hunting cone shell Conus textile PUBMED:12193600. Upon injection of this venom component, normal mice are converted into behavioural phenocopies of a well-known mutant, the spasmodic mouse PUBMED:10677206.

    \ ' '7302' 'IPR006605' '\

    Basement membranes are sheet-like extracellular matrices found at the basal\ surfaces of epithelia and condensed mesenchyma. By preventing cell mixing and\ providing a cell-adhesive substrate, they play crucial roles in tissue\ development and function. Basement menbranes are composed of an evolutionarily\ ancient set of large glycoproteins, which includes members of the laminin\ family, collagen IV, perlecan and nidogen/entactin. Nidogen/entactin is an\ important basement membrane component, which promotes cell attachment,\ neutrophil chemotaxis, trophoblast outgrowth, and angiogenesis. It consists of\ three globular regions, G1-G3. G1 and G2 are connected by a thread-like\ structure, whereas that between G2 and G3 is rod-like PUBMED:9633511, PUBMED:11427896.

    \ \

    The nidogen G2 region binds to collagen IV and perlecan. The nidogen G2\ structure is composed of two domains, an N-terminal EGF-like domain and a much larger beta-barrel domain of ~230 residues. The nidogen G2 beta-barrel consists of an 11-stranded beta-barrel\ of complex topology, the interior of which is traversed by the hydrophobic,\ predominantly alpha helical segment connecting strands C and D. The N-terminal\ half of the barrel comprises two beta-meanders (strands A-C and D-F) linked by\ the buried alpha-helical segment. The polypeptide chain then crosses the\ bottom of the barrel and forms a five-stranded Greek key motif in the C-\ terminal half of the domain. Helix alpha3 caps the top of the barrel and forms\ the interface to the EGF-like domain. The nidogen G2 beta-barrel domain has\ unexpected structural similarity to green fluorescent protein, suggesting that\ they derive from a common ancestor. A large surface patch on the barrel\ surface is strikingly conserved in all metazoan nidogens. Site-directed\ mutagenesis demonstrates that the conserved residues in the conserved patch\ are involved in the binding of perlecan, and possibly also of collagen IV PUBMED:11427896.

    \ ' '7303' 'IPR011104' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    This entry represents the C-terminal kinase domain of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phosphorelay system in control of carbon catabolic repression in bacteria PUBMED:9570401. This kinase in unusual in that it recognises the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes PUBMED:9570401. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller PUBMED:11904409.

    \ ' '7304' 'IPR006395' '\

    These sequences describe methylaspartate ammonia-lyase, also called beta-methylaspartase. It follows methylaspartate mutase (composed of S and E subunits) in one of several possible pathways of glutamate fermentation.

    \ ' '7305' 'IPR011099' '\

    Alpha-glucuronidases, components of an ensemble of enzymes central to the recycling of photosynthetic biomass, remove the alpha-1,2 linked 4-O-methyl glucuronic acid from xylans. This family represents the C-terminal region of alpha-glucuronidase, which is mainly alpha-helical. It wraps around the catalytic domain (), making additional interactions both with the N-terminal domain () of its parent monomer and also forming the majority of the dimer-surface with the equivalent C-terminal domain of the other monomer of the dimer PUBMED:11937059.

    \ ' '7306' 'IPR011095' '\

    This entry represents the C-terminal, catalytic domain of the D-alanine--D-alanine ligase enzyme . D-Alanine is one of the central molecules of the cross-linking step of peptidoglycan assembly. There are three enzymes involved in the D-alanine branch of peptidoglycan biosynthesis: the pyridoxal phosphate-dependent D-alanine racemase (Alr), the ATP-dependent D-alanine: D-alanine ligase (Ddl), and the ATP-dependent D-alanine:D-alanine-adding enzyme (MurF) PUBMED:12499203.

    \ ' '7307' 'IPR006109' '\

    NAD-dependent glycerol-3-phosphate dehydrogenase () (GPD) catalyzes the reversible reduction of dihydroxyacetone phosphate to glycerol-3-phosphate. It is a cytoplasmic protein, active as a homodimer PUBMED:2500660, each monomer containing an N-terminal NAD binding site PUBMED:6773774. In insects, it acts in conjunction with a mitochondrial alpha-glycerophosphate oxidase in the alpha-glycerophosphate cycle, which is essential for the production of energy used in insect flight PUBMED:2500660.

    \ ' '7308' 'IPR011085' '\

    This proteins in this entry are of unknown function. Members are restricted to Proteobacteria.

    \ ' '7309' 'IPR011086' '\

    This domain of unknown function is found in a limited set of Bradyrhizobium proteins. There appears to be a periodic -DG- motif in the domain.

    \ ' '7310' 'IPR011087' '\

    The function of these proteins is unknown. All are from Bradyrhizobium japonicum.

    \ ' '7311' 'IPR011121' '\

    This is a C-terminal tryptophan rich domain found in membrane proteins of Synechocystis and Bradyrhizobium normally found in 2 to 3 copies.

    \ ' '7312' 'IPR011083' '\

    This region is occasionally found in conjunction with . Most of the proteins appear to be phage tail proteins; however some appear to be involved in other processes. For instance the RhiB protein () from Rhizobium leguminosarum may be involved in plant-microbe interactions PUBMED:1597418. A related protein, microcystin related protein (MrpB, ) is involved in the pathogenicity of Microcystis aeruginosa. The finding of this family in a structural component of the phage tail fibre baseplate () suggests that its function is structural rather than enzymatic. Structural studies show this region consists of a helix and a loop PUBMED:12888344 and three beta-strands. This alignment does not catch the third strand as it is separated from the rest of the structure by around 100 residues. This strand is conserved in homologues but the intervening sequence is not. Much of the function of appears to reside in this intervening region. In the tertiary structure of the phage baseplate this domain forms part of the collar and may bind SO4. The long unconserved region maybe due to domain swapping in and out of a loop or due to rapid evolution.

    \ ' '7313' 'IPR011094' '\

    This family is the lppY/lpqO homologue family. They are related to \'probable conserved lipoproteins\' LppY and LpqO from Mycobacterium bovis. \

    \ ' '7314' 'IPR011105' '\

    These enzymes have been implicated in cell wall hydrolysis, most extensively in Bacillus subtilis. For instance is expressed during sporulation in an inactive form and deposited on the cell outer cortex. During germination the enzyme is activated and hydrolyses the cortex PUBMED:10658652. A similar role is carried out by the partially redundant PUBMED:9515903.

    \ \

    The sleB gene () encodes a germination-specific N-acetylmuramyl-L-alanine amidase in B. subtilis and Bacillus cereus PUBMED:10197998. It is synthesized with a putative signal sequence and hydrolyses the spore cortex in situ, during germination. In dormant spores it exist in a mature but inactive state.

    \ ' '7315' 'IPR016019' '\

    The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell PUBMED:9618447 and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Salmonella spp. \ secrete an effector protein called SopE that is responsible for stimulating \ the reorganisation of the host cell actin cytoskeleton, and ruffling of the \ cellular membrane PUBMED:9482928. It acts as a guanyl-nucleotide-exchange factor on Rho-GTPase proteins such as Cdc42 and Rac. As it is imperative for the bacterium \ to revert the cell back to its "normal" state as quickly as possible, \ another tyrosine phosphatase effector called SptP reverses the actions \ brought about by SopE PUBMED:11316807.

    \ \

    Recently, it has been found that SopE and its protein homologue SopE2 can\ activate different sets of Rho-GTPases in the host cell PUBMED:11316807. Far from being a redundant set of two similar type III effectors, they both act in unison \ to specifically activate different Rho-GTPase signalling cascades in the\ host cell during infection.\

    \ \

    This entry represents the guanine nucleotide exchange factor domain of SopE. This domain has an alpha-helical structure consisting of two three-helix bundles arranged in a lamdba shape PUBMED:12093730, PUBMED:15379540.

    \ ' '7316' 'IPR011100' '\

    Alpha-glucuronidases, components of an ensemble of enzymes central to the recycling of photosynthetic biomass, remove the alpha-1,2 linked 4-O-methyl glucuronic acid from xylans. This family represents the central catalytic domain of alpha-glucuronidase PUBMED:11937059.

    \ ' '7317' 'IPR003536' '\ Secretion of virulence factors in Gram-negative bacteria involves \ transportation of the protein across two membranes to reach the cell \ exterior. There have been four secretion systems described in \ animal enteropathogens, such as Salmonella and Yersinia, with further \ sequence similarities in plant pathogens like Ralstonia and Erwinia PUBMED:9618447.\ \

    The type III secretion system is of great interest, as it is used to \ transport virulence factors from the pathogen directly into the host cell \ and is only triggered when the bacterium comes into close contact with\ the host. The protein subunits of the system are very similar to those of \ bacterial flagellar biosynthesis. However, while the latter forms a\ ring structure to allow secretion of flagellin and is an integral part of\ the flagellum itself PUBMED:9618447, type III subunits in the outer membrane \ translocate secreted proteins through a channel-like structure.

    \ \

    Exotoxins secreted by the type III system do not possess a secretion signal,\ and are considered unique for this reason PUBMED:9618447. Enteropathogenic and entero-\ haemorrhagic Escherichia coli secrete the bacterial adhesion mediation\ molecule intimin PUBMED:10835344, which targets the translocated intimin receptor, Tir. Tir is secreted by the bacteria and is embedded in the target cell\'s plasma membrane PUBMED:10835344. This facilitates bacterial cell attachment to the host.

    \ ' '7318' 'IPR003536' '\ Secretion of virulence factors in Gram-negative bacteria involves \ transportation of the protein across two membranes to reach the cell \ exterior. There have been four secretion systems described in \ animal enteropathogens, such as Salmonella and Yersinia, with further \ sequence similarities in plant pathogens like Ralstonia and Erwinia PUBMED:9618447.\ \

    The type III secretion system is of great interest, as it is used to \ transport virulence factors from the pathogen directly into the host cell \ and is only triggered when the bacterium comes into close contact with\ the host. The protein subunits of the system are very similar to those of \ bacterial flagellar biosynthesis. However, while the latter forms a\ ring structure to allow secretion of flagellin and is an integral part of\ the flagellum itself PUBMED:9618447, type III subunits in the outer membrane \ translocate secreted proteins through a channel-like structure.

    \ \

    Exotoxins secreted by the type III system do not possess a secretion signal,\ and are considered unique for this reason PUBMED:9618447. Enteropathogenic and entero-\ haemorrhagic Escherichia coli secrete the bacterial adhesion mediation\ molecule intimin PUBMED:10835344, which targets the translocated intimin receptor, Tir. Tir is secreted by the bacteria and is embedded in the target cell\'s plasma membrane PUBMED:10835344. This facilitates bacterial cell attachment to the host.

    \ ' '7319' 'IPR011107' '\

    These proteins include Ypi1, a novel Saccharomyces cerevisiae type 1 protein phosphatase inhibitor PUBMED:14506263 and ppp1r11/hcgv (), annotated as having protein phosphatase inhibitor activity PUBMED:8781118.

    \ ' '7320' 'IPR011120' '\

    Neutral trehalases mobilise trehalose accumulated by fungal cells as a protective and storage carbohydrate. This family represents a calcium-binding domain similar to EF hand. Residues 97 and 108 in have been implicated in this interaction. It is thought that this domain may provide a general mechanism for regulating neutral trehalase activity in yeasts and filamentous fungi PUBMED:12943532.

    \ ' '7322' 'IPR011110' '\

    A large group of two component regulator proteins appear to have the same N-terminal structure of 14 tandem repeats. These repeats show homology to members of and indicating that they are likely to form a beta-propeller. This family has been built with artificially high cut-offs in order to avoid overlaps with other beta-propeller families. The fourteen repeats are likely to form two propellers; it is not clear if these structures are likely to recruit other proteins or interact with DNA.

    \ ' '7323' 'IPR011123' '\

    This region is mostly found at the end of the beta propellers () in a family of two component regulators. However they are also found tandemly repeated in without other signal conduction domains being present. It\'s named after the conserved tyrosines found in the alignment. The exact function is not known.

    \ ' '7324' 'IPR011124' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a CW-type zinc finger motif, named for its conserved cysteine and tryptophan residues. It is predicted to be a highly specialised mononuclear four-cysteine (C4) zinc finger that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including chromatin methylation status and early embryonic development. Weak homology to members of further evidences these predictions. The domain is found exclusively in vertebrates, vertebrate-infecting parasites and higher plants PUBMED:14607086.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '7325' 'IPR011113' '\

    The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It is thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers PUBMED:10230401.

    \ ' '7326' 'IPR011112' '\

    The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It is thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers PUBMED:10230401. This domain is found to the N terminus of the RNA binding domain ().

    \ ' '7327' 'IPR011114' '\

    In prokaryotes, RuvA, RuvB, and RuvC process the universal DNA intermediate of homologous recombination, termed Holliday junction. The tetrameric DNA helicase RuvA specifically binds to the Holliday junction and facilitates the isomerization of the junction from the stacked folded configuration to the square-planar structure PUBMED:12408833. In the RuvA tetramer, each subunit consists of three domains, I, II and III, where I and II form the major core that is responsible for Holliday junction binding and base pair rearrangements of Holliday junction executed at the crossover point, whereas domain III regulates branch migration through direct contact with RuvB.

    \ \ \

    The domain represents the C-terminal domain III of RuvA. This domain plays a significant role in the ATP-dependent branch migration of the hetero-duplex through direct contact with RuvB PUBMED:10890893. Within the Holliday junction, this domain makes no interaction with the DNA.

    \ ' '7328' 'IPR003618' '\

    Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II PUBMED:3346229. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius PUBMED:8502569. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner PUBMED:1917889, PUBMED:8566795.

    \

    TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III PUBMED:12914699. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

    \

    This domain is found in the central region of transcription elongation factor S-II and in several hypothetical proteins.

    \ ' '7329' 'IPR011098' '\

    The G5 domain (named after its conserved glycine residues) is a module of ~80 residues that is found in a variety of enzymes such as Streptococcal IgA peptidases and various glycosyl hydrolases in bacteria. It is found in one to seven copies in association with other domains, such as LysM, bacterial Ig-like, M23 and M26 peptidases, F5/8 type C, vanW or transglycosylase-like. The G5 domain contains a few highly conserved residues. None of these conserved residues are the polar types of amino acids found in active sites, so it seems unlikely this region has an enzymatic function. However, in nearly all cases the G5 domain is associated with a known enzymatic domain. Therefore, the G5 domain may confer localization or substrate specificity on the proteins in which it is found. As a common feature of the proteins containing G5 domains is N-acetylglucosamine binding, it has been suggested that this function might be attributed to the G5 domain. Other alternative functions could be allosteric regulation of the enzymatic domain or cofactor binding PUBMED:15598841.

    \ ' '7330' 'IPR011106' '\

    The MANSC (motif at N terminus with seven cysteines) domain is a module with a\ well-conserved seven cysteine motif that is present at the N terminus of\ higher multicellular animal membrane and extracellular proteins. It is\ possible that some of the cysteine residues in the MANSC domain form\ structurally important disulphide bridges.\ All of the MANSC-containing proteins contain predicted transmembrane regions\ and signal peptides. It has been proposed that the MANSC domain in HAI-1 might\ function through binding with hepatocyte growth factor activator and\ matriptase PUBMED:15124631.

    \ ' '7331' 'IPR011125' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    Proteins of the HypF family are involved in the maturation and regulation of hydrogenase PUBMED:9492269. In the N terminus they appear to have two zinc finger domains that are similar to those found in the DnaJ chaperone PUBMED:12206761.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '7332' 'IPR011096' '\

    The FTP domain is found in the propeptide region of bacterial and fungal metallopeptidases belonging to MEROPS peptidase families M4 and M36 respectively. In bacteria the FTP domain is N-terminal to this entry, the PepSY domain; in fungi the M36 peptidases do not contain the PepSY domain. Propeptide swapping experiments have shown that the propeptides of the M4 and M36 families are not functionally interchangeable PUBMED:12589825.

    \ \

    The function of the propeptide in M36 peptidases has not been described, but it is likely, as in other related peptidases, to have targeting, chaperone activity and to inhibit peptidase activity, so as to prevent premature activation PUBMED:12589825, PUBMED:8636020.

    \ \ ' '7333' 'IPR011101' '\

    This family contains phage proteins Gp37 (bacteriophage phiE125) and Gp68 (mycobacteriophage Che9c) and bacterial homologues.

    \ ' '7334' 'IPR011111' '\

    This family includes proteins with sequence similarity to the RepB partitioning protein of the large Ti (tumour-inducing) plasmids of Agrobacterium tumefaciens PUBMED:10613878, PUBMED:9524202.

    \ ' '7335' 'IPR011122' '\

    These proteins are encoded by putative wav gene clusters, which are responsible for the synthesis of the core oligosaccharide (OS) region of Vibrio cholerae lipopolysaccharide PUBMED:11953379.

    \ ' '7336' 'IPR011109' '\

    This domain is usually found associated with in putative integrases/recombinases of mobile genetic elements of diverse bacteria and phages.

    \ ' '7337' 'IPR011088' '\

    The members of this family are restricted to the Gammaproteobacteria and \ Epsilonproteobacteria, the function of these proteins is unknown.

    \ ' '7338' 'IPR011089' '\

    The family contains RloF from Campylobacter jejuni, its function and those of the other members are unknown.

    \ ' '7339' 'IPR011090' '\

    This family of proteins is restricted to the Gammaproteobacteria, their function is unknown.

    \ ' '7341' 'IPR011092' '\

    The members of this family are primarily from the Gammaproteobacteria. The function of these proteins is unknown.

    \ ' '7342' 'IPR011119' '\

    The members of this family are restricted to the proteobacteria. Some members have been annotated as helicase, conjugative relaxase or nickase. The majority contain an HD domain, which is found in a superfamily of enzymes with a predicted or known phosphohydrolase activity. These enzymes appear to be involved in the nucleic acid metabolism, signal transduction and possibly other functions in bacteria.

    \ ' '7343' 'IPR011093' '\

    This entry contains proteins some of which are from pathogenic strains of Gammaproteobacteria. Though the function of these proteins is unknown, they could be involved in pathogenesis. This domain is found at the C terminus of proteins that contain a N-terminal metal-dependent phosphohydrolase (HD) region and are considered to be helicases/relaxases.

    \ ' '7344' 'IPR011116' '\

    SecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP-dependent manner. This domain is composed of two C-terminal alpha helical subdomains: the wing and scaffold subdomains.

    \ ' '7345' 'IPR011115' '\

    SecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP-dependent manner PUBMED:9644254,PUBMED:2542029. This domain represents the N-terminal ATP-dependent helicase domain, which is related to the .

    \ ' '7347' 'IPR011118' '\

    This family includes fungal tannase PUBMED:8917102 and feruloyl esterase PUBMED:11931668, PUBMED:8679110. It also includes several bacterial homologues of unknown function.

    \ ' '7348' 'IPR009216' '\ This entry represents proteins of unknown function. It has been shown in Salmonella enterica that srfB is one of the genes activated by the global signal transduction/regulatory system SsrA/B PUBMED:10844662. This activation takes place within eukaryotic cells. The activated genes include pathogenicity island 2 (SPI-2) genes and at least 10 other genes (srfB is one of them) which are believed to be horizontally acquired, and to be involved in virulence/pathogenicity PUBMED:10844662.\ ' '7349' 'IPR011108' '\

    The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in and are common to all metallo-beta-lactamases. The fifth motif appears to be specific to function. This entry represents the fifth motif from metallo-beta-lactamases involved in RNA metabolism PUBMED:12177301.

    \ ' '7350' 'IPR011084' '\

    The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in and are common to all metallo-beta-lactamases. The fifth motif appears to be specific to function. This entry represents the fifth motif from metallo-beta-lactamases involved in DNA repair PUBMED:12177301.

    \ ' '7351' 'IPR011080' '\

    This entry represents bacterial domains with an Ig-like fold. These domains are found in a variety of bacterial surface proteins.

    \ ' '7352' 'IPR006565' '\

    This bromodomain is found in eukaryotic transcription factors and PHD domain containing proteins (). The tandem PHD finger-bromodomain is found in many chromatin-associated proteins. It is involved in gene silencing by the human co-repressor KRAB-associated protein 1 (KAP1). The tandem PHD finger-bromodomain of KAP1 has a distinct structure that joins the two protein modules. The first helix, alpha(Z), of an atypical bromodomain forms the central hydrophobic core that anchors the other three helices of the bromodomain on one side and the zinc binding PHD finger on the other PUBMED:18488044. \

    \

    The Rap1 GTPase-activating protein, Sipa1, is modulated by the cellular bromodomain protein, Brd4. Brd4 belongs to the BET family and is a multifunctional protein involved in transcription, replication, the signal transduction pathway, and cell cycle progression. All of these functions are linked to its association with acetylated chromatin. It has tandem bromodomains PUBMED:18500820. The dysregulation of the Brd4-associated pathways may play an important role in breast cancer progression PUBMED:18427120. Bovine papillomavirus type 1 E2 also binds to chromosomes in a complex with Brd4. Interaction with Brd4 is additionally important for E2-mediated transcriptional regulation PUBMED:18495759, PUBMED:18513937.\

    \ ' '7353' 'IPR001496' '\

    The SOCS box was first identified in SH2-domain-containing proteins of the suppressor of cytokines signalling (SOCS) family PUBMED:9202125 but was later also found in:

    \ \

    \ \

    The SOCS box found in these proteins is an about 50 amino acid carboxy-terminal domain composed of two blocks of well-conserved residues separated by between 2 and 10 non-conserved residues PUBMED:9419338. The C-terminal conserved region is an L/P-rich sequence of unknown function, whereas the N-terminal conserved region is a consensus BC box PUBMED:9869640, which binds to the Elongin BC complex PUBMED:9869640, PUBMED:10051596. It has been proposed that this association could couple bound proteins to the ubiquitination or proteasomal compartments PUBMED:10051596.

    \ ' '7354' 'IPR006563' '\

    This domain in found exclusively in plant proteins, associated with HOX domains which may suggest these proteins are\ homeodomain transcription factors.

    \ ' '7355' 'IPR003650' '\ This domain confers specificity among members of the Hairy/E(SPL) family. HES-2 (hairy and enhancer of split 2) is a transcription factor, and the hairy protein is a pair-rule protein that regulates embryonic segmentation and adult bristle patterning. These proteins are transcriptional repressors of genes that require the BHLH protein for their transcription.\ ' '7356' 'IPR006561' '\

    This domain is found in proteins containing the double-stranded RNA-binding motif, DSRM (), or the zinc finger domain C2H2 (). This domain is found\ exclusively in the metazoa.

    \ ' '7357' 'IPR006562' '\

    This domain of unknown function is found in helicases and other DNA-binding proteins of eukaryotes PUBMED:11779830.

    \ ' '7358' 'IPR006579' '\

    This domain is present in proteins found exclusively in the arthropods, including a number of Drosophila\ species, the silk moth and the gypsy moth. These proteins are possibly\ involved in RNA binding or single strand DNA binding.

    \ ' '7359' 'IPR003894' '\

    The TAF homology (TAFH) or Nervy homology region 1 (NHR1) domain is a domain of 95-100 amino acids present in eukaryotic proteins of the MTG/ETO family and whereof the core ~75-80 residues occur in TAF proteins. The transcription initiation TFIID complex is composed of TATA binding protein (TBP) and a number of TBP-associated factors (TAFs). The TAFH/NHR1 domain is named after fruit fly TATA-box-associated factor 110 (TAF110), human TAF105 and TAF130, and the fruit fly protein Nervy, which is a homologue of human MTG8/ETO PUBMED:9447981, PUBMED:9790752. The human eight twenty-one (ETO or MTG8) and related myeloid transforming gene products MTGR1 and MTG16 as well as the Nervy protein contain the NHR1-4 domains. The NHR1/TAFH domain occurs in the N-terminal part of these proteins, while a MYND-type zinc finger forms the NHR4 domain PUBMED:12559562. The TAFH/NHR1 domain can be involved in protein-protein interactions, e.g in MTG8/ETO with HSP90 and Gfi-1 PUBMED:10076566.

    \ \ \ ' '7360' 'IPR011081' '\

    This entry represents bacterial domains with an Ig-like fold. These domains are found in a variety of bacterial surface proteins.

    \ ' '7361' 'IPR006576' '\

    BRK is a domain of unknown function found only in the metazoa and in association with CHROMO domain () and DEAD/DEAH box helicase domain ().

    \ ' '7362' 'IPR006571' '\

    TLDc is a domain of unknown function, restricted to eukaryotes, and commonly found in TBC () and LysM () domain containing proteins.

    \ ' '7363' 'IPR006572' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    In eukaryotes, initiation of DNA replication requires the assembly of pre-replication complexes (pre-RCs) on chromatin during the G1 phase. In the S phase, pre-RCs are activated by two protein kinases, Cdk2 and Cdc7, which results in the loading of replication factors and the unwinding of replication origins by the MCM helicase complex PUBMED:8984634. Cdc7 is a serine/threonine kinase that is conserved from yeast to human. It is regulated by its association with a regulatory subunit, the Dbf4 protein. This complex is often referred to as DDK (Dbf4-dependent kinase) PUBMED:14643426.

    \ \

    DBF4 contains an N-terminal BRCT domain and a C-terminal conserved region that could potentially coordinate one zinc atom, the DBF4-type zinc finger. This entry represents the zinc finger, which is important for the interaction with Cdc7 PUBMED:8943332, PUBMED:8066465.

    \ \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '7364' 'IPR011102' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms PUBMED:8868347, PUBMED:11406410. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation PUBMED:10426948, and CheA, which plays a central role in the chemotaxis system PUBMED:9989504. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water PUBMED:11145881. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.

    \

    HKs can be roughly divided into two classes: orthodox and hybrid kinases PUBMED:8029829, PUBMED:1482126. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK PUBMED:10966457. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.

    \

    The HWE domain is found in a subset of two-component system kinases, belonging to the same superfamily as PUBMED:14702314. In PUBMED:14702314, the HWE family was defined by the presence of conserved a H residue and a WXE motifs and was limited to members of the proteobacteria. However, many homologues of this domain are lack the WXE motif. Furthermore, homologues are found in a wide range of Gram-positive and Gram-negative bacteria as well as in several archaea.

    \ ' '7365' 'IPR011426' '\

    This family includes CamS (), from which Staphylococcus aureus sex pheromone staph-cAM373 is processed. It also includes a number of uncharacterised bacterial proteins.

    \ ' '7366' 'IPR006637' '\

    This hydrophobic repeat is found in a number of Chlostridium proteins. It contains a conserved tryptophan residue.

    \ ' '7367' 'IPR011430' '\

    These eukaryotic proteins include DRIM (Down-Regulated In Metastasis) (), which is differentially expressed in metastatic and non-metastatic human breast carcinoma cells PUBMED:9673349. It is believed to be involved in processing of non-coding RNA PUBMED:12837249.

    \ ' '7368' 'IPR011501' '\

    Nucleolar complex-associated protein (Noc3p, ) is conserved in eukaryotes and plays essential roles in replication and rRNA processing in Saccharomyces cerevisiae PUBMED:12110182.

    \ ' '7369' 'IPR011488' '\

    These proteins share a region of similarity that falls towards the C terminus from .

    \ ' '7370' 'IPR011419' '\

    This entry represents a group of ATPase F1F0-assembly proteins, including ATP12 and ATPAF2 (ATP synthase mitochondrial F1 complex assembly factor 2). These proteins are essential for the assembly of the mitochondrial F1-F0 complex.

    \

    Mitochondrial F1-ATPase is an oligomeric enzyme composed of five distinct subunit polypeptides. The alpha and beta subunits make up the bulk of protein mass of F1. In Saccharomyces cerevisiae both subunits are synthesised as precursors with N-terminal targeting signals that are removed upon translocation of the proteins to the matrix compartment PUBMED:1826907. These proteins include examples from eukaryotes and bacteria and may have chaperone activity, being involved in F1 ATPase complex assembly.

    \ \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '7371' 'IPR011431' '\

    A Saccharomyces cerevisiae (Baker\'s yeast) member of this family (PGA2, ) is a single pass membrane protein which has been implicated in protein trafficking PUBMED:16943325, PUBMED:14690591.

    \ ' '7372' 'IPR011425' '\

    This entry includes subunits Med9 and Med21 of the Mediator complex. Subunit Med9 and Med21 are part of the middle module of the Mediator complex; this associates with the core polymerase subunits to form the RNA polymerase II holoenzyme.

    \ \

    Med9 alternatively known as the chromosome segregation protein, CSE2 () is required, along with CSE1 () for accurate mitotic chromosome segregation in Saccharomyces cerevisiae (Baker\'s yeast) PUBMED:8336709.

    \ \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '7373' 'IPR011520' '\

    The mammalian TEF and the Drosophila scalloped genes belong to a conserved family of transcriptional factors that possesses a TEA/ATTS DNA-binding domain. Transcriptional activation by these proteins likely requires interactions with specific coactivators. In Drosophila, Scalloped (Sd) interacts with Vestigial (Vg) to form a complex, which binds DNA through the Sd TEA/ATTS domain. The Sd-Vg heterodimer is a key regulator of wing development, which directly controls several target genes and is able to induce wing outgrowth when ectopically expressed. This short conserved region is needed for interaction with Sd PUBMED:10518497.

    \ ' '7374' 'IPR011489' '\

    The EMI domain, first named after its presence in proteins of the EMILIN family, is a small cysteine-rich module of around 75 amino acids. The EMI domain is most often found at the N-terminus of metazoan extracellular proteins that are forming or are compatible with multimer formation PUBMED:11068053. It is found in association with other domains, such as C1q, laminin-type EGF-like, collagen-like, FN3, WAP, ZP or FAS1 PUBMED:12507493. It has been suggested that the EMI domain could be a protein-protein interaction module, as the EMI domain of EMILIN-1 was found to interact with the C1q domain of EMILIN-2 PUBMED:11068053.

    \

    The EMI domain possesses six highly conserved cysteines residues, which likely form disulphide bonds. Other key features of the EMI domain are the C-C-x-G-[WYFH] pattern, a hydrophobic position just preceding the first cysteine (Cys1) of the domain and a cluster of hydrophobic residues between Cys3 and Cys4. The EMI domain could be made of two sub-domains, the fold of the second one sharing similarities with the C-terminal sub-module characteristic of EGF-like domains PUBMED:12507493.

    \

    Proteins known to contain a EMI domain include:

    \ \

    \ \

    The Pfam alignment for this domain is truncated at the C terminus and does not include the final cysteine PUBMED:12507493. This is to stop the family overlapping with other domains.

    \ ' '7375' 'IPR011508' '\

    This domain is found in three copies at the N terminus of the Caenorhabditis elegans RSD-2 protein. RSD-2 (RNAi spreading defective) is involved in systemic RNAi PUBMED:14738731. Mutations in the rsd-2 gene do not affect somatic genes but only germline expressed genes PUBMED:14738731.

    \ ' '7376' 'IPR011427' '\

    This domain is found in several Chlamydia polymorphic membrane proteins PUBMED:11254597. Chlamydia pneumoniae (Chlamydophila pneumoniae) is an obligate intracellular bacterium and a common human pathogen causing infection of the upper and lower respiratory tract. This domain is found between the beta-helical repeats () and the C-terminal .

    \ ' '7377' 'IPR003335' '\

    Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase\ pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to\ the translocase component PUBMED:2202721. From there, the mature proteins are either targeted to the outer\ membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial\ chromosome.

    \

    \ The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral\ membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of\ the mature peptide into the periplasm (SecD and SecF) PUBMED:2202721. The chaperone protein SecB PUBMED:11336818 is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm.\ SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane\ protein ATPase SecA for secretion PUBMED:10418149. Together with SecY and SecG, SecE forms a multimeric\ channel through which preproteins are translocated, using both proton motive forces and ATP-driven secretion. The\ latter is mediated by SecA. The structure of the\ Escherichia coli SecYEG assembly revealed a sandwich of two membranes interacting through the extensive cytoplasmic\ domains PUBMED:12167867. Each membrane is composed of dimers of SecYEG. The monomeric complex contains 15\ transmembrane helices. \

    \

    This family consists of various prokaryotic SecD and SecF protein export membrane proteins. The SecD and SecF equivalents of the\ Gram-positive bacterium Bacillus subtilis are jointly present in one polypeptide,\ denoted SecDF, that is required to maintain a high capacity for protein secretion.\ Unlike the SecD subunit of the pre-protein translocase of E. coli, SecDF\ of B. subtilis was not required for the release of a mature secretory protein from\ the membrane, indicating that SecDF is involved in earlier translocation steps PUBMED:9694879.\ Comparison with SecD and\ SecF proteins from other organisms revealed the presence of 10 conserved\ regions in SecDF, some of which appear to be important for SecDF function.\ Interestingly, the SecDF protein of B. subtilis has 12 putative transmembrane\ domains. Thus, SecDF does not only show sequence similarity but also structural\ similarity to secondary solute transporters PUBMED:9694879.

    \ ' '7378' 'IPR011432' '\

    This domain is found duplicated in proteins of unknown function. The proteins typically also contain leucine-rich repeats.

    \ ' '7379' 'IPR011433' '\

    This family is found in a group of small bacterial proteins. Its function is not known.

    \ ' '7380' 'IPR011428' '\

    This domain is found in the Bacilli coat protein X as a tandem repeat and as a single domain in coat protein V. The proteins are found in the insoluble fraction PUBMED:8509331.

    \ ' '7381' 'IPR011434' '\

    This domain is found as 1-3 copies in a small family of proteins of unknown function.

    \ ' '7382' 'IPR011490' '\

    This domain is found in a wide variety of contexts, but mostly occurring in cell wall associated proteins. A lack of conserved catalytic residues suggests that it is a binding domain. From context, possible substrates are hyaluronate or fibronectin. This is further evidenced by PUBMED:12438356. Possibly the exact substrate is N-acetyl glucosamine. Finding it in the same protein as further supports this proposal. It is found in the C-terminal part of , which is removed during maturation PUBMED:14759609. Some of the proteins it is found in (e.g. ) are involved in methicillin resistance PUBMED:10896508. The name FIVAR derives from Found In Various Architectures.

    \ ' '7383' 'IPR011496' '\

    This family consists of both eukaryotic and prokaryotic hyaluronidases. Human is expressed during meningioma PUBMED:9811929. Clostridium perfringens, , is involved in pathogenesis and is likely to act on connectivity tissue during gas gangrene PUBMED:8177218. It catalyses the random hydrolysis of 1->4-linkages between N-acetyl-beta-D-glucosamine and D-glucuronate residues in hyaluronate.

    \ ' '7384' 'IPR011435' '\

    This family of protenis of unknown function contains several conserved glycines and phenylalanines.

    \ ' '7385' 'IPR011515' '\

    This entry represents the C-terminal domain of Shugoshin (Sgo1) kinetochore-attachment proteins. Shugoshin has a conserved coiled-coil N-terminal domain and a highly conserved C-terminal basic region (). Shugoshin is a crucial target of Bub1 kinase that plays a central role in chromosome cohesion during mitosis and meiosis divisions by preventing premature dissociation of cohesin complex from centromeres after prophase, when most of cohesin complex dissociates from chromosomes arms PUBMED:14730319, PUBMED:18987869. Shugoshin is thought to act by protecting Rec8 and Rad21 at the centromeres from separase degradation during anaphase I (during meiosis) so that sister chromatids remain tethered PUBMED:14730319. Shugoshin also acts as a spindle checkpoint component required for sensing tension between sister chromatids during mitosis, its degradation when they separate preventing cell cycle arrest and chromosome loss in anaphase, a time when sister chromatids are no longer under tension. Human shugoshin is diffusible and mediates kinetochore-driven formation of kinetochore-microtubules during bipolar spindle assembly PUBMED:16687935. Further, the primary role of shugoshin is to ensure bipolar attachment of kinetochores, and its role in protecting cohesion has co-developed to facilitate this process PUBMED:17322402.

    \ ' '7386' 'IPR011516' '\

    This entry represents the N-terminal domain of Shugoshin (Sgo1) kinetochore-attachment proteins. Shugoshin has a conserved coiled-coil N-terminal domain and a highly conserved C-terminal basic region (). Shugoshin is a crucial target of Bub1 kinase that plays a central role in chromosome cohesion during mitosis and meiosis divisions by preventing premature dissociation of cohesin complex from centromeres after prophase, when most of cohesin complex dissociates from chromosomes arms PUBMED:14730319, PUBMED:18987869. Shugoshin is thought to act by protecting Rec8 and Rad21 at the centromeres from separase degradation during anaphase I (during meiosis) so that sister chromatids remain tethered PUBMED:14730319. Shugoshin also acts as a spindle checkpoint component required for sensing tension between sister chromatids during mitosis, its degradation when they separate preventing cell cycle arrest and chromosome loss in anaphase, a time when sister chromatids are no longer under tension. Human shugoshin is diffusible and mediates kinetochore-driven formation of kinetochore-microtubules during bipolar spindle assembly PUBMED:16687935. Further, the primary role of shugoshin is to ensure bipolar attachment of kinetochores, and its role in protecting cohesion has co-developed to facilitate this process PUBMED:17322402.

    \ ' '7387' 'IPR011491' '\

    This domain is found in several bacterial FlaE flagellar proteins. These proteins are part of the flagellar basal body rod complex.

    \ ' '7388' 'IPR011436' '\

    This domain is found in a small number of Chlamydia proteins of unknown function. It occurs together with .

    \ ' '7389' 'IPR011437' '\

    These proteins have four conserved cysteines, which is suggestive of a metal binding function. This domain may be found on its own or duplicated in the proteins.

    \ ' '7390' 'IPR011500' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    GPCR family 3 receptors (also known as family C) are structurally similar to other GPCRs, but do not show any significant sequence similarity and thus represent a distinct group. Structurally they are composed of four elements; an N-terminal signal sequence; a large hydrophilic extracellular agonist-binding region containing several conserved cysteine residues which could be involved in disulphide bonds; a shorter region containing seven transmembrane domains; and a C-terminal cytoplasmic domain of variable length PUBMED:17266540. Family 3 members include the metabotropic glutamate receptors, the extracellular calcium-sensing receptors, the gamma-amino-butyric acid (GABA) type B receptors, and the vomeronasal type-2 receptors PUBMED:1309649, PUBMED:8255296, PUBMED:10773016, PUBMED:9292726. As these receptors regulate many important physiological processes they are potentially promising targets for drug development.

    \

    This entry represents a conserved sequence, found in the extracellular region, that contains several highly-conserved Cys residues that are predicted to form disulphide bridges.

    \ ' '7391' 'IPR011438' '\

    This domain is found in several hypothetical bacterial proteins as a tandem repeat.

    \ ' '7392' 'IPR011439' '\

    This domain is found in several cell surface proteins. Some are involved in antibiotic resistance (e.g. and ) PUBMED:10332717 and/or cellular adhesion (e.g. ) PUBMED:12438342. In some proteins it is repeated more than fifteen times.

    \ ' '7393' 'IPR013769' '\

    Bicarbonate (HCO3-) transport mechanisms are the principal regulators of pH in animal cells. Such transport also plays a vital role in acid-base movements in the stomach, pancreas, intestine, kidney, reproductive organs and the central nervous system. Functional studies have suggested four different HCO3- transport modes. Anion exchanger proteins exchange HCO3- for Cl- in a reversible, electroneutral manner PUBMED:2289848. Na+/HCO3- co-transport proteins mediate the coupled movement of Na+ and HCO3- across plasma membranes, often in an electrogenic manner PUBMED:. Na- driven Cl-/HCO3- exchange and K+/HCO3- exchange activities have also been detected in certain cell types, although the molecular identities of the proteins responsible remain to be determined.

    \ \

    Sequence analysis of the two families of HCO3- transporters that have been cloned to date (the anion exchangers and Na+/HCO3- co-transporters) reveals that they are homologous. This is not entirely unexpected, given that they both transport HCO3- and are inhibited by a class of pharmacological agents called disulphonic stilbenes PUBMED:9235899. They share around ~25-30% sequence identity, which is distributed along their entire sequence length, and have similar predicted membrane topologies, suggesting they have ~10 transmembrane (TM) domains.

    \ ' '7394' 'IPR011440' '\

    This domain is found as 1-2 copies in a small family of proteins of unknown function.

    \ ' '7396' 'IPR011495' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms PUBMED:8868347, PUBMED:11406410. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation PUBMED:10426948, and CheA, which plays a central role in the chemotaxis system PUBMED:9989504. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water PUBMED:11145881. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.

    \

    HKs can be roughly divided into two classes: orthodox and hybrid kinases PUBMED:8029829, PUBMED:1482126. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK PUBMED:10966457. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.

    \

    This is the dimerisation and phosphoacceptor domain of a subfamily of histidine kinases. It shares sequence similarity with and . It is usually found adjacent to a C-terminal ATPase domain (). This domain is found in a wide range of bacteria and also several archaea.

    \ ' '7397' 'IPR011494' '\

    The Hira proteins are found in a range of eukaryotes and are implicated in the assembly of repressive chromatin. These proteins also contain .

    \ ' '7399' 'IPR011442' '\

    These proteins are associated with in transcription initiation factor TFIID subunit 6 (TAF6).

    \ ' '7400' 'IPR011421' '\

    Vertebrate BCNT (named after Bucentaur) or human craniofacial development protein 1 (CFDP1) are characterised by an N-terminal acidic region, a central and single IR element (inverted repeat) from the retrotransposable element-1 family (RTE-1) and a highly conserved 82-amino acid region at the C terminus.

    \ \

    This entry represents the BCNT C-terminal domain that is also found in Drosophila YETI, a protein that binds to a microtubule-based motor kinesin-1, and the yeast SWR1-complex protein 5 (SWC5), a component of the SWR1 chromatin remodeling complex PUBMED:16384818, PUBMED:14720462.

    \ \

    In the bovine genome recombination of BCNT through the IR element with a member of the retrotransposable element-1 family, leads to gene duplications, the insertion of the RTE-1 apurinic/apyrimidinic endonuclease (APE)-like domain (see ) with the concomitant loss of the conserved C-terminal domain of BCNT and with the additional recruitment of either 2 (p97bcnt) or 3 (p97bcnt-2) C-terminal IR-elements PUBMED:19393175.

    \ \ \ \ ' '7401' 'IPR011420' '\

    The AreA nitrogen regulatory proteins (which are GATA type transcription factors) share a highly conserved N terminus and have at the C terminus.

    \ ' '7402' 'IPR011513' '\

    Saccharomyces cerevisiae Nse1 () forms part of a complex with SMC5-SMC6. This non-structural maintenance of chromosomes (SMC) complex plays an essential role in genomic stability, being involved in DNA repair and DNA metabolism PUBMED:12966087, PUBMED:11927594. It is conserved in eukaryotes from yeast to human.

    \ ' '7403' 'IPR011502' '\

    This is a family of nucleoporins conserved from yeast to human.

    \ ' '7404' 'IPR011422' '\

    These proteins include BRCA1-associated protein 2 (BRAP2), which binds nuclear localisation signals (NLSs) in vitro and in yeast two-hybrid screening PUBMED:9497340. These proteins share a region of sequence similarity at their N terminus. They also have at the C terminus.

    \ ' '7405' 'IPR011443' '\

    This domain appears to be found only in a small family of Chlamydia species. It is usually found repeated. The function of these proteins is not known.

    \ ' '7406' 'IPR011499' '\

    This domain is found at the N terminus of a group of Chlamydial lipid A biosynthesis proteins. It is also found by itself in a family of proteins of unknown function.

    \ ' '7407' 'IPR013044' '\

    This domain is found in a small number of Chlamydia proteins of unknown function. It occurs together with .

    \ ' '7408' 'IPR011505' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This entry represents metallopeptidases belonging to MEROPS peptidase family M26 (IgA1-specific metallopeptidase, clan MA). They are extracellular enzymes, which cleave mammalian IgA. They are only found in Gram-positive bacteria and are often found associated with ; they may be attached to the cell wall.

    \

    This entry also contains the metallopeptidases ZmpB and ZmpC from Streptococcus pneumoniae. These metallopeptidases are thought to contribute to the inflammatory response to Streptococcal infection PUBMED:12933834, PUBMED:12841855.

    \ ' '7409' 'IPR011493' '\

    This domain is found in the IgA1-specific metalloendopeptidases, which attach to the cell wall peptidoglycan by an amide bond PUBMED:8926055. IgA1 protease selectively cleaves human IgA1 and is likely to be a pathogenicity factor in some pathogens including Giardia spp PUBMED:12841855. This domain is also found in various other contexts, including with . It is named GLUG after the mostly conserved G-L-any-G motif.

    \

    The IgA1-specific metalloendopeptidases belong to MEROPS peptidase family M26, clan MA(E).

    \ \ ' '7410' 'IPR011418' '\

    DNA damaging agents such as the anti-tumour drugs bleomycin and neocarzinostatin or those that generate oxygen radicals produce a variety of lesions in DNA. Amongst these is base-loss which forms apurinic/apyrimidinic (AP) sites or strand breaks with atypical 3\' termini. DNA repair at the AP sites is initiated by specific endonuclease cleavage of the phosphodiester backbone. Such endonucleases are also generally capable of removing blocking groups from the 3\' terminus of DNA strand breaks. AP endonucleases can be classified into two families based on sequence similarity PUBMED:7661852.

    \

    This entry represents a highly-conserved sequence found at the C terminus of several apurinic/apyrimidinic (AP) endonucleases in a range of Gram-positive and Gram-negative bacteria.

    \ ' '7411' 'IPR011444' '\

    This domain is found in a family of paralogues in the planctomycetes. The function is not known. It is found associated with the Planctomycete cytochrome C domain .

    \ ' '7412' 'IPR011445' '\

    These proteins share a highly-conserved sequence at their N terminus. They include several proteins from Rhodopirellula baltica and also several from proteobacteria.

    \ ' '7413' 'IPR011446' '\

    This is a family of proteins identified in Rhodopirellula baltica. The function is not known.

    \ ' '7414' 'IPR011447' '\

    This is a family of proteins identified in Rhodopirellula baltica.

    \ ' '7415' 'IPR011444' '\

    This domain is found in a family of paralogues in the planctomycetes. The function is not known. It is found associated with the Planctomycete cytochrome C domain .

    \ ' '7416' 'IPR011448' '\

    This is a domain that occurs in 1-2 copies in a family of proteins identified in Leptospira interrogans. The function of the proteins is not known.

    \ ' '7417' 'IPR011449' '\

    This entry represents a bacterial subgroup and describes describes a 25-residue region including an invariant Pro-Glu-Pro (PEP) motif, a thirteen residue strongly hydrophobic sequence likely to span the membrane, and a five-residue strongly basic motif that often contains four Arg residues. In most cases, this motif is found within nine residues of the C-terminal end of the protein.

    \ \

    This domain is found in proteins from Rhodopirellula baltica. The domain is also found in the proteobacteria, chlorobiaceae (green sulphur bacteria), and in the betaproteobacteria e.g. Nitrosomonas europaea. One protein, from R. baltica (), shows some similarity to M12B zinc peptidases.

    \ ' '7418' 'IPR011450' '\

    This is a family of proteins of unknown function.

    \ ' '7419' 'IPR011451' '\

    This entry represents a cluster of homologous proteins identified in Leptospira interrogans. One member () has been predicted to be a phenazine biosynthesis family protein.

    \ ' '7420' 'IPR011518' '\

    These transposases are found in the planctomycete Rhodopirellula baltica, the cyanobacterium Nostoc, and the Gram-positive bacterium Streptomyces.

    \

    More information about these proteins can be found at Protein of the Month: Transposase PUBMED:.

    \ ' '7421' 'IPR011519' '\

    This conserved sequence is found associated with in several paralogous proteins in Rhodopirellula baltica. It is also found associated with in several eukaryotic integrin-like proteins (e.g. human ASPIC ) and in several other bacterial proteins (e.g. ) PUBMED:12536216.

    \ ' '7423' 'IPR011506' '\

    This motif is conserved at the N terminus of several Rhodopirellula baltica proteins predicted to be extracellular.

    \ ' '7424' 'IPR011453' '\

    This is a large family of paralogous proteins apparently unique to planctomycetes.

    \ ' '7425' 'IPR011454' '\

    This is a small family of short hypothetical proteins in Rhodopirellula baltica.

    \ ' '7426' 'IPR011455' '\

    This is a family of paralogous proteins in Leptospira interrogans.

    \ ' '7427' 'IPR011457' '\

    This is a small family of short hypothetical proteins in Leptospira interrogans.

    \ ' '7428' 'IPR011458' '\

    This is a family of paralogous proteins in Leptospira interrogans. Several (e.g. ) have been annotated as possible CopG-like transcriptional regulators (see ).

    \ ' '7429' 'IPR011456' '\

    This is a large family of short hypothetical proteins in Leptospira interrogans.

    \ ' '7430' 'IPR011459' '\

    These proteins share a region of homology in their N termini, and are found in several phylogenetically diverse bacteria and in the archaeon Methanosarcina acetivorans. Some of these proteins also contain characterised domains such as (e.g. ) and (e.g. ).

    \ ' '7431' 'IPR011460' '\

    These proteins of unknown function are found in Leptospira interrogans and in several gamma proteobacteria.

    \ ' '7432' 'IPR011461' '\

    This is a large family of short hypothetical proteins in Leptospira interrogans.

    \ ' '7434' 'IPR011463' '\

    This is a family of hypothetical proteins identified in Rhodopirellula baltica.

    \ ' '7435' 'IPR011464' '\

    This is a family of hypothetical proteins from Rhodopirellula baltica.

    \ ' '7436' 'IPR011465' '\

    This is a family of paralogous proteins found in Planctomycetacia, Betaproteobacteria and Methanomicrobia.

    \ ' '7437' 'IPR011466' '\

    These proteins from several diverse bacteria share a short conserved sequence towards their N termini.

    \ ' '7438' 'IPR011467' '\

    These hypothetical proteins from bacteria, such as Rhodopirellula baltica, Bacteroides thetaiotaomicron and Porphyromonas gingivalis, share a region of conserved sequence towards their N termini.

    \ ' '7439' 'IPR011468' '\

    This is a family of hypothetical proteins found in Leptospira interrogans.

    \ ' '7441' 'IPR011470' '\

    This small family is found in several undescribed proteins. The alignment is distinguished by the frequent occurrence of conserved glycine and aromatic residues.

    \ ' '7442' 'IPR011471' '\

    This is a family of hypothetical proteins found in Leptospira interrogans.

    \ ' '7443' 'IPR011522' '\

    This entry represents YkoF-related proteins. YkoF is involved in the hydroxymethyl pyrimidine (HMP) salvage pathway PUBMED:15451668. The domain is found in pairs in these proteins.

    \ ' '7445' 'IPR011473' '\

    This is a family of paralogous hypothetical proteins identified in Rhodopirellula baltica that also has members in Gloeobacter violaceus, Rhizobium meliloti and Agrobacterium tumefaciens.

    \ ' '7446' 'IPR011474' '\

    This is a family of short hypothetical proteins found in Rhodopirellula baltica.

    \ ' '7447' 'IPR011475' '\

    Several Rhodopirellula baltica proteins share this probable domain. Most of these proteins are predicted to be secreted or membrane-associated.

    \ ' '7448' 'IPR011512' '\

    This is a family of hypothetical proteins from Leptospira interrogans which share a highly conserved sequence motif at the C terminus.

    \ ' '7449' 'IPR011476' '\

    This is a family of hypothetical proteins found in Rhodopirellula baltica.

    \ ' '7450' 'IPR011475' '\

    Several Rhodopirellula baltica proteins share this probable domain. Most of these proteins are predicted to be secreted or membrane-associated.

    \ ' '7451' 'IPR011477' '\

    This sequence motif is highly conserved in several short hypothetical proteins from Rhodopirellula baltica. It is also associated with in .

    \ ' '7452' 'IPR011478' '\

    This entry represents a conserved region at the C-terminus of a family of cytochrome-like proteins in Rhodopirellula baltica and Solibacter usitatus. These proteins also contain , , and .

    \ ' '7453' 'IPR011479' '\

    This is a family of short hypothetical proteins found in Rhodopirellula baltica.

    \ ' '7454' 'IPR013036' '\

    A region of similarity shared by several Rhodopirellula baltica cytochrome-like proteins that are predicted to be secreted. These proteins also contain , , and .

    \ ' '7455' 'IPR013039' '\

    A region of similarity shared by several Rhodopirellula baltica cytochrome-like proteins that are predicted to be secreted. These proteins also contain , , and .

    \ ' '7456' 'IPR011480' '\

    This is a family of short hypothetical proteins found in Rhodopirellula baltica.

    \ ' '7457' 'IPR011481' '\

    These hypothetical proteins in Rhodopirellula baltica have a conserved C-terminal region.

    \ ' '7459' 'IPR013042' '\

    A region of similarity shared by several Rhodopirellula baltica cytochrome-like proteins that are predicted to be secreted. These proteins also contain , , and .

    \ ' '7460' 'IPR011483' '\

    This is a family of proteins found in Rhodopirellula baltica that are predicted to be secreted. Also, a member has been identified in Caulobacter crescentus (Caulobacter vibrioides) (). These proteins may be related to .

    \ ' '7462' 'IPR011509' '\

    This short repeat is found in the RtxA toxin family PUBMED:9927695.

    \ ' '7463' 'IPR011429' '\

    These proteins share a region of homology at their N terminus that contains the C-{CPWHF}-{CPWR}-C-H-{CFYW} motif typical of cytochromes C.

    \ ' '7464' 'IPR011504' '\

    This motif is found at the N terminus of several short hypothetical proteins in Rhodopirellula baltica and the predicted Arylsulphatase B () .

    \ ' '7465' 'IPR013043' '\

    A region of similarity shared by several Rhodopirellula baltica cytochrome-like proteins that are predicted to be secreted. These proteins also contain , , and .

    \ ' '7466' 'IPR011517' '\

    The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes.

    With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding \'helix-turn-helix\' motif involved in binding the conserved -35 region of promoters recognised by the major sigma factors PUBMED:3092189, PUBMED:1597408. \

    \

    This entry represents a group of sigma factors that are able to regulate extra cellular function (ECF) PUBMED:12073657. Eubacteria display considerable genetic diversity between ECF-sigma factors, but all retain two features: the ability to respond to extra-cytoplasmic functions; and regulation by anti-sigma and anti-anti-sigma factors PUBMED:15374527. This family show sequence similarity to and .

    \ ' '7467' 'IPR011521' '\

    These hypothetical proteins in Rhodopirellula baltica contain several repeats of a sequence whose core contains the residues YTV.

    \ ' '7468' 'IPR011507' '\

    These Rhodopirellula baltica proteins share a highly conserved sequence, centred around an invariant QPP motif, at their N termini. This motif may represent an export signal.

    \ ' '7469' 'IPR011485' '\

    This is a family of proteins for which no function is known yet.

    \ ' '7470' 'IPR011486' '\

    This is a family of proteins for which no function is known yet.

    \ ' '7471' 'IPR011487' '\

    This is a family of Rhodopirellula baltica hypothetical proteins of about 500 amino acids in length.

    \ ' '7472' 'IPR011503' '\

    This conserved sequence is centred around an invariant motif of PGAMP in several short hypothetical proteins from the planctomycete Rhodopirellula baltica. The motif also occurs twice in .

    \ ' '7473' 'IPR013091' '\

    A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown PUBMED:2288911, PUBMED:6334307, PUBMED:3534958, PUBMED:6607417, PUBMED:3282918, PUBMED: to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N-terminus of some EGF-like domains PUBMED:1527084. Calcium-binding may be crucial for numerous protein-protein interactions.

    \

    For human coagulation factor IX it has been shown PUBMED:7606779 that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) PUBMED:1527084. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site.

    \

    As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes PUBMED:1527084.

    \
    \
                                 +------------------+        +---------+\
                                 |                  |        |         |\
                   nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx\
                       |                  | \
                       +------------------+\
    \
    \'n\': negatively charged or polar residue [DEQN]\
    \'b\': possibly beta-hydroxylated residue [DN]\
    \'a\': aromatic amino acid\
    \'C\': cysteine, involved in disulphide bond\
    \'x\': any amino acid.\
    
    \ ' '7474' 'IPR011498' '\

    Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified PUBMED:8453663. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP PUBMED:8453663 and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin PUBMED:7593276, PUBMED:7822422, and in galactose oxidase from the fungus Dactylium dendroides PUBMED:8126718, PUBMED:2002850. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold PUBMED:8182749.

    \ \

    The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila PUBMED:7593276. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase PUBMED:8126718.

    \ \

    This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.

    \ ' '7475' 'IPR011510' '\

    This entry represents a second domain related to the SAM domain. Sterile alpha motif (SAM) domains are known to be involved in diverse protein-protein interactions, associating with both SAM-containing and non-SAM-containing protein pathways.

    \ ' '7476' 'IPR011497' '\

    This domain is usually indicative of serine protease inhibitors that belong to Merops inhibitor families: I1, I2, I17 and I31. However, kazal-like domains are also seen in the extracellular part of agrins, which are not known to be protease inhibitors. Kazal domains often occur in tandem arrays and have a central alpha-helix, a short two-stranded antiparallel beta-sheet and several disulphide bonds PUBMED:12051857, PUBMED:9242660, PUBMED:7489704. The amino terminal segment of this domain binds to the active site of its target proteases, thus inhibiting their function.

    \ ' '7477' 'IPR011424' '\

    This short domain is rich in cysteines and histidines. The pattern of conservation is similar to that found in . C1 domains are protein kinase C-like zinc finger structures. Diacylglycerol (DAG) kinases (DGKs) have a two or three commonly conserved cysteine-rich C1 domains PUBMED:18691010. DGKs modulate the balance between the two signaling lipids, DAG and phosphatidic acid (PA), by phosphorylating DAG to yield PA PUBMED:17512245.

    \ \

    The PKD (protein kinase D) family are novel DAG receptors. They have twin C1 domains, designated C1a and C1b, which bind DAG or phorbol esters. Individual C1 domains differ in ligand-binding activity and selectivity PUBMED:18076381.\

    \ ' '7478' 'IPR004044' '\

    The K homology (KH) domain was first identified in the human heterogeneous\ nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acids\ that is present in a wide variety of quite diverse nucleic acid-binding\ proteins PUBMED:8036511. It has been shown to bind RNA PUBMED:9302998, PUBMED:10369774. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitro RNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently PUBMED:8036511.

    \

    According to structural PUBMED:9302998, PUBMED:10369774, PUBMED:11160884 analysis the KH domain can be separated in two groups. The first group or type-1 contain a beta-alpha-alpha-beta-beta-alpha structure, whereas in the type-2 the two last beta-sheet are located in the N-terminal part of the domain (alpha-beta-beta-alpha-alpha-beta). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helice 1 and 2 in type-1 and between helice 2 and 3 in type-2. Proteins known to contain a type-2 KH domain include eukaryotic and prokaryotic S3 family of ribosomal proteins, and the prokaryotic GTP-binding protein, era.

    \ ' '7479' 'IPR011492' '\

    This is the Flavivirus DEAD domain. The domain is related to the DEAD/DEAH box helicase domain which is found in a large family of ATPases.

    \ ' '7480' 'IPR011417' '\

    AP180 is an endocytotic accessory protein that has been implicated in the formation of clathrin-coated pits. The domain is involved in phosphatidylinositol 4,5-bisphosphate binding and is a universal adaptor for nucleation of clathrin coats PUBMED:12740367, PUBMED:12742163.

    \ ' '7481' 'IPR011511' '\ SH3 (src Homology-3) domains are small protein modules containing \ approximately 50 amino acid residues PUBMED:15335710, PUBMED:11256992. They are found in a \ great variety of intracellular or\ membrane-associated proteins PUBMED:1639195, PUBMED:14731533, PUBMED:7531822 for example, in a variety of\ proteins with enzymatic activity, in adaptor\ proteins that lack catalytic sequences and in cytoskeletal\ proteins, such as fodrin and yeast actin binding protein ABP-1. \

    The SH3 domain has a characteristic fold which consists of five or six beta-strands arranged as two tightly packed anti-parallel beta sheets. The linker\ regions may contain short helices PUBMED:. The surface of the SH3-domain bears a flat, hydrophobic ligand-binding pocket which consists of three shallow grooves defined by conservative aromatic residues in which the ligand adopts an extended left-handed helical arrangement. The ligand binds with low affinity but this may be enhanced by multiple interactions.\ The region bound by the SH3 domain is in all cases proline-rich and contains PXXP as a core-conserved binding motif. The function of the SH3 domain is not well understood but they may mediate many diverse processes such as increasing local concentration of proteins, altering their subcellular location and mediating the assembly of large multiprotein complexes PUBMED:7953536.

    \

    This entry represents a variant of the SH3 domain.

    \ ' '7482' 'IPR003597' '\

    The basic structure of immunoglobulin (Ig) molecules is a tetramer of two light chains and two heavy chains linked by disulphide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains: alpha, delta, epsilon, gamma and mu, all consisting of a variable domain (VH) and three (in alpha, delta and gamma) or four (in epsilon and mu) constant domains (CH1 to CH4). Ig molecules are highly modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. The domains in Ig and Ig-like molecules are grouped into four types: V-set (variable; ), C1-set (constant-1; ), C2-set (constant-2; ) and I-set (intermediate; ) PUBMED:9417933. Structural studies have shown that these domains share a common core Greek-key beta-sandwich structure, with the types differing in the number of strands in the beta-sheets as well as in their sequence patterns PUBMED:15327963, PUBMED:11377196.

    \

    Immunoglobulin-like domains that are related in both sequence and structure can be found in several diverse protein families. Ig-like domains are involved in a variety of functions, including cell-cell recognition, cell-surface receptors, muscle structure and the immune system PUBMED:10698639.

    \ \

    This entry represents C1-set domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules PUBMED:9597133, PUBMED:12637770, and in various T-cell receptors.

    \ ' '7483' 'IPR006209' '\ A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF)\ has been shown PUBMED:, PUBMED:3282918, PUBMED:6607417, PUBMED:2288911, PUBMED:6334307 to be present, in a more\ or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to\ contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in\ what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in\ the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin\ G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide\ bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet.\ Subdomains between the conserved cysteines vary in length.\ ' '7484' 'IPR011514' '\

    This is a short domain found in bacterial type II/III secretory system proteins. The architecture of these proteins suggests that this family may be functionally analogous to .

    \ ' '7485' 'IPR010003' '\

    This entry represents a conserved region approximately 60 residues long within eukaryotic HepA-related protein (HARP). This exhibits single-stranded DNA-dependent ATPase activity, and is ubiquitously expressed in human and mouse tissues PUBMED:10857751. Family members may contain more than one copy of this region.

    \ ' '7486' 'IPR011647' '\ This motif occurs in multiple copies in Leptospira interrogans proteins.\ ' '7487' 'IPR011651' '\ This entry represents a region of conserved sequence at the N terminus of several Notch ligand proteins.\ ' '7489' 'IPR011630' '\ These proteins have no known function.\ ' '7490' 'IPR011662' '\

    This is a conserved region found at the N-terminal region of bacterial proteins involved in either protein secretion or the uptake of selective substrates, including:

    \ \

    \ ' '7491' 'IPR011652' '\ This entry represents an apparent variant of the repeat.\ ' '7492' 'IPR011657' '\ This entry consists of nucleoside transport proteins. is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane PUBMED:7775409. is a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine. It also transports the anti-viral nucleoside analogues AZT and ddC PUBMED:8027026. This entry covers the C terminus of this family of transporters.\ ' '7493' 'IPR011638' '\

    The Gut family consists only of glucitol-specific permeases, but these occur both in Gram-negative and Gram-positive bacteria. Escherichia coli contains IIA protein, IIC protein and IIBC protein.

    This entry represents the C-terminal conserved region of the IIBC component.

    \ ' '7494' 'IPR011640' '\ Escherichia coli has an iron(II) transport system (feo) which may make an important contribution to the iron supply of the cell under anaerobic conditions PUBMED:8407793. FeoB has been identified as part of this transport system. FeoB is a large 700-800 amino acid integral membrane protein. The N terminus has been previously erroneously described as being ATP-binding PUBMED:8407793. Recent work shows that it is similar to eukaryotic G-proteins and that it is a GTPase PUBMED:12446835.\ ' '7496' 'IPR011655' '\ These proteins include those ascribed to M penetrans paralogue family 26 in PUBMED:12466555.\ ' '7497' 'IPR011631' '\ These proteins appear to be specific to Mycoplasma species. They are of unknown function.\ ' '7498' 'IPR011653' '\ This group of paralogous proteins identified in Mycoplasma penetrans includes homologues of lipoprotein p35 PUBMED:12466555.\ ' '7499' 'IPR011639' '\

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the restriction endonuclease Eco57I, which recognises asymmetric DNA sequence 5\'-CTGAAG and has both restriction (DNA cleavage a short distance away from the recognition site) and modification (methylation) activities residing in a single polypeptide chain PUBMED:15134658, PUBMED:11124947. It cleaves 22 bases after C-1. As a methylase, it causes specific methylation on A-5 on one strand, the other strand being methylated by the Eco57IB methylase. Homologues of the Escherichia coli Eco57I restriction endonuclease are found in several phylogenetically diverse bacteria.

    \ ' '7500' 'IPR011642' '\ This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins (e.g. ) PUBMED:10455109. In the FeoB proteins (e.g. ), which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity PUBMED:12781516. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name - Gate.\ ' '7501' 'IPR011632' '\ This repeat is found in a small number of proteins and is apparently limited to Coxiella burnetii.\ ' '7502' 'IPR011699' '\ These proteins share some similarity with members of the Major Facilitator Superfamily (MFS).\ ' '7503' 'IPR011633' '\ These proteins have no known function.\ ' '7505' 'IPR011628' '\

    This conserved region is found in a group of haemagglutinins and peptidases, e.g. , that, in Porphyromonas gingivalis (Bacteroides gingivalis), form components of the major extracellular virulence complex RgpA-Kgp - a mixture of proteinases and adhesins PUBMED:10858222. These domains are cleaved from the original polyprotein and form part of the adhesins PUBMED:9245829.

    \ ' '7506' 'IPR011659' '\

    WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase PUBMED:11814058, PUBMED:10322433. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.

    \ This region appears to be related to the repeat. This model is likely to miss copies within a sequence.\ ' '7507' 'IPR009048' '\

    This entry represents the receptor-binding domain (RBD) of alpha-2-macroglobulin proteins. The RBD is located at the C-terminus, its structure having an immunoglobulin-like fold consists of a sandwich of nine strands in two sheets with a Greek-key topology PUBMED:11106161, PUBMED:9634697.

    \

    The alpha-macroglobulin (aM) family of proteins includes protease inhibitors PUBMED:2473064, typified by the human tetrameric a2-macroglobulin (a2M); they belong to the MEROPS proteinase inhibitor family I39, clan IL. These protease inhibitors share several defining properties, which include (i) the ability to inhibit proteases from all catalytic classes, (ii) the presence of a \'bait region\' and a thiol ester, (iii) a similar protease inhibitory mechanism and (iv) the inactivation of the inhibitory capacity by reaction of the thiol ester with small primary amines. aM protease inhibitors inhibit by steric hindrance PUBMED:2472396. The mechanism involves protease cleavage of the bait region, a segment of the aM that is particularly susceptible to proteolytic cleavage, which initiates a conformational change such that the aM collapses about the protease. In the resulting aM-protease complex, the active site of the protease is sterically shielded, thus substantially decreasing access to protein substrates. Two additional events occur as a consequence of bait region cleavage, namely (i) the h-cysteinyl-g-glutamyl thiol ester becomes highly reactive and (ii) a major conformational change exposes a conserved COOH-terminal receptor binding domain PUBMED:2469470 (RBD). RBD exposure allows the aM protease complex to bind to clearance receptors and be removed from circulation PUBMED:2430968. Tetrameric, dimeric, and, more recently, monomeric aM protease inhibitors have been identified PUBMED:9914899, PUBMED:10426429.

    \ ' '7508' 'IPR011626' '\

    This domain covers the complement component region of the alpha-2-macroglobulin family.

    \

    The alpha-macroglobulin (aM) family of proteins includes protease inhibitors PUBMED:2473064, typified by the human tetrameric a2-macroglobulin (a2M); they belong to the MEROPS proteinase inhibitor family I39, clan IL. These protease inhibitors share several defining properties, which include (i) the ability to inhibit proteases from all catalytic classes, (ii) the presence of a \'bait region\' and a thiol ester, (iii) a similar protease inhibitory\ mechanism and (iv) the inactivation of the inhibitory capacity by reaction of the thiol ester with small primary amines. \ aM protease inhibitors inhibit by steric hindrance PUBMED:2472396. The mechanism involves protease cleavage of the bait region, a segment of the aM that is particularly susceptible to proteolytic cleavage, which initiates a conformational change such that the aM collapses about the protease. In the resulting aM-protease complex, the active site of the protease is sterically shielded, thus substantially decreasing access to protein substrates. Two additional events occur as a consequence of bait region cleavage, namely (i) the h-cysteinyl-g-glutamyl thiol ester becomes highly reactive and (ii) a major conformational change exposes a conserved COOH-terminal receptor binding domain PUBMED:2469470 (RBD). RBD exposure allows the aM protease complex to bind to clearance receptors and be removed from circulation PUBMED:2430968. Tetrameric, dimeric, and, more recently, monomeric aM protease inhibitors have been identified PUBMED:9914899, PUBMED:10426429.

    \ \ ' '7509' 'IPR013098' '\

    The basic structure of immunoglobulin (Ig) molecules is a tetramer of two light chains and two heavy chains linked by disulphide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains: alpha, delta, epsilon, gamma and mu, all consisting of a variable domain (VH) and three (in alpha, delta and gamma) or four (in epsilon and mu) constant domains (CH1 to CH4). Ig molecules are highly modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. The domains in Ig and Ig-like molecules are grouped into four types: V-set (variable; ), C1-set (constant-1; ), C2-set (constant-2; ) and I-set (intermediate; ) PUBMED:9417933. Structural studies have shown that these domains share a common core Greek-key beta-sandwich structure, with the types differing in the number of strands in the beta-sheets as well as in their sequence patterns PUBMED:15327963, PUBMED:11377196.

    \

    Immunoglobulin-like domains that are related in both sequence and structure can be found in several diverse protein families. Ig-like domains are involved in a variety of functions, including cell-cell recognition, cell-surface receptors, muscle structure and the immune system PUBMED:10698639.

    \ \

    This entry represents I-set domains, which are found in several cell adhesion molecules, including vascular (VCAM), intercellular (ICAM), neural (NCAM) and mucosal addressin (MADCAM) cell adhesion molecules, as well as junction adhesion molecules (JAM). I-set domains are also present in several other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1 PUBMED:10830169, and the signalling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis PUBMED:16369100.

    \ ' '7510' 'IPR011636' '\ Thiosulphate:quinone oxidoreductase (TQO) catalyses one of the early steps in elemental sulphur oxidation. A novel TQO enzyme was purified from the thermo-acidophilic archaeon Acidianus ambivalens and shown to consist of a large subunit (DoxD) and a smaller subunit (DoxA). The DoxD- and DoxA-like two subunits are fused together in a single polypeptide in .\ ' '7511' 'IPR011637' '\ These proteins appear to have some sequence similarity with but their function is unknown PUBMED:15306018.\ ' '7512' 'IPR011661' '\

    The crystal structure of the sulphur oxygenase/reductase (SOR) of the thermo-acidophilic archaeon Acidianus ambivalens has been determined to 1.7-A resolution PUBMED:16484493. Twenty-four monomers form a large hollow sphere enclosing a positively charged nanocompartment. Apolar channels provide access for linear sulphur species. A cysteine persulphide and a low-potential mononuclear non-heme iron site ligated by a 2-His-1-carboxylate facial triad in a pocket of each subunit constitute the active sites, accessible from the inside of the sphere. The iron is likely the site of both sulphur oxidation and sulphur reduction.

    \ \ \ \

    At 85 degrees C in vitro, elemental sulphur is oxidised to sulphite, thiosulphate and hydrogen sulphide with no external cofactors needed. The proposed equation is: 4S + O2 + 4 H2O ---> 2 HSO3- + 2 H2S + 2 H+.

    \ ' '7513' 'IPR011629' '\

    Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase PUBMED:12869542. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase PUBMED:17163662.

    \ \

    There are at least two distinct cobalamin biosynthetic pathways in bacteria PUBMED:11153269:

    \ \

    Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) PUBMED:11215515. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.

    \ \

    CobW proteins are generally found proximal to the trimeric cobaltochelatase subunit CobN, which is essential for vitamin B12 (cobalamin) biosynthesis PUBMED:12869542. They contain a P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. CobW might be involved in cobalt reduction leading to cobalt(I) corrinoids.

    \

    This entry represents the C-terminal domain found in CobW, as well as in P47K (), a Pseudomonas chlororaphis protein needed for nitrile hydratase expression PUBMED:7765511.

    \ \ ' '7514' 'IPR011656' '\ NOTCH signalling plays a fundamental role during a great number of developmental processes in multicellular animals PUBMED:10221902. NOD and NODP represent a region present in many NOTCH proteins and NOTCH homologs in multiple species such as NOTCH2 and NOTCH3, LIN12, SC1 and TAN1. The role of the NOD and NODP domains remains to be elucidated.\ ' '7515' 'IPR011698' '\ This group of enzymes was suggested to be related to the MinD family of ATPases involved in regulation of cell division in bacteria and archaea PUBMED:10966576. Further sequence analysis suggests a model for the interaction of CobB and CobQ with their respective substrates PUBMED:10966576. CobB and CobQ were also found to contain unusual Triad family (class I) glutamine amidotransferase domains with conserved Cys and His residues, but lacking the Glu residue of the catalytic triad PUBMED:10966576. \ ' '7516' 'IPR013106' '\

    The basic structure of immunoglobulin (Ig) molecules is a tetramer of two light chains and two heavy chains linked by disulphide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains: alpha, delta, epsilon, gamma and mu, all consisting of a variable domain (VH) and three (in alpha, delta and gamma) or four (in epsilon and mu) constant domains (CH1 to CH4). Ig molecules are highly modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. The domains in Ig and Ig-like molecules are grouped into four types: V-set (variable; ), C1-set (constant-1; ), C2-set (constant-2; ) and I-set (intermediate; ) PUBMED:9417933. Structural studies have shown that these domains share a common core Greek-key beta-sandwich structure, with the types differing in the number of strands in the beta-sheets as well as in their sequence patterns PUBMED:15327963, PUBMED:11377196.

    \

    Immunoglobulin-like domains that are related in both sequence and structure can be found in several diverse protein families. Ig-like domains are involved in a variety of functions, including cell-cell recognition, cell-surface receptors, muscle structure and the immune system PUBMED:10698639.

    \ \

    This entry represents the V-set domains, which are Ig-like domains resembling the antibody variable domain. V-set domains are found in diverse protein families, including immunoglobulin light and heavy chains; in several T-cell receptors such as CD2 (Cluster of Differentiation 2), CD4, CD80, and CD86; in myelin membrane adhesion molecules; in junction adhesion molecules (JAM); in tyrosine-protein kinase receptors; and in the programmed cell death protein 1 (PD1).

    \ ' '7518' 'IPR011648' '\

    KaiA is a component of the kaiABC clock protein complex, which constitutes the main circadian regulator in cyanobacteria. The kaiABC complex may act as a promoter-nonspecific transcription regulator that represses transcription, possibly by acting on the state of chromosome compaction. In the complex, KaiA enhances the phosphorylation status of kaiC. In contrast, the presence of kaiB in the complex decreases the phosphorylation status of kaiC, suggesting that kaiB acts by antagonizing the interaction between kaiA and kaiC. The activity of KaiA activates kaiBC expression, while KaiC represses it. The overall fold of the KaiA monomer is that of a four-helix bundle, which forms a dimer in the known structure PUBMED:15071498. KaiA functions as a homodimer. Each monomer is composed of three functional domains: the N-terminal amplitude-amplifier domain, the central period-sdjuster domain and the C-termianl clock-oscillator domain. The N-terminal domain of KaiA, from cyanobacteria, acts as a psuedo-receiver domain, but lacks the conserved aspartyl residue required for phosphotransfer in response regulators PUBMED:12438647. The C-terminal domain is responsible for dimer formation, binding to KaiC, enhancing KaiC phosphorylation and generating the circadian oscillations PUBMED:15170179. The KaiA protein from Anabaena sp. (strain PCC 7120) lacks the N-terminal CheY-like domain.

    \ ' '7519' 'IPR011649' '\ The cyanobacterial clock proteins KaiA and KaiB are proposed as regulators of the circadian rhythm in cyanobacteria. Mutations in both proteins have been reported to alter or abolish circadian rhythmicity. KaiB adopts an alpha-beta meander motif and is found to be a dimer PUBMED:15071498.\ ' '7520' 'IPR011701' '\ Among the different families of transporter only two occur ubiquitously in all classifications of organisms. These are the ATP-Binding Cassette (ABC) superfamily and the Major Facilitator Superfamily (MFS). The MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients PUBMED:9529885, PUBMED:9868370.\ ' '7521' 'IPR011658' '\

    The PA14 domain forms an insert in bacterial beta-glucosidases, other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins and bacterial toxins, including anthrax protective antigen (PA). The domain also occurs in a Dictyostelium pre-spore cell-inducing factor Psi and in fibrocystin, the mammalian protein whose mutation leads to polycystic kidney and hepatic disease. The crystal structure of PA shows that this domain (named PA14 after its location in the PA20 pro-peptide) has a beta-barrel structure. The PA14 domain sequence suggests a binding function, rather than a catalytic role. The PA14 domain distribution is compatible with carbohydrate binding PUBMED:15236739.

    \ ' '7522' 'IPR011650' '\ This domain consists of 4 beta strands and two alpha helices which make up the dimerisation surface of members of the MEROPS peptidase family M20 PUBMED:9083113. This family includes a range of zinc exopeptidases: carboxypeptidases, dipeptidases and specialised aminopeptidases PUBMED:7674922.\ ' '7523' 'IPR011643' '\ In Chlamydomonas reinhardtii, the gene encoding is induced by iron deficiency PUBMED:12012236. Its product complements et3fet4 or yeast ftr1 mutation enabling assimilation of iron. In green algae, this protein secreted and in Chlorococcum littorale is periplasmic.\ ' '7524' 'IPR011646' '\ The KAP (after Kidins220/ARMS and PifA) family of predicted NTPases are sporadically distributed across a wide phylogenetic range in bacteria and in animals. Many of the prokaryotic KAP NTPases are encoded in plasmids and tend to undergo disruption to form pseudogenes. A unique feature of all eukaryotic and certain bacterial KAP NTPases is the presence of two or four transmembrane helices inserted into the P-loop NTPase domain. These transmembrane helices anchor KAP NTPases in the membrane such that the P-loop domain is located on the intracellular side PUBMED:15128444.\ ' '7525' 'IPR011620' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms PUBMED:8868347, PUBMED:11406410. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation PUBMED:10426948, and CheA, which plays a central role in the chemotaxis system PUBMED:9989504. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water PUBMED:11145881. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.

    \

    HKs can be roughly divided into two classes: orthodox and hybrid kinases PUBMED:8029829, PUBMED:1482126. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK PUBMED:10966457. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.

    \

    This entry represents the transmembrane region of the 5TM-Lyt (5TM Receptors of the LytS-YhcK type) histidine kinase PUBMED:12914674. The two-component regulatory system LytS/LytT probably regulates genes involved in cell wall metabolism.

    \ ' '7526' 'IPR011623' '\ This entry represents the transmembrane region of the 7TM-DISM (7TM Receptors with Diverse Intracellular Signalling Modules) PUBMED:12914674.\ ' '7527' 'IPR011622' '\ This entry represents one of two distinct types of extracellular domain found in the 7TM-DISM (7TM Receptors with Diverse Intracellular Signalling Modules) bacterial transmembrane proteins PUBMED:12914674. It is possible that this domain adopts a jelly roll fold and acts as a receptor for carbohydrates and their derivatives PUBMED:12914674.\ ' '7528' 'IPR011624' '\ This entry represents the extracellular domain of the 7TM-HD (7TM Receptors with HD hydrolase) PUBMED:12914674.\ ' '7529' 'IPR011621' '\ These bacterial 7TM receptor proteins have an intracellular domain . This entry corresponds to the 7 helix transmembrane domain. These proteins also contain an N-terminal extracellular domain.\ ' '7530' 'IPR011641' '\

    This domain contains 5 conserved cysteine residues, that are likely to\ participate in disulphide bonds. They are found in a wide variety of\ extracellular proteins. Their function is currently unknown.

    \ ' '7531' 'IPR011644' '\ The HNOB (Haem NO Binding) domain, is a predominantly alpha-helical domain and binds haem via a covalent linkage to histidine. The HNOB domain is predicted to function as a haem-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals.\ ' '7532' 'IPR011645' '\ The HNOBA (Haem NO Binding) domain is found associated with the HNOB domain and in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a haem-dependent sensor for gaseous ligands, and transduce diverse downstream signals in both bacteria and animals.\ ' '7533' 'IPR011663' '\ The UbiC transcription regulator-associated (UTRA) domain is a conserved ligand-binding domain that has a similar fold to PUBMED:12757941. It is believed to modulate activity of bacterial transcription factors in response to binding small molecules PUBMED:12757941.\ ' '7534' 'IPR011625' '\

    This is a domain of the alpha-2-macroglobulin family.

    \

    The alpha-macroglobulin (aM) family of proteins includes protease inhibitors PUBMED:2473064, typified by the human tetrameric a2-macroglobulin (a2M); they belong to the MEROPS proteinase inhibitor family I39, clan IL. These protease inhibitors share several defining properties, which include (i) the ability to inhibit proteases from all catalytic classes, (ii) the presence of a \'bait region\' and a thiol ester, (iii) a similar protease inhibitory\ mechanism and (iv) the inactivation of the inhibitory capacity by reaction of the thiol ester with small primary amines. aM protease inhibitors inhibit by steric hindrance PUBMED:2472396. The mechanism involves protease cleavage of the bait region, a segment of the aM that is particularly susceptible to proteolytic cleavage, which initiates a conformational change such that the aM collapses about the protease. In the resulting aM-protease complex, the active site of the protease is sterically shielded, thus substantially decreasing access to protein substrates. Two additional events occur as a consequence of bait region cleavage, namely (i) the h-cysteinyl-g-glutamyl thiol ester becomes highly reactive and (ii) a major conformational change exposes a conserved COOH-terminal receptor binding domain PUBMED:2469470 (RBD). RBD exposure allows the aM protease complex to bind to clearance receptors and be removed from circulation PUBMED:2430968. Tetrameric, dimeric, and, more recently, monomeric aM protease inhibitors have been identified PUBMED:9914899, PUBMED:10426429.

    \ \ ' '7535' 'IPR011660' '\ This entry represents the Rv0623 ()-like group of transcription factors associated with the PSK operon PUBMED:14659018.\ ' '7536' 'IPR011635' '\ The APHP (acidic peptide-dependent hydrolases/peptidase) domain is found in a variety of different proteins.\ ' '7537' 'IPR011715' '\

    This region contains a probable site of ubiquitination that ensures rapid degradation of tyrosine aminotransferase in rats. The half life of the enzyme in vivo is about 2-4 hours. The enzyme contains at least 2 phosphorylation sites including CAPK at Ser29 and, at the other end of the protein, a casein kinase II site at S*QEECDK. This region of TAT is probably primarily related to regulatory events. Most other transaminases are much more stable and are not phosphorylated.

    \ ' '7538' 'IPR011705' '\

    This domain is found associated with () and (). BTB (broad-complex, tramtrack and bric a brac) is a Kelch related domain, also known as the POZ domain PUBMED:16207353. BTB proteins are divided into subgroups depending on what domain lies at the C-terminus. Despite the divergence in sequences, the BTB fold is highly conserved.

    \ \

    BTB-Kelch proteins have Kelch repeats that form a beta-propeller that can interact with actin filaments PUBMED:15544948. BTB and C-terminal Kelch (BACK) together constitute a novel conserved domain, which is thought to have a possible role in substrate orientation in Cullin3-based E3 ligase complexes.

    \ \

    Four domains, namely the BTB domain, a kelch domain, a BACK domain, and an intervening region (IVR) make up the aryl hydrocarbon receptor (AHR); a ligand-activated transcription factor PUBMED:16582008. This entry represents the domain associated with BTB and Kelch.

    \ ' '7539' 'IPR011695' '\

    The PEST motif is found in one or more copies in Tash AT-hook proteins from Theileria annulata. Tash proteins are transported to the host nucleus and are thought to be involved in pathogenesis PUBMED:11683409. The PEST motif is often found in conjunction with the (), whose function is unknown. These repeats may be part of the PEST motif (a signal for rapid proteolytic degradation) PUBMED:15075278, though this is not proven. This motif is also found in other T. annulata proteins, which have no other known domains.

    \ ' '7540' 'IPR011714' '\

    This repeat is found in some Plasmodium and Theileria proteins.

    \ ' '7541' 'IPR010991' '\

    The p53 protein is a tetrameric transcription factor that plays a central role in the prevention of neoplastic transformation PUBMED:7878469. Oligomerization appears to be essential for the tumour suppressing activity of p53. p53 can be divided into different functional domains: an N-terminal transactivation domain, a proline-rich domain, a DNA-binding domain (), a tetramerisation domain and a C-terminal regulatory region. The tetramerisation domain of human p53 extends from residues 325 to 356, and has a 4-helical bundle fold. The tetramerisation domain is essential for DNA binding, protein-protein interactions, post-translational modifications, and p53 degradation PUBMED:11420672.

    \ \ ' '7542' 'IPR009087' '\

    Rab geranylgeranyltransferase (RabGGT) catalyses the transfer of geranylgeranyl groups to the C-terminal cysteine residues of Rab proteins, Ras-related small GTPases that function in intracellular vesicular transport PUBMED:10745007. RabGGT is only able to prenylate Rab when it is complexed to the Rab escort protein (REP), after which REP remains bound to the prenylated Rab and delivers it to its target membrane. RabGGT is a member of the protein prenyltransferase family (), all of which are heterodimers consisting of alpha and beta subunits. RabGGT is distinct from other members of the prenyltransferase family because of the presence of an Ig-like insert domain in the alpha subunit that is folded into an eight-stranded sandwich between two helices in the helical domain.

    \ \ ' '7543' 'IPR011692' '\

    This family of plant proteins have been implicated in nodule development PUBMED:8634476 in the legume Medicago truncatula (Barrel medic). MtN-19 was shown by Northern blot to be induced during nodulation PUBMED:8634476. The molecular function of these proteins is unknown.

    \ ' '7544' 'IPR011666' '\ This domain is found at the N terminus of several eukaryotic RNA processing proteins (e.g ).\ ' '7545' 'IPR001245' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    Tyrosine phosphorylating activity was originally detected in two viral transforming \ proteins PUBMED:, but many retroviral transforming \ proteins and their cellular counterparts have since been shown to possess such activity. \ The growth factor receptors, which are activated by ligand binding, and the\ insulin-related peptide receptor, are also family members.

    \ ' '7546' 'IPR012910' '\

    In Escherichia coli the TonB protein interacts with outer membrane receptor proteins that carry out high-affinity binding and energy-dependent uptake of specific substrates into the periplasmic space PUBMED:14499604. These substrates are either poorly permeable through the porin channels or are encountered at very low concentrations. In the absence of TonB, these receptors bind their substrates but do not carry out active transport. TonB-dependent regulatory systems consist of six components: a specialised outer membrane-localized TonB-dependent receptor (TonB-dependent transducer) that interacts with its energizing TonB-ExbBD protein complex, a cytoplasmic membrane-localized anti-sigma factor and an extracytoplasmic function (ECF)-subfamily sigma factor PUBMED:15993072. The TonB complex senses signals from outside the bacterial cell and transmits them via two membranes into the cytoplasm, leading to transcriptional activation of target genes. The proteins that are currently known or presumed to interact with TonB include BtuB PUBMED:12652322, CirA, FatA, FcuT, FecA PUBMED:11872840, FhuA PUBMED:9865695, FhuE, FepA PUBMED:9886293, FptA, HemR, IrgA, IutA, PfeA, PupA and Tbp1. The TonB protein also interacts with some colicins. Most of these proteins contain a short conserved region at their N-terminus PUBMED:12957833.

    \

    This entry represents the plug domain, which has been shown to be an independently folding subunit of the TonB-dependent receptors PUBMED:15111112. It acts as the channel gate, blocking the pore until the channel is bound by a ligand. At this point it undergoes conformational changes and opens the channel.

    \ ' '7547' 'IPR011700' '\

    The basic-leucine zipper (bZIP) transcription factors PUBMED:7780801, PUBMED: of eukaryotes are proteins that contain a basic region mediating sequence-specific DNA-binding, followed by a leucine zipper region (see ), which is required for dimerization.

    \ ' '7548' 'IPR011709' '\

    This domain is found towards the C terminus of the DEAD-box helicases (). In these helicases it is, apparently, always found in association with . There do seem to be a couple of instances where it occurs by itself - e.g. .

    \ ' '7549' 'IPR011710' '\

    Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer PUBMED:15261670. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins PUBMED:14690497. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi PUBMED:11208122. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes PUBMED:17041781. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta\', gamma, delta, epsilon and zeta subunits.

    \

    This entry represents the C-terminal domain of the beta subunit from coatomer proteins (Beta-coat proteins). The C-terminal domain probably adapts the function of the N-terminal domain. Coatomer protein complex I (COPI)-coated vesicles are involved in transport between the endoplasmic reticulum and the Golgi but also participate in transport from early to late endosomes within the endocytic pathway PUBMED:12893528.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '7550' 'IPR013105' '\

    The tetratrico peptide repeat (TPR) is a structural motif present in a wide range of proteins PUBMED:7667876, PUBMED:9482716, PUBMED:1882418. It mediates protein-protein interactions and the assembly of multiprotein complexes PUBMED:14659697. The TPR motif consists of 3-16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding.

    \

    This repeat includes outlying Tetratricopeptide-like repeats (TPR) that are not matched by .

    \ ' '7551' 'IPR011716' '\

    This entry includes tetratricopeptide-like repeats found in the LcrH/SycD-like chaperones PUBMED:12799000.

    \ ' '7552' 'IPR011717' '\

    This entry includes tetratricopeptide-like repeats not detected by the , and models. The tetratricopeptide repeat (TPR) motif is a protein-protein interaction module found in multiple copies in a number of functionally different proteins that facilitates specific interactions with a partner protein(s) PUBMED:10517866.

    \ ' '7553' 'IPR011697' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    These peptidases have gamma-glutamyl hydrolase activity; that is they catalyse the cleavage of the gamma-glutamyl bond in poly-gamma-glutamyl substrates. They are structurally related to , but contain extensions in four loops and at the C terminus PUBMED:11953431. They belong to MEROPS peptidase family C26 (gamma-glutamyl hydrolase family), clan PC. The majority of the sequences are classified as unassigned peptidases.

    \ ' '7554' 'IPR013101' '\

    Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively PUBMED:10357231). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.

    \

    Reaction of amidotransferase domain:

    \ \ \

    Reactions of FMN-binding domain:

    \ \ \

    This entry includes some LRRs that fail to be detected by PUBMED:7817399, PUBMED:8264799.

    \ ' '7555' 'IPR013093' '\

    ATPases Associated to a variety of cellular Activities (AAA) are a family distinguished by a highly conserved module of 230 amino acids PUBMED:7646486. The highly conserved nature of this module across taxa suggests that it has a key cellular role. Members of the family are involved in diverse cellular functions including gene expression, peroxisome assembly and vesicle mediated transport. Although the role of ATPase AAA-2 domain is not, as yet, clear, the AAA+ superfamily of proteins to which the AAA ATPases belong has a chaperone-like function in the assembly, operation or disassembly of proteins PUBMED:9927482. Some of these ATPases function as a chaperone subunit of a proteasome-like degradation complex. This ATPase family includes some proteins not detected by .

    \ ' '7556' 'IPR011703' '\

    This entry includes some of the AAA proteins not detected by the model.

    \

    AAA ATPases form a large, functionally diverse protein family belonging to the AAA+ superfamily of ring-shaped P-loop NTPases, which exert their activity through the energy-dependent unfolding of macromolecules. AAA ATPases contain a P-loop NTPase domain, which is the most abundant class of NTP-binding protein fold, and is found throughout all kingdoms of life PUBMED:15037234. P-loop NTPase domains act to hydrolyse the beta-gamma phosphate bond of bound nucleoside triphosphate. There are two classes of P-loop domains: the KG (kinase-GTPase) division, and the ASCE division, the latter including the AAA+ group as well as several other ATPases.

    \

    There are at least six major clades of AAA domains (metalloproteases, meiotic proteins, D1 and D2 domains of ATPases with two AAA domains, proteasome subunits, and BSC1), as well as several minor clades, some of which consist of hypothetical proteins PUBMED:15037233. The domain organisation of AAA ATPases consists of a non-ATPase N-terminal domain that acts in substrate recognition, followed by one or two AAA domains (D1 and D2), one of which may be degenerate.

    \ ' '7557' 'IPR013103' '\

    A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This entry includes reverse transcriptases not recognised by PUBMED:1698615.

    \ ' '7558' 'IPR011704' '\

    The ATPases Associated to a variety of cellular Activities (AAA) are a family distinguished by a highly conserved module of 230 amino acids PUBMED:7646486. The highly conserved nature of this module across taxa suggests that it has a key cellular role. Members of the family are involved in diverse cellular functions including gene expression, peroxisome assembly and vesicle mediated transport. Although the role of ATPase AAA-5 domain is not, as yet, clear, the AAA+ superfamily of proteins to which the AAA ATPases belong has a chaperone-like function in the assembly, operation or disassembly of proteins PUBMED:9927482. This ATPase domain includes some proteins not detected by the model.

    \ ' '7559' 'IPR011711' '\

    Many bacterial transcription regulation proteins bind DNA through a helix-turn-helix (HTH) motif, which can be classified into subfamilies on the basis of sequence similarities. The HTH GntR family has many members distributed among diverse bacterial groups that regulate various biological processes. It was named GntR after the Bacillus subtilis repressor of the gluconate operon PUBMED:2060763. In general, these proteins contain a DNA-binding HTH domain at the N terminus, and an effector binding or oligomerisation domain at the C terminus. The winged-helix DNA-binding domain is well conserved in structure for the whole of the GntR family (), and is similar in structure to other transcriptional regulator families. The C-terminal effector-binding and oligomerisation domains are more variable and are consequently used to define the subfamilies. Based on the sequence and structure of the C-terminal domains, the GtnR family can be divided into four major groups, as represented by FadR (), HutC, MocR and YtrA, as well as some minor groups such as those represented by AraR and PlmA PUBMED:11756427.

    \

    This entry represents the C-terminal ligand binding domain of many members of the GntR family. This domain probably binds to a range of effector molecules that regulate the transcription of genes through the action of the N-terminal DNA-binding domain. This domain is found in and that are regulators of sugar biosynthesis operons.

    \ ' '7560' 'IPR011712' '\

    Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions PUBMED:16176121. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk PUBMED:18076326. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more PUBMED:12372152. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) PUBMED:10966457. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.

    \

    A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response PUBMED:11934609, PUBMED:11489844.

    \ \

    Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms PUBMED:8868347, PUBMED:11406410. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation PUBMED:10426948, and CheA, which plays a central role in the chemotaxis system PUBMED:9989504. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water PUBMED:11145881. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.

    \

    HKs can be roughly divided into two classes: orthodox and hybrid kinases PUBMED:8029829, PUBMED:1482126. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK PUBMED:10966457. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.

    \

    This entry represetns the dimerisation and phosphoacceptor domain of a sub-family of histidine kinases. It shares sequence similarity with and .

    \ ' '7561' 'IPR011706' '\

    Copper is one of the most prevalent transition metals in living organisms and its biological function is intimately related to its redox properties. Since free copper is toxic, even at very low concentrations, its homeostasis in living organisms is tightly controlled by subtle molecular mechanisms. In eukaryotes, before being transported inside the cell via the high-affinity copper transporters of the CTR family, the copper (II) ion is reduced to copper (I). In blue copper proteins such as cupredoxin, the copper (I) ion form is stabilised by a constrained His2Cys coordination environment.

    \

    Multicopper oxidases oxidise their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre; dioxygen binds to the trinuclear centre and, following the transfer of four electrons, is reduced to two molecules of water PUBMED:16234932. There are three spectroscopically different copper centres found in multicopper oxidases: type 1 (or blue), type 2 (or normal) and type 3 (or coupled binuclear) PUBMED:2404764, PUBMED:1995346. Multicopper oxidases consist of 2, 3 or 6 of these homologous domains, which also share homology to the cupredoxins azurin and plastocyanin. Structurally, these domains consist of a cupredoxin-like fold, a beta-sandwich consisting of 7 strands in 2 beta-sheets, arranged in a Greek-key beta-barrel PUBMED:11867755. Multicopper oxidases include:

    \

    \

    In addition to the above enzymes there are a number of other proteins that are similar to the multi-copper oxidases in terms of structure and sequence, some of which have lost the ability to bind copper. These include: copper resistance protein A (copA) from a plasmid in Pseudomonas syringae; domain A of (non-copper binding) blood coagulation factors V (Fa V) and VIII (Fa VIII) PUBMED:3052293; yeast FET3 required for ferrous iron uptake PUBMED:8293473; yeast hypothetical protein YFL041w; and the fission yeast homologue SpAC1F7.08.

    \ \

    This entry represents multicopper oxidase type 2 domains.

    \ ' '7562' 'IPR011707' '\

    Copper is one of the most prevalent transition metals in living organisms and its biological function is intimately related to its redox properties. Since free copper is toxic, even at very low concentrations, its homeostasis in living organisms is tightly controlled by subtle molecular mechanisms. In eukaryotes, before being transported inside the cell via the high-affinity copper transporters of the CTR family, the copper (II) ion is reduced to copper (I). In blue copper proteins such as cupredoxin, the copper (I) ion form is stabilised by a constrained His2Cys coordination environment.

    \

    Multicopper oxidases oxidise their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre; dioxygen binds to the trinuclear centre and, following the transfer of four electrons, is reduced to two molecules of water PUBMED:16234932. There are three spectroscopically different copper centres found in multicopper oxidases: type 1 (or blue), type 2 (or normal) and type 3 (or coupled binuclear) PUBMED:2404764, PUBMED:1995346. Multicopper oxidases consist of 2, 3 or 6 of these homologous domains, which also share homology to the cupredoxins azurin and plastocyanin. Structurally, these domains consist of a cupredoxin-like fold, a beta-sandwich consisting of 7 strands in 2 beta-sheets, arranged in a Greek-key beta-barrel PUBMED:11867755. Multicopper oxidases include:

    \

    \

    In addition to the above enzymes there are a number of other proteins that are similar to the multi-copper oxidases in terms of structure and sequence, some of which have lost the ability to bind copper. These include: copper resistance protein A (copA) from a plasmid in Pseudomonas syringae; domain A of (non-copper binding) blood coagulation factors V (Fa V) and VIII (Fa VIII) PUBMED:3052293; yeast FET3 required for ferrous iron uptake PUBMED:8293473; yeast hypothetical protein YFL041w; and the fission yeast homologue SpAC1F7.08.

    \ \

    This entry represents multicopper oxidase type 3 (or coupled binuclear) domains.

    \ ' '7563' 'IPR011708' '\

    This is a conserved region found in the the DNA polymerase III alpha subunit, (). DNA polymerase III is a complex, multichain enzyme responsible for most of the replicative synthesis in bacteria. This DNA polymerase also exhibits 3\' to 5\' exonuclease activity. The alpha chain is the DNA polymerase.

    \ ' '7564' 'IPR006527' '\

    This domain occurs in a diverse superfamily of genes in plants. Most examples are found C-terminal to an F-box (), a 60 amino acid motif involved in ubiquitination of target proteins to mark them for degradation. Two-hybid experiments support the idea that most members are interchangeable F-box subunits of SCF E3 complexes PUBMED:12169662. Some members have two copies of this domain.

    \ ' '7565' 'IPR012885' '\

    This domain is found is found towards the C terminus of proteins that contain an F-box, , suggesting that they are effectors linked with ubiquitination.

    \ ' '7566' 'IPR008243' '\

    Chorismate mutase (CM; ) catalyses the reaction at the branch point of the biosynthetic pathway leading to the three aromatic amino acids, phenylalanine, tryptophan and tyrosine (chorismic acid is the last common intermediate, and CM leads to the L-phenylalanine/L-tyrosine branch). It is part of the shikimate pathway, which is present only in bacteria, fungi and plants.

    \ \

    This entry represents a family of monofunctional (non-fused) chorismate mutases from Gram-positive bacteria (Firmicutes) and cyanobacteria. Trusted members of the family are found in operons with other enzymes of the chorismate pathways, both up- and downstream of CM (Listeria, Bacillus, Oceanobacillus) or are the sole CM in the genome where the other members of the chorismate pathways are found elsewhere in the genome (Nostoc, Thermosynechococcus). They are monofunctional, homotrimeric, nonallosteric enzymes and are not regulated by the end-product aromatic amino acids.

    \ \

    The three types of CM are AroQ class, Prokaryotic type (e.g., amongst others); AroQ class, Eukaryotic type (); and AroH class. They fall into two structural folds (AroQ class and AroH class) which are completely unrelated PUBMED:11528003. The two types of the AroQ structural class (the Escherichia coli CM dimer and the yeast CM monomer) can be structurally superimposed, and the topology of the four-helix bundle forming the active site is conserved PUBMED:11528003.

    For additional information please see PUBMED:8046752, PUBMED:8061004, PUBMED:2105742, PUBMED:8378335, PUBMED:10818343, PUBMED:11450855, PUBMED:9383421.

    \ ' '7567' 'IPR014781' '\

    Anthrax toxin is a plasmid-encoded toxin complex produced by the Gram-positive, spore-forming bacteria, Bacillus anthracis. The toxin consists of three non-toxic proteins: the protective antigen (PA), the lethal factor (LF) and the edema factor (EF) PUBMED:14570563. These component proteins self-assemble at the surface of host cell receptors, yielding a series of toxic complexes that can produce shock-like symptoms and death. Anthrax toxin is one of a large group of Bacillus and Clostridium exotoxins referred to as binary toxins, forming independent enzymatic (A moiety) and binding (B moiety) components. The LF and EF proteins are the enzymes (A moiety) that act on cytosolic substrates, while PA is a multi-functional protein (B moiety) that binds to cell surface receptors, mediates the assembly and internalisation of the complexes, and delivers them to the host cell endosome PUBMED:17335404. Once PA is attached to the host receptor PUBMED:17381430, it must then be cleaved by a host cell surface (furin family) protease before it is able to bind EF and LF. The cleavage of the N-terminus of PA enables the C-terminal fragment to self-associate into a ring-shaped heptameric complex (prepore) that can bind LF or EF competitively. The PA-LF/EF complex is then internalised by endocytosis, and delivered to the endosome, where PA forms a pore in the endosomal membrane in order to translocate LF and EF to the cytosol. LF is a Zn-dependent metalloprotease that cleaves and inactivates mitogen-activated protein (MAP) kinases, kills macrophages, and causes death of the host by inhibiting cell proliferation PUBMED:14616089, PUBMED:11700563. EF is a calcium-and calmodulin-dependent adenylyl cyclase that can cause edema (fluid-filled swelling) when associated with PA. EF is not toxic by itself, and is required for the survival of germinated Bacillus spores within macrophages at the early stages of infection. EF dramatically elevates the level of host intracellular cAMP, a ubiquitous messenger that integrates many processes of the cell; increases in cAMP can interfere with host intracellular signalling PUBMED:15131111.

    \

    This entry represents the N- and C-terminal domains found in both lethal factor and edema factor proteins of anthrax toxin.

    \ ' '7568' 'IPR012919' '\

    The Caenorhabditis elegans UNC-84 protein is a nuclear envelope protein that is involved in nuclear anchoring and migration during development. The S. pombe Sad1 protein localises at the spindle pole body. UNC-84 and Sad1 share a common C-terminal region that is often termed the SUN (Sad1 and UNC) domain PUBMED:10508607, PUBMED:15082709. In mammals, the SUN domain is present in two proteins, Sun1 and Sun2 PUBMED:10508607. The SUN domain of Sun2 has been demonstrated to be in the periplasm PUBMED:15082709.

    \ ' '7569' 'IPR011713' '\

    Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively PUBMED:10357231). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.

    \

    Reaction of amidotransferase domain:

    \ \ \

    Reactions of FMN-binding domain:

    \ \ \

    This entry includes some LRRs that fail to be detected by the model.

    \ ' '7570' 'IPR012925' '\

    This domain is found at the C-terminus of some MerR family transcription factors and has an alpha-helical globin-like fold PUBMED:12682015. It includes Mta, a central regulator of multidrug resistance in Bacillus subtilis.

    \ ' '7571' 'IPR011696' '\ HaTx1 is a 35 amino acid peptide toxin that was isolated from Chilean tarantula (Grammostola spatulata) venom. It inhibits the drk1 voltage-gated K(+) channel not by blocking the pore, but by altering the energetics of gating PUBMED:10731427.\ ' '7572' 'IPR011665' '\

    The Vaccinia virus has an infection-induced host cell cycle control mechanism. p53 and Rb, which are associated with the is inactivated Rb, which are associated with the RNA polymerase III transcription factor B (TFIIIB) subunits, TBP and Brf1, are inactivated PUBMED:17877750, PUBMED:17028095.

    \ \

    TFIIB, Brf1, and Brf2 share related N-terminal zinc ribbon and core domains. TFIIB bridges RNA polymerase II (Pol II) with the promoter-bound pre-initiation complex, whereas Brf1 and Brf2 are involved in the recruitment of Pol III. Brf1 and Brf2 both have a C-terminal extension absent in TFIIB, but their C-terminal extensions are unrelated. In yeast Brf1, the C-terminal extension interacts with the TBP/TATA box complex and contributes to the recruitment of Bdp1 PUBMED:16227591.

    \ \

    It is suggested that the structure of the TBP-DNA complex may be altered upon entry of Brf1 and Bdp1 into the complex. Entry of Brf1 and Bdp1 into the complex imposes a strict sequence preference for the downstream half of the TATA box PUBMED:17028095.

    \ \

    This region covers both the Brf homology II and III regions PUBMED:12660736.

    \ ' '7573' 'IPR002087' '\ Anti-proliferative proteins have been shown to include mammalian and avian protein BTG1 (which appears to be involved in negative regulation\ of cell proliferation) and rat/mouse NGF-inducible protein PC3/TIS21 (BTG2) PUBMED:1373383, PUBMED:8325512, PUBMED:1849653.\ These proteins have from 158 to 363 amino acid residues, that are highly similar and include 3 conserved cysteine residues. BTG2 seems to have a\ signal sequence; while the other proteins may lack such a domain. The sequence\ of the N-terminal half of these proteins is well conserved.\ ' '7574' 'IPR009073' '\

    This entry represents the C-terminal oligomerisation domain found in HscB (heat shock cognate protein B), which is also known as HSC20 (20K heat shock cognate protein). HscB acts as a co-chaperone to regulate the ATPase activity and peptide-binding specificity of the molecular chaperone HscA, also known as HSC66 (HSP70 class). HscB proteins contain two domains, an N-terminal J-domain, which is involved in interactions with HscA, connected by a short loop to the C-terminal oligomerisation domain; the two domains make contact through a hydrophobic interface. The core of the oligomerisation domain is thought to bind and target proteins to HscA and consists of an open, three-helical bundle PUBMED:11124030. HscB, along with HscA, has been shown to play a role in the biogenesis of iron-sulphur proteins.

    \ ' '7575' 'IPR012921' '\

    Spen (split end) proteins regulate the expression of key transcriptional effectors in diverse signalling pathways. They are large proteins characterised by N-terminal RNA-binding motifs and a highly conserved C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. The function of the SPOC domain is unknown, but the SPOC domain of the SHARP Spen protein has been implicated in the interaction of SHARP with the SMRT/NcoR corepressor, where SHARP plays an essential role in the repressor complex PUBMED:12897056.

    \

    The SPOC domain is folded into a single compact domain consisting of a beta-barrel with seven strands framed by six alpha helices. A number of deep grooves and clefts in the surface, plus two nonpolar loops, render the SPOC domain well suited to protein-protein interactions; most of the conserved residues occur on the protein surface rather than in the core. Other proteins containing a SPOC domain include drosophila Split ends, which promotes sclerite development in the head and restricts it in the thorax, and mouse MINT (homologue of SHARP), which is involved in skeletal and neuronal development via its repression of Msx2.

    \ ' '7576' 'IPR011683' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \ This domain is found in family 53 of the glycosyl hydrolase classification PUBMED:12691742. These enzymes are endo-1,4- beta-galactanases (). The structure of this domain is known PUBMED:12484750 and has a TIM barrel fold.\ ' '7577' 'IPR011986' '\

    Dioxygenases catalyse the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms. Cleavage of aromatic rings is one of the most important functions of dioxygenases, which play key roles in the degradation of aromatic compounds. The substrates of ring-cleavage dioxygenases can be classified into two groups according to the mode of scission of the aromatic ring. Intradiol enzymes () use a non-haem Fe(III) to cleave the aromatic ring between two hydroxyl groups (ortho-cleavage), whereas extradiol enzymes use a non-haem Fe(II) to cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon (meta-cleavage) PUBMED:10730195, PUBMED:15264822. These two subfamilies differ in sequence, structural fold, iron ligands, and the orientation of second sphere active site amino acid residues. Extradiol dioxygenases are usually homo-multimeric, bind one atom of ferrous ion per subunit and have a subunit size of about 33 kDa. Extradiol dioxygenases can be divided into three classes. Class I and II enzymes () show sequence similarity, with the two-domain class II enzymes having evolved from a class I enzyme through gene duplication. Class III enzymes are different in sequence and structure, but they do share several common active-site characteristics with the class II enzymes, in particular the coordination sphere and the disposition of the putative catalytic base are very similar. Class III enzymes usually have two subunits, designated A () and B ().

    \

    LigAB is a protocatechuate 4,5-dioxygenase () that belongs to the extradiol class III enzyme family. The LigA subunit of this enzyme is multi-helical, containing a compact array of 6 short helices PUBMED:10467151.

    \ \ ' '7578' 'IPR009108' '\

    This entry represents a group of uncharacterised hypothetical proteins from archaea, including the 8.4 kDa protein MTH865 from Methanobacterium thermoautotrophicum. The NMR structure of MTH865 reveals an EF-Hand-like fold consisting of four helices in two hairpins PUBMED:11693569.

    \ ' '7579' 'IPR011682' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Glycoside hydrolase family 38 comprises enzymes with only one known activity; alpha-mannosidase () (). This domain is found at the C terminus of glycosyl hydrolases from family 38.

    \ ' '7580' 'IPR011679' '\

    ERp29 is a ubiquitously expressed endoplasmic reticulum protein found in mammals PUBMED:11435111. This protein is found associated with an N-terminal thioredoxin-like domain (), which is homologous to the domain of human protein disulphide isomerase (PDI). ERp29 may help mediate the chaperone function of PDI. The C-terminal Erp29 domain has a 5-helical bundle fold. ERp29 is thought to form part of the thyroglobulin folding complex PUBMED:11884402.

    \ ' '7581' 'IPR011681' '\ GcrA, together with CtrA (see and ), form a master cell cycle regulator. These bacterial regulators are involved in controlling the progression and asymmetric polar morphogenesis PUBMED:15087506. During this process, there are temporal and spatial variations in the concentrations of GcrA and CtrA. The variation in concentration produces time and space dependent transcriptional regulation of modular functions that implement cell-cycle processes PUBMED:15087506. More specifically, GcrA acts as an activator of components of the replisome and the segregation machinery PUBMED:15087506.\ ' '7582' 'IPR011664' '\ These protein sequences, found in various bacterial species, are similar to those of Abi proteins, which are involved in bacteriophage resistance mediated by abortive infection in Lactococcus species PUBMED:8534099, PUBMED:7601848. The proteins are thought to have helix-turn-helix motifs, found in many DNA-binding proteins, allowing them to perform their function PUBMED:7601849.\ ' '7583' 'IPR006457' '\

    This domain is found tandemly duplicated in a most members of a paralogous family in the archaeon Methanosarcina acetivorans str. C2A. This domain is clearly related to the central region of a family of archaeal S-layer proteins described in .

    \ ' '7584' 'IPR011667' '\ This region is found in a number of hypothetical proteins thought to be expressed by the eukaryote Encephalitozoon cuniculi, an obligate intracellular microsporidial parasite. The proteins are approximately 200 residues long.\ ' '7585' 'IPR011668' '\ This domain is found in archaeal species. It is likely to bind zinc via its four well-conserved cysteine residues.\ ' '7586' 'IPR011669' '\ This entry contains a number of hypothetical bacterial and archaeal proteins. The region is approximately 350 residues long. A member of this family () is thought to associate with another subunit to form an H+-transporting ATPase, but no evidence has been found to support this.\ ' '7587' 'IPR011670' '\ This family includes sequences of largely unknown function but which share a number of features in common. They are expressed by bacterial species, and in many cases these bacteria are known to associate symbiotically with plants. Moreover, the majority are coded for by plasmids, which in many cases are known to confer on the organism the ability to interact symbiotically with leguminous plants. An example of such a plasmid is NGR234, which encodes Y4CF, a protein of unknown function that is a member of this family PUBMED:9163424. Other members of this family are expressed by organisms with a documented genomic similarity to plant symbionts PUBMED:12271122.\ ' '7588' 'IPR011671' '\

    Proteins in this family have been predicted to function as AdoMet-dependent methyltransferases PUBMED:16861910.

    \ ' '7589' 'IPR011672' '\ This is a family of sequences coming from hypothetical proteins found in both bacterial and archaeal species.\ ' '7590' 'IPR011673' '\ This is a family of proteins of unknown function expressed by various bacterial species. Some members of this family (e.g. , ) are thought to be lipoproteins. Another member of this family () is thought to be involved in photosynthesis PUBMED:10976061.\ ' '7591' 'IPR011674' '\ This is a group of sequences from hypothetical archaeal proteins. The region in question is approximately 330 amino acid residues long.\ ' '7592' 'IPR011675' '\ This is a family of hypothetical bacterial and bacteriophage proteins. The region in question is approximately 150 residues long and is highly conserved throughout the family.\ ' '7593' 'IPR011676' '\ The proteins of this entry are mainly hypothetical proteins expressed by Oryza sativa.\ ' '7594' 'IPR011680' '\ This is a family of eukaryotic proteins thought to be involved in axonal outgrowth and fasciculation PUBMED:9096408. The N-terminal regions of these sequences are less conserved than the C-terminal regions, and are highly acidic PUBMED:9096408. The Caenorhabditis elegans homolog, UNC-76 (), may play structural and signalling roles in the control of axonal extension and adhesion (particularly in the presence of adjacent neuronal cells PUBMED:9971736) and these roles have also been postulated for other FEZ family proteins PUBMED:9096408. Certain homologs have been definitively found to interact with the N-terminal variable region (V1) of PKC-zeta, and this interaction causes cytoplasmic translocation of the FEZ family protein in mammalian neuronal cells PUBMED:9971736. The C-terminal region probably participates in the association with the regulatory domain of PKC-zeta PUBMED:9971736. The members of this family are predicted to form coiled-coil structures PUBMED:9971736, PUBMED:14697253, which may interact with members of the RhoA family of signalling proteins PUBMED:9971736, but are not thought to contain other characteristic protein motifs PUBMED:14697253. Certain members of this family are expressed almost exclusively in the brain, whereas others (such as FEZ2, ) are expressed in other tissues, and are thought to perform similar but unknown functions in these tissues PUBMED:14697253.\ ' '7595' 'IPR011686' '\ The omega transcriptional repressor regulates expression of genes involved in copy number control and stable maintenance of plasmids. The omega protein belongs to the structural superfamily of MetJ/Arc repressors featuring a ribbon-helix-helix DNA-binding motif with the beta-ribbon located in and recognising the major groove of operator DNA PUBMED:11733997.\ ' '7596' 'IPR011684' '\ This is a group of sequences found exclusively in plants. They are similar to kinase interacting protein 1 (KIP1), which has been found to interact with the kinase domain of PRK1, a receptor-like kinase PUBMED:11500547. This particular region contains two coiled-coils, which are described as motifs involved in protein-protein interactions PUBMED:11500547. It has also been suggested that the coiled-coils of the protein allow it to dimerise in vivo PUBMED:11500547.\ ' '7597' 'IPR011685' '\ This is a group of mainly hypothetical eukaryotic proteins. Putative features found in LETM1, such as a transmembrane domain and a CK2 and PKC phosphorylation site PUBMED:10486213, are relatively conserved throughout the family. Deletion of LETM1 is thought to be involved in the development of Wolf-Hirschhorn syndrome in humans PUBMED:10486213. A member of this family, , is known to be expressed in the mitochondria of Drosophila melanogaster PUBMED:10071211, suggesting that this may be a group of mitochondrial proteins.\ ' '7598' 'IPR011687' '\ This entry contains sequences that bear similarity to the glioma tumour suppressor candidate region gene 2 protein (p60) PUBMED:10708517. This protein has been found to interact with herpes simplex type 1 regulatory proteins, but its exact role in the life cycle of the virus is not known PUBMED:10196275.\ ' '7599' 'IPR011688' '\ This is a family of sequences found in both bacteria and bacteriophages. This region is approximately 130 residues long and in some cases is found as part of the PVL (Panton-Valentine leukocidin) group of genes, which encode a member of the leukocidin group of bacterial toxins that kill leukocytes by creation of pores in the cell membrane PUBMED:12044378. PVL appears to be a virulence factor associated with a number of human diseases PUBMED:10524952.\ ' '7600' 'IPR011690' '\ This region is approximately 35 residues long. It is found repeated in a number of putative phosphate starvation-inducible proteins expressed by various bacterial species. PsiF () is known to be an example of such phosphate starvation-inducible proteins PUBMED:2160940.\ ' '7602' 'IPR011694' '\ This entry contains a group of peptides derived from a salivary gland cDNA library of the tick Ixodes scapularis (Black-legged tick) PUBMED:12177149. Also present are peptides from a related tick species, Ixodes ricinus (Sheep tick). They are characterised by a putative signal peptide, indicative of secretion, and conserved cysteine residues.\ ' '7604' 'IPR011677' '\ This is a group of sequences derived from hypothetical eukaryotic proteins. The region in question is approximately 330 residues long and has a cysteine rich N terminus.\ ' '7605' 'IPR011678' '\ These sequences are mainly derived from predicted eukaryotic proteins. The region in question lies towards the C terminus of these large proteins and is approximately 300 amino acid residues long.\ ' '7606' 'IPR011689' '\ This is a group of proteins, expressed in the crenarchaeon Pyrobaculum aerophilum, whose members are variable in length and level of conservation. The presence of numerous frameshifts and internal stop codons in multiple alignments are thought to indicate that most family members are no longer functional PUBMED:11792869.\ ' '7607' 'IPR012934' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA PUBMED:14604529.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '7608' 'IPR012900' '\

    This region is found to the N-terminus of , which is a transcription factor domain. It is between 150 and 200 amino acids in length. The N-terminal half is rather rich in proline residues and has been termed the PRD (proline rich domain) PUBMED:11722549, whereas the C-terminal half is more polar and has been called the MFMR (multifunctional mosaic region). It has been suggested that this family is composed of three sub-families called A, B and C PUBMED:8127687, classified according to motif composition. It has been suggested that some of these motifs may be involved in mediating protein-protein interactions PUBMED:8127687. The MFMR region contains a nuclear localisation signal in bZIP opaque and GBF-2 PUBMED:11722549. The MFMR also contains a transregulatory activity in TAF-1. The MFMR in CPRF-2 contains cytoplasmic retention signals PUBMED:11722549.

    \ ' '7609' 'IPR012485' '\

    Centromere protein Cenp-I (also known as Mis6) is an essential centromere connector protein acting during G1-S phase of the cell cycle. Mis6 is thought to be required for recruiting Cenp-A, the centromere- specific histone H3 variant; an important event for centromere function and chromosome segregation during mitosis PUBMED:9230309, PUBMED:10864871.

    \ ' '7610' 'IPR012419' '\

    The members of this family are sequences that are similar to a region of Cas1p protein (). This is an O-acetyltransferase that in Cryptococcus neoformans var. neoformans was shown to be required for O-acetylation of its capsular polysaccharide PUBMED:11703667. The capsule is this organism\'s most obvious virulence factor PUBMED:11703667.

    \ ' '7611' 'IPR012920' '\

    This presumed domain is found at the C-terminus of a family of FtsJ-like methyltransferases. Members of this family are involved in 60S ribosomal biogenesis, for example PUBMED:10556316.

    \ ' '7612' 'IPR012494' '\

    This family represents the Reovirus core protein Mu-2. Mu-2 is a microtubule associated protein and is thought to play a key role in the formation and structural organisation of reovirus inclusion bodies PUBMED:11932414, PUBMED:1566600.

    \ ' '7613' 'IPR012858' '\

    This group of sequences is similar to a region of the dendritic cell-specific transmembrane protein (DC-STAMP, ). This is thought to be a novel receptor protein that shares no identity with other multimembrane-spanning proteins PUBMED:11169400. It is thought to have seven putative transmembrane regions PUBMED:11169400, two of which are found in the region featured in this family. DC-STAMP is also described as having potential N-linked glycosylation sites and a potential phosphorylation site for PKC PUBMED:11169400, but these are not conserved.

    \ ' '7615' 'IPR012427' '\

    This is a family of 14 highly conserved sequences, from hypothetical proteins expressed by both bacterial and archaeal species.

    \ ' '7616' 'IPR012428' '\

    The members of this family are all derived from relatively short hypothetical proteins thought to be expressed by various Nucleopolyhedroviruses.

    \ ' '7617' 'IPR012429' '\

    These sequences are found in hypothetical proteins of unknown function expressed by bacterial and archaeal species. The region in question is approximately 230 residues long.

    \ ' '7618' 'IPR012430' '\

    Sequences making up this family are derived from hypothetical proteins expressed by both prokaryotic and eukaryotic species. The region in question is approximately 250 residues long.

    \ ' '7619' 'IPR012431' '\

    This is a family consisting of sequences from hypothetical proteins of unknown function expressed by certain species of archaea. One member () is thought to be similar to tropomyosin PUBMED:10382966.

    \ ' '7620' 'IPR012432' '\

    This is a group of sequences found in hypothetical proteins predicted to be expressed in a number of bacterial species. The region in question is approximately 150 amino acid residues long.

    \ ' '7621' 'IPR012859' '\

    The sequences making up this family are derived from hypothetical proteins of unknown function expressed by various archaeal species. The region in question is approximately 160 residues long.

    \ ' '7622' 'IPR012433' '\

    This family consists of sequences from hypothetical proteins thought to be expressed by two members of the Xanthomonas genus. The region in question is 125 amino acid residues long.

    \ ' '7623' 'IPR012860' '\

    The members of this family include sequences derived from hypothetical eukaryotic proteins of unknown function. The region in question is approximately 550 residues long.

    \ ' '7624' 'IPR012434' '\

    The members of this family are sequences derived from a group of hypothetical proteins expressed by certain bacterial species. The region concerned is approximately 440 amino acid residues in length.

    \ ' '7625' 'IPR012436' '\

    This family contains sequences derived from a group of hypothetical proteins expressed by Arabidopsis thaliana (Mouse-ear cress). These sequences are highly similar and the region concerned is about 100 residues long.

    \ ' '7626' 'IPR012862' '\

    The members of this family include sequences that are parts of hypothetical proteins expressed by plant species. The region in question is about 170 amino acids long.

    \ ' '7627' 'IPR012437' '\

    This family contains sequences covering an approximately 270 amino acid stretch of a group of hypothetical proteins. These proteins are expressed by archaeal species of the Methanosarcina genus.

    \ ' '7628' 'IPR012438' '\

    This approximately 50-residue region is found in a number of sequences derived from hypothetical plant proteins. This region features a highly basic 5 amino-acid stretch towards its centre.

    \ ' '7629' 'IPR012439' '\

    This family consists of sequences derived from hypothetical eukaryotic proteins. A region approximately 100 residues in length is featured.

    \ ' '7630' 'IPR012441' '\

    The members of this family are all sequences found within hypothetical proteins expressed by various bacterial species. The region concerned is approximately 150 residues long.

    \ ' '7631' 'IPR012866' '\

    This family consists of sequences found in a number of hypothetical plant proteins of unknown function. The region of interest contains nine highly conserved cysteine residues and is approximately 160 amino acids in length, which probably represent a zinc-binding domain.

    \ ' '7632' 'IPR012444' '\

    The sequences making up this family are all derived from hypothetical proteins expressed by Caenorhabditis elegans. The region in question is approximately 160 amino acids long.

    \ ' '7633' 'IPR012891' '\

    This domain is found in proteins carrying other domains known to be involved in intracellular signalling pathways (such as ) indicating that it might also be involved in these pathways. It has 4 highly conserved cysteine residues, suggesting that it can bind zinc ions. Moreover, it is found repeated in some members of this family (such as ); this may indicate that these domains are able to interact with one another, raising the possibility that this domain mediates heterodimerisation.

    \ ' '7634' 'IPR012478' '\

    This family contains sequences bearing similarity to a region of GSG1 (), a protein specifically expressed in testicular germ cells PUBMED:9337410. It is possible that over expression of the human homologue may be involved in tumourigenesis of human testicular germ cell tumours PUBMED:9337410. The region in question has four highly conserved cysteine residues.

    \ ' '7635' 'IPR012893' '\

    The members of this entry are similar to a region close to the C-terminus of the HipA protein expressed by various bacterial species (for example ). This protein is known to be involved in high-frequency persistence to the lethal effects of inhibition of either DNA or peptidoglycan synthesis PUBMED:1715862. When expressed alone, it is toxic to bacterial cells PUBMED:1715862, but it is usually tightly associated with HipB PUBMED:8021189, and the HipA-HipB complex may be involved in autoregulation of the hip operon. The hip proteins may be involved in cell division control and may interact with cell division genes or their products PUBMED:8021189.

    \ ' '7636' 'IPR012894' '\

    The members of this entry contain a region that is found towards the N-terminus of the HipA protein expressed by various bacterial species (for example ). This protein is known to be involved in high-frequency persistence to the lethal effects of inhibition of either DNA or peptidoglycan synthesis PUBMED:1715862. When expressed alone, it is toxic to bacterial cells PUBMED:1715862, but it is usually tightly associated with HipB PUBMED:8021189, and the HipA-HipB complex may be involved in autoregulation of the hip operon. The hip proteins may be involved in cell division control and may interact with cell division genes or their products PUBMED:8021189.

    \ ' '7637' 'IPR012488' '\

    The region featured in this family is found repeated in a number of plant proteins, some of which are expressed specifically in nodules formed during symbiotic interactions with certain bacterial species]. Some of these proteins are also termed glycine-rich proteins (GRPs), due to the presence of a glycine-rich C-terminal region in their structures PUBMED:12236598. Bacterial infection is required for the induction of nodule-specific GRP genes, and it is thought that nodule-specific GRPs may play non-redundant roles required at specific stages of nodule development PUBMED:12236598. Members of this group of proteins may be cytosolic, whereas others are thought to be membrane-associated PUBMED:9037164.

    \ ' '7638' 'IPR012492' '\

    This family contains sequences that are similar to the C-terminal region of Red protein (). This and related proteins are thought to be localised to the nucleus, and contain a RED repeat which consists of a number of RE and RD sequence elements PUBMED:10216252. The region in question has several conserved NLS sequences PUBMED:10216252. The function of Red protein is unknown, but efficient sequestration to nuclear bodies suggests that its expression may be tightly regulated or that the protein self-aggregates extremely efficiently PUBMED:10216252.

    \ ' '7639' 'IPR012916' '\

    This domain contains sequences that are similar to the N-terminal region of Red protein (). This and related proteins contain a RED repeat which consists of a number of RE and RD sequence elements PUBMED:10216252. The region in question has several conserved NLS sequences and a putative trimeric coiled-coil region PUBMED:10216252, suggesting that these proteins are expressed in the nucleus PUBMED:10216252. The function of Red protein is unknown, but efficient sequestration to nuclear bodies suggests that its expression may be tightly regulated, or that the protein self-aggregates extremely efficiently PUBMED:10216252.

    \ ' '7640' 'IPR012918' '\

    The members of this family are sequences similar to the C-terminal region of RTP801, the protein product of a hypoxia-inducible factor 1 (HIF-1)- responsive gene PUBMED:11884613. Two members of this family expressed by Drosophila melanogaster, Scylla () and Charybde (), are designated as Hox targets PUBMED:11884613. RTP801 is thought to be involved in various cellular processes PUBMED:11884613. Over expression of the gene caused the apoptosis-resistant phenotype in cycling cells, and apoptosis sensitivity in growth arrested cells PUBMED:11884613. Moreover, the protein product of the mouse homologue of RTP801 (dig2 ()) is thought to be induced by diverse apoptotic signals, and also by dexamethasone treatment PUBMED:12736248.

    \ ' '7641' 'IPR012496' '\

    These sequences are similar to a region conserved amongst various protein products of the transmembrane channel-like (TMC) gene family, such as Transmembrane channel-like protein 3 () and EVIN2 () - this region is termed the TMC domain PUBMED:12906855. Mutations in these genes are implicated in a number of human conditions, such as deafness and epidermodysplasia verruciformis PUBMED:12906855. TMC proteins are thought to have important cellular roles, and may be modifiers of ion channels or transporters PUBMED:12812529.

    \ ' '7642' 'IPR012495' '\

    The members of this family are similar to a region of the protein product of the bacterial tadE locus (). In various bacterial species, the tad locus is closely linked to flp-like genes, which encode proteins required for the production of pili involved in adherence to surfaces PUBMED:11553455. It is thought that the tad loci encode proteins that act to assemble or export an Flp pilus in various bacteria PUBMED:11553455. All tad loci but TadA have putative transmembrane regions PUBMED:11553455, and in fact the region in question is this family has a high proportion of hydrophobic amino acid residues.

    \ ' '7643' 'IPR012924' '\

    This domain consists of a group of sequences that are similar to the core of TfuA protein (). This protein is involved in the production of trifolitoxin (TFX), a gene-encoded, post-translationally modified peptide antibiotic PUBMED:8763943. The role of TfuA in TFX synthesis is unknown, and it may be involved in other cellular processes PUBMED:8763943.

    \ ' '7644' 'IPR012899' '\

    This five residue motif is found in a number of bacterial proteins bearing similarity to the protein CpxP (). This is a periplasmic protein that aids in combating extracytoplasmic protein-mediated toxicity, and may also be involved in the response to alkaline pH PUBMED:947. Another member of this family, Spy () is also a periplasmic protein that may be involved in the response to stress PUBMED:9068658. The homology between CpxP and Spy may indicate that these two proteins are functionally related PUBMED:9473036. The motif is found repeated twice in many members of this entry.

    \ ' '7645' 'IPR012502' '\

    This family contains sequences expressed in eukaryotic organisms bearing high similarity to the conserved region of Drosophila melanogaster wings apart-like protein (WAPL). The D. melanogaster WAPL protein regulates heterochromatin structure PUBMED:10747063. It is required to hold sister chromatids of meiotic heterochromatin together and is implicated in both heterochromatin pairing during female meiosis and the modulation of position-effect variegation (PEV). Although the high-sequence conservation is limited to a third of the protein sequence, a WAPL homologue has been identified in mammals. Mammalian WAPL may play a significant role in meiosis as does the Drosophila one. Human WAPL is overexpressed in invasive human cervical cancers and is often associated with cervical carcinogenesis PUBMED:15150110, PUBMED:15620708.

    \ ' '7646' 'IPR012849' '\

    The region is found towards the N-terminus of a number of adaptor proteins that interact with Abl-family tyrosine kinases PUBMED:12011975. More specifically, it is termed the homeo-domain homologous region (HHR), as it is similar to the DNA-binding region of homeo-domain proteins PUBMED:7590236. Other homeo-domain proteins have been implicated in specifying positional information during embryonic development, and in the regulation of the expression of cell-type specific genes PUBMED:7590236. The Abl-interactor proteins are thought to coordinate the cytoplasmic and nuclear functions of the Abl-family kinases, and seem to be involved in cytoskeletal reorganisation, but their precise role remains unclear PUBMED:12011975.

    \ ' '7647' 'IPR012442' '\

    These sequences are derived from a number of hypothetical plant proteins. The region in question is approximately 270 amino acids long. Some members of this family are annotated as yeast pheromone receptor proteins AR781 but no literature was found to support this.

    \ ' '7648' 'IPR012476' '\

    The members of this family are sequences that are similar to the human protein GLE1 (). This protein is localised at the nuclear pore complexes and functions in poly(A)+ RNA export to the cytoplasm PUBMED:9618489.

    \ ' '7649' 'IPR012479' '\

    This family comprises sequences bearing significant similarity to the mouse transcriptional regulator protein HCNGP (). This protein is localised to the nucleus and is thought to be involved in the regulation of beta-2-microglobulin genes.

    \ ' '7650' 'IPR012908' '\

    The sequences found in this family are similar to PGAP1 (). This is an endoplasmic reticulum membrane protein with a catalytic serine-containing motif that is conserved in a number of lipases. PGAP1 functions as a GPI inositol-deacylase; this deacylation is important for the efficient transport of GPI-anchored proteins from the endoplasmic reticulum to the Golgi body.

    \ ' '7651' 'IPR012930' '\

    The members of this family are sequences that are similar to TraC () from Rhizobium etli. The gene encoding this protein is one of a group of genes found on plasmid p42a of Rhizobium etli (strain CFN 42/ATCC 51251) that are thought to be involved in the process of plasmid self-transmission. Mobilisation of plasmid p42a is of importance as it is required for transfer of plasmid p42d, the symbiotic plasmid which carries most of the genes required for nodulation and nitrogen fixation by this symbiotic bacterium. The predicted protein products of p42a are similar to known transfer proteins of Agrobacterium tumefaciens plasmid pTiC58 PUBMED:12591886.

    \ ' '7652' 'IPR012850' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    Alpha-amylase is classified as family 13 of the glycosyl hydrolases and is present in archaea, bacteria, plants and animals. Alpha-amylase is an essential enzyme in alpha-glucan metabolism, acting to catalyse the hydrolysis of alpha-1,4-glucosidic bonds of glycogen, starch and related polysaccharides. Although all alpha-amylases possess the same catalytic function, they can vary with respect to sequence. In general, they are composed of three domains: a TIM barrel containing the active site residues and chloride ion-binding site (domain A), a long loop region inserted between the third beta strand and the alpha-helix of domain A that contains calcium-binding site(s) (domain B), and a C-terminal beta-sheet domain that appears to show some variability in sequence and length between amylases (domain C) PUBMED:11141191. Amylases have at least one conserved calcium-binding site, as calcium is essential for the stability of the enzyme. The chloride-binding functions to activate the enzyme, which acts by a two-step mechanism involving a catalytic nucleophile base (usually an Asp) and a catalytic proton donor (usually a Glu) that are responsible for the formation of the beta-linked glycosyl-enzyme intermediate.

    \

    This entry represents the beta-sheet domain that is found in several alpha-amylases, usually at the C-terminus. This domain is organised as a five-stranded anti-parallel beta-sheet PUBMED:9571044, PUBMED:8196040.

    \

    More information about this protein can be found at Protein of the Month: alpha-Amylase PUBMED:.

    \ ' '7653' 'IPR012497' '\

    The members of this family resemble neurotoxin B-IV (), which is a crustacean-selective neurotoxin produced by the marine worm Cerebratulus lacteus. This highly cationic peptide is approximately 55 residues and is arranged to form two antiparallel helices connected by a well-defined loop in a hairpin structure. The branches of the hairpin are linked by four disulphide bonds. Three residues identified as being important for activity, namely Arg-17, -25 and -34, are found on the same face of the molecule, while another residue important for activity, Trp30, is on the opposite side. The protein\'s mode of action is not entirely understood, but it may act on voltage-gated sodium channels, possibly by binding to an as yet uncharacterised site on these proteins. Its site of interaction may also be less specific, for example it may interact with negatively charged membrane lipids PUBMED:9180379.

    \ ' '7654' 'IPR012386' '\

    2\',3\' Cyclic nucleotide phosphodiesterases (CPDases) are enzymes that catalyse at least two distinct steps in the splicing of tRNA introns in eukaryotes. The active site is characterised by two conserved histidine residues PUBMED:12466548. The enzyme has six cysteine residues, four of which are involved in forming two intra-molecular disulphide bridges. One of these bridges is involved in the catalytic activity of the enzyme as it opens when CPDase is semi-reduced PUBMED:11694509.

    \ ' '7655' 'IPR013095' '\

    Type III secretion chaperones are involved in delivering virulence effector proteins from bacterial pathogens directly into eukaryotic cells. The chaperones may prevent aggregation and degradation of their substrates, may target the effector to the secretion apparatus, and may ensure a secretion-component unfolded conformation of their specific substrate. One member of this family, SigE () forms homodimers in crystal. The monomers have a novel fold with an alpha-beta(3)-alpha-beta(2)-alpha topology PUBMED:11685226.

    \ ' '7656' 'IPR012884' '\

    The phage-encoded excisionase protein (Xis, ) is involved in excisive recombination by regulating the assembly of the excisive intasome and by inhibiting viral integration. It adopts an unusual winged-helix structure in which two alpha helices are packed against two extended strands. Also present in the structure is a two-stranded anti-parallel beta-sheet, whose strands are connected by a four-residue wing. During interaction with DNA, helix alpha2 is thought to insert into the major groove, while the wing contacts the adjacent minor groove or phosphodiester backbone. The C-terminal region of Xis is involved in interaction with phage-encoded integrase (Int), and a putative C-terminal alpha helix may fold upon interaction with Int and/or DNA PUBMED:12460578.

    \ ' '7657' 'IPR020600' '\

    This entry represents inosine monophosphate (IMP) cyclohydrolase, found in archaeal species, as well as some bacterial proteins of unknown function.

    \ \

    IMP cyclohydrolase catalyses the cyclisation of 5-formylamidoimidazole-4-carboxamide ribonucleotide to IMP, a reaction which is important in de novo purine biosynthesis in archaeal species PUBMED:11844782. This single domain protein is arranged to form an overall fold that consists of a four-layered alpha-beta-beta-alpha core structure. The two antiparallel beta-sheets pack against each other and are covered by alpha-helices on one face of the molecule. The protein is structurally similar to members of the N-terminal nucleophile (NTN) hydrolase superfamily. A deep pocket was in fact found on the surface of IMP cyclohydrolase in a position equivalent to that of active sites of NTN-hydrolases, but an N-terminal nucleophile could not be found. Therefore, it is thought that this enzyme is structurally but not functionally similar to members of the NTN-hydrolase family PUBMED:12012346.

    \

    In bacteria this step is catalysed by a bifunctional enzyme (purH).

    \ ' '7658' 'IPR012481' '\

    Kanamycin nucleotidyltransferase (KNTase) is involved in conferring resistance to aminoglycoside antibiotics and catalyses the transfer of a nucleoside monophosphate group from a nucleotide to kanamycin. This enzyme is dimeric with each subunit being composed of two domains. The C-terminal domain contains five alpha helices, four of which are organised into an up-and-down alpha helical bundle. Residues found in this domain may contribute to this enzyme\'s active site PUBMED:7577914.

    \ ' '7659' 'IPR012905' '\

    The members of this family are similar to the galactophilic lectin-1 expressed by Pseudomonas aeruginosa (PA-IL, ). Lectins recognising specific carbohydrates found on the surface of host cells are known to be involved in the initiation of infections by this organism. The protein is thought to be organised into an extensive network of beta-sheets, as is the case with many other lectins PUBMED:1429650.

    \ ' '7660' 'IPR012498' '\

    Alpha-A conotoxin PIVA () is the major paralytic toxin found in the venom produced by the piscivorous snail Conus purpurascens. This peptide acts by blocking the acetylcholine-binding site of the nicotinic acetylcholine receptor at the neuromuscular junction PUBMED:7673220. The overall shape of the peptide is described as an "iron" with a highly charged hydrophilic loop of 15S-19R forming the "handle" domain that is exposed to the exterior of the protein. The stability of the conotoxin is primarily governed by three disulphide bonds. A triangular structural motif formed by residues 19R, 12H and 6Y is thought to constitute a "binding core" that is important in binding to the acetylcholine receptor PUBMED:9048550.

    \ ' '7661' 'IPR012911' '\

    Protein phosphatase 2C (PP2C) is involved in regulating cellular responses to stress in various eukaryotes. It consists of two domains: an N-terminal catalytic domain and a C-terminal domain characteristic of mammalian PP2Cs. This domain consists of three antiparallel alpha helices, one of which packs against two corresponding alpha-helices of the N-terminal domain. The C-terminal domain does not seem to play a role in catalysis, but it may provide protein substrate specificity due to the cleft that is created between it and the catalytic domain PUBMED:9003755.

    \ ' '7662' 'IPR013102' '\

    This domain is found at the C-terminal end of the large alpha/beta domain making up various pyrimidine nucleoside phosphorylases PUBMED:9817849, PUBMED:2199449. It has slightly different conformations in different members of this family. For example, in pyrimidine nucleoside phosphorylase (PYNP, ) there is an added three-stranded anti-parallel beta sheet as compared to other members of the family, such as Escherichia coli thymidine phosphorylase (TP, ) PUBMED:9817849. The domain contains an alpha/ beta hammerhead fold and residues in this domain seem to be important in formation of the homodimer PUBMED:9817849.

    \ ' '7663' 'IPR012415' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents Cfr10I and Bse634I restriction endonucleases. They exhibit a conserved tetrameric architecture that is of functional importance, wherein two dimers are arranged, back-to-back, with their putative DNA-binding clefts facing opposite directions. These clefts are formed between two monomers that interact, mainly via hydrophobic interactions supported by a few hydrogen bonds, to form a U-shaped dimer. Each monomer is folded to form a compact alpha-beta structure, whose core is made up of a five-stranded mixed beta-sheet. The monomer may be split into separate N-terminal and C-terminal subdomains at a hinge located in helix alpha3 PUBMED:11842098. Both Cfr10I and Bse634I recognise the double-stranded sequence RCCGGY and cleave after the purine R PUBMED:8568865.

    \ ' '7664' 'IPR012854' '\

    Amine oxidases (AO) are enzymes that catalyse the oxidation of a wide range of biogenic amines including many neurotransmitters, histamine and xenobiotic amines. There are two classes of amine oxidases: flavin-containing () and copper-containing (). Copper-containing AO act as a disulphide-linked homodimer. They catalyse the oxidation of primary amines to aldehydes, with the subsequent release of ammonia and hydrogen peroxide, which requires one copper ion per subunit and topaquinone as cofactor PUBMED:8591028: \

    \

    Copper-containing amine oxidases are found in bacteria, fungi, plants and animals. In prokaryotes, the enzyme enables various amine substrates to be used as sources of carbon and nitrogen PUBMED:9048544, PUBMED:9405045. In eukaryotes they have a broader range of functions, including cell differentiation and growth, wound healing, detoxification and cell signalling PUBMED:8805580.

    \

    The copper amine oxidases occur as mushroom-shaped homodimers of 70-95 kDa, each monomer containing a copper ion and a covalently bound redox cofactor, topaquinone (TPQ). TPQ is formed by post-translational modification of a conserved tyrosine residue. The copper ion is coordinated with three histidine residues and two water molecules in a distorted square pyramidal geometry, and has a dual function in catalysis and TPQ biogenesis. The catalytic domain is the largest of the 3-4 domains found in copper amine oxidases, and consists of a beta sandwich of 18 strands in two sheets. The active site is buried and requires a conformational change to allow the substrate access. The two N-terminal domains share a common structural fold, its core consisting of a five-stranded antiparallel beta sheet twisted around an alpha helix. The D1 domains from the two subunits comprise the stalk, of the mushroom-shaped dimer, and interact with each other but do not pack tightly against each other PUBMED:8591028, PUBMED:10576737.

    \ \

    This entry represents a domain found at the N-terminal of certain copper amine oxidases, as well as in related proteins such as cell wall hydrolase and N-acetylmuramoyl-L-alanine amidase. This domain consists of a five-stranded antiparallel beta-sheet twisted around an alpha helix PUBMED:8591028, PUBMED:10576737.

    \ ' '7665' 'IPR009109' '\

    Ran GTPase is a ubiquitous protein required for nuclear transport, spindle assembly, nuclear assembly and mitotic cell cycle regulation. RanGTPase activating protein 1 (RanGAP1) is one of several RanGTPase accessory proteins. During interphase, RanGAP1 is located in the cytoplasm, while during mitosis it becomes associated with the kinetochores PUBMED:12852855. Cytoplasmic RanGAP1 is required for RanGTPase-directed nuclear transport. The activity of RanGAP1 requires the accessory protein RanBP1. RanBP1 facilitates RanGAP1 hydrolysis of Ran-GTP, both directly and by promoting the dissociation of Ran-GTP from transport receptors, which would otherwise block RanGAP1-mediated hydrolysis. RanGAP1 is thought to bind to the Switch 1 and Switch 2 regions of RanGTPase. The Switch 2 region can be buried in complexes with karyopherin-beta2, and requires the interaction with RanBP1 to permit RanGAP1 function. RanGAP1 can undergo SUMO (small ubiquitin-like modifier) modification, which targets RanGAP1 to RanBP2/Nup358 in the nuclear pore complex, and is required for association with the nuclear pore complex and for nuclear transport PUBMED:11853669. The enzymes involved in SUMO modification are located on the filaments of the nuclear pore complex.

    \

    The RanGAP1 N-terminal domain is fairly well conserved between vertebrate and fungal proteins, but yeast does not contain the C-terminal domain. The C-terminal domain is SUMO-modified and required for the localisation of RanGAP1 at the nuclear pore complex. The structure of the C-terminal domain is multihelical, consisting of two curved alpha/alpha layers in a right-handed superhelix.

    \ ' '7666' 'IPR012422' '\

    Bacterial cytochrome c oxidase is found bound to the to the cell membrane, where it is involved in the generation of the transmembrane proton electrochemical gradient. It is composed of four subunits. Subunit IV consists of one transmembrane helix that does not interact directly with the other subunits, but maintains its position by indirect contacts via phospholipid molecules found in the structure. The function of subunit IV is as yet unknown PUBMED:12144789.

    \ ' '7667' 'IPR012425' '\

    This domain is found towards the C-terminal region of various aldolase enzymes. It consists of five alpha-helices, four of which form an antiparallel helical bundle that plugs the C-terminus of the N-terminal TIM barrel domain PUBMED:12764229. The communication domain is thought to play an important role in the heterodimerisation of the enzyme PUBMED:12764229.

    \ \

    Members of this entry heterodimerise with members of to form a bifunctional aldolase-dehydrogenase PUBMED:12764229.

    \ ' '7668' 'IPR012886' '\

    The formiminotransferase (FT) domain of formiminotransferase-cyclodeaminase (FTCD) forms a homodimer, with each protomer being comprised of two subdomains. The formiminotransferase domain has an N-terminal subdomain that is made up of a six-stranded mixed beta-pleated sheet and five alpha helices, which are arranged on the external surface of the beta sheet. This, in turn, faces the beta-sheet of the C-terminal subdomain to form a double beta-sheet layer. The two subdomains are separated by a short linker sequence, which is not thought to be any more flexible than the remainder of the molecule. The substrate is predicted to form a number of contacts with residues found in both the N-terminal and C-terminal subdomains PUBMED:10673422.

    \

    This entry represents the N-terminal subdomain of the formiminotransferase domain.

    \ ' '7670' 'IPR012417' '\

    The sequences featured in this family are found repeated in a number of plant calmodulin-binding proteins (such as , and ), and are thought to constitute the calmodulin-binding domains PUBMED:12825696, PUBMED:11684678. Binding of the proteins to calmodulin depends on the presence of calcium ions PUBMED:12825696, PUBMED:11684678. These proteins are thought to be involved in various processes, such as plant defence responses PUBMED:12825696 and stolonisation or tuberization PUBMED:11684678.

    \ ' '7671' 'IPR008920' '\

    Bacteria regulate membrane fluidity by manipulating the relative levels of saturated and unsaturated fatty acids within the phospholipids of their membrane bilayers. In Escherichia coli, the transcription factor, FadR, functions as a switch that co-ordinately regulates the machinery required for fatty acid beta-oxidation and the expression of a key enzyme in fatty acid biosynthesis. This single repressor controls the transcription of the whole fad regulon PUBMED:11279025. Binding of fadR is specifically inhibited by long chain fatty acyl-CoA compounds.

    \

    The crystal structure of FadR reveals a two domain dimeric molecule where the N-terminal winged-helix domain binds DNA (), and the C-terminal domain binds acyl-CoA PUBMED:11279025. The binding of acyl-CoA to the C-terminal domain results in a conformational change that affects the DNA binding affinity of the N-terminal domain PUBMED:11013219.

    \

    FadR is a member of the GntR family of bacterial transcription regulators. The DNA-binding domain is well conserved for this family, whereas the C-terminal effector-binding domain () is more variable, and is consequently used to define the GntR subfamilies PUBMED:11756427. The FadR group is the largest subgroup, and is characterised by an all-helical C-terminal domain composed of 6 to 7 alpha helices PUBMED:11013219. This entry represents the C-terminal domain of FadR.

    \ ' '7672' 'IPR006631' '\

    This domain of unknown function is found in primarily in Drosophila melanogaster (Fruit fly) proteins of unknown\ function.

    \ ' '7673' 'IPR012890' '\

    Sequences found in this family are similar to a region of a human GC-rich sequence DNA-binding factor homologue (). This is thought to be a protein involved in transcriptional regulation due to partial homologies to a transcription repressor and histone-interacting protein PUBMED:11707072.

    \ ' '7674' 'IPR012861' '\

    This family contains many hypothetical bacterial and archaeal proteins. A few members of this family are annotated as being putative transmembrane proteins, and the region in question in fact contains many hydrophobic residues.

    \ ' '7676' 'IPR012863' '\

    The sequences featured in this family are derived from a number of hypothetical prokaryotic proteins. The region in question is approximately 130 amino acids long.

    \ ' '7677' 'IPR012484' '\

    The sequence making up family 7 of the metallothionein superfamily are found repeated in metallothionein proteins expressed by two Tetrahymena species. Metallothioneins are low molecular mass, cysteine-rich metal-binding proteins that are thought to be involved in the regulation of levels of trace metals, and detoxification of these metals when present in excess PUBMED:7813475. Some of the metallothioneins found in this family (for example, ) are known to be induced by cadmium and are thought to be involved in the cellular sequestration of toxic metal ions. The high proportion of cysteine residues allows the metal ions to be bound by the formation of clusters of metal-thiolate complexes PUBMED:7813475. Tetrahymena spp. metallothioneins differ from other eukaryotic metallothioneins mainly in the length of their sequences and in the cysteine-containing motifs they exhibit.

    \ ' '7678' 'IPR012864' '\

    This family contains many eukaryotic hypothetical proteins. The region featured in this family is approximately 120 residues long. Members of this family may belong to the cupin superfamily.

    \ ' '7679' 'IPR012906' '\

    This entry describes the N-terminal region of proteins that are similar to, and nclude, the product of the paaX gene of Escherichia coli (). PaaX is a transcriptional regulator that is always found in association with operons believed to be involved in the degradation of phenylacetic acid PUBMED:11260461. The gene product has been shown to bind to the promoter sites and repress their transcription PUBMED:10766858.

    \ ' '7680' 'IPR012440' '\

    Archaeal and bacterial hypothetical proteins are found in this family, with the region in question being approximately 40 residues long.

    \ ' '7681' 'IPR012493' '\

    The sequences featured in this family are similar to a region of the human renin receptor () that bears a putative transmembrane spanning segment PUBMED:12045255. The renin receptor is involved in intracellular signal transduction by the activation of the ERK1/ERK2 pathway, and it also serves to increase the efficiency of angiotensinogen cleavage by receptor-bound renin, therefore facilitating angiotensin II generation and action on a cell surface PUBMED:12045255.

    \ ' '7682' 'IPR012926' '\

    A number of members of this family are annotated as being transmembrane proteins induced by tumour necrosis factor alpha, but no literature was found to support this.

    \ ' '7683' 'IPR012865' '\

    The sequences making up this family are derived from various hypothetical phage and prophage proteins. The region in question is approximately 140 amino acids long.

    \ ' '7684' 'IPR012867' '\

    This entry contains hypothetical proteins expressed by either bacterial or archaeal species. Some of these are annotated as being transmembrane proteins, and many contain a high proportion of hydrophobic residues.

    \ ' '7685' 'IPR012443' '\

    Some of the members of this family are hypothetical bacterial and archaeal proteins, but others are annotated as being cation transporters expressed by the archaeon Methanosarcina mazei (Methanosarcina frisia) (, and ).

    \ ' '7686' 'IPR012445' '\

    This family is made up of sequences derived from hypothetical eukaryotic proteins of unknown function.

    \ ' '7687' 'IPR012446' '\

    Sequences found in this family are derived from hypothetical eukaryotic proteins of unknown function. The region in question is approximately 280 residues long.

    \ ' '7688' 'IPR012435' '\

    These sequences are found in hypothetical eukaryotic proteins of unknown function. The region concerned is approximately 280 residues long.

    \ ' '7689' 'IPR013100' '\

    Epoxide hydrolases catalyse the hydrolysis of epoxides to corresponding diols, which is important in detoxification, synthesis of signal molecules, or metabolism. Limonene-1,2- epoxide hydrolase (LEH) differs from many other epoxide hydrolases in its structure and its novel one-step catalytic mechanism. Its main fold consists of a six-stranded mixed beta-sheet, with three N-terminal alpha helices packed to one side to create a pocket that extends into the protein core. A fourth helix lies in such a way that it acts as a rim to this pocket. Although mainly lined by hydrophobic residues, this pocket features a cluster of polar groups that lie at its deepest point and constitute the enzyme\'s active site PUBMED:12773375.

    \ ' '7690' 'IPR013094' '\

    The alpha/beta hydrolase fold PUBMED:1409539 is common to a number of hydrolytic enzymes of widely differing phylogenetic origin and catalytic function. The core of each enzyme is an alpha/beta-sheet (rather than a barrel), containing 8 strands connected by helices PUBMED:1409539. The enzymes are believed to have diverged from a common ancestor, preserving the arrangement of the catalytic residues. All have a catalytic triad, the elements of which are borne on loops, which are the best conserved structural features of the fold. Esterase (EST) from Pseudomonas putida is a member of the alpha/beta hydrolase fold superfamily of enzymes PUBMED:16321951.

    \ \

    In most of the family members the beta-strands are parallels, but some have an inversion of the first strands, which gives it an antiparallel orientation. The catalytic triad residues are presented on loops. One of these is the nucleophile elbow and is the most conserved feature of the fold. Some other members lack one or all of the catalytic residues. Some members are therefore inactive but others are involved in surface recognition. The ESTHER database PUBMED: gathers and annotates all the published information related to gene and protein sequences of this superfamily PUBMED:14681380.

    \

    This entry represents the catalytic domain fold-3 of alpha/beta hydrolase.

    \ ' '7691' 'IPR012421' '\

    This entry represents the C-terminal domain found in the Tropheryma whipplei WisP family of proteins PUBMED:12606174.

    \ ' '7692' 'IPR012503' '\

    This family is found at the N-terminus of the Tropheryma whipplei WisP family proteins PUBMED:12606174.

    \ ' '7693' 'IPR012903' '\

    This domain is found in the cyanobacteria, and the nitrogen-fixing proteobacterium Azotobacter vinelandii and may be involved in nitrogen fixation, but no role has been assigned PUBMED:2644218.

    \ \ ' '7694' 'IPR012424' '\

    Members of this family have been implicated in as being involved in an unusual form of DNA transfer (conjugation) in Bacteroides PUBMED:11319931. The family has been named CtnDOT_TraJ to avoid confusion with other conjugative transfer systems.

    \ ' '7695' 'IPR012447' '\

    The proteins in this entry have not been characterised.

    \ ' '7696' 'IPR012448' '\

    The proteins in this entry have not been characterised.

    \ ' '7697' 'IPR012868' '\

    The proteins in this entry have not been characterised.

    \ ' '7698' 'IPR012449' '\

    This family consists of proteins from the Pseudomonadaceae.

    \ ' '7699' 'IPR012450' '\

    This protein is found in some prophages found in Lactobacillales lactis PUBMED:11160885.

    \ ' '7700' 'IPR012451' '\

    The proteins in this entry have not been characterised.

    \ ' '7701' 'IPR012452' '\

    This domain appears to be restricted to the Bacillales.

    \ ' '7702' 'IPR012453' '\

    This family of small proteins seems to be found in several places in the Coxiella genome.

    \ ' '7703' 'IPR012454' '\

    This family consists of hypothetical bacterial proteins of unknown function

    \ ' '7704' 'IPR012504' '\

    Members of this protein family are the YabP protein of the bacterial sporulation program, as found in Bacillus subtilis, Clostridium tetani, and other spore-forming members of the Firmicutes. In B. subtilis, a yabP single mutant appears to sporulate and germinate normally PUBMED:11283287, but is in an operon with yabQ (essential for formation of the spore cortex), it near-universal among endospore-forming bacteria, and is found nowhere else. It is likely, therefore, that YabP does have a function in sporulation or germination, one that is either unappreciated or partially redundant with that of another protein.

    \ ' '7705' 'IPR012455' '\

    This protein is found in Lactobacillae prophages.

    \ ' '7706' 'IPR012851' '\

    The Coat F proteins contribute to the Bacillales spore coat. They occur multiple times in the genomes in which they are found. Bacillus subtilis endospore protein coats protect them and may play a role in their germination PUBMED:18723620. Spore coat protein F, on the outer surface of the endospore, is one of a suite of proteins that could be used to differentiate between members of the Bacillus genus PUBMED:14711677.

    \ ' '7707' 'IPR013097' '\

    The function of this domain is unknown, but it is upregulated in response to salt stress in Populus balsamifera (balsam poplar) PUBMED:14704136. It is also found at the C-terminus of a fructose 1,6-bisphosphate aldolase from Hydrogenophilus thermoluteolus () PUBMED:10705449. is found in the pA01 plasmid, which encodes genes for molybdopterin uptake and degradation of plant alkaloid nicotine. The structure of one has been solved () and the domain forms an alpha-beta barrel dimer PUBMED:14872131. Although there is a clear duplication within the domain it is not obviously detectable in the sequence.

    \ ' '7708' 'IPR012456' '\

    The proteins in this entry have not been characterised.

    \ ' '7709' 'IPR012869' '\

    The proteins in this family have not been characterised, but contain a ribbon-helix-helix domain, making them a family of putative repressors.

    \ ' '7710' 'IPR012909' '\

    This domain is found at the N-terminus of the polyhydroxyalkanoate (PHA) synthesis regulators. These regulators have been shown to directly bind DNA and PHA PUBMED:12081972. The invariant nature of this domain compared to the C-terminal domain(s) suggests that it contains the DNA-binding function.

    \ ' '7711' 'IPR008987' '\

    The Bacteriophage T4 is a double-stranded, structurally complex virus that infects Escherichia coli. Gene product 9 (gp9) connects the long tail fibres to the baseplate, and triggers baseplate reorganization and tail contraction after virus attachment to the host cell. The gp9 protein forms a homotrimer, with each monomer having three domains: the N-terminal alpha-helical domain forms a triple coiled coil, the middle domain is a mixed, seven-stranded beta sandwich with a unique fold, and the C-terminal domain is a eight-stranded beta-sandwich with similarity to jellyroll viral capsid protein structures PUBMED:10545330. The flexible loops that occur between domains may enable the conformational changes necessary during infection.

    \ ' '7712' 'IPR012888' '\

    Proteins containing this domain are similar to L-fucose isomerase expressed by Escherichia coli (, ). This enzyme corresponds to glucose-6-phosphate isomerase in glycolysis, and converts an aldo-hexose to a ketose to prepare it for aldol cleavage. The enzyme is a hexamer, with each subunit being wedge-shaped and composed of three domains. Both domains 1 and 2 contain central parallel beta-sheets with surrounding alpha helices. Domain 1 demonstrates the beta-alpha-beta-alpha- beta Rossman fold. The active centre is shared between pairs of subunits related along the molecular three-fold axis, with domains 2 and 3 from one subunit providing most of the substrate-contacting residues, and domain 1 from the adjacent subunit contributing some other residues PUBMED:9367760.

    \ ' '7713' 'IPR012889' '\

    Proteins containing this domain are similar to L-fucose isomerase expressed by Escherichia coli (, ). This enzyme corresponds to glucose-6-phosphate isomerase in glycolysis, and converts an aldo-hexose to a ketose to prepare it for aldol cleavage. The enzyme is a hexamer, with each subunit being wedge-shaped and composed of three domains. Both domains 1 and 2 contain central parallel beta- sheets with surrounding alpha helices. The active centre is shared between pairs of subunits related along the molecular three-fold axis, with domains 2 and 3 from one subunit providing most of the substrate-contacting residues PUBMED:9367760.

    \ ' '7714' 'IPR013096' '\

    This family represents the conserved barrel domain of the cupin superfamily PUBMED:9573603 (cupa is the Latin term for a small barrel).

    \ ' '7715' 'IPR012932' '\

    Vitamin K epoxide reductase (VKOR) recycles reduced vitamin K, which is used subsequently as a co-factor in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. VKORC1 is a member of a large family of predicted enzymes that are present in vertebrates, Drosophila, plants, bacteria and archaea PUBMED:15276181. Four cysteine residues and one residue, which is either serine or threonine, are identified as likely active-site residues PUBMED:15276181. In some plant and bacterial homologues the VKORC1 homologous domain is fused with domains of the thioredoxin family of oxidoreductases PUBMED:15276181.

    \ ' '7716' 'IPR013099' '\

    This entry includes the two membrane helix type ion channels found in bacteria PUBMED:11836519.

    \ ' '7717' 'IPR012413' '\

    The sequences found in this family are similar to the BA14K proteins expressed by Brucella abortus () and by Brucella suis (). BA14K was found to be strongly immunoreactive; it induces both humoral and cellular responses in hosts throughout the infective process PUBMED:9673296.

    \ ' '7718' 'IPR012416' '\

    The members of this family are putative or actual calmodulin binding proteins expressed by various plant species. Some members (for example, ), are known to be involved in the induction of plant defence responses PUBMED:12777041. However, their precise function in this regard is as yet unknown.

    \ ' '7719' 'IPR012852' '\

    Proteins found in this family are similar to the coiled-coil transcriptional coactivator protein expressed by Mus musculus (CoCoA, ). This protein binds to a highly conserved N-terminal domain of p160 coactivators, such as GRIP1 (), and thus enhances transcriptional activation by a number of nuclear receptors. CoCoA has a central coiled-coil region with three leucine zipper motifs, which is required for its interaction with GRIP1 and may regulate the autonomous transcriptional activation activity of the C-terminal region PUBMED:14690606.

    \ ' '7720' 'IPR012458' '\

    The members of this family are hypothetical plant proteins of unknown function. The region featured in this family is approximately 100 amino acids long.

    \ ' '7721' 'IPR012459' '\

    This family contains sequences from a number of hypothetical eukaryotic proteins of unknown function. The region featured is approximately 150 amino acids long.

    \ ' '7722' 'IPR012870' '\

    These sequences are derived from hypothetical plant proteins of unknown function. The region in question is approximately 250 residues long.

    \ ' '7723' 'IPR012460' '\

    Hypothetical archaeal and bacterial proteins make up this family. A few proteins are annotated as being potential metal-binding proteins, and in fact the members of this family have four highly conserved cysteine residues, but no further literature evidence was found in this regard.

    \ ' '7724' 'IPR012871' '\

    The hypothetical proteins found in this family are expressed by Oryza sativa (Rice) and are of unknown function.

    \ ' '7725' 'IPR012461' '\

    This family is composed of sequences derived from hypothetical eukaryotic proteins of unknown function. Some members of this family are annotated as being potential phospholipases but no literature was found to support this.

    \ ' '7726' 'IPR012874' '\

    This family contains hypothetical proteins of unknown function found in Methanosarcina acetivorans and Methanosarcina mazei.

    \ ' '7727' 'IPR012875' '\

    The members of this family are sequences derived from hypothetical eukaryotic and bacterial proteins. The region in question is approximately 60 residues long.

    \ ' '7728' 'IPR012463' '\

    The members of this family are sequences derived from hypothetical plant proteins of unknown function. One member of this family () is annotated as a putative RNA-binding protein, but no evidence was found to support this.

    \ ' '7729' 'IPR012464' '\

    This family contains sequences derived from proteins of unknown function expressed by Drosophila melanogaster and Anopheles gambiae.

    \ ' '7730' 'IPR012474' '\

    This family is composed of plant proteins that are similar to FRIGIDA protein expressed by Arabidopsis thaliana (Mouse-ear cress) (). This protein is probably nuclear and is required for the regulation of flowering time in the late-flowering phenotype. It is known to increase RNA levels of flowering locus C. Allelic variation at the FRIGIDA locus is a major determinant of natural variation in flowering time PUBMED:11030654.

    \ ' '7731' 'IPR012872' '\

    The hypothetical eukaryotic proteins found in this family are of unknown function.

    \ ' '7732' 'IPR012873' '\

    This family is composed of hypothetical bacterial proteins of unknown function.

    \ ' '7733' 'IPR012892' '\

    Sequences found in this entry are derived from a number of bacteriophage and prophage proteins. They are similar to gp58 (), a minor structural protein of Lactococcus delbrueckii bacteriophage LL-H PUBMED:7828907.

    \ ' '7734' 'IPR012490' '\

    This is a family of proteins expressed by the crenarchaeon Pyrobaculum aerophilum. The members are highly variable in length and level of conservation. The presence of numerous frameshifts and internal stop codons in multiple alignments are thought to indicate that most family members are no longer functional PUBMED:11792869.

    \ ' '7735' 'IPR012423' '\

    The Saccharomyces cerevisiae (Baker\'s yeast) member of this family is part of NuA4, the only essential histone acetyltransferase complex in S. cerevisiae involved in global histone acetylation PUBMED:15353583.

    \ ' '7736' 'IPR012914' '\

    This domain is found in the purine catabolism regulatory protein expressed by Bacillus subtilis (PucR, ). PucR is thought to be a transcriptional regulator of genes involved in the purine degradation pathway, and may contain a LysR-like DNA-binding domain. It is similar to LysR-type regulators in that it represses its own expression PUBMED:11344136. The other members of this family are also putative regulatory proteins.

    \ ' '7737' 'IPR012927' '\

    This domain is present in the N-terminal region of the ShET2 enterotoxin produced by Shigella flexneri () and Escherichia coli (). This protein was found to confer toxigenicity in Ussing chamber assays, and the N-terminal region was found to be important for its enterotoxic effect. It is thought to be a hydrophobic protein that forms inclusion bodies within the bacterial cell, and may be secreted by the Mxi system PUBMED:7591128. Most proteins containing this domain are annotated as putative enterotoxins, but one member () is a regulator of acetyl CoA synthetase, and another two members ( and ) are annotated as ankyrin-like regulatory proteins and contain Ank repeats ().

    \ ' '7738' 'IPR012507' '\

    The sequences featured in this family are similar to two proteins expressed by Lactococcus lactis, YibE () and YibF (). Most of the members of this family are annotated as being putative membrane proteins, and in fact the sequences contain a high proportion of hydrophobic residues.

    \ ' '7739' 'IPR012855' '\

    D-aminoacylase (, ) hydrolyses a wide variety of N-acyl derivatives of neutral D-amino acids, in a zinc-dependent manner. The enzyme is composed of a small beta-barrel domain and a larger catalytic alpha/beta-barrel. The C-terminal region featured in this family forms part of the beta-barrel domain, together with a short N-terminal segment. The beta-strands of both barrels were found to superimpose well. The small beta-barrel domain does not seem to contribute to the substrate-binding site or to be involved in the catalytic process PUBMED:12454005.

    \ ' '7740' 'IPR012457' '\

    The members of this family are hypothetical proteins expressed by Trypanosoma cruzi, a eukaryotic parasite that causes Chagas, disease in humans. This region is found as multiple copies per protein.

    \ ' '7741' 'IPR012462' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This entry contains UfSP1 and UfSP2, which are cysteine peptidases required for the processing and activation of Ubiquitin fold modifier 1 (Ufm1, ) and for its release from conjugated cellular proteins. UfSP1 and UfSP2 are 217 aa and 461 aa respectively PUBMED:15071506, PUBMED:17182609. The peptidases belong to MEROPS peptidase family C78, clan CA. The UfSP2 family have an N-terminal extension with one or more zinc finger domains of the C2H2 type (), which have been shown to be involved in protein:protein interaction. UfSP2 is present in most, if not all, multi-cellular organisms including plants, nematodes, flies, and mammals, whereas UfSP1 is not present in plants and nematodes PUBMED:17182609.

    \ ' '7742' 'IPR012876' '\

    The sequences found in this family are all derived from hypothetical plant proteins of unknown function. The region features a number of highly conserved cysteine residues.

    \ ' '7743' 'IPR012883' '\

    ERp29 () is a ubiquitously expressed endoplasmic reticulum protein, and is involved in the processes of protein maturation and protein secretion in this organelle PUBMED:10727933, PUBMED:11435111. The protein exists as a homodimer, with each monomer being composed of two domains. The N-terminal domain featured in this family is organised into a thioredoxin-like fold that resembles the a domain of human protein disulphide isomerase (PDI) PUBMED:11435111. However, this domain lacks the C-X-X-C motif required for the redox function of PDI; it is therefore thought that the function of ERp29 is similar to the chaperone function of PDI PUBMED:11435111. The N-terminal domain is exclusively responsible for the homodimerisation of the protein, without covalent linkages or additional contacts with other domains PUBMED:11435111.

    \ ' '7744' 'IPR012465' '\

    This family is composed of uncharacterised proteins expressed by Methanopyrus kandleri, a hyperthermophilic archaeon.

    \ ' '7745' 'IPR012877' '\

    This region is found in a number of Caenorhabditis elegans and Caenorhabditis briggsae proteins, in one case () as a repeat. In many of the family members, this region is associated with the CHK region described by SMART as being found in zinc finger-C4 and HLH domain-containing kinases. In fact, one member of this family () is annotated as being a member of the nuclear hormone receptor family, and contains regions typical of such proteins (, , and ).

    \ ' '7746' 'IPR012913' '\

    The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II (), which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyses the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing PUBMED:10929008. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum PUBMED:10929008. Mutations in the gene coding for PRKCSH have been found to be involved in the development of autosomal dominant polycystic liver disease (ADPLD), but the precise role the protein has in the pathogenesis of this disease is unknown PUBMED:12529853.

    \ ' '7747' 'IPR012931' '\

    This domain is found in the N-terminal region of the TraG protein () from Escherichia coli. This is a membrane-spanning protein, with three predicted transmembrane segments and two periplasmic regions PUBMED:1348105. The TraG protein is known to be essential for DNA transfer in the process of conjugation, with the N-terminal portion being required for F pilus assembly PUBMED:1348105, PUBMED:7915817. The protein is thought to interact with the periplasmic domain of TraN () to stabilise mating-cell interactions PUBMED:7915817.

    \ ' '7749' 'IPR012418' '\

    This region featured in this family is repeated in spinach cold acclimation protein CAP160 () CAP160 is induced during periods of drought stress; its precise function is unknown but it has been implicated in the stabilisation of membranes, cytoskeletal elements, and ribosomes. By acting as a compatible solute, it may reduce the toxic effects of cellular solutes that accumulate at high concentration PUBMED:9536054. Other members of this family are also induced by water stress, abscisic acid, and/or low temperature, such as desiccation-responsive protein 29B () and CDet11-24 protein ().

    \ ' '7750' 'IPR012880' '\

    The proteins featured in this family are all hypothetical eukaryotic proteins of unknown function. The region in question is approximately 150 residues long.

    \ ' '7751' 'IPR012467' '\

    The sequences featured in this family are found in hypothetical archaeal and bacterial proteins of unknown function. The region in question is approximately 200 amino acids long.

    \ ' '7752' 'IPR012473' '\

    This family features sequences bearing similarity to the C-terminal portion of the Bacteriophage T4 protein fibritin (). This protein is responsible for attachment of long tail fibres to virus particle, and forms the, "whiskers", or fibres on the neck of the virion. The region seen in this family contains an N-terminal coiled-coil portion and the C-terminal globular foldon domain (residues 457-486), which is essential for fibritin trimerisation and folding PUBMED:15033360. This domain consists of a beta-hairpin; three such hairpins come together in a beta-propeller-like arrangement in the trimer, which is stabilised by hydrogen bonds, salt bridges and hydrophobic interactions PUBMED:15033360.

    \ ' '7753' 'IPR012477' '\

    This family features glycosyltransferases belonging to glycosyltransferase family 52 PUBMED:12691742, which have alpha-2,3- sialyltransferase () and alpha-glucosyltransferase () activity. For example, beta-galactoside alpha-2,3- sialyltransferase expressed by Neisseria meningitidis ()is a member of this family and is involved in a step of lipooligosaccharide biosynthesis requiring sialic acid transfer; these lipooligosaccharides are thought to be important in the process of pathogenesis PUBMED:8910446.

    \ ' '7754' 'IPR012486' '\

    The sequences featured in this family are similar to a hypothetical protein product of ORF N1221 in the CPT1-SPC98 intergenic region of the yeast genome (). This encodes an acidic polypeptide with several possible transmembrane regions PUBMED:8619318.

    \ ' '7755' 'IPR012489' '\

    This family consists of protein sequences that are similar to the nuclease A inhibitor expressed by bacteria of the genus Anabaena (NuiA, ). This sequence is organised to form an alpha-beta-alpha sandwich fold, which is similar to the PR-1-like fold. NuiA interacts with nuclease A by means of residues located at one end of the molecule, including residues making up the loop between helices III and IV and the loop between strands C and D. The mechanism of inhibition of nuclease A by NuiA is as yet incompletely understood PUBMED:12095254.

    \ ' '7756' 'IPR012915' '\

    The sequences in this family are similar to the reoviral minor core protein lambda 3 (), which functions as a RNA-dependent RNA polymerase within the protein capsid. It is organised into 3 domains. The N- and C-terminal domains create a "cage" which encloses a conserved central catalytic domain within a hollow centre. This catalytic domain is arranged to form finger, palm and thumb subdomains. Unlike other RNA polymerases, such as HIV reverse transcriptase and T7 RNA polymerase, the lambda 3 protein binds template and substrate with only localised rearrangements, and catalytic activity can occur with little structural change. However, the structure of the catalytic complex is similar to that of other polymerase catalytic complexes with known structure PUBMED:12464184.

    \ ' '7757' 'IPR012929' '\

    This domain is found in a number of proteins, including TPR protein () and yeast myosin-like proteins 1 (MLP1, ) and 2 (MLP2, ). These proteins share a number of features; for example, they all have coiled-coil regions and all three are associated with nuclear pores PUBMED:9024684, PUBMED:7798308, PUBMED:10617624. TPR is thought to be a component of nuclear pore complex- attached intranuclear filaments PUBMED:9024684, and is implicated in nuclear protein import PUBMED:7798308. Moreover, its N-terminal region is involved in the activation of oncogenic kinases, possibly by mediating the dimerisation of kinase domains or by targeting these kinases to the nuclear pore complex PUBMED:7798308. MLP1 and MLP2 are involved in the process of telomere length regulation, where they are thought to interact with proteins such as Tel1p and modulate their activity PUBMED:12490156.

    \ ' '7758' 'IPR012933' '\

    The viral, archaeal and bacterial proteins making up this family are similar to the YcfA protein expressed by Escherichia coli (). Most of these proteins are hypothetical proteins of unknown function.

    \ ' '7759' 'IPR012501' '\

    This family contains various proteins that are homologues of the yeast Vps54 protein, such as the rat homologue (), the human homologue (), and the mouse homologue (). In yeast, Vps54 associates with Vps52 and Vps53 proteins to form a trimolecular complex that is involved in protein transport between Golgi, endosomal, and vacuolar compartments PUBMED:12039048. All Vps54 homologues contain a coiled coil region (not found in the region featured in this family) and multiple dileucine motifs PUBMED:12039048.

    \ ' '7760' 'IPR012912' '\

    Members of this family are similar to the protein product of ORF-3 () found on plasmid pRiA4 in the bacterium Agrobacterium rhizogenes. This plasmid is responsible for tumourigenesis at wound sites of plants infected by this bacterium, but the ORF-3 product does not seem to be involved in the pathogenetic process PUBMED:2226811. Other proteins found in this family are annotated as being putative TnpR resolvases (, ), but no further evidence was found to back this. Moreover, another member of this family is described as a probable lexA repressor () and in fact carries a LexA DNA binding domain (), but no references were found to expand on this.

    \ ' '7761' 'IPR012856' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    D-aminopeptidase () is a dimeric enzyme with each monomer being composed of three domains. Domain B is organised to form a beta barrel made up of eight antiparallel beta strands. It is connected to domain A, the catalytic domain, by an eight-residue sequence, and also interacts with both domains A and C via non-covalent bonds. Domain B probably functions in maintaining domain C in a good position to interact with the catalytic domain PUBMED:10986464.

    \ \

    This domain is found in peptidases that belong to MEROPS peptidase family S12 (D-Ala-D-Ala carboxypeptidase B family, clan ME).

    \ ' '7762' 'IPR012853' '\

    The members of this family are all similar to chloramphenicol 3-O phosphotransferase (CPT, ) expressed by Streptomyces venezuelae. Chloramphenicol (Cm) is a metabolite produced by this bacterium that can inhibit ribosomal peptidyl transferase activity and therefore protein production. By transferring a phosphate group to the C-3 hydroxyl group of Cm, CPT inactivates this potentially lethal metabolite PUBMED:11468347, PUBMED:10835366.

    \ ' '7763' 'IPR012857' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    D-aminopeptidase () is a dimeric enzyme with each monomer being composed of three domains. Domain C is organised to form a beta barrel made up of eight antiparallel beta strands. It is connected to domain B by a short linker sequence, and interacts extensively with the domain A, the catalytic domain. The gamma loop of domain C forms part of the wall of the catalytic pocket; domain C is in fact thought to confer substrate and inhibitor specificity to the enzyme.

    \ \

    This domain is found in peptidases that belong to MEROPS peptidase family S12 (D-Ala-D-Ala carboxypeptidase B family, clan ME).

    \ ' '7764' 'IPR012466' '\

    NECAP 1 localises to clathrin-coated pits and direct binding to the globular ear domain of the alpha-adaptin subunit (alpha-ear) of the adaptor protein 2 (AP-2) complex. This interaction is mediated by a specific motif, WVQF, that uses a distinct alpha-ear interface relative to known alpha-ear-binding partners. Disruption of this interaction blocks clathrin-mediated endocytosis PUBMED:14555962.

    \ ' '7765' 'IPR012904' '\

    The presence of 8-oxoguanine residues in DNA can give rise to G-C to T-A transversion mutations. This enzyme is found in archaeal, bacterial and eukaryotic species, and is specifically responsible for the process which leads to the removal of 8-oxoguanine residues. It has DNA glycosylase activity () and DNA lyase activity () PUBMED:10706276. The region featured in this family is the N-terminal domain, which is organised into a single copy of a TBP-like fold. The domain contributes residues to the 8-oxoguanine binding pocket PUBMED:11902834.

    \ ' '7766' 'IPR012922' '\

    The sequences featured in this family are similar to a probable integrase () expressed by the SSV1 virus of the archaeon Sulfolobus shibatae. This protein may be necessary for the integration of the virus into the host genome by a process of site-specific recombination PUBMED:1926776.

    \ ' '7767' 'IPR012414' '\

    This family features the antihypertensive and antiviral proteins BDS-I () and BDS-II () expressed by Anemonia sulcata. BDS-I is organised into a triple-stranded antiparallel beta-sheet, with an additional small antiparallel beta-sheet at the N-terminus PUBMED:2566326. Both peptides are known to specifically block the Kv3.4 potassium channel, and thus bring about a decrease in blood pressure PUBMED:9506974. Moreover, they inhibit the cytopathic effects of mouse hepatitis virus strain MHV-A59 on mouse liver cells, by an unknown mechanism PUBMED:2566326.

    \ ' '7768' 'IPR012468' '\

    The members of this family are all hypothetical proteins of unknown function expressed by the eukaryotic parasite Encephalitozoon cuniculi GB-M1. The region in question is approximately 250 amino acids long.

    \ ' '7769' 'IPR012475' '\

    Lectins are involved in many recognition events at the molecular or cellular level. These fungal lectins, such as Aleuria aurantialectin (AAL, ), specifically recognise fucosylated glycans. AAL is a dimeric protein, with each monomer being organised into a six-bladed beta-propeller fold and a small antiparallel two-stranded beta-sheet. The beta-propeller fold is important in fucose recognition; five binding pockets are found between the propeller blades. The small beta-sheet, on the other hand, is involved in the dimerisation process PUBMED:12732625.

    \ ' '7770' 'IPR012881' '\

    The members of this family are hypothetical eukaryotic proteins of unknown function. The region in question is approximately 100 amino acid residues long.

    \ ' '7771' 'IPR012480' '\

    This family features sequences that are similar to a region of the Flavobacterium heparinum proteins heparinase II () and heparinase III (). The former is known to degrade heparin and heparan sulphate, whereas the latter predominantly degrades heparan sulphate. Both are secreted into the periplasmic space upon induction with heparin PUBMED:8702264.

    \ ' '7772' 'IPR012897' '\

    Potassium channels are the most diverse group of the ion channel family\ PUBMED:1772658, PUBMED:1879548. They are important in shaping the action potential, and in neuronal excitability and plasticity PUBMED:2451788. The potassium channel family is\ composed of several functionally distinct isoforms, which can be broadly\ separated into 2 groups PUBMED:2555158: the practically non-inactivating \'delayed\' group and the rapidly inactivating \'transient\' group.

    \

    These are all highly similar proteins, with only small amino acid\ changes causing the diversity of the voltage-dependent gating mechanism,\ channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or\ other second messengers PUBMED:2448635. In eukaryotic cells, K+ channels\ are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes PUBMED:1373731. In prokaryotic cells, they play a role in the\ maintenance of ionic homeostasis PUBMED:11178249.

    \

    All K+ channels discovered so far possess a core of \ alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has\ been termed the K+ selectivity sequence.\ In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane.\ However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains.\ The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK)\ PUBMED:11178249, PUBMED:. The 2TM domain family comprises inward-rectifying K+ \ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.

    \

    The Kv family can be divided into several subfamilies on the basis of sequence similarity and function. Four of these subfamilies, Kv1 (Shaker), Kv2 (Shab), Kv3 (Shaw) and Kv4 (Shal), consist of pore-forming alpha subunits that associate with different types of beta subunit. Each alpha subunit comprises six hydrophobic TM domains with a P-domain between the fifth and sixth, which partially resides in the membrane. The fourth TM domain has positively charged residues at every third residue and acts as a voltage sensor, which triggers the conformational change that opens the channel pore in response to a displacement in membrane potential PUBMED:10712896. More recently, 4 new electrically-silent alpha subunits have been cloned: Kv5 (KCNF), Kv6 (KCNG), Kv8 and Kv9 (KCNS). These subunits do not themselves possess any functional activity, but appear to form heteromeric channels with Kv2 subunits, and thus modulate Shab channel activity PUBMED:9305895. When highly expressed, they inhibit channel activity, but at lower levels show more specific modulatory actions.

    \

    The first Kv1 sequence (also known as Shaker) was found in Drosophila melanogaster (Fruit fly). Several vertebrate potassium channels with similar amino acid sequences were subsequently found and, together with the D. melanogaster Shaker channel, now constitute the Kv1 family. The family consists of at least 6 genes (Kv1.1, Kv1.2, Kv1.3, Kv1.4, Kv1.5 and Kv1.6) which each play distinct physiological roles. A conserved motif found towards the C-terminus of these channels is required for efficient processing and surface expression PUBMED:11343973. Variations in this motif account for the differences in cell surface expression and localisation between family members. These channels are mostly expressed in the brain, but can also be found in non-excitable cells, such as lymphocytes PUBMED:10798390, PUBMED:.

    \

    This entry features the tandem inactivation domain found at the N-terminus of the Kv1.4 potassium channel. It is composed of two subdomains. Inactivation domain 1 (ID1, residues 1-38) consists of a flexible N-terminus anchored at a 5-turn helix, and is thought to work by occluding the ion pathway, as is the case with a classical ball domain. Inactivation domain 2 (ID2, residues 40-50) is a 2.5 turn helix with a high proportion of hydrophobic residues that probably serves to attach ID1 to the cytoplasmic face of the channel. In this way, it can promote rapid access of ID1 to the receptor site in the open channel. ID1 and ID2 function together to bring about fast inactivation of the Kv1.4 channel, which is important for the role of the channel in short-term plasticity PUBMED:12590144.

    \ ' '7773' 'IPR012901' '\

    This family features sequences that are similar to a region of hypothetical yeast gene product N2227 (). This is thought to be expressed during meiosis and may be involved in the defence response to stressful conditions PUBMED:8771715.

    \ ' '7774' 'IPR012907' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group of serine peptidases belong to MEROPS peptidase family S11 (D-Ala-D-Ala carboxypeptidase A family, clan SE). The protein fold of the peptidase domain for members of this family resembles that of D-Ala-D-Ala-carboxypeptidase B, the type example for clan SE.

    \ \

    This entry also contains proteins that are annoatated as penicillin-binding protein 6, these also belong to MEROPS peptidase family S11. Penicillin-binding protein 6 expressed by Escherichia coli () functions as a D-alanyl-D-alanine carboxypeptidase. It is composed of two domains that are oriented at approximately right angles to each other. The N-terminal domain () is the catalytic domain. The C-terminal domain, this entry, is organised into a sandwich of two anti-parallel beta-sheets, and has a relatively hydrophobic surface as compared to the N-terminal domain. Its precise function is unknown; it may mediate interactions with other cell wall-synthesising enzymes, thus allowing the protein to be recruited to areas of active cell wall synthesis. It may also function as a linker domain that positions the active site in the catalytic domain closer to the peptidoglycan layer, to allow it to interact with cell wall peptides PUBMED:10967102.

    \ ' '7775' 'IPR012878' '\

    The members of this family are sequences derived from hypothetical bacterial and eukaryotic proteins of unknown function. One member of this family is annotated as a possible arabinosidase, but no references were found to back this.

    \ ' '7776' 'IPR012499' '\

    This family includes three peptides secreted by the spider Hadronyche versuta (Blue mountains funnel-web spider) (, , ). These are insect-selective, excitatory neurotoxins that may function by antagonising muscle acetylcholine receptors, or acetylcholine receptor subtypes present in other invertebrate neurons PUBMED:10881200. Janus atracotoxin-Hv1c (J-ACTX-Hv1c, ) is organised into a disulphide-rich globular core (residues 3-19) and a beta-hairpin (residues 20-34). There are 4 disulphide bridges, one of which is a vicinal disulphide bridge; this is known to be unimportant in the maintenance of structure but critical for insecticidal activity PUBMED:10881200.

    \ ' '7777' 'IPR012879' '\

    The members of this family are all hypothetical eukaryotic proteins of unknown function. One member () is described as being an adipocyte-specific protein, but no evidence of this was found.

    \ ' '7778' 'IPR012506' '\

    The members of this family are similar to the hypothetical protein yhhN expressed by Escherichia coli (). Many of the members of this family are annotated as being possible transmembrane proteins, and in fact they all have a high proportion of hydrophobic residues.

    \ ' '7779' 'IPR012487' '\

    The sequences in this family are similar to the Dugbe virus (Dugbe nairovirus) M polyprotein precursor (), which includes glycoproteins G1 and G2. Both are thought to be inserted in the membrane of the Golgi complex of the infected host cell, and G1 is known to have a role in infection of vertebrate hosts PUBMED:1387749.

    \ ' '7780' 'IPR012505' '\

    The members of this family are all hypothetical bacterial proteins of unknown function, and are similar to the YbbR protein expressed by Bacillus subtilis (, ). One member () is annotated as an uncharacterised secreted protein, whereas another member () is described as a hypothetical protein in the 5,region of the def gene of Thermus thermophilus, which encodes a deformylase PUBMED:7961514, but no further information was found in either case. This region is found repeated up to four times in many members of this family.

    \ ' '7781' 'IPR012472' '\

    This family of fungal proteins is uncharacterised. Each protein contains two copies of this region.

    \ ' '7782' 'IPR013104' '\

    The Clostridium neurotoxin family is composed of tetanus neurotoxins and seven serotypes of botulinum neurotoxin. The structure of the botulinum neurotoxin reveals a four domain protein. The N-terminal catalytic domain (), the central translocation domains and two receptor-binding domains PUBMED:9783750. This domain is the C-terminal receptor-binding domain, which adopts a modified beta-trefoil fold with a six stranded beta-barrel and a beta-hairpin triplet capping the domain PUBMED:9783750. The first step in the intoxication process is a binding event between this domain and the pre-synaptic nerve ending PUBMED:9783750.

    \ ' '7783' 'IPR012500' '\

    The Clostridium neurotoxin family is composed of tetanus neurotoxin and seven serotypes of botulinum neurotoxin. The structure of the botulinum neurotoxin reveals a four domain protein. The N-terminal catalytic domain (), the central translocation domains and two receptor binding domains PUBMED:9783750. Subsequent to cell surface binding and receptor mediated endocytosis of the neurotoxin, an acid induced conformational change in the neurotoxin translocation domain is believed to allow the domain to penetrate the endosome and from a pore, thereby facilitating the passage of the catalytic domain across the membrane into the cytosol PUBMED:9783750. The structure of the translocation reveals a pair of helices that are 105 Angstroms long and is structurally distinct from other pore forming toxins PUBMED:9783750.

    \ ' '7784' 'IPR012928' '\

    The Clostridium neurotoxin family is composed of tetanus neurotoxin and seven serotypes of botulinum neurotoxin. The structure of the botulinum neurotoxin reveals a four domain protein. The N-terminal catalytic domain (), the central translocation domain and two receptor binding domains PUBMED:9783750. This domain is the N-terminal receptor binding domain, which is comprised of two seven-stranded beta-sheets sandwiched together to form a jelly role motif PUBMED:9783750. The role of this domain in receptor binding appears to be indirect.

    \ ' '7785' 'IPR012470' '\

    Family of fungal proteins with unknown function. A member of this family has been found to localise in the mitochondria PUBMED:14562095.

    \ ' '7786' 'IPR012882' '\

    This is a family of uncharacterised fungal proteins.

    \ ' '7787' 'IPR012471' '\

    Family of uncharacterised fungal proteins.

    \ ' '7788' 'IPR012917' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This is a family of mitochondrial ribosomal proteins, which appears to be fungal specific PUBMED:1574929.

    \ ' '7789' 'IPR012469' '\

    A family of uncharacterised fungal proteins.

    \ ' '7790' 'IPR012887' '\

    In the salvage pathway of GDP-L-fucose, free cytosolic fucose is phosphorylated by L-fucokinase to form L-fucose-L-phosphate, which is then further converted to GDP-L-fucose in the reaction catalysed by GDP-L-fucose pyrophosphorylase PUBMED:14686921.

    \ ' '7791' 'IPR012420' '\

    The CBP4 gene in Saccharomyces cerevisiae is essential for the expression and activity of ubiquinol-cytochrome c reductase PUBMED:8063753, PUBMED:8811190. This family appears to be fungal specific.

    \ ' '7792' 'IPR012483' '\

    Mba1 is an inner membrane protein that is part of the mitochondrial protein export machinery PUBMED:16601683, PUBMED:11381092. It binds to the large subunit of mitochondrial ribosomes and cooperates with the C-terminal ribosome-binding domain of Oxa1, which is a central component of the insertion machinery of the inner membrane. In the absence of both Mba1 and the C-terminus of Oxa1, mitochondrial translation products fail to be properly inserted into the inner membrane and serve as substrates of the matrix chaperone Hsp70 PUBMED:8690083. It is proposed that Mba1 functions as a ribosome receptor that cooperates with Oxa1 in the positioning of the ribosome exit site to the insertion machinery of the inner membrane PUBMED:8690083.

    \ ' '7793' 'IPR012923' '\

    Replication fork pausing is required to initiate recombination events. More specifically, Swi1 is required for recombination near the mat1 locus. Swi3 has been found to co-purify with Swi1. Together they define a fork protection complex that coordinates leading- and lagging-strand synthesis and stabilises stalled replication forks PUBMED:15367656. This complex is required for accurate replication, fork protection and replication checkpoint signalling PUBMED:15367656, PUBMED:15371597.

    \ ' '7794' 'IPR012902' '\

    This short motif directs methylation of the conserved phenylalanine residue. It is most often found at the N-terminus of pilins and other proteins involved in secretion, see , , and .

    \ \

    This model describes many (but not all) examples of the N-terminal region of bacterial proteins that resemble type IV pilins at their N-terminus PUBMED:7934814. This domain contains a cleavage site G^FxxxE followed by a hydrophobic stretch. The new N-terminal residue produced after cleavage, usually Phe, is methylated. Separate domains of the prepilin peptidase appear to be responsible\ for cleavage and methylation. Proteins with this N-terminal region include type IV pilins and other components of pilus biogenesis, competence proteins, and type II secretion proteins. Typically several proteins in a single operon have this region.

    \ ' '7795' 'IPR012491' '\

    Rec10 / Red1 is involved in meiotic recombination and chromosome segregation during homologous chromosome formation. This protein localises to the synaptonemal complex in Saccharomyces cerevisiae and the analogous structures (linear elements) in Schizosaccharomyces pombe PUBMED:15226405. This family is currently only found in fungi.

    \ ' '7796' 'IPR012896' '\

    Integrins are the major metazoan receptors for cell adhesion to extracellular matrix proteins and, in vertebrates, also play important roles in certain cell-cell adhesions, make transmembrane connections to the cytoskeleton and activate many intracellular signalling pathways PUBMED:12297042, PUBMED:12361595. The integrin receptors are composed of alpha and beta subunit heterodimers. Each subunit crosses the membrane once, with most of the polypeptide residing in the extracellular space, and has two short cytoplasmic domains. Some members of this family have EGF repeats at the C terminus and also have a vWA domain inserted within the integrin domain at the N terminus.

    \

    Most integrins recognise relatively short peptide motifs, and in general require an acidic amino acid to be present. Ligand specificity depends upon both the alpha and beta subunits PUBMED:12234368. There are at least 18 types of alpha and 8 types of beta subunits recognised in humans PUBMED:14689578. Each alpha subunit tends to associate only with one type of beta subunit, but there are exceptions to this rule PUBMED:2467745. Each association of alpha and beta subunits has its own binding specificity and signalling properties. Many integrins require activation on the cell surface before they can bind ligands. Integrins frequently intercommunicate, and binding at one integrin receptor activate or inhibit another.

    \

    The structure of unliganded alphaV beta3 showed the molecule to be folded, with the head bent over towards the C termini of the legs which would normally be inserted into the membrane PUBMED:12714499. The head comprises a beta propeller domain at the end terminus of the alphaV subunit and an I/A domain inserted into a loop on the top of the hybrid domain in the beta subunit. The I/A domain consists of a Rossman fold with a core of beta parallel sheets surrounded by amphipathic alpha helices.

    \ \

    This entry represents the tail domain of the integrin beta subunit. It forms a four-stranded beta-sheet that contains parallel and anttparallel strands and faces an alpha helix found at the N-terminus of this domain PUBMED:11546839. Interactions between the alpha-helix and the beta-sheet are mostly hydrophobic and involve a disulphide bond. The rear of the beta sheet is covered with a long A-B loop.

    \ ' '7797' 'IPR012848' '\

    Most eukaryotic endopeptidases (MEROPS peptidase family A1) are synthesised with signal and propeptides. The animal pepsin-like endopeptidase propeptides form a distinct family of propeptides, which contain a conserved motif approximately 30 residues long. In pepsinogen A, the first 11 residues of the mature pepsin sequence are displaced by residues of the propeptide. The propeptide contains two helices that block the active site cleft, in particular the conserved Asp11 residue, in pepsin, hydrogen bonds to a conserved Arg residue in the propeptide. This hydrogen bond stabilises the propeptide conformation and is probably responsible for triggering the conversion of pepsinogen to pepsin under acidic conditions PUBMED:1594574, PUBMED:2056534.

    \ ' '7798' 'IPR012935' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This zinc-finger like domain is distributed throughout the eukaryotic kingdom in NIPA (Nuclear interacting partner of ALK) and other proteins. NIPA is thought to perform an antiapoptotic role in nucleophosmin-anaplastic lymphoma kinase (ALK) mediated signalling events PUBMED:12748172. The domain is often repeated, with the second domain usually containing a large insert (approximately 90 residues) after the first three cysteine residues. The Schizosaccharomyces pombe protein containing this domain () is involved in mRNA export from the nucleus PUBMED:15357289.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '7799' 'IPR001340' '\

    Staphylococcus aureus is a Gram-positive coccus that grows in clusters or pairs, and is the major cause of nosocomial infections due to its multiple antibiotic resistant nature. Patients who are immuno-compromised (e.g., those suffering from third degree burns or chronic illness) are at risk from deep staphylococcal infections, such as osteomyelitis and pneumonia. Most skin infections are also caused by this bacterium PUBMED:9350200.

    \ \

    Many virulence mechanisms are employed by Staphylococci to induce pathogenesis: these can include polysaccharide capsules and exotoxins PUBMED:9350200. Examples of the latter are bi-component toxins, which involve the synergistic combination of an "S" and an "F" component PUBMED:9804914. These undergo conformational changes in their protein structure and form oligomeric pores in the target cell membrane upon recognition of certain host receptors. The main cells targeted are polymorphonuclear cells, monocytes, erythrocytes and macrophages. Examples of this protein family include: leucocidin, gamma-haemolysin and alpha-haemolysin.

    \ \

    Recently, the crystal structure of the S. aureus leucocidin "F" component (LukF) has been determined to 1.9A resolution PUBMED:10048924. This structure, which comprises a central 3-strand beta-sheet, with an N-terminal "latch", clarified the mechanism of virulence in the bi-component toxin. Further work using a different form of the leucocidin (LukF-PV) has suggested that it may be a representative fold for water-soluble transmembrane toxins PUBMED:10368297.

    \ ' '7800' 'IPR013108' '\

    Amidohydrolases are a diverse superfamily of enzymes which catalyse the hydrolysis of amide or amine bonds in a large number of different substrates including urea, cytosine, AMP, formylmethanofuran, etc PUBMED:9144792, PUBMED:11395407. Also included in this superfamily are the phopshotriesterase enzymes, which hydrolyse P-O bonds. Members participate in a large number of processes including nucleotide metabolism, detoxification and neuronal development. They use a variety of divalent metal cofactors for catalysis: for example adenosine deaminase binds a single zinc ion, phopsphotriesterase binds two, while urease binds nickel. It has been postulated that since some of these proteins, such as those some of those involved in neuronal devlopment, appear to have lost their metal-binding centres, their function may simply be to bind, but not hydrolyse, their target molecules.

    \ \

    This entry represents a subset of amidohydrolase domains that participate in different functions including cytosine degradation, atrazine degradation and other metabolic processes. The structure of the domain from Escherichia coli has been studied, and like other amidohydrolases it forms a classical alpha-beta TIM-barrel fold PUBMED:11812140. The active site is located in the mouth of the enzyme barrel and contains a bound iron ion that coordinates a hydroxyl nucleophile. Substrate binding involves a significant conformational change that sequesters the reaction complex from solvent.

    \ ' '7801' 'IPR012936' '\

    This domain occurs in many hypothetical proteins, and also two partially characterised proteins. One of these proteins, PTX1 , is a homeodomain-containing transcription factor involved in regulating all pituitary hormone genes PUBMED:10067870. This protein is down regulated in prostate carcinoma PUBMED:11445006. The other protein, ERGIC-32 , is involved in protein transport from the ER to the Golgi PUBMED:15308636.

    \ ' '7802' 'IPR012939' '\

    This domain occurs within alpha-1,2-mannosidases, which remove alpha-1,2-linked mannose residues from Man(9)(GlcNAc)(2) by hydrolysis. They are critical for the maturation of N-linked oligosaccharides and ER-associated degradation PUBMED:10026209.

    \ ' '7803' 'IPR004465' '\ Ribonucleotide reductases (RNRs) are enzymes that provide the precursors of DNA synthesis. The three characterised classes of RNRs differ by their metal cofactor and their stable organic radical. Class Ib RNR is encoded in four different genes: nrdH, nrdI, nrdE and nrdF PUBMED:12686643. The exact function of NrdI within the ribonucleotide reductases has not yet been fully characterised.\ ' '7804' 'IPR012947' '\

    The catalytically active form of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this SAD domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain PUBMED:10319817.

    \ ' '7805' 'IPR013111' '\

    A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF)\ has been shown PUBMED:, PUBMED:3282918, PUBMED:6607417, PUBMED:2288911, PUBMED:6334307 to be present, in a more\ or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to\ contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in\ what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in\ the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin\ G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide\ bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet.\ Subdomains between the conserved cysteines vary in length.

    \ \

    This entry contains EGF domains found in a variety of extracellular and membrane proteins

    \ ' '7806' 'IPR004595' '\

    All proteins in this domain for which functions are known are components of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair. It includes the yeast transcription factor Ssl1 (Suppressor of stem-loop protein 1) that is essential for translation initiation and affects UV resistance.

    \ \

    The C-terminal region is essential for transcription activity. This regions binds three zinc atoms through two independent domain. The first contains a C4 zinc finger motif, whereas the second is characterised by a CX(2)CX(2-4)FCADCD motif. The solution structure of the second C-terminal domain revealed homology with the regulatory domain of protein kinase C PUBMED:10882739.

    \ ' '7807' 'IPR012941' '\

    Phenol hydroxylase is a homodimer which hydroxylates phenol to catechol, or similar products. The enzyme is comprised of three domains. The first two domains form the active site. The third domain, this domain, is involved in forming the dimerisation interface. The domain adopts a thioredoxin-like fold PUBMED:9634698.

    \ ' '7808' 'IPR013114' '\

    Fatty acids biosynthesis occurs by two distinct pathways: in fungi, mammals and mycobacteria, type I or associative fatty-acid biosynthesis (type I FAS) is accomplished by multifunctional proteins in which distinct domains catalyse specific reactions; in plants and most bacteria, type II or dissociative fatty-acid biosynthesis (type II FAS) is accomplished by distinct enzymes PUBMED:14684903.

    \

    Both FabZ and FabA catalyse the dehydration of beta-hydroxyacyl acyl carrier protein (ACP) to trans 2-enoyl ACP. However, FabZ and FabA display subtle differences in substrate specificities, whereby FabA is most effective on acyl ACPs of 9-11 carbon atoms in length, while FabZ is less specific. Unlike FabA, FabZ does not function as an isomerase and cannot initiate unsaturated fatty acid biosynthesis. However, only FabZ can act during the elongation of unsaturated fatty acid chains.

    \ \ \

    This enzyme domain has a HotDog fold.

    \ ' '7809' 'IPR012577' '\

    Members of this family include many hypothetical proteins. It also includes members of the NIPSNAP family, which have putative roles in vesicular transport PUBMED:9661659. This domain is often found in duplicate.

    \ ' '7810' 'IPR013117' '\

    This domain is found at the C terminus of intimin. Its structure has been solved and shown to have a C-lectin type of structure PUBMED:10835344. Intimin is a bacterial adhesion molecule involved in intimate attachment of enteropathogenic and enterohemorrhagic Escherichia coli to mammalian host cells. Intimin targets the translocated intimin receptor (Tir), which is exported by the bacteria and integrated into the host cell plasma membrane.

    \ ' '7811' 'IPR012944' '\

    This domain occurs in several hypothetical proteins. It also occurs in RagB, , a protein involved in signalling PUBMED:7499430 and SusD, , an outer membrane protein involved in nutrient binding PUBMED:11717282.

    \ ' '7812' 'IPR012598' '\

    This repeat is found in two hypothetical Plasmodium proteins.

    \ ' '7813' 'IPR012564' '\

    Members of this family are viral glycoproteins that form part of an envelope complex PUBMED:9733861.

    \ ' '7814' 'IPR012946' '\

    The X8 domain PUBMED:11115868 contains 6 conserved cysteine residues that presumably form three disulphide bridges. The domain is found in an Olive pollen allergen PUBMED:15004167 as well as at the C terminus of family 17 glycosyl hydrolases PUBMED:11554480. This domain may be involved in carbohydrate binding.

    \ ' '7815' 'IPR012937' '\

    This domain occurs in many hypothetical proteins. It also occurs in some prion-like proteins.

    \ ' '7816' 'IPR012942' '\

    SRR1 proteins are signalling proteins involved in regulating the circadian clock PUBMED:12533513.

    \ ' '7817' 'IPR012945' '\

    This domain is found in tubulin-binding cofactor C (or tubulin-specific chaperone C) (TBCC). TBCC is a folding cofactor that participates in tubulin biogenesis along with the other tubulin folding cofactors A (TBCA), B (TBCB), E (TBCE) and D (TBCD), as well as the GTP-binding protein Arl2 PUBMED:17184771, PUBMED:12225668.

    \ ' '7818' 'IPR012533' '\

    GLE1 is an essential nuclear export factor involved in RNA export.

    \ ' '7819' 'IPR012642' '\

    Proteins containing the Wos2 domain are involved in the regulation of the cell cycle PUBMED:10581266 and are Myb-related transcriptional activators.

    \ ' '7820' 'IPR012943' '\

    Proteins with this domain associate with the spindle body during cell division PUBMED:15004232.

    \ ' '7821' 'IPR012940' '\

    This domain occurs in some putative nucleic acid binding proteins. One of these proteins has been partially characterised PUBMED:15488989 and contains two putative phosphorylation sites and a possible dimerisation / leucine zipper domain.

    \ ' '7822' 'IPR013116' '\

    Acetohydroxy acid isomeroreductase catalyses the conversion of acetohydroxy acids into dihydroxy valerates. This reaction is the second in the synthetic pathway of the essential branched side chain amino acids valine and isoleucine.

    \ ' '7823' 'IPR013027' '\ This entry describes both class I and class II oxidoreductases. \ FAD flavoproteins belonging to the family of pyridine nucleotide-disulphide \ oxidoreductases (glutathione reductase, trypanothione reductase, lipoamide dehydrogenase, \ mercuric reductase, thioredoxin reductase, alkyl hydroperoxide reductase) share sequence \ similarity with a number of other flavoprotein oxidoreductases, in particular with \ ferredoxin-NAD+ reductases involved in oxidative metabolism of a variety of hydrocarbons \ (rubredoxin reductase, putidaredoxin reductase, terpredoxin reductase, ferredoxin-NAD+ \ reductase components of benzene 1,2-dioxygenase, toluene 1,2-dioxygenase, chlorobenzene \ dioxygenase, biphenyl dioxygenase), NADH oxidase and NADH peroxidase PUBMED:2319593, \ PUBMED:1404382, PUBMED:2067578. Comparison of the crystal structures of human glutathione \ reductase and Escherichia coli thioredoxin reductase reveals different locations of their active \ sites, suggesting that the enzymes diverged from an ancestral FAD/NAD(P)H reductase and \ acquired their disulphide reductase activities independently PUBMED:2067578. \

    \ Despite functional similarities, oxidoreductases of this family show no sequence \ similarity with adrenodoxin reductases PUBMED:2924777 and flavoprotein pyridine nucleotide\ cytochrome reductases (FPNCR) PUBMED:1748631. Assuming that disulphide reductase activity \ emerged later, during divergent evolution, the family can be referred to as FAD-dependent \ pyridine nucleotide reductases, FADPNR.

    \

    To date, 3D structures of glutathione reductase PUBMED:3656429, thioredoxin reductase \ PUBMED:2067578, mercuric reductase PUBMED:2067577, lipoamide dehydrogenase PUBMED:1880807, \ trypanothione reductase PUBMED:1924336 and NADH peroxidase PUBMED:1942054 have been solved. \ The enzymes share similar tertiary structures based on a doubly-wound alpha/beta fold, \ but the relative orientations of their FAD- and NAD(P)H-binding domains may vary \ significantly. By contrast with the FPNCR family, the folds of the FAD- and \ NAD(P)H-binding domains are similar, suggesting that the domains evolved by gene \ duplication PUBMED:7411611.\

    \ ' '7824' 'IPR013120' '\

    This family represents the C-terminal NAD-binding region of the male sterility protein from Arabidopsis and Drosophila. A sequence-related jojoba acyl CoA reductase is also included.

    \ ' '7825' 'IPR002587' '\

    1L-myo-Inositol-1-phosphate synthase () catalyzes the conversion of D-glucose 6-phosphate to 1L-myo-inositol-1-phosphate, the first committed step in the production of all inositol-containing compounds, including phospholipids, either directly or by salvage. The enzyme exists in a cytoplasmic form in a wide range of plants, animals, and fungi. It has also been detected in several bacteria and a chloroplast form is observed in alga and higher plants. Inositol phosphates play an important role in signal transduction.

    \

    In Saccharomyces cerevisiae (Baker\'s yeast), the transcriptional regulation of the INO1 gene has been studied in detail PUBMED:7975896 and its expression is sensitive to the availability of phospholipid precursors as well as growth phase. The regulation of the structural gene encoding 1L-myo-inositol-1-phosphate synthase has also been analyzed at the transcriptional level in the aquatic angiosperm, Spirodela polyrrhiza (Giant duckweed) and the halophyte, Mesembryanthemum crystallinum (Common ice plant) PUBMED:9370339.

    \ ' '7826' 'IPR012938' '\

    Proteins containing this domain are thought to be glucose/sorbosone dehydrogenases. The best characterised of these proteins is soluble glucose dehydrogenase () from Acinetobacter calcoaceticus, which oxidises glucose to gluconolactone. The enzyme is a calcium-dependent homodimer which uses PQQ as a cofactor PUBMED:10508152.

    \ ' '7827' 'IPR012991' '\

    Members of this family are components of the type IV secretion system. They mediate intracellular transfer of macromolecules via a mechanism ancestrally related to that of bacterial conjugation machineries.

    \ ' '7828' 'IPR012543' '\

    This family contains many hypothetical proteins.

    \ ' '7829' 'IPR012962' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This group represents archaeal zinc-dependent peptidases, and homologous sequences from bacteria and eukaryotes, belonging to the MEROPS peptidase family M54 (archaelysin, clan MA); though the family is more commonly known as the archaemetzincins. Two human homologues have been characterised and the family is widely distributed in the archaea PUBMED:15972818.

    \ ' '7830' 'IPR006518' '\

    These sequences are full-length and part-length members of the RHS (retrotransposon hot spot) family in Trypanosoma brucei and Trypanosoma cruzi. Members of this family are frequently interrupted by non-LTR retrotransposons inserted at exactly the same relative position.

    \ ' '7831' 'IPR012544' '\

    This family contains many bacterial hypothetical proteins.

    \ ' '7832' 'IPR012536' '\

    This is a family of unique short (US) cytoplasmic glycoproteins which are expressed in cytomegalovirus PUBMED:11992003.

    \ ' '7833' 'IPR012545' '\

    This family contains many hypothetical bacterial proteins.

    \ ' '7834' 'IPR010017' '\

    Methyl transfer from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalysed by methyltransferases (Mtases) and modifies DNA, RNA, proteins and small molecules, such as catechol for regulatory purposes. The various aspects of the role of DNA methylation in prokaryotic restriction-modification systems and in a number of cellular processes in eukaryotes including gene regulation and differentiation is well documented.

    \ \

    Three classes of DNA Mtases transfer the methyl group from AdoMet to the target base to form either N-6-methyladenine, or N-4-methylcytosine, or C-5- methylcytosine. In C-5-cytosine Mtases, ten conserved motifs are arranged in the same order PUBMED:8127644. Motif I (a glycine-rich or closely related consensus sequence; FAGxGG in M.HhaI PUBMED:8343957), shared by other AdoMet-Mtases PUBMED:2684970, is part of the cofactor binding site and motif IV (PCQ) is part of the catalytic site. In contrast, sequence comparison among N-6-adenine and N-4-cytosine Mtases indicated two of the conserved segments PUBMED:2690010, although more conserved segments may be present. One of them corresponds to motif I in C-5-cytosine Mtases, and the other is named (D/N/S)PP(Y/F). Crystal structures are known for a number of Mtases PUBMED:7607476, PUBMED:8343957, PUBMED:8127644, PUBMED:7971991. The cofactor binding sites are almost identical and the essential catalytic amino acids coincide. The comparable protein folding and the existence of equivalent amino acids in similar secondary and tertiary positions indicate that many (if not all) AdoMet-Mtases have a common catalytic domain structure. This permits tertiary structure prediction of other DNA, RNA, protein, and small-molecule AdoMet-Mtases from their amino acid sequences PUBMED:7897657.

    \ \

    This entry represents a set of bacterial AdoMet-dependent tRNA (mo5U34)-methyltransferases. These enzymes catalyse the conversion of 5-hydroxyuridine (ho5U) to 5-methoxyuridine (mo5U) at the wobble position (34) of tRNA PUBMED:15383682. The 5-methoxyuridine is subsequently converted to uridine-5-oxyacetic acid, a modified nucleoside that is apparently necessary for the efficient decoding of G-ending Pro, Ala, and Val codons in these organisms PUBMED:17942742.

    \ ' '7835' 'IPR012546' '\

    This family contains many archaeal proteins which have very conserved sequences.

    \ ' '7836' 'IPR012983' '\

    This domain is called PHR as it was original found in the proteins PAM (), highwire () and RPM (). This domain can be duplicated in the highwire, PAM and PRM sequences. The function of PHR is currently unclear.

    \ ' '7837' 'IPR012963' '\

    This family contains many hypothetical bacterial proteins and two putative membrane proteins ( and ).

    \ ' '7838' 'IPR013109' '\

    This family contains many hypothetical proteins that belong to the cupin superfamily.

    \ ' '7839' 'IPR012641' '\

    Members of this family are polydna viral proteins that contain a cysteine rich motif PUBMED:11724552. Some members of this family have multiple copies of this domain.

    \ ' '7840' 'IPR012616' '\

    The TOM13 family of proteins are mitochondrial outer membrane proteins that mediate the assembly of beta-barrel proteins PUBMED:15326197.

    \ ' '7841' 'IPR012596' '\

    Proteins in this family are bacteriophage GP30.3 proteins. Gene 30.3\' encodes a 75 amino acid basic peptide which has a C terminus rich in charged amino acids PUBMED:8088550. Of the pair of T4 overlapping genes, 30.3 and 30.3\', the latter is a smaller gene and is entirely enclosed within the other by one position downstream PUBMED:9272856.\

    \ ' '7842' 'IPR012547' '\

    This family contains many hypothetical bacterial proteins.

    \ ' '7843' 'IPR012964' '\

    This family of proteins contains many bacterial proteins that are encoded by the unbL gene. The function of these proteins is unknown.

    \ ' '7844' 'IPR012062' '\

    Escherichia coli and other enteric bacteria contain two closely related D-tagatose 1,6-bisphosphate (TagBP)-specific aldolases involved in catabolism of galactitol (genes gatY gatZ) and of N-acetyl-galactosamine and D-galactosamine (genes kbaY, kbaZ, also called agaY, agaZ). The catalytic subunits GatY/KbaY alone are sufficient to show aldolase activity and contain most or all of the residues that have been identified as essential in substrate/product recognition and catalysis for class II aldolases PUBMED:11976750, PUBMED:8955298. However, these aldolases differ from other Class II aldolases (which are homodimeric enzymes) in that they require subunits GatZ/KbaZ for full activity and for good in vivo and in vitro stability. The Z subunits alone do not show any aldolase activity PUBMED:11976750. It should be noted that the previous suggestion of a tagatose 6P-kinase function for AgaZ PUBMED:8932697 and other members of this family turned out to be erroneous PUBMED:10931310, PUBMED:11976750.

    \ ' '7845' 'IPR012548' '\

    This family contains many hypothetical proteins.

    \ ' '7846' 'IPR012597' '\

    This family corresponds to mating-type pheromone proteins. The homobasidiomycetes, or mushroom fungi, have arguably the most complex mating system of all known organisms. Many species possess a mating system known as bifactorial incompatibility, where two unlinked loci control the mating -type of an individual incompatibility loci (the A and B mating-type loci). Each A mating-type sublocus encodes a pair of divergently transcribed homeodomain transcription factors while the genes responsible for B mating-type activity encode lipopeptide pheromones and G-protein -coupled pheromone receptors PUBMED:15219565.

    \ ' '7847' 'IPR013122' '\

    This domain contains the cation channel region of PKD1 and PKD2 proteins.

    \ ' '7848' 'IPR012969' '\

    This entry represents bacterial proteins that bind to fibrinogen. Fibrinogen is capable of binding to a wide number of endogenous proteins and cell receptors during haemostasis, including binding to platelets to promote their aggregation PUBMED:15837518. This entry represents the fibrinogen receptor FbsA () from the pathogen Streptococcus agalactiae, which is responsible for causing endocarditis in humans. FbsA is considered an important virulence factor that is capable of binding fibrinogen, through which it elicits platelet aggregation and adherence to the extracellular matrix. This enables the bacteria to invade the pulmonary epithelium, which may be a prerequisite for infection PUBMED:15383464.

    \

    More information about these proteins can be found at Protein of the Month: Fibrinogen PUBMED:.

    \ ' '7849' 'IPR012520' '\

    This family includes antimicrobial peptides secreted from skins of frogs. The secretion of antimicrobial peptides from the skins of frogs plays an important role in the self defence of these frogs. Structural characterization of these peptides showed that they belonged to four known families: the brevinin-1 family, the esculentin-2 family, the ranatuerin-2 family and the temporin family PUBMED:10651828.

    \ ' '7850' 'IPR012549' '\

    Some members of this family are putative bacterial membrane proteins. This domain is found immediately N-terminal to the sulphatase domain in many sulphatases.

    \ ' '7851' 'IPR012550' '\

    This family contains many hypothetical proteins from bacteria and yeast.

    \ ' '7852' 'IPR013113' '\

    Proteins in this entry are siderophore-interacting FAD-binding proteins.

    \

    This entry includes the vibriobactin utilization protein ViuB, which is involved in the removal of iron from iron-vibriobactin complexes, as well as several hypothetical proteins.

    \ ' '7853' 'IPR013112' '\

    This FAD binding domain is associated with ferric reductase NAD binding proteins and the heavy chain of Cytochrome b-245.

    \ ' '7854' 'IPR012521' '\

    This family consists of the major classes of antimicrobial peptides secreted from the skin of frogs that protect the frogs against invading microbes. They are typically between 10-50 amino acids long and are derived from proteolytic cleavage of larger precursors. Major classes of peptides such esculentin, gaegurin, brevinin, rugosin and ranatuerin are included in this family PUBMED:12470734.

    \ ' '7855' 'IPR012523' '\

    This family consists of the ponericin family of antimicrobial peptides isolated from predatory ant Pachycondyla goeldii (Ponerine ant). The ponericin peptides may adopt amphipathic alpha-helical structure in polar environments. In the ant colony, these peptides exhibit a defensive role against microbial pathogens arising from prey introduction and/or ingestion PUBMED:11279030.

    \ ' '7856' 'IPR012522' '\

    This family includes antimicrobial peptides isolated from the crude venom of the wolf spider Oxyopes kitabensis (Wolf spider). These peptides, known as oxyopinins, are the largest linear cationic amphipathic peptides chemically characterised and exhibit disrupting activities towards biological membranes PUBMED:11976325.

    \ ' '7857' 'IPR012524' '\

    This family consists of antimicrobial peptides produced by bees. These peptides have strong antimicrobial and some anti-fungal activity and has homology to abaecin which is the largest proline-rich antimicrobial peptide isolated from European bumblebee Bombus pascuorum PUBMED:9219367.

    \ ' '7858' 'IPR012512' '\

    The albumin I protein, a hormone-like peptide, stimulates kinase activity upon binding a membrane bound 43 kDa receptor. The structure of this region reveals a knottin like fold, comprise of three beta strands PUBMED:12631285.

    \ ' '7859' 'IPR013107' '\

    Acyl Co-A dehydrogenases () are enzymes that catalyse the first step in each cycle of beta-oxidation in mitochondion. Acyl-CoA dehydrogenases PUBMED:3326738, PUBMED:2777793, PUBMED:8034667 catalyze the alpha,beta-dehydrogenation of acyl-CoA thioesters to the corresponding trans 2,3-enoyl CoA-products with concommitant reduction of enzyme-bound FAD. Reoxidation of the flavin involves transfer of electrons to ETF (electron transfering flavoprotein). These enzymes are homodimers containing one molecule of FAD.

    The monomeric enzyme is folded into three domains of approximately equal size. The N-terminal and the C-terminal are mainly alpha-helices packed together, and the middle domain consists of two orthogonal beta-sheets. The flavin ring is buried in the crevise between two alpha-helical domains and the beta-sheet of one subunit, and the adenosine pyrophosphate moiety is stretched into the subunit junction with one formed by two C-terminal domains PUBMED:8356049. The C-terminal domain of Acyl-CoA dehydrogenase is an all-alpha, four helical up-and-down bundle.

    \ ' '7860' 'IPR013115' '\

    ATP phosphoribosyltransferase () is the enzyme that catalyzes the first step in the biosynthesis of histidine in bacteria, fungi and plants as shown below. It is a member of the larger phosphoribosyltransferase superfamily of enzymes which catalyse the condensation of 5-phospho-alpha-D-ribose 1-diphosphate with nitrogenous bases in the presence of divalent metal ions PUBMED:11751055.

    \ \ \ \

    Histidine biosynthesis is an energetically expensive process and ATP phosphoribosyltransferase activity is subject to control at several levels. Transcriptional regulation is based primarily on nutrient conditions and determines the amount of enzyme present in the cell, while feedback inihibition rapidly modulates activity in response to cellular conditions. The enzyme has been shown to be inhibited by 1-(5-phospho-D-ribosyl)-ATP, histidine, ppGpp (a signal associated with adverse environmental conditions) and ADP and AMP (which reflect the overall energy status of the cell). As this pathway of histidine biosynthesis is present only in prokayrotes, plants and fungi, this enzyme is a promising target for the development of novel antimicrobial compounds and herbicides.

    \ \

    This entry represents the C-terminal portion of ATP phosphoribosyltransferase. The enzyme itself exists in equilibrium between an active dimeric form, an inactive hexameric form and higher aggregates PUBMED:14741209, PUBMED:12511575. Interconversion between the various forms is largely reversible and is influenced by the binding of the natural substrates and inhibitors of the enzyme. This domain is not directly involved in catalysis but appears to be responsible for the formation of hexamers induced by the binding of inhibitors to the enzyme, thus regulating activity.

    \ ' '7861' 'IPR013121' '\

    This entry contains ferric reductase NAD binding proteins.

    \ ' '7862' 'IPR012951' '\

    This domain is found in the berberine bridge and berberine bridge-like enzymes, which are involved in the biosynthesis of numerous isoquinoline alkaloids. They catalyse the transformation of the N-methyl group of \ (S)-reticuline into the C-8 berberine bridge carbon of (S)-scoulerine PUBMED:8972604.

    \ ' '7863' 'IPR001250' '\

    Mannose-6-phosphate isomerase or phosphomannose isomerase () (PMI) is the enzyme that catalyses the interconversion of mannose-6-phosphate and fructose-6-phosphate. In eukaryotes PMI is involved in the synthesis of GDP-mannose, a constituent of N- and O-linked glycans and GPI anchors and in prokaryotes it participates in a variety of pathways, including capsular polysaccharide biosynthesis and D-mannose metabolism. PMI\'s belong to the cupin superfamily whose functions range from isomerase and epimerase activities involved in the modification of cell wall carbohydrates in bacteria and plants, to non-enzymatic storage proteins in plant seeds, and transcription factors linked to congenital baldness in mammals PUBMED:11165500. Three classes of PMI have been defined PUBMED:8307007.

    \

    Type I includes eukaryotic PMI and the enzyme encoded \ by the manA gene in enterobacteria. PMI has a bound zinc ion, which is essential for activity.

    \

    A crystal structure of PMI from Candida albicans shows that the enzyme has three distinct domains PUBMED:8612079. The active site lies in the central domain, contains a single essential zinc atom, and forms a deep, open cavity of suitable dimensions to contain M6P or F6P The central domain is flanked by a helical domain on one side and a jelly-roll like domain on the other.

    \ ' '7864' 'IPR013123' '\

    Most cellular RNAs undergo a number of post-transcriptional nucleoside modifications. While the biological role of many of these modifications is unknown, some have been shown to be necessary for cell growth or for resistance to antibiotics PUBMED:8266080, PUBMED:9187657. One of the most common modifications is 2\'O-ribose methylation catalysed by the RNA 2\'O-ribose methyltransferases, a large enzyme family that transfer a methyl group from S-adenosyl-L-methionine (AdoMet) to the 2\'-OH group of the backbone ribose PUBMED:9917067.

    \ \

    This entry represents a substrate-binding domain found in a variety of bacterial and mitochondrial RNA 2\'-O ribose methyltransferases. These include the bacterial enzyme RlmB, which specifically methylates the conserved nucleotide guanosine 2251 in 23S RNA, and PET56, which specifically methylates the equivalent guanosine in mitochondrial 21S RNA PUBMED:11698387, PUBMED:8266080. This domain forms a four-stranded mixed beta sheet similar to that found in other RNA binding enzymes PUBMED:12377117. It shows considerable conformational flexibility which is thought to be important for its ability to bind RNA.

    \ ' '7865' 'IPR012990' '\

    COPII (coat protein complex II)-coated vesicles carry proteins from the endoplasmic reticulum (ER) to the Golgi complex PUBMED:11535824. COPII-coated vesicles form on the ER by the stepwise recruitment of three cytosolic components: Sar1-GTP to initiate coat formation, Sec23/24 heterodimer to select SNARE and cargo molecules, and Sec13/31 to induce coat polymerisation and membrane deformation PUBMED:12239560.

    \ \

    Sec23 p and Sec24p are structurally related, folding into five distinct domains: a beta-barrel, a zinc-finger (), an alpha/beta trunk domain (), an all-helical region (), and a C-terminal gelsolin-like domain (). This entry describes part of the Sec23/24 beta-barrel domain, which is formed from approximately 180 residues from three segments of the polypeptide. The strands of the barrel are oriented roughly parallel to the membrane such that one end of the barrel forms part of the inner surface of the coat and the other end part of the membrane-distal surface. The barrel is constructed from two opposed sheets: a six-stranded beta sheet facing partly towards the zinc finger domain and partly towards the solvent, and a five-stranded beta sheet facing the helical domain.

    \ ' '7866' 'IPR012615' '\

    This domain has been identified in a number of distantly related species of trematodes. This protein domain is crucial for eggshell synthesis in trematodes (Ebersberger I).

    \ ' '7867' 'IPR013532' '\

    Pro-opiomelanocortin is present in high levels in the pituitary and is processed into 3 major peptide families: adrenocorticotrophin (ACTH); alpha-, beta- and gamma-melanocyte- stimulating hormones (MSH); and beta-endorphin PUBMED:2266117. ACTH regulates the synthesis and release of glucocorticoids and, to some extent, aldosterone in the adrenal cortex. It is synthesised and released in response to corticotrophin-releasing factor at times of stress (i.e. heat, cold, infection, etc.), its release leading to increased metabolism. The action of MSH in man is poorly understood, but it may be involved in temperature regulation PUBMED:2266117. Full activity of ACTH resides in the first 20 N-terminal amino acids, the first 13 of which are identical to alpha-MSH PUBMED:2266117, PUBMED:2839146.

    \

    This region corresponds to the conserved YGG motif that is found in a wide variety of opioid neuropeptides such as enkephalin

    \ ' '7868' 'IPR012525' '\

    This family consists of diapausin-related antimicrobial peptides. Diapause during periods of environmental adversity is an essential part of the life cycle of many organisms with the molecular basis being different among animals. Diapause-specific peptides provide anti-fungal activity and act as N-type voltage-gated calcium channel blocker PUBMED:14706547.

    \ ' '7869' 'IPR012529' '\

    This family consists of the attractin family of water-borne pheromone. Mate attraction in Aplysia involves a long-distance water-borne signal in the form of the attractin peptide that is released during egg laying. These peptides contain 6 conserved cysteines and are folded into 2 antiparallel helices. The second helix contains the IEECKTS sequence conserved in Aplysia attractins PUBMED:15118100.

    \ ' '7870' 'IPR012621' '\

    This family consists of TOM7 family of mitochondrial import receptors. TOM7 forms part of the translocase of the outer mitochondrial membrane (TOM) complex and it appears to function as a modulator of the dynamics of the mitochondrial protein transport machinery by promoting the dissociation of subunits of the outer membrane translocase PUBMED:9642296.

    \ ' '7871' 'IPR012574' '\

    This family consists of proteins with similarity to the mitochondrial proteolipids. Mitochondrial proteolipid consists of about 60 amino acids residues and is about 6.8 kDa in size PUBMED:2298292.

    \ ' '7872' 'IPR012575' '\

    This family consists of the MNLL subunits of NADH:ubiquinone oxidoreductase complex PUBMED:15581635. MNLL subunit is one of the many subunits found in the complex and it contains a mitochondrial import sequence. However, the role of MNLL subunit is unclear PUBMED:12644575.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '7873' 'IPR012595' '\

    This family consists of the PetM family of cytochrome b6f complex subunit IV. The cytochrome b6f complex consists of 7 subunits and contains 2 beta haemes and 1 chlorophyll alpha per cytochrome f. It is highly active in transferring electrons from decylplastoquinol to oxidised plastocyanin PUBMED:7493968.

    \ ' '7874' 'IPR011725' '\

    This entry describes a very small protein, coenzyme PQQ biosynthesis protein A, which is smaller than 25 amino acids in many species. It is proposed to serve as a peptide precursor of coenzyme pyrrolo-quinoline-quinone (PQQ), with Glu and Tyr of a conserved motif Glu-Xxx-Xxx-Xxx-Tyr becoming part of the product PUBMED:9467911.

    \ ' '7875' 'IPR012510' '\

    The repeat has the consensus sequence GDV(K/Q/R)(T/S/G)X(R/K/T) WLFETXPLD. This repeat motif is typically found in the N-terminus of the proteins, with a copy number between 2 and 28 repeats. Direct evidence for binding to and stabilising F-actin has been found in the human protein () PUBMED:16631741. The homologues in mouse and chicken localise in the adherens junction complex of the intercalated disc in cardiac muscle and in the myotendon junction of skeletal muscle. mXin may co-localise with Vinculin which is known to attach the actin to the cytoplasmic membrane PUBMED:16631741. It has been shown that the amino-terminus of human xin (CMYA1) binds the EVH1 domain of Mena/VASP/EVL, and the carboxy-terminus binds the, for the filamin family unique, domain 20 of filamin C PUBMED:12203715. This confirms the proposed role of xin repeat containing proteins as F-actin-binding adapter proteins.

    \ ' '7876' 'IPR012551' '\

    This domain is found in a variety of actinomycetales proteins. All of the proteins containing this domain are hypothetical and probably membrane bound or associated. Currently, it is unclear to the function of this domain.

    \ ' '7877' 'IPR012535' '\

    Cdc14 is a component of the septation initiation network (SIN) and is required for the localisation and activity of Sid1. Sid1 is a protein kinase that localises asymmetrically to one spindle pole body (SPB) in anaphase disappears prior to cell separation PUBMED:10775265 PUBMED:11384993.

    \ ' '7878' 'IPR012567' '\

    This family consists of the leader peptides of ilvGEDA operon. The expression of the ilvGEDA operon of E coli K-12 is multivalently controlled by the three branched -chain amino acids. Regulation is thought to occur by attenuation of transcription in response to the changing levels of the cognate tRNAs. Transcription of this operon is usually terminated at the end of the leader (regulatory) region PUBMED:3900037.

    \ ' '7879' 'IPR012565' '\

    This family consists of the leader peptide of the histidine (his) operon. The his operon contains all the genes necessary for histidine biosynthesis. The region corresponding to the untranslated 5\'-end of the transcript, named the his leader region, displays the typical features of the T box transcriptional attenuation mechanism which is involved in the regulation of many amino acid biosynthetic operons PUBMED:10094678.

    \ ' '7880' 'IPR012605' '\

    This entry represents of the RepA1 leader peptide known as Tap found in IncFII plasmids. The frequency of replication of IncFII plasmid NR1 during the cell division cycle is regulated by the control of the synthesis of the plasmid-specific replication initiation protein (RepA1). When RepA1 is synthesised, it binds to the plasmid replication origin (ori) and effects the assembly of a replication complex composed of host proteins that mediate the replication of the plasmid PUBMED:1447133, PUBMED:1378398. The tap gene encodes a 24-amino acid peptide whose translation is required for the translation of repA.

    \ ' '7881' 'IPR012566' '\

    This family consists of the leader peptides of the ilvB operon. This region encodes a potential leader polypeptide containing 32 amino acids, 12 of which are the regulatory amino acids valine and leucine. A model for the multivalent regulation of this operon by valyl- and leucyl-tRNA is proposed on the basis of the mutually exclusive formation of five strong stem-and-loop structures in the leader mRNA PUBMED:6292893.

    \ ' '7882' 'IPR012618' '\

    The antibiotic tetracycline has a broad spectrum of activity, acting to inhibit bacterial protein synthesis by binding to the 30S ribosomal subunit, which prevents the association of the aminoacyl-tRNA to the ribosomal acceptor A site. Tetracycline binding is reversible, therefore diluting out the antibiotic can reverse its effects. Tetracycline resistance genes are often located on mobile elements, such as plasmids, transposons and/or conjugative transposons, which can sometimes be transferred between bacterial species. In certain cases, tetracycline can enhance the transfer of these elements, thereby promoting resistance amongst a bacterial colony. There are three types of tetracycline resistance: tetracycline efflux, ribosomal protection, and tetracycline modification PUBMED:16887689, PUBMED:15837373:

    \

    \

    \

    \

    The expression of several of these tet genes is controlled by a family of tetracycline transcriptional regulators known as TetR. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity PUBMED:15944459. The TetR proteins identified in over 115 genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response.

    \ \

    This entry represents the tetracycline resistance leader peptide, which can be found in Tet(L) efflux proteins. Tet(L) is a transmembrane protein that can function as a metal-tetracycline/H+ antiporter. Its sequence is preceded by a leader region region that contains a 20-amino-acid open reading frame and an appropriately spaced ribosome binding site PUBMED:9988470. Expression of the gene is induced by addition of tetracycline, which is thought to act by binding to ribosomes that translate the tet(L) leader peptide coding sequence. The presence of three inverted repeats, which can form two different conformations of mRNA, suggests that the tetracycline resistance (TcR) region is regulated by a translational attenuation mechanism. A Rho-independent transcriptional terminator structure is present immediately after the translational stop codon of the Tet protein PUBMED:2996983.

    \ ' '7883' 'IPR012558' '\

    This family consists of erythromycin resistance gene leader peptides. These leader peptides are involved in the translational attenuation of erythromycin resistance genes. Interestingly, the consensus sequence of peptides conferring erythromycin resistance is similar to that of the leader peptides, thus indicating that a similar type of interaction between the nascent peptide and antibiotics can occur in both cases PUBMED:11587794.

    \ ' '7884' 'IPR012602' '\

    This family consists of the pyrBI operon leader peptides. The expression of the pyrBI operon, which encodes the subunits of the pyrimidine biosynthetic enzyme aspartate transcarbamylase. is regulated primarily through a UTP-sensitive transcriptional attenuation control mechanism. In this mechanism, the concentration of UTP determines the extent of coupling between transcription and translation within the pyrBI leader region, hence determining the level of rho-independent transcriptional termination at an attenuator preceding the pyrB gene PUBMED:7517939.

    \ ' '7885' 'IPR012620' '\

    This entry defines the apparent leader peptides of tryptophanase operons in Escherichia coli, Vibrio cholerae, Photobacterium profundum, Haemophilus influenzae, and related species. It has been suggested that these peptides act in cis to alter the behaviour of the translating ribosome PUBMED:9045840.

    \

    The tryptophanese (tna) operon leader peptide catalyses the degradation of L-tryptophan to indole, pyruvate and ammonia, enabling the bacteria to utilise tryptophan as a source of carbon, nitrogen and energy. The tna operon of Escherichia coli contains two major structural genes, tnaA and tnaB. Preceding tnaA in the tna operon is a 319 -nucleotide transcribed regulatory region that contains the coding region for a 24-residue leader peptide, TnaC. The RNA sequence in the vicinity of the tnaC stop codon is rich in Cytidylate residues which is required for efficient Rho -dependent termination in the leader region of the tna operon PUBMED:14563884.

    \ ' '7886' 'IPR012570' '\

    This family consists of the leucine operon leader peptide. The leucine operon is involved in the control of the biosynthesis of leucine. Four adjacent leucine codons within the leucine leader RNA are critically important in transcription attenuation-mediated control of leucine operon expression in bacteria. The leader RNA contains translational start and stop signals, a cluster of four leucine codons and overlapping regions of dyad symmetry that are capable of forming stem-and-loop structures PUBMED:3922957.

    \ ' '7887' 'IPR012638' '\

    This family consists of the tryptophan (trp) leader peptides. Tryptophan accumulation is the principal event resulting in down regulation of transcription of the structural genes of the trp operon. The leader peptide of the trp operon forms mutually exclusive secondary structures that would either result in the termination of transcription of the trp operon when tryptophan is in plentiful supply or vice versa PUBMED:15262409.

    \ ' '7888' 'IPR012639' '\

    This family consists of the tryptophan operon leader peptides. The tryptophan operon is regulated by transcription attenuation in response to changes in the level of tryptophan. The transcript of the leader peptide can adopt alternative mutually-exclusive secondary structures that would either result in termination of transcription of the tryptophan structural genes or in transcription of the entire operon PUBMED:12213655.

    \ ' '7889' 'IPR012559' '\

    This family consists of erythromycin resistance gene leader peptides. These leader peptides are involved in the transcriptional attenuation control of the synthesis of the macrolide-lincosamide -streptogramin B resistance protein. It acts as a transcriptional attenuator, in contrast to other inducible erm genes. The mRNA leader sequence can fold in either of two mutually exclusive conformations, one of which is postulated to form in the absence of induction, and to contain two rho factor-independent terminators PUBMED:1713206.

    \ ' '7890' 'IPR012578' '\

    Proteins containing this domain are components of the nuclear pore complex PUBMED:12791264. One member of this domain is Nucleoporin POM34 () which is thought to have a role in anchoring peripheral Nups into the pore and mediating pore formation PUBMED:12791264.

    \ ' '7891' 'IPR012989' '\

    The SEP domain is named after Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. In p47, the SEP domain has been shown to bind to and inhibit the cysteine protease cathepsin L PUBMED:15498563. Most SEP domains are succeeded closely by a UBX domain PUBMED:15498563.

    \ \

    This domain has a 2-layer beta(3)-alpha(2)-beta fold, and is present in a number of other proteins as well, including FAF1 (Fas-associated factor 1) and undulin 2. Many of these proteins also contain the UBX domain C-terminal to the FAF domain (). This domain is found in many eukaryotic proteins PUBMED:8524870.

    \ ' '7892' 'IPR012976' '\

    This is the central domain in Nop56/SIK1-like proteins PUBMED:15112237.

    \ ' '7893' 'IPR012587' '\

    This short region is found in two copies in p68-like RNA helicases PUBMED:15112237.

    \ ' '7894' 'IPR012586' '\

    This characteristic repeat of proliferating cell nuclear antigen P120 is found in three copies PUBMED:15112237.

    \ ' '7895' 'IPR012982' '\

    This domain is found in poly(ADP-ribose)-synthetases PUBMED:15112237. The function of this domain is unknown.

    \ ' '7896' 'IPR012993' '\

    This domain is characteristic of UVSB PI-3 kinase, MEI-41 and ESR1 PUBMED:15112237.

    \ ' '7897' 'IPR012568' '\

    This family represents the K167/Chmadrin repeat PUBMED:15112237. The function of this repeat is unknown.

    \ ' '7898' 'IPR012588' '\

    Exosomes are nano-compartments that function in the degradation or processing of RNA (including mRNA, rRNA, snRNA and snoRNA) PUBMED:15951817, PUBMED:17174896. Exosomes occur in both archaea and eukaryotes, and have a similar overall structure to each other and to bacterial/organelle PNPases (polynucleotide phosphorylases; ) PUBMED:17084501, consisting of a barrel structure composed of a hexameric ring of PH domains that act as a degradation chamber, and an S1-domain/KH-domain containing cap that binds the RNA substrate (and sometimes accessory proteins) in order to regulate and restrict entry into the degradation chamber PUBMED:16285927. There are two types of exosomes in eukaryotes, cytoplasmic exosomes that are responsible for 3\'-5\' exoribonuclease degradation of mRNAs, and nuclear exosomes that degrade pre-mRNAs (such as nonsense transcripts) and degrade rRNAs, snRNAs and snoRNAs. Unstructured RNA substrates feed in through the pore made by the S1 domains, are degraded by the PH domain ring, and exit as nucleotides via the PH pore at the opposite end of the barrel PUBMED:16713559, PUBMED:17380186.

    \

    There are several accessory proteins that help degrade, unwind or polyadenylate RNA substrate before they enter the exosome. This entry represents the N-terminal domain of Rrp6 (exosome component 10 in humans), a nuclear exosome accessory factor that interacts with the bottom of the hexameric PH-ring opposite the cap. Rrp6 functions as a hydrolytic exonuclease, and is homologous to RNase-D in Escherichia coli.

    \

    More information about these proteins can be found at Protein of the Month: RNA Exosomes PUBMED:.

    \ ' '7899' 'IPR012987' '\

    This presumed domain is found at the N terminus of RNP K-like proteins that also contain KH domains PUBMED:15112237.

    \ ' '7900' 'IPR012960' '\

    This is an N-terminal domain of Dyskerin-like proteins, which is often associated with the TruB N-terminal() and PUA() domains PUBMED:15112237.

    \ ' '7901' 'IPR012606' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This domain is found at the N terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021 PUBMED:15112237.

    \ ' '7902' 'IPR012542' '\

    The DTCHT region is the C-terminal part of DNA gyrases B / topoisomerase IV / HATPase proteins PUBMED:15112237. This region is composed of quite low complexity sequence.

    \ ' '7903' 'IPR013843' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of \ sequence similarities. One of these families includes yeast S7 (YS6); archaeal S4e; and \ mammalian and plant cytoplasmic S4 PUBMED:2124517. Two highly similar isoforms of mammalian S4 \ exist, one coded by a gene on chromosome Y, and the other on chromosome X. These proteins have \ 233 to 264 amino acids.

    \ \

    This entry represents the N-terminal region of these proteins.

    \ ' '7904' 'IPR012532' '\

    This is a C-terminal domain in Bloom\'s syndrome DEAD helicase subfamily PUBMED:15112237. The helicase articipates in DNA replication and repair, exhibiting a magnesium-dependent ATP-dependent DNA-helicase activity that unwinds single- and double-stranded DNA in a 3\'-5\' direction.

    \ ' '7905' 'IPR012958' '\

    The CHD N-terminal domain is found in PHD/RING fingers and chromo domain-associated helicases PUBMED:15112237.

    \ ' '7906' 'IPR012957' '\

    The CHDCT2 C-terminal domain is found in PHD/RING fingers and chromo domain-associated CHD-like helicases PUBMED:15112237.

    \ ' '7907' 'IPR012975' '\

    This domain is found C-terminal to 1 or 2 domains PUBMED:15112237 in NONA and PSP1 proteins.

    \ ' '7908' 'IPR012992' '\

    The antibiotic tetracycline has a broad spectrum of activity, acting to inhibit bacterial protein synthesis by binding to the 30S ribosomal subunit, which prevents the association of the aminoacyl-tRNA to the ribosomal acceptor A site. Tetracycline binding is reversible, therefore diluting out the antibiotic can reverse its effects. Tetracycline resistance genes are often located on mobile elements, such as plasmids, transposons and/or conjugative transposons, which can sometimes be transferred between bacterial species. In certain cases, tetracycline can enhance the transfer of these elements, thereby promoting resistance amongst a bacterial colony. There are three types of tetracycline resistance: tetracycline efflux, ribosomal protection, and tetracycline modification PUBMED:16887689, PUBMED:15837373:

    \

    \

    \

    \

    The expression of several of these tet genes is controlled by a family of tetracycline transcriptional regulators known as TetR. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity PUBMED:15944459. The TetR proteins identified in over 115 genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response.

    \ \

    This entry represents the tetracycline resistance leader peptide, which can be found in Tet(M) ribosomal protection proteins. A short open reading frame corresponding to a 28 amino acid peptide, which contains a number of inverted repeat sequences was found immediately upstream of tet(M). Transcriptional analyses has found that expression of tet(M) resulted from an extension of a small transcript representing the upstream leader region into the resistance determinant. Therefore, this leader sequence is responsible for transcriptional attenuation and thus regulation of the transcription of tet(M) PUBMED:1323953.

    \ ' '7909' 'IPR012537' '\

    This family consists of chloramphenicol (Cm) resistance gene leader peptides. Inducible resistance to Cm in both Gram-positive and Gram-negative bacteria is controlled by translation attenuation. In translation attenuation, the ribosome-binding-site (RBS) for the resistance determinant is sequestered in a secondary structure domain within the mRNA. Preceding the secondary structure is a short, translated ORF termed the leader. Ribosome stalling in the leader causes the destabilisation of the downstream secondary structure, allowing initiation of translation of the Cm resistance gene PUBMED:8955642.

    \ ' '7910' 'IPR012986' '\

    This family consists of the PsaX family of photosystem I (PSI) protein subunits. PSI is a large multi-subunit pigment protein complex embedded in the thylakoid membranes of green plants and cyanobacteria. PsaX is one of the 12 protein subunits found in PSI and these subunits are arranged as monomers or trimers within the membrane as shown by the structure of the trimeric complex from Synechococcus elongatus PUBMED:14556907.

    \ ' '7911' 'IPR012988' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This presumed domain is found at the N terminus of Ribosomal L30 proteins and has been termed RL30NT or NUC018 PUBMED:15112237.

    \ ' '7912' 'IPR012996' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a putative zinc-binding domain (CHHC motif) in RNP H and F. The domain is often associated with .

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '7913' 'IPR012604' '\

    This region is found in RBM1-like RNA binding hnRNPs PUBMED:15112237.

    \ ' '7914' 'IPR012591' '\

    Pre-mRNA-processing-splicing factor 8 is a central component of the spliceosome, which may play a role in aligning the pre-mRNA 5\'- and 3\'-exons for ligation. It interacts with U5 snRNA, and with pre-mRNA 5\'-splice sites in B spliceosomes and 3\'-splice sites in C spliceosomes. It is part of the U5 snRNP complex, and of U5.4/6 and U5.U4atac/U6atac snRNP complexes in U2- and U12-dependent spliceosomes, respectively. It is also found in a mRNA splicing-dependent exon junction complex (EJC) with SRRM1 where it interacts with U5 snRNP proteins SNRP116 and WDR57/SPF38 PUBMED:15840809, PUBMED:9774689.

    \ ' '7915' 'IPR012592' '\

    The PROCN domain is the central domain in pre-mRNA splicing factors of PRO8 family PUBMED:15112237.

    \ ' '7916' 'IPR012984' '\

    The PROCT domain is the C-terminal domain in pre-mRNA splicing factors of PRO8 family PUBMED:15112237.

    \ ' '7917' 'IPR012556' '\

    This family consists of the entericidin antidote/toxin peptides. The entericidin locus is activated in stationary phase under high osmolarity conditions by rho-S and simultaneously repressed by the osmoregulatory EnvZ/OmpR signal transduction pathway. The entericidin locus encodes tandem paralogous genes (ecnAB) and directs the synthesis of two small cell-envelope lipoproteins which can maintain plasmids in bacterial population by means of post-segregational killing PUBMED:9677290.

    \ ' '7918' 'IPR012622' '\

    The Ergtoxin (ErgTx) family is a class of peptides from scorpion venom that specifically block ERG (ether-a-go-go-related gene) K+ channels of the nerve, heart and endocrine cells PUBMED:11023354, PUBMED:12650941, PUBMED:12459475.

    \

    Peptides of the ErgTx family have from 42 to 47 amino acid residues cross-linked by four disulphide bridges. The four disulphide bridges have been assigned as C1-C4, C2-C6, C3-C7 and C5-C8 (see the schematic representation below) PUBMED:11023354. ErgTxs consist of a triple-stranded beta-sheet and an alpha-helix, as is typical of K+ channel scorpion toxins. There is a large hydrophobic patch on the surface of the toxin, surrounding a central lysine residue located near the beta-hairpin loop between the second and third strands of the beta-sheet. It has been postulated that this hydrophobic patch is likely to form part of the binding surface of the toxin PUBMED:12650941. Peptides of the ErgTx family possess a Knottin scaffold (see http://knottin.cbs.cnrs.fr).

    \ \

    Some proteins known to belong to the ErgTx family are listed below:

    \ \ ' '7919' 'IPR012623' '\

    This family consists of members of the conotoxin O-superfamily. The O-superfamily of conotoxins consists of 3 groups of Conus peptides that belong to the same structural group. These 3 groups differ in their pharmacological properties: the w-conotoxins which inhibit calcium channels, the delta-conotoxins which slow down the inactivation rate of voltage -sensitive sodium channels and the muO-conotoxins block the voltage sensitive sodium currents PUBMED:7622492.

    \ ' '7920' 'IPR012624' '\

    This family consists of the I-superfamily of conotoxins. This is a new class of peptides in the venom of some Conus species. These toxins are characterised by four disulphide bridges and inhibit of modify ion channels of nerve cells. The I-superfamily conotoxins is found in five or six major clades of cone snails and could possible be found in many more species PUBMED:15450929.

    \ ' '7921' 'IPR012625' '\

    This family consists of the huwentoxin-II (HWTX-II) family of toxins secreted by spiders. These toxins are found in venom that secreted from the bird spider Selenocosmia huwena Wang. The HWTX-II adopts a novel scaffold different from the ICK motif that is found in other huwentoxins. HWTX-II consists of 37 amino acids residues including six cysteines involved in three disulphide bridges PUBMED:15066414.

    \ ' '7922' 'IPR012557' '\

    Heat-stable toxin 1 of entero-aggregative Escherichia coli (EAST1) is a small toxin. It is not, however, solely associated with entero-aggregative E. coli but also with many other diarrhoeic E. coli families. Some studies have established the role of EAST1 in some human outbreaks of diarrhoea. Isolates from farm animals have been shown to carry the astA gene coding for EAST1. However, the relation between the presence of EAST1 and disease is not conclusive PUBMED:16336921.

    \ ' '7923' 'IPR012626' '\

    This family consists of insecticidal peptides isolated from venom of spiders of Aptostichus schlingeri (Trap-door spider) and Calisoga sp. Nine insecticidal peptides were isolated from the venom of the A. schlinger spider and seven of these toxins cause flaccid paralysis to insect larvae within 10 min of injection. However, all nine peptides were lethal within 24 hours PUBMED:1440641.

    \ ' '7924' 'IPR012627' '\

    This family consists of Magi peptide toxins (Magi 1, 2 and 5) isolated from the venom of Hexathelidae spider. These insecticidal peptide toxins bind to sodium channels and induce flaccid paralysis when injected into lepidopteran larvae. However, these peptides are not toxic to mice when injected intracranially at 20 pmol/g.

    \ ' '7925' 'IPR012628' '\

    This family consists of toxic peptides (Magi 5) found in the venom of the Hexathelidae spider. Magi 5 is the first spider toxin with binding affinity to site 4 of a mammalian sodium channel and the toxin has an insecticidal effect on larvae, causing paralysis when injected into the larvae.

    \ ' '7926' 'IPR012629' '\

    This family consists of conotoxins isolated from the venom of cone snail Conus tulipa and Conus geographus. Conotoxin TVIIA, isolated from Conus tulipa displays little sequence homology with other well-characterised pharmacological classes of peptides, but displays similarity with conotoxin GS, a peptide from Conus geographus. Both these peptides block skeletal muscle sodium channels and also share several biochemical features and represent a distinct subgroup of the four-loop conotoxins PUBMED:10903496.

    \ ' '7927' 'IPR012630' '\

    This family consists of the hefutoxins that are found in the venom of the scorpion Heterometrus fulvipes (Indian black scorpion). These toxins, kappa-hefutoxin1 and kappa-hefutoxin2, exhibit no homology to any known toxins. The hefutoxins are potassium channel toxins PUBMED:12034709.

    \ ' '7928' 'IPR012534' '\

    This family consists of the bombolitin peptides that are found in the venom of the bumblebee Megabombus pennsylvanicus (American common bumblebee). Bombolitins are structurally and functionally very similar. They lyse erythrocytes and liposomes, release histamine from rat peritoneal mast cells, and stimulate phospholipase A2 from different sources PUBMED:2578459.

    \ ' '7929' 'IPR012631' '\

    This family consists of the T-superfamily of conotoxins. Eight different T-superfamily peptides from five Conus species were identified. These peptides share a consensus signal sequence, and a conserved arrangement of cysteine residues. T-superfamily peptides were found expressed in venom ducts of all major feeding types of Conus, suggesting that the T-superfamily is a large and diverse group of peptides, widely distributed in the 500 different Conus species PUBMED:10521453.

    \ ' '7930' 'IPR012509' '\

    This entry occurs within the Anemonia sulcata toxin III (ATX III) neurotoxin family. ATX III is a neurotoxin that is produced by sea anemone; it adopts a compact structure containing four reverse turns and two other chain reversals, but no regular alpha-helix or beta-sheet. A hydrophobic patch found on the surface of the peptide may constitute part of the sodium channel binding surface PUBMED:7727358.

    \ ' '7931' 'IPR012632' '\

    Toxins of the scorpion calcine family bind directly to ryanodine receptors (RyRs), intracellular channel targets of the endoplasmic reticulum, and induce long lasting channel openings in a mode of smaller conductance. They have the ability to translocate into cells by crossing the plasma membrane PUBMED:10075681, PUBMED:10713267, PUBMED:15653689.

    \

    Toxins of scorpion calcine family are highly basic 33-amino acid peptides that present three disulphide bridges (C1-C4, C2-C5, and C3-C6) and fold along a knottin or inhibitor cystine knot motif (http://knottin.cbs.cnrs.fr) PUBMED:10075681, PUBMED:10713267, PUBMED:15653689. Their three dimensional structure consists of a compact disulphide-bonded core from which emerge loops and the N-terminus. The main element of regular secondary structure is a double-stranded antiparallel beta-sheet. A third peripheral extended strand is almost perpendicular to the double-stranded antiparallel beta-sheet PUBMED:10713267, PUBMED:10861934. Scorpion calcine mimic the activating segment of the dihydropyridine receptor II-III loop, which interacts with a region of the ryanodine receptor PUBMED:10075681, PUBMED:10713267, PUBMED:12429019.

    \ \

    This family includes:

    \ \ ' '7932' 'IPR012967' '\

    This domain is found at the N terminus of a variety of plant O-methyltransferases. It has been shown to mediate dimerisation of these proteins PUBMED:11224575.

    \ ' '7933' 'IPR012965' '\

    This is a fungal domain of unknown function, though the yeast protein MSB1() which contains this domain is thought to play a role in bud formation PUBMED:1996092.

    \ ' '7934' 'IPR012526' '\

    This family consists of antimicrobial peptides secreted by scorpions. Novel antimicrobial peptides have been isolated from scorpions, namely the opistoporin PUBMED:12354111 and the pandinin PUBMED:11563967. These peptides form essentially helical structures and demonstrate high antimicrobial activity against Gram-negative and Gram-positive bacteria respectively.

    \ ' '7935' 'IPR012527' '\

    This family consists of the uperin family of antimicrobial peptides. Uperin is a wide-spectrum antibiotic peptide isolated from the Australian toadlet, Uperoleia mjobergii. Being only 17 amino acid residues long, it is smaller than most other wide-spectrum antibiotic peptides isolated from amphibians. Uperin adopts a well-defined amphipathic alpha-helix with distinct hydrophilic and hydrophobic faces PUBMED:10461748.

    \ ' '7936' 'IPR012528' '\

    This family consists of the ponericin L family of antimicrobial peptides that are isolated from the venom of the predatory ant Pachycondyla goeldii (Ponerine ant). Ponericin L family shares similarities with dermaseptins. Ponericin L may adopt an amphipathic alpha-helical structure in polar environments and these peptides exhibit a defensive role against microbial pathogens arising from prey introduction and/or ingestion PUBMED:11279030.

    \ ' '7937' 'IPR012513' '\

    This family consists of the metchnikowin family of antimicrobial peptides from Drosophila. metchnikowin is a proline-rich peptide whose expression is immune-inducible. Induction of the metchnikowin gene expression can be mediated either by the TOLL pathway or by the imd gene product. The metchnikowin peptide is unique among the Drosophila antimicrobial peptides in that it is active against both bacteria and fungi PUBMED:9600835.

    \ ' '7938' 'IPR012514' '\

    This family consists of the formaecin family of antimicrobial peptides isolated from the bulldog ant Myrmecia gulosa in response to bacterial infection. Formaecins are inducible peptide antibiotics and are active against growing Escherichia coli but were inactive against other Gram-negative and Gram-positive bacteria. Formaecin peptides are 16 amino acids long, are rich in proline and have N-acetylgalactosamine O-linked to a conserved threonine PUBMED:9497332.

    \ ' '7939' 'IPR012515' '\

    This family consists of the pleurocidin family of antimicrobial peptides. Pleurocidins are found in the skin mucous secretions of the winter flounder (Pleuronectes americanus) and these peptides exhibit antimicrobial activity against Escherichia coli. Pleurocidin is predicted to assume an amphipathic alpha-helical conformation similar to other linear antimicrobial peptides and may play a role in innate host defence PUBMED:9115266.

    \ ' '7940' 'IPR012516' '\

    This family consists of the halocidin family of antimicrobial peptides. Halocidins are isolated from the haemocytes of the tunicate, Halocynthia aurantium (Sea peach). They are dimeric in structures, which are found via a disulphide linkage between cysteines of two different- sized monomers. Halocidins have been shown to have strong antimicrobial activities against a wide variety of pathogenic bacteria and could be ideal candidates as peptide antibiotics against multidrug-resistant bacteria PUBMED:12067731.

    \ ' '7941' 'IPR012517' '\

    This family consists of lactocin 705 which is a bacteriocin produced by Lactobacillus casei CRL 705. Lactocin 705 is a class IIb bacteriocin, whose activity depends upon the complementation of two peptides (705-alpha and 705-beta) of 33 amino acid residues each. Lactocin 705 is active against several Gram-positive bacteria, including food-borne pathogens and is a good candidate to be used for biopreservation of fermented meats PUBMED:10754241.

    \ ' '7942' 'IPR012518' '\

    This family consists of the ocellatin family of antimicrobial peptides. Ocellatins are produced from the electrical-stimulated skin secretions of the South American frog, Leptodactylus ocellatus (Argus frog). The family consists of three structurally related peptides, ocellatin 1, ocellatin 2 and ocellatin 3. These peptides present haemolytic activity against human erythrocytes and are also active against Escherichia coli PUBMED:15648972.

    \ ' '7943' 'IPR012593' '\

    This family consists of the PEA-VEAacid neuropeptides family. These neuropeptides are isolated from the abdominal perisympathetic organs of the American cockroach. These peptides are found together with Pea-YLS-amide and Pea-SKNacid, giving a unique neuropeptide pattern in abdominal perisympathetic organs. The functions of these neuropeptides are unknown PUBMED:10676456.

    \ ' '7944' 'IPR012508' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    A-ATPases (or A1A0-ATPase) () are found exclusively in Archaea and display a close resemblance in structure and subunit composition with V-ATPases, although their function in both ATP synthesis and ATP hydrolysis is closer to that of F-ATPases PUBMED:10340845. A-ATPases are composed of two linked complexes: the A1 complex consisting of seven subunits contains the catalytic core that synthesizes/hydrolyses ATP, while the A0 complex consisting of at least two subunits forms the membrane-spanning pore PUBMED:8702544. The rotary motor in A-ATPases is composed of only two subunits, the stator subunit I and the rotor subunit C PUBMED:15168615. A-ATPases may have arisen as an adaptation to the different cellular needs and the more extreme environmental conditions faced by Archaeal species.

    \

    The epsilon subunit is the smallest (7 kDa) of those found in the A1 complex. Unlike the A, B and C subunits, the epsilon subunit does not have a homologous counterpart in F- or V-ATPases PUBMED:2147683.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '7945' 'IPR012538' '\

    This family consists of the cytochrome c oxidase subunit IIa family. The bax-type cytochrome c oxidase from Thermus thermophilus is known as a two subunit enzyme. From its crystal structure, it was discovered that an additional transmembrane helix, subunit IIa, spans the membrane. This subunit consists of 34 residues forming one helix across the membrane. The presence of this subunit seems to be important for the function of cytochrome c oxidases PUBMED:11152118.

    \ ' '7946' 'IPR012589' '\

    This family consists of small proteolipids associated with the plasma membrane H+ ATPase. Two proteolipids (PMP1 and PMP2) are associated with the ATPase and both genes are similarly expressed in the wild-type strain of yeast with no modification of the level of transcription of one PMP gene is detected in a strain deleted of the other. Though both proteolipids show similarity with other small proteolipids associated with other cation -transporting ATPases, their functions remain unclear PUBMED:8063750.

    \ ' '7947' 'IPR012633' '\

    This family consists of the SFI family of spider toxins. This family of toxins might share structural, evolutionary and functional relationships with other small, highly structurally constrained spider neurotoxins. These toxins are highly selective agonists/antagonists of different voltage-dependent calcium channels and are extremely valuable reagents in the analysis of neuromuscular function.

    \ ' '7948' 'IPR012634' '\

    This family consists of PhTx insecticidal neurotoxins that are found in the venom of Phoneutria nigriventer (Brazilian armed spider). The venom of the P. nigrivente contains numerous neurotoxic polypeptides of 30-140 amino acids, which exert a range of biological effects. While some of these neurotoxins are lethal to mice after intracerebroventricular injections, others are extremely toxic to insects of the orders Diptera and Dictyoptera but had much weaker toxic effects on mice PUBMED:10978749.

    \ ' '7949' 'IPR012325' '\

    Assassin bugs (Arthropoda:Insecta:Hemiptera:Reduviidae), sometimes known as conenoses or kissing bugs, are one of the largest and morphologically\ diverse families of true bugs feeding on crickets, caterpillars and other\ insects. Some assassin bug species are bloodsucking parasites of mammals, even\ of human. They can be commonly found throughout most of the world and their\ size varies from a few millimetres to as much as 3 or 4 centimetres. The\ toxic saliva of the predatory assassin bugs contains a complex mixture of\ small and large peptides for diverse uses such as immobilizing and pre-digesting their prey, and defence against competitors and predators. Assassin bug toxins are small peptides with disulphide connectivity that target ion-channels. They are relatively homologous to the calcium channel blockers omega-conotoxins from marine cone snails and belong to the\ four-loop cysteine scaffold structural class PUBMED:11423127, PUBMED:11669615.

    \ \

    One of these small proteins, Ptu1, blocks reversibly the N-type calcium channels, but at the same time is less specific for the L- or P/Q-type calcium channels PUBMED:11423127. Ptu1 is 34 amino acid residues long and is cross-linked by three disulphide bridges. Ptu1 contains a beta-sheet region made of two antiparallel beta-strands and consists of a compact disulphide-bonded core from which four loops emerge as well as N- and C-termini PUBMED:11669615. Some assassin bug toxins are listed below:

    \ \

    \ ' '7950' 'IPR012571' '\

    Proteins in this family are yeast mitochondrial inner membrane proteins MDM31 and MDM32. These proteins are required for the maintenance of mitochondrial morphology, and the stability of mitochondrial DNA PUBMED:15631992.

    \ ' '7951' 'IPR012635' '\

    This family consists of acidic alpha-KTx short chain scorpion toxins. These toxins named parabutoxins, block voltage-gated K channels and have extremely low pI values. Furthermore, they lack the crucial pore-plugging lysine. In addition, the second important residue of the dyad, the hydrophobic residue (Phe or Tyr) is also missing PUBMED:14561751.

    \ ' '7952' 'IPR012636' '\

    This family consists of the tamulustoxins, which are found in the venom of Mesobuthus tamulus (Eastern Indian scorpion) (Buthus tamulus). Tamulustoxin shares no similarity with other scorpion venom toxins, although the positions of its six cysteine residues suggest that it shares the same structural scaffold. Tamulustoxin acts as a potassium channel blocker PUBMED:11361010.

    \ ' '7953' 'IPR012637' '\

    This family consists of the lethal peptides (waglerins) that are found in the venom of Trimeresurus wagleri (Wagler\'s pit viper) (Tropidolaemus wagleri). Waglerins are 22-24 residue lethal peptides and are competitive antagonist of the muscle nicotinic receptor (nAChR). Waglerin-1 possesses a distinctive selectivity for the alpha-epsilon interface binding site of the mouse nAChR PUBMED:8533138.

    \ ' '7954' 'IPR012576' '\

    This family consists of the B12 subunit of NADH:ubiquinone oxidoreductase proteins. The function of this subunit is unclear PUBMED:9425316.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '7955' 'IPR013110' '\

    The DOT1 domain regulates gene expression by methylating histone H3 PUBMED:15292170. H3 methylation by DOT1 has been shown to be required for the DNA damage checkpoint in yeast PUBMED:15632126.

    \ ' '7956' 'IPR012970' '\

    This family consists of a group of secreted bacterial lyase enzymes () capable of acting on hyaluronan and chondroitin in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen.

    \ ' '7957' 'IPR013118' '\

    Long-chain mannitol dehydrogenases are a group of secondary alcohol dehydrogenases that differ from other alcohol or polyol dehydrogenases in that they do not utilise Zn(2+) or other metal cofactors and do not contain a conserved catalytic tyrosine residue. The proteins in this family that have been studied are monomeric enzymes of ~54 kDa and include:\

    \ These enzymes are mostly found in bacteria, though they are also present in some fungal species.

    \ \

    This entry represents the C-terminal substrate-binding domain of long-chain mannitol dehydrogenases. This domain is primarily alpha-helical in nature, being composed of eleven helices and a small beta hairpin PUBMED:12196534. Most of the residues implicated in substrate binding are located within this region, and a conserved lysine residue is thought to act as a proton acceptor during catalysis.

    \ ' '7958' 'IPR012600' '\

    This is found at the N-terminal end of some peptidases that belong to MEROPS peptidase family C25 (). Little is known about the function of this motif.

    \ ' '7959' 'IPR012599' '\

    This domain is found at the N-terminal of cathepsin B and cathepsin B-like peptidases that belong to MEROPS peptidase subfamily C1A. Cathepsin B are lysosomal cysteine proteinases belonging to the papain superfamily and are unique in their ability to act as both an endo- and an exopeptidases. They are synthesized as inactive zymogens. Activation of the peptidases occurs with the removal of the propeptide PUBMED:7890671, PUBMED:8740363.

    \ ' '7961' 'IPR012950' '\

    This family consists of the alpha and beta enterocins and lactococcin G peptides. These peptides have some antimicrobial properties; they inhibit the growth of Enterococcus spp. and a few other Gram-positive bacteria. These peptides act as pore-forming toxins that create cell membrane channels through a barrel-stave mechanism and thus produce an ionic imbalance in the cell. This family of antimicrobial peptides belongs to the class II group of bacteriocin PUBMED:10742203.

    \ ' '7962' 'IPR012519' '\

    This family consists of the type A lantibiotic peptides. Both Pep5 and epicidin-280 are ribosomally-synthesised antimicrobial peptides produced by Gram-positive bacteria that are characterised by the presence of lanthionine and/or methyllanthionine residues. The lantibiotics family has a highly specific activity against multi- drug resistant bacteria and has potential to be utilised in a wide range of medical applications PUBMED:2253617,PUBMED:9726851.

    \ ' '7963' 'IPR012553' '\

    This family consists of the defensin-like peptides (DLPs) isolated from platypus venom. These DLPs show similar three-dimensional fold to that of beta-defensin-12 and sodium-channel neurotoxin Shl. However the side chains known to be functionally important to beta-defensin-12 and Shl are not conserved in DLPs. This suggests a different biological function. Consistent with this contention, DLPs have been shown to possess no anti-microbial properties and have no observable activity on rat dorsal-root-ganglion sodium-channel currents PUBMED:10417345.

    \ ' '7964' 'IPR012511' '\

    This family consists of the S-adenosyl-l-methionine decarboxylase (AdoMetDC) leader peptides. AdoMetDC is a key regulatory enzyme in the biosynthesis of polyamines. All expressed plant AdoMetDC mRNA 5, leader sequences contain a highly conserved pair of overlapping upstream ORFs (uORFs) that overlap by one base. Sequences of the small uORFs are highly conserved between monocot, dicot and gymnosperm AdoMetDC mRNA species, suggesting a translational regulatory mechanism PUBMED:11139406.

    \ ' '7965' 'IPR012585' '\

    This family consists of the anticodon nuclease activator proteins. Pre-existing host tRNAs are reprocessed during Bacteriophage T4 infection of certain Escherichia coli strains. In this pathway, tRNA(Lys) is cleaved 5, by the anticodon nuclease to the wobble base and is later restored in polynucleotide kinase and RNA ligase reactions PUBMED:3280805.

    \ ' '7966' 'IPR012995' '\

    This family consists of the CIII family of regulatory proteins. The lambda CIII protein has 54 amino acids and it forms an amphipathic helix within its amino acid sequence. Lambda CIII stabilises the lambda CII protein and the host sigma factor 32, responsible for transcribing genes of the heat shock regulon PUBMED:8990286.

    \ ' '7967' 'IPR012555' '\

    This family consists of the major transforming proteins (E5) of the bovine papilloma virus (BPV). The equine sarcoid is one of the most common dermatological lesion in equids. It is a benign, locally invasive dermal fibroblastic lesion and studies have shown an association of the lesions with BPV. E5 is a short hydrophobic membrane protein localising to the Golgi apparatus and other intracellular membranes. It binds to and constitutively activates the platelet-derived growth factor-beta in transformed cells. This stimulation activates a receptor signalling cascade which results in an intracellular growth stimulatory signal PUBMED:12951274.

    \ ' '7968' 'IPR012607' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This family consists of the 30S ribosomal proteins subunit S22 polypeptides. This polypeptide is 47 amino acids in length and has a molecular weight of about 5 kDa. The S22 subunit is a component of the stationary-phase-specific ribosomal protein and is assembled in the ribosomal particles in the stationary phase. This subunit along with other stationary-phase-specific ribosomal proteins result in compositional changes of ribosomes during the stationary phase. The significance of this change is not clear as yet PUBMED:11168583.

    \ ' '7969' 'IPR012552' '\

    This family consists of the DVL family of proteins. In a gain-of-function genetic screen for genes that influence fruit development in Arabidopsis, DEVIL (DVL) gene was identified. DVL is a small protein and over expression of the protein results in pleiotropic phenotypes featured by shortened stature, rounder rosette leaves, clustered inflorescences, shortened pedicles, and siliques with pronged tips. DVL family is a novel class of small polypeptides and the over expression phenotypes suggest that these polypeptides may have a role in plant development PUBMED:14871303.

    \ ' '7970' 'IPR012608' '\

    This family consists of Sex Peptides (SP) that are found in Drosophila. On mating, Drosophila females decreases her remating rate and increases her egg-laying rate due, in part, to the transfer of SP from the male to the female. SP are found in seminal fluids transferred from the male to the female during mating. The male seminal fluid proteins are referred to as accessory gland proteins (Acps). The SP is one of the most interesting Acps and plays an important role in reproduction PUBMED:12913117.

    \ ' '7971' 'IPR012640' '\

    This family consists of the homologues of the VirB proteins of type IV secretion systems (T4SS). Conjugal transfer across the cell envelope of Gram-negative bacteria is mediated by a supramolecular structure termed mating pair formation (Mpf) complex. Collectively, secretion pathways ancestrally related to bacterial conjugation systems are now known as T4SS. T4SS are involved in the delivery of effector molecules to eukaryotic target cells; each of these systems exports distinct DNA or protein substrates to effect a myriad of changes in host cell physiology during infection PUBMED:11309113.

    \ ' '7972' 'IPR012539' '\

    This family consists of the cuticle proteins from the Cancer pagurus (Rock crab) and the Homarus americanus (American lobster). These proteins are isolated from the calcified regions of the crustacean and they contain two copies of an 18 residue sequence motif, which thus far has been found only in crustacean calcified exoskeletons PUBMED:10425740.

    \ ' '7973' 'IPR012610' '\

    This family consists of the small acid-soluble spore proteins (SASP) of the H type (sspH). SspH are unique to spores of Bacillus subtilis and are expressed only in the forespore compartment during sporulation of this organism. The sspH genes are monocistronic and are recognised by the forespore-specific sigma factor for RNA polymerase - sigma-G. The specific role of this protein is unclear but is thought to play a role in sporulation under conditions different from that of the common laboratory tests of spore properties PUBMED:10333516.

    \ ' '7974' 'IPR012948' '\

    This domain is the central domain of AARP2 (asparagine and aspartate rich protein 2). It is weakly similar to the GTP-binding domain of elongation factor TU PUBMED:15112237. PfAARP2 is an antigen from Plasmodium falciparum of 150 kDa, which is encoded by a unique gene on chromosome 1 PUBMED:9247928. The central region of Pfaarp2 contains blocks of repetitions encoding asparagine and aspartate residues.

    \ ' '7975' 'IPR012956' '\

    This N-terminal domain is found in CARG-binding factor A-like proteins PUBMED:15112237.

    \ ' '7976' 'IPR012959' '\

    This C-terminal domain is found in Penguin-like proteins and is associated with Pumilio like repeats PUBMED:15112237.

    \ ' '7977' 'IPR012953' '\

    WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase PUBMED:11814058, PUBMED:10322433. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.

    \

    This N-terminal domain is found in BOP1-like WD40 proteins. Bop1 is a nucleolar protein involved in rRNA processing, thereby controlling the cell cycle PUBMED:16362343. It is required for the maturation of the 25S and 5.8S ribosomal RNAs. It may serve as an essential factor in ribosome formation that coordinates processing of the spacer regions in pre-rRNA. The Pes1-Bop1 complex has several components: BOP1, GRWD1, PES1, ORC6L, and RPL3 and is involved in ribosome biogenesis and altered chromosome segregation. The overexpression of BOP1 increases the percentage of multipolar spindles in human cells. Deregulation of the BOP1 pathway may contribute to colorectal tumourigenesis in humans PUBMED:16804918. Elevated levels of Bop1 induces Bop1/WDR12 and Bop1/Pes1 subcomplexes and the assembly and integrity of the PeBoW complex is highly sensitive to changes in Bop1 protein levels PUBMED:17353269.

    \ \

    Nop7p-Erb1p-Ytm1p, found in yeast, is potentially the homologous complex of Pes1-Bop1-WDR12 as it is involved in the control of ribosome biogenesis and S phase entry. The integrity of the PeBoW complex is required for ribosome biogenesis and cell proliferation in mammalian cells PUBMED:16043514. In Giardia, the species specific cytoskeleton protein, beta-giardin, interacts with Bop1 PUBMED:16362343.\

    \ ' '7978' 'IPR012954' '\

    This C-terminal domain is found in BAP28-like nucleolar proteins PUBMED:15112237. The bap28 mutation leads to abnormalities in the brain, starting at midsomitogenesis stages. Mutant zebrafish embryos display excessive apoptosis, especially in the central nervous system (CNS) that results in death. The mutation affects a gene that encodes a large protein with high similarity to the uncharacterised human protein BAP28 and lower similarity to yeast Utp10. Utp10 is a component of a nucleolar U3 small nucleolar RNA-containing RNP complex that is required for transcription of ribosomal DNA and for processing of 18 S rRNA. Zebrafish Bap28 is also required for rRNA transcription and processing, with a major effect on 18S rRNA maturation. Bap28 is therefore required for cell survival in the CNS through its role in rRNA synthesis and processing PUBMED:16531401.

    \ ' '7979' 'IPR012541' '\

    This C-terminal domain is found in the Dbp10p subfamily of hypothetical RNA helicases PUBMED:15112237.

    \ ' '7980' 'IPR012961' '\

    This C-terminal domain is found in DOB1/SK12/helY-like DEAD box helicases PUBMED:15112237.

    \ ' '7981' 'IPR012952' '\

    WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase PUBMED:11814058, PUBMED:10322433. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.

    \

    This C-terminal domain is found in the BING4 family of nucleolar WD40 repeat proteins PUBMED:15112237.

    \ ' '7982' 'IPR012561' '\

    This is central domain B in proteins of the Ferlin family PUBMED:15112237.

    \ ' '7983' 'IPR012968' '\

    This domain is present in proteins of the Ferlin family. It is often located between two C2 domains PUBMED:15112237.

    \ ' '7984' 'IPR012562' '\

    This is the C-terminal domain found in the RNA helicase II / Gu protein family PUBMED:15112237.

    \ ' '7985' 'IPR012971' '\

    This N-terminal domain is found in a subfamily of hypothetical nucleolar GTP-binding proteins similar to human NGP1 PUBMED:15112237.

    \ ' '7986' 'IPR012972' '\

    This domain is located N-terminal to WD40 repeats(). It is found in the microtubule-associated protein PUBMED:15112237.

    \ ' '7987' 'IPR012973' '\

    This C-terminal domain is found in the NOG subfamily of nucleolar GTP-binding proteins PUBMED:15112237.

    \ ' '7988' 'IPR012974' '\

    This N-terminal domain is found in RNA-binding proteins of the NOP5 family PUBMED:15112237.

    \ ' '7989' 'IPR012579' '\

    This C-terminal domain is found in a novel family of hypothetical nucleolar proteins PUBMED:15112237.

    \ ' '7990' 'IPR012977' '\

    This N-terminal domain is found in a novel nucleolar protein family defined by NUC130/133 PUBMED:15112237.

    \ ' '7991' 'IPR012580' '\

    This small domain is found in a novel nucleolar family PUBMED:15112237.

    \ ' '7993' 'IPR012978' '\

    This is the central domain of a novel family of hypothetical nucleolar proteins PUBMED:15112237.

    \ ' '7995' 'IPR012582' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    This is domain B in the catalytic subunit of DNA-dependent protein kinases.

    \ ' '7996' 'IPR012617' '\

    This C-terminal domain is found in traube proteins PUBMED:15112237.

    \ ' '7997' 'IPR012560' '\

    This is central domain A in proteins of the Ferlin family PUBMED:15112237.

    \ ' '7998' 'IPR012980' '\

    This domain is found in a novel family of nucleolar proteins PUBMED:15112237.

    \ ' '7999' 'IPR012583' '\

    This N-terminal domain is found in hypothetical nucleolar proteins with NUC202 tandem repeat PUBMED:15112237.

    \ ' '8000' 'IPR012584' '\

    This domain is found in a novel family of nucleolar proteins PUBMED:15112237.

    \ ' '8001' 'IPR012603' '\

    This domain is found N-terminal to the ARID/BRIGHT domain in DNA-binding proteins of the Retinoblastoma-binding protein 1 family PUBMED:15112237.

    \ ' '8002' 'IPR012590' '\

    This domain is found in POP1-like nucleolar proteins PUBMED:15112237.

    \ ' '8003' 'IPR012572' '\

    This domain is required for cell cycle arrest induced by spindle assembly checkpoint (SPC) activation. It is also involved in the nuclear accumulation and kinetochore targeting of proteins Bub1p, Bub3p and Mad3p PUBMED:15525673.

    \ ' '8004' 'IPR012955' '\

    This domain is the C-terminal region of the CASP family of proteins. These are Golgi membrane proteins which are thought to have a role in vesicle transport PUBMED:12429822.

    \ ' '8005' 'IPR012994' '\

    This family contains a set of membrane proteins, typically 33 amino acids long. The family has no known function, but the protein is found in the operon CydAB in Escherichia coli. Members have a consensus motif (MWYFXW), which is rich in aromatic residues. The protein forms a single membrane-spanning helix. This family seems to be restricted to proteobacteria PUBMED:9068659.

    \ ' '8006' 'IPR012966' '\

    This is a conserved domain in the anillin family of proteins, which are involved in cell division PUBMED:12668659. In Schizosaccharomyces pombe (Fission yeast, anillin (Mid2) is involved in septin ring organisation and cell separation PUBMED:12668659, PUBMED:12654901. The domain is found adjacent to a \'pleckstrin homology\' (PH) domain. The PH domain occurs in a wide range of proteins involved in intracellular signalling or as constituents of the cytoskeleton ().

    \ ' '8007' 'IPR012613' '\

    This family consists of the small acid-soluble spore proteins (SASP) O type (sspO). SspO (originally cotK) are unique to the spores of Bacillus subtilis and are expressed only in the forespore compartment of sporulating cells of this organism. The sspO is the first gene in a likely operon with sspP and transcription of this gene is primarily by RNA polymerase with the forespore-specific sigma factor, sigma-G. Mutation deleting sspO causes the loss of the SspO from the forespore but had no discernible effect on sporulation, spore properties or spore germination PUBMED:10806362.

    \ ' '8008' 'IPR012611' '\

    This family consists of the small acid-soluble spore proteins (SASP) belonging to the K type (sspK). The sspK are unique to the spores of Bacillus subtilis and are expressed only in the forespore compartment of sporulating cells of this organism. The sspK gene is monocistronic and transcription is primarily by the RNA polymerase with the forespore-specific sigma factor, sigma-G. Mutation deleting sspK results in loss of SspK from the spore but had no discernible effect on sporulation, spore properties or spore germination PUBMED:10806362.

    \ ' '8009' 'IPR012612' '\

    This family consists of the small acid-soluble spore protein (SASP) N type (sspN). SspN is a 48 residues protein that is expressed only in the forespore compartment of sporulating Bacillus subtilis. The sspN gene is recognised equally by both sigma-G and sigma-F. The role of SspN is still not well-defined PUBMED:10333516.

    \ ' '8010' 'IPR012563' '\

    This family consists of the GnsA/GnsB family. GnsA and GnsB are multicopy suppressors of the secG null mutation. These proteins participate in the synthesis of phospholipids, suggesting the functional relationship between SecG and membrane phospholipids. Over expression of gnsA and gnsB causes a remarkable increase in the unsaturated fatty acid content. However, the gnsA-gnsB double null mutant exhibits no effect. Both proteins are predicted to possess a helix-turn-helix structure PUBMED:11544213.

    \ ' '8011' 'IPR012614' '\

    This family consists of the small acid-soluble spore proteins (SASP) P type (sspP). sspP is expressed only in the forespore compartment of the sporulating cell. sspP is also expressed under sigma-G control from the same promoter as sspO. Mutations deleting sspP causes no discernible effect on sporulation, spore properties or spore germination PUBMED:10806362.

    \ ' '8012' 'IPR012530' '\

    This family consists of the B melanoma antigen (BAGE) peptides. The BAGE gene encodes a human tumour antigen that is recognised by a cytolytic T lymphocyte. BAGE genes are expressed in melanomas, bladder and lung carcinomas and in a few tumours of other histological types PUBMED:12461691.

    \ ' '8013' 'IPR012554' '\

    This family consists of the DegQ (formerly sacQ) regulatory peptides. The DegQ family of peptides control the rates of synthesis of a class of both secreted and intracellular degradative enzymes in Bacillus subtilis. DegQ is 46 amino acids long and activates the synthesis of degradative enzymes. The expression of this peptide was shown to be subjected both to catabolite repression and DegS-DegU-mediated control. Thus allowing an increase in the rate of synthesis of degQ under conditions of nitrogen starvation PUBMED:1688843.

    \ ' '8014' 'IPR012594' '\

    This family consists of the pedibin and Hym-346 signalling peptides. These two peptides have been isolated from Hydra attenuata (Hydra) (Hydra vulgaris) and Hydra magnipapillata (Hydra). Experiments have indicated that both cause a reduction in the positional value gradient, the principle patterning process governing the maintenance of form in the adult hydra. The peptides cause an increase in the rate of foot regeneration following bisection of the body column. Thus both play important signalling roles in patterning processes in cnidaria and maybe in more complex metazoans PUBMED:9876180.

    \ ' '8015' 'IPR012609' '\

    This family consists of the stage V sporulation (SpoV) proteins of Bacillus subtilis which includes SpoVM. SpoVM is an small, 26 residue-long protein that is produced in the mother cell chamber of the sporangium during the process of sporulation in B. subtilis. SpoVM forms an amphipathic alpha-helix and is recruited to the polar septum shortly after the sporangium undergoes asymmetric division. The function of SpoVM depends on proper subcellular localisation PUBMED:12562810.

    \ ' '8016' 'IPR012540' '\

    This family consists of cuticle protein 7 isoforms that are isolated from the carapace cuticle of a juvenile horseshoe crab, Limulus polyphemus. There are 3 isoforms of cuticle protein 7. The 3 isoforms are N-terminally blocked but could be deblocked by treatment with pyroglutaminase, showing that the N-terminal residue is a pyroglutamine residue PUBMED:12628379.

    \ ' '8017' 'IPR012531' '\

    This family consists of the BB1 proteins. BB1 is a growth regulating protein that is expressed by multiple tissues in humans, including the lung. BB1 has been shown to function in cell growth-related processes of foetal and early postnatal lung PUBMED:11033765.

    \ ' '8018' 'IPR012643' '\

    This family consists of the wound-inducible basic proteins from plants. The metabolic activities of plants are dramatically altered upon mechanical injury or pathogen attack. A large number of proteins accumulates at wound or infection sites, such as the wound-inducible basic proteins. These proteins are small, 47 amino acids in length, has no signal peptides and are hydrophilic and basic PUBMED:8310075.

    \ ' '8019' 'IPR012619' '\

    This family consists of myoactive tetradecapeptides that are isolated from the gut of Earthworms, Eisenia foetida (Common brandling worm) and Pheretima vittata (Earthworm). These peptides were termed ETP and PTP respectively. Both peptides showed a potent excitatory action on spontaneous contractions of the anterior gut. These peptides show similarity to Molluscan tetradecapeptides and Arthropodan tridecapeptides PUBMED:8532604.

    \ ' '8020' 'IPR012601' '\

    This family consists of the spermatozal protamines. Spermatozal protamines play an important role in remodelling of the sperm chromatin during mammalian spermiogenesis. Nuclear elongation and chromatin condensation are concomitant with modifications in the basic protein complement associated with DNA. Somatic histones are initially replaced by testis -specific histone variants, then by transitional proteins, and ultimately by protamines PUBMED:12672123.

    \ ' '8021' 'IPR012573' '\

    This family consists of meleagrin and cygnin basic peptides that are isolated from turkey and black swan respectively. Both peptides are low in molecular weight and contain three disulphide bonds with high concentrations of aromatic residues. These peptides show similarity to transferrins and probably play some vital role in avian eggs but the exact function is still unknown PUBMED:2760022.

    \ ' '8022' 'IPR012981' '\

    This domain is involved in pre-rRNA processing PUBMED:15670595. It has been shown to be required either for nucleolar retention or correct assembly of the box C/D snoRNP in Saccharomyces cerevisiae PUBMED:15670595.

    \ ' '8023' 'IPR012569' '\

    Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively PUBMED:10357231). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.

    \

    Reaction of amidotransferase domain:

    \ \ \

    Reactions of FMN-binding domain:

    \ \ \

    These are small, all beta strand domains, structurally described for the protein Internalin (InlA) and related proteins InlB, InlE, InlH from the pathogenic bacterium Listeria monocytogenes. Their function appears to be mainly structural: They are fused to the C-terminal end of leucine-rich repeats (LRR), significantly stabilising the LRR, and forming a common rigid entity with the LRR. They are themselves not involved in protein-protein-interactions but help to present the adjacent LRR-domain for this purpose. These domains belong to the family of Ig-like domains in that they consist of two sandwiched beta sheets that follow the classical connectivity of Ig-domains. The beta strands in one of the sheets is, however, much smaller than in most standard Ig-like domains, making it somewhat of an outlier PUBMED:11575932, PUBMED:12526809, PUBMED:15003459.

    \ ' '8024' 'IPR012985' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 PUBMED:15509782 and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad PUBMED:15509782.

    \ ' '8025' 'IPR013175' '\

    This is a family of conserved fungal proteins of unknown function.

    \ ' '8026' 'IPR013172' '\

    Drosophila immune-induced molecules (DIMs) are short proteins induced during the immune response of Drosophila. This family includes DIMs 1 to 4 that have masses below 5 kDa PUBMED:9736738.

    \ ' '8027' 'IPR013265' '\

    This entry contains putative genes, of 129 bp, from the Trichothecene gene cluster of Fusarium sporotrichioides and Gibberella zeae (Fusarium graminearum) that encode a predicted protein of 43 amino acids whose function is unknown PUBMED:12080147, PUBMED:11352533.

    \ ' '8028' 'IPR013269' '\

    This entry contains Orf UL2 of Human cytomegalovirus (HHV-5) (Human herpesvirus 5), which is a short protein of unknown function PUBMED:12533697.

    \ ' '8029' 'IPR013267' '\

    Most isolated ORF2 of TT virus (TTV) encode a 49 amino acid protein (pORF2a) because of an in-frame stop codon. ORF2s isolated from G1 TTV encode a 202 amino acid protein (pORF2ab) PUBMED:10963344.

    \ ' '8030' 'IPR013146' '\

    This entry included thymopoietins; short proteins of 49 amino acid isolated from bovine spleen cells PUBMED:7306506. Thymopoietins (TMPOs) are a group of ubiquitously expressed nuclear proteins. They are suggested to play an important role in nuclear envelope organisation and cell cycle control PUBMED:10430029.

    \ \

    Thymopoietins are characterised by LEM (LAP2, emerin, MAN1) domain, this is a globular module of approximately 40 amino acids, which is mostly found in the nucleoplasmic portions of metazoan inner nuclear membrane proteins. The LEM domain has been shown to mediate binding to BAF (barrier-to-autointegration factor) and BAF-DNA complexes. BAF dimers bind to double-stranded DNA non-specifically and thereby bridge DNA molecules to form a large, discrete nucleoprotein complex PUBMED:14618255, PUBMED:10671519.

    \ \

    The resolution of the solution structure of the LEM domain reveals that it is composed of a three-residue N-terminal helical turn and two large parallel alpha helices interacting through a set of conserved hydrophobic amino acids. The two helices, which are connected by a long loop are oriented at an angle of ~45 degree PUBMED:11435115.

    \ ' '8031' 'IPR013184' '\

    This is a family of short conserved proteins of 37 amino acids, described in Lactococcus phage c2 and in related phage. The function of these proteins is unknown.

    \ ' '8032' 'IPR013232' '\

    Gene 1.1 in Bacteriophage T7 encodes a 42 amino acid protein, rich in basic amino acids suggesting its interaction with nucleic acids PUBMED:6254001. Many homologues are present in different T7 and T3-like bacteriophage.

    \ ' '8033' 'IPR013161' '\

    BssC short protein (57 amino acids) has been described as the gamma-subunit of benzylsuccinate synthase from Thauera aromatica strain K172 PUBMED:9632263. TutF has been identified and described as highly similar to BssC in T. aromatica strain T1 PUBMED:10698784.

    \ ' '8034' 'IPR013218' '\

    The Mtw1 kinetochore complex contains at least four essential components including Mtw1, DSN1, NNF1 and NSL1. All proteins exhibit genetic and two-hybrid interactions and all stabley associate in solution. The function of the complex is unclear though it is involved in chromosome segregation PUBMED:15502821, PUBMED:12455957.

    \ ' '8035' 'IPR013239' '\

    This is a family of fungal proteins. RPA14 is one of the final two subunits of Saccharomyces cerevisiae (Baker\'s yeast) RNA polymerase I and is proposed to play a role in the recruitment of pol I to the promoter PUBMED:15647272.

    \ ' '8036' 'IPR013270' '\

    This family represents the CD47 leukocyte antigen V-set like Ig domain PUBMED:12124426, PUBMED:8794870.

    \ ' '8037' 'IPR013162' '\

    The basic structure of immunoglobulin (Ig) molecules is a tetramer of two light chains and two heavy chains linked by disulphide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains: alpha, delta, epsilon, gamma and mu, all consisting of a variable domain (VH) and three (in alpha, delta and gamma) or four (in epsilon and mu) constant domains (CH1 to CH4). Ig molecules are highly modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. The domains in Ig and Ig-like molecules are grouped into four types: V-set (variable; ), C1-set (constant-1; ), C2-set (constant-2; ) and I-set (intermediate; ) PUBMED:9417933. Structural studies have shown that these domains share a common core Greek-key beta-sandwich structure, with the types differing in the number of strands in the beta-sheets as well as in their sequence patterns PUBMED:15327963, PUBMED:11377196.

    \

    Immunoglobulin-like domains that are related in both sequence and structure can be found in several diverse protein families. Ig-like domains are involved in a variety of functions, including cell-cell recognition, cell-surface receptors, muscle structure and the immune system PUBMED:10698639.

    \ \

    This entry represents the C2-set type domains found in the T-cell antigen CD80, as well as in related proteins. CD80 (B7-1) is a glycoprotein expressed on antigen-presenting cells PUBMED:10661405. The shared ligands on CD80 and CD86 (B7-2) deliver the co-stimulatory signal through CD28 and CTLA-4 on T-cells, where CD28 augments the T-cell response and CTLA-4 attenuates it PUBMED:11279502.

    \ ' '8038' 'IPR013223' '\

    This domain includes the N-terminal OB domain found in ribonuclease B proteins in one or two copies.

    \ ' '8039' 'IPR013185' '\

    This entry represents the N-terminal domain of homologues of elongation factor P, which probably are translation initiation factors.

    \ \ \ \ \ \ ' '8040' 'IPR013240' '\

    This is a family of proteins conserved from yeasts to human. Subunit A34.5 of RNA polymerase I is a non-essential subunit which is thought to help Pol I overcome topological constraints imposed on ribosomal DNA during the process of transcription PUBMED:9121426.

    \ ' '8041' 'IPR013246' '\

    The Sgf11 family is a SAGA complex subunit in Saccharomyces cerevisiae (Baker\'s yeast). The SAGA complex is a multisubunit protein complex involved in transcriptional regulation. SAGA combines proteins involved in interactions with DNA-bound activators and TATA-binding protein (TBP), as well as enzymes for histone acetylation and deubiquitylation PUBMED:15657441.

    \ ' '8042' 'IPR013158' '\

    This domain is found at the N terminus of the Apolipoprotein B mRNA editing enzyme. Apobec-1 catalyzes C to U editing of apolipoprotein B (apoB) mRNA in the mammalian intestine.

    The N-terminal domain of APOBEC-1 like proteins is the catalytic domain, while the C-terminal domain is a pseudocatalyitc domain. More specifically, the catalytic domain is a zinc dependent deaminases domain and is essential for cytidine deamination. APOBEC-3 like members contain two copies of this domain. This family also includes the functionally homologous activation induced deaminase, which is essential for the development of antibody diversity in B lymphocytes. RNA editing by APOBEC-1 requires homodimerisation and this complex interacts with RNA binding proteins to from the editosome PUBMED:12683974 (and references therein).

    \ ' '8043' 'IPR013171' '\

    This region contains the zinc-binding domain of cytidine and deoxycytidylate deaminase.

    \

    Cytidine deaminase () (cytidine aminohydrolase) catalyzes the hydrolysis of cytidine into uridine and ammonia while deoxycytidylate deaminase () (dCMP deaminase) hydrolyzes dCMP into dUMP. Both enzymes are known to bind zinc and to require it for their catalytic activity PUBMED:1567863, PUBMED:8428902. These two enzymes do not share any sequence similarity with the exception of a region that contains three conserved histidine and cysteine residues which are thought to be involved in the binding of the catalytic zinc ion.

    \ ' '8044' 'IPR013208' '\

    Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The structure is an eight-stranded beta barrel.

    \ ' '8045' 'IPR013177' '\

    This domain is found at the C-terminal end of mitochondrial proteins of unknown function.

    \ ' '8046' 'IPR013178' '\

    Histone acetylation is required in many cellular processes including transcription, DNA repair, and chromatin assembly. RTT109 is required for H3K56 acetylation and loss of Rtt109 results in the loss of H3K56 acetylation, both on bulk histone and on chromatin PUBMED:17272723. RTT109 and H3K56 acetylation appear to correlate with actively transcribed genes and associate with the elongating form of Pol II in yeast PUBMED:17272723.

    \ ' '8048' 'IPR013180' '\

    This domain is found in eukaryotic proteins. A human nuclear protein with this domain () is thought to have a role in apoptosis PUBMED:12659813.

    \ ' '8049' 'IPR013176' '\

    The function of this fungal family of proteins is unknown.

    \ ' '8050' 'IPR013166' '\

    [Citrate (pro-3S)-lyase] ligase (), also known as citrate lyase ligase, is responsible for acetylation of the prosthetic group (2-(5\'\'-phosphoribosyl)-3\'-dephosphocoenzyme-A) of the gamma subunit of citrate lyase. It converts the inactive thiol form of the enzyme to the active form. In Clostridium sphenoides, citrate lyase ligase actively degrades citrate. In Clostridium sporosphaeroides and Lactococcus lactis, however, the enzyme is under stringent regulatory control. The enzyme\'s activity in anaerobic bacteria is modulated by phosphorylation and dephosphorylation PUBMED:3935436.

    \

    The proteins in this entry represent the C-terminal domain of citrate lyase ligase.

    \ ' '8051' 'IPR013262' '\

    The TOM13 family of proteins are mitochondrial outer membrane proteins that mediate the assembly of beta-barrel proteins PUBMED:15326197.

    \ ' '8052' 'IPR001034' '\

    The deoR-type HTH domain is a DNA-binding, helix-turn-helix (HTH) domain of\ about 50-60 amino acids present in transcription regulators of the deoR family, involved in sugar catabolism. This family of prokaryotic regulators is named after the Escherichia coli protein DeoR, a repressor of the deo operon, which encodes nucleotide and deoxyribonucleotide catabolic enzymes. DeoR also negatively regulates the expression of nupG and tsx, a nucleoside-specific transport protein and a channel-forming protein, respectively.

    \ \

    DeoR-like transcription repressors occur in diverse bacteria as regulators of\ sugar and nucleoside metabolic systems. The effector molecules for deoR-like\ regulators are generally phosphorylated intermediates of the relevant\ metabolic pathway. The DNA-binding deoR-type HTH domain occurs usually in the\ N-terminal part. The C-terminal part can contain an effector-binding domain\ and/or an oligomerisation domain. DeoR occurs as an octamer, whilst glpR and\ agaR are tetramers. Several operators may be bound simultaneously, which could\ facilitate DNA looping PUBMED:1731335, PUBMED:14731281.

    \ ' '8053' 'IPR013197' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \

    This family consists of several DNA-directed RNA polymerase III polypeptides which are related to the Saccharomyces cerevisiae (Baker\'s yeast) RPC82 protein. RNA polymerase C (III) promotes the transcription of tRNA and 5S RNA genes. In S. cerevisiae, the enzyme is composed of 15 subunits, ranging from 10 kDa to about 160 kDa PUBMED:1406632. This region is probably a DNA-binding helix-turn-helix.

    \ ' '8054' 'IPR013198' '\

    This family consists of the C-terminal helix-turn-helix domain found in several bacterial GTP-sensing transcriptional pleiotropic repressor CodY proteins. CodY has been found to repress the dipeptide transport operon (dpp) of Bacillus subtilis in nutrient-rich conditions PUBMED:7783641. The CodY protein also has a repressor effect on many genes in Lactococcus lactis during growth in milk PUBMED:11401725.

    \ ' '8055' 'IPR013225' '\

    This family contains proteins that are similar to the product of the paaX gene of Escherichia coli (). This protein is involved in the regulation of expression of a group of proteins known to participate in the metabolism of phenylacetic acid PUBMED:10766858.

    \ ' '8056' 'IPR013181' '\

    This is a group of rice proteins of unknown function. They may have a role in ATPase activation.

    \ ' '8057' 'IPR013156' '\

    Pseudins are a subfamily of the FSAP family (Frog Secreted Active Peptides) extracted from the skin of the paradoxical frog Pseudis paradoxa (Paradoxical frog). The pseudins belong to the class of cationic, amphipathic-helical antimicrobial peptides PUBMED:11689009.

    \ ' '8058' 'IPR013182' '\

    This domain is found in different combinations with cortical patch components EF hand, SH3 and ENTH and is therefore likely to be involved in cytoskeletal processes. This family contains many hypothetical proteins.

    \ ' '8059' 'IPR013183' '\

    This is a family of fungal proteins of unknown function.

    \ ' '8060' 'IPR013241' '\

    This family of fungal proteins form a subunit of RNase P, the ribonucleoprotein enzyme that cleaves the leader sequence of precursor tRNAs to generate mature tRNAs. The structure of Pop3 has been assigned the L7Ae/L30e fold PUBMED:15613537. This RNA-binding fold is also present in human RNase P subunit Rpp38, raising the possibility that Pop3p and Rpp38 are functional homologues.

    \ ' '8061' 'IPR013248' '\

    This family of proteins are membrane localised chaperones that are required for correct plasma membrane localisation of amino acid permeases (AAPs) PUBMED:15623581. Shr3 prevents AAPs proteins from aggregating and assists in their correct folding. In the absence of Shr3, AAPs are retained in the ER.

    \ ' '8062' 'IPR013168' '\

    This domain was originally found in the C-terminal moiety of the Cp-7 lysin (lysozyme, ) encoded by Bacteriophage Cp-7. It is assumed that this domain represents a cell wall binding motif although no direct evidence has been obtained so far to support this.

    \ ' '8063' 'IPR013260' '\

    Proteins in this entry are involved in cell cycle progression and pre-mRNA splicing PUBMED:12384582, PUBMED:11102353.

    \ ' '8064' 'IPR013258' '\

    This domain is associated with the N terminus of striatin. Striatin is an intracellular protein which has a caveolin-binding motif, a coiled-coil structure, a calmodulin-binding site, and a WD () repeat domain PUBMED:10748158. It acts as a scaffold protein PUBMED:15569929 and is involved in signalling pathways PUBMED:10748158, PUBMED:12610732.

    \ ' '8066' 'IPR013255' '\

    This is a family of chromosome segregation proteins. It contains Spc25, which is a conserved eukaryotic kinetochore protein involved in cell division. In fungi the Spc25 protein is a subunit of the Nuf2-Ndc80 complex PUBMED:15023545 and in vertebrates it forms part of the Ndc80 complex PUBMED:14738735. The family also contains Csm1 which in Saccharomyces cerevisiae is part of the monopolin complex. This protein has been shown to promote mono-orientation during meiosis and also plays a mitotic role in DNA replication PUBMED:15728720.

    \ ' '8067' 'IPR013209' '\

    This domain is found in Saccharomyces cerevisiae (Baker\'s yeast) protein SMP2, proteins with an N-terminal lipin domain () and phosphatidylinositol transfer proteins PUBMED:8437575. SMP2 is involved in plasmid maintenance and respiration PUBMED:12376568. Lipin proteins are involved in adipose tissue development and insulin resistance PUBMED:11792863.

    \ ' '8068' 'IPR013257' '\

    The SRI (Set2 Rpb1 interacting) domain mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation PUBMED:15798214.

    \ ' '8069' 'IPR013228' '\

    This domain is found C-terminal to the PE () and PPE () domains. The secondary structure of this domain is predicted to be a mixture of alpha helices and beta strands PUBMED:12711809.

    \ ' '8070' 'IPR006597' '\

    Sel1-like repeats are tetratricopeptide repeat sequences originally identified in a Caenorhabditis elegans receptor molecule which is a key negative regulator of the Notch pathway PUBMED:8722778. Mammalian homologues have since been identified although these mainly pancreatic proteins have yet to have a function assigned.

    \ ' '8071' 'IPR013247' '\ SH3 (src Homology-3) domains are small protein modules containing \ approximately 50 amino acid residues PUBMED:15335710, PUBMED:11256992. They are found in a \ great variety of intracellular or\ membrane-associated proteins PUBMED:1639195, PUBMED:14731533, PUBMED:7531822 for example, in a variety of\ proteins with enzymatic activity, in adaptor\ proteins that lack catalytic sequences and in cytoskeletal\ proteins, such as fodrin and yeast actin binding protein ABP-1. \

    The SH3 domain has a characteristic fold which consists of five or six beta-strands arranged as two tightly packed anti-parallel beta sheets. The linker\ regions may contain short helices PUBMED:. The surface of the SH3-domain bears a flat, hydrophobic ligand-binding pocket which consists of three shallow grooves defined by conservative aromatic residues in which the ligand adopts an extended left-handed helical arrangement. The ligand binds with low affinity but this may be enhanced by multiple interactions.\ The region bound by the SH3 domain is in all cases proline-rich and contains PXXP as a core-conserved binding motif. The function of the SH3 domain is not well understood but they may mediate many diverse processes such as increasing local concentration of proteins, altering their subcellular location and mediating the assembly of large multiprotein complexes PUBMED:7953536.

    \

    A homologue of the SH3 domain has been found in a number of different bacterial proteins including glycyl-glycine endopeptidase, bacteriocin and some hypothetical proteins.

    \ ' '8072' 'IPR013154' '\

    This is the catalytic domain of alcohol dehydrogenases (). Many of them contain an inserted zinc binding domain. This domain has a GroES-like structure; a name derived from the superfamily of proteins with a GroES fold. Proteins with a GroES fold structure have a highly conserved hydrophobic core and a glycyl-aspartate dipeptide which is thought to maintain the fold PUBMED:10556240, PUBMED:8804825.

    \ ' '8073' 'IPR013216' '\

    Methyl transfer from the ubiquitous S-adenosyl-L-methionine (SAM) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalyzed by methyltransferases (Mtases) and modifies DNA, RNA, proteins and small molecules, such as catechol for regulatory purposes. The various aspects of the role of DNA methylation in prokaryotic restriction-modification systems and in a number of cellular processes in eukaryotes including gene regulation and differentiation is well documented.

    \ \

    This entry represents a methyltransferase domain found in a large variety of SAM-dependent methyltransferases including, but not limited to:\

    \ \ Structural studies show that this domain forms the Rossman-like alpha-beta fold typical of SAM-dependent methyltransferases PUBMED:12737817, PUBMED:14999102, PUBMED:12429089.

    \ \ ' '8074' 'IPR013217' '\

    Methyl transfer from the ubiquitous donor S-adenosyl-L-methionine (SAM) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalyzed by methyltransferases (Mtases) and modifies DNA, RNA, proteins and small molecules, such as catechol for regulatory purposes. The various aspects of the role of DNA methylation in prokaryotic restriction-modification systems and in a number of cellular processes in eukaryotes including gene regulation and differentiation is well documented.

    \ \

    This entry represents a methyltransferase domain found in a large variety of SAM-dependent methyltransferases including, but not limited to:\

    \ \ Structural studies show that this domain forms the Rossman-like alpha-beta fold typical of SAM-dependent methyltransferases PUBMED:11746687, PUBMED:8810903, PUBMED:15340920.

    \ ' '8075' 'IPR013256' '\

    This entry includes the Saccharomyces cerevisiae (Baker\'s yeast) protein SPT2 which is a chromatin protein involved in transcriptional regulation PUBMED:15563464.

    \ \

    These proteins shows conservation of several domains across numerous species, including having a cluster of positively charged amino acids. This cluster probably functions in the binding properties of the proteins PUBMED:15563464. Sin1p/Spt2p probably modulates the local chromatin structure by binding two strands of double-stranded DNA at their crossover point.

    \ \

    Sin1p/Spt2p has sequence similarity to HMG1 and serves as a negative transcriptional regulator of a small family of genes that are activated by the SWI/SNF chromatin-remodelling complex. It is also involved in maintaining the integrity of chromatin during transcription elongation. Sin1p/Spt2 is required for, and is directly involved in, the efficient recruitment of the mRNA cleavage/polyadenylation complex PUBMED:16788068. Spt2 is also involved in regulating levels of histone H3 over transcribed regions PUBMED:16449659.

    \ ' '8076' 'IPR013189' '\

    This domain corresponds to the C-terminal domain of glycosyl hydrolase family 32. It forms a beta sandwich module PUBMED:14973124.

    \ ' '8077' 'IPR013221' '\

    The bacterial cell wall provides strength and rigidity to counteract internal osmotic pressure, and protection against the environment. The peptidoglycan layer gives the cell wall its strength, and helps maintain the overall shape of the cell. The basic peptidoglycan structure of both Gram-positive and Gram-negative bacteria is comprised of a sheet of glycan chains connected by short cross-linking polypeptides. Biosynthesis of peptidoglycan is a multi-step (11-12 steps) process comprising three main stages:

    \

    \

    Stage two involves four key Mur ligase enzymes: MurC () PUBMED:17139082, MurD () PUBMED:17427948, MurE () PUBMED:16595662 and MurF () PUBMED:16322581. These four Mur ligases are responsible for the successive additions of L-alanine, D-glutamate, meso-diaminopimelate or L-lysine, and D-alanyl-D-alanine to UDP-N-acetylmuramic acid. All four Mur ligases are topologically similar to one another, even though they display low sequence identity. They are each composed of three domains: an N-terminal Rossmann-fold domain responsible for binding the UDPMurNAc substrate; a central domain (similar to ATP-binding domains of several ATPases and GTPases); and a C-terminal domain (similar to dihydrofolate reductase fold) that appears to be associated with binding the incoming amino acid. The conserved sequence motifs found in the four Mur enzymes also map to other members of the Mur ligase family, including folylpolyglutamate synthetase, cyanophycin synthetase and the capB enzyme from Bacillales PUBMED:16934839.

    \ \

    This entry represents the C-terminal domain from all four stage 2 Mur enzymes: UDP-N-acetylmuramate-L-alanine ligase (MurC), UDP-N-acetylmuramoylalanine-D-glutamate ligase (MurD), UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (MurE), and UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase (MurF). This entry also includes folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate and cyanophycin synthetase that catalyses the biosynthesis of the cyanobacterial reserve material multi-L-arginyl-poly-L-aspartate (cyanophycin) PUBMED:9652408.

    \ ' '8078' 'IPR013201' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    This entry represents a peptidase inhibitor domain, which belongs to MEROPS peptidase inhibitor family I29. The domain is also found at the N-terminus of a variety of peptidase precursors that belong to MEROPS peptidase subfamily C1A; these include cathepsin L, papain, and procaricain () PUBMED:8939744. It forms an alpha-helical domain that runs through the substrate-binding site, preventing access. Removal of this region by proteolytic cleavage results in activation of the enzyme. This domain is also found, in one or more copies, in a variety of cysteine peptidase inhibitors such as salarin PUBMED:14505823.

    \ \ \ ' '8079' 'IPR013186' '\

    The soybean early nodulin 40 (ENOD40) mRNA contains two short overlapping ORFs; in vitro translation yields two peptides of 12 and 24 amino acids PUBMED:11842184. The putative role of the ENOD40 genes has been in favour of organogenesis, such as induction of the cortical cell divisions that lead to initiation of nodule primordia, in developing lateral roots and embryonic tissues. This supports the hypothesis for a role of ENOD40 in lateral organ development PUBMED:12114565.

    \ ' '8080' 'IPR013266' '\

    PdT-3 or Tryptophyllin-3 peptide is a subfamily of the family Tryptophyllin and of the superfamily FSAP (Frog Skin Active Peptide). Originally identified in skin extracts of Neotropical leaf frogs, Phyllomedusa sp. This subfamily has an average length of 13 amino acids. The pharmacological activity of the tryptophyllins remains to be established PUBMED:14687697 but it seems that these peptides possess an action on liver protein synthesis and body weight PUBMED:3831963.

    \ ' '8081' 'IPR013213' '\

    Mastoparans are a family of tetradecapeptides from wasp venom that have been shown to directly activate GTP-binding regulatory proteins. These peptides show selectivity among G proteins: they strongly activate Go and Gi but not Gs or Gt. The peptides of this family are composed by 14 amino acids but they can assume different structures PUBMED:9537994.

    \ ' '8082' 'IPR013254' '\

    The sperm-activating peptides (SAPs) are isolated in egg-conditioned media (egg jelly) of sea urchins. SAPs have several effects on sea urchin spermatozoa: stimulate sperm respiration and motility through intracellular alkalinization, transient elevation of cAMP, cGMP and Ca2+ levels in sperm cells PUBMED:1756858, PUBMED:2059627.

    \ ' '8083' 'IPR013214' '\

    Mastoparan (MP) peptides I, II and III are extracted from the venom gland of Protopolybia exigua (Neotropical social wasp). They are tetradecapeptides presenting from seven to ten hydrophobic amino acid residues and from two to four lysine residues in their primary sequences. These peptides cause the degranulation of mast cells. Protopolybia-MP-I also causes haemolysis of erythrocytes.

    \ ' '8084' 'IPR013203' '\

    In this family there are leaders peptides involved in the regulation of the glutaminase subunit (small subunit) of arginine-specific carbamoyl phosphate synthetase. In Neurospora crassa it is a small upstream ORF of 24 codons above the arg-2 locus PUBMED:2141606. In yeast it is the leader peptide of the CPA1 gene. The 5\' region of CPA1 mRNA contains a 25 codon upstream open reading frame. The leader peptide, the product of the upstream open reading frame, plays an essential, negative role in the specific repression of CPA1 by arginine PUBMED:3555844.

    \ ' '8085' 'IPR013204' '\

    These short proteins are leader peptides (15-19 amino acids) of erm genes that code for resistance determinants in Staphylococcus aureus PUBMED:2985541.

    \ ' '8086' 'IPR011720' '\

    This family consists of examples of the threonine biosynthesis (thr) operon leader peptide, also called the thr operon attenuator. The small gene for this peptide is often missed in genome annotation. It should be looked for in genomes of the proteobacteria, immediately upstream of genes for threonine biosynthesis, typically aspartokinase I/homoserine dehydrogenase, homoserine kinase, and threonine synthase. Transcription of the rest of the Thr operon is attenuated (mostly turned off) unless the ribosome pauses during a stretch of the leader sequence rich in both Ile (made from Thr) and in Thr itself because of the scarcity of those amino acids at the time. The leader peptide itself, once made, may have no role other than to be degraded. Similar systems exist for some other amino acid biosynthetic operons, such as Trp.

    \ ' '8087' 'IPR013205' '\

    The tryptophan operon regulatory region of Citrobacter freundii (leader transcript) encodes a 14-residue peptide containing characteristic tandem tryptophan residues. It is about 10 nucleotides shorter than those of Escherichia coli and Salmonella typhimurium PUBMED:6749821.

    \ ' '8088' 'IPR013157' '\

    This family of antibacterial peptides are secreted from the granular dorsal glands of Litoria aurea (Green and golden bell frog), Litoria raniformis (Southern bell frog), Litoria citropa (Australian blue mountains tree frog) and frogs from genus Uperoleia. They are a part of the FSAP peptide family. Amongst the more active of these are aurein 1.2, aurein 2.2 and aurein 3.1; caerin 1.1, maculatin 1.1, uperin 3.6 PUBMED:10951191; citropin 1.1, citropin 1.2, citropin 1.3 and a minor peptide are wide-spectrum antibacterial peptides PUBMED:10504394.

    \ ' '8089' 'IPR013259' '\

    The sulfakinin (SK) family of neuropeptides have only been identified in crustaceans and insects. For most species there is the potential for producing two sulfakinin peptides, one has a short sulfakinin sequence. The function of the sulfakinins is difficult to assess. For the Periplaneta americana (American cockroach), various forms of the endogenous sulfakinins have been shown to be active on the hindgut, and also on the heart. In Calliphora vomitoria (Blue blowfly) the peptides act as neurotransmitters or neuromodulators, linking the brain with all thoracic and abdominal ganglia.

    \ ' '8090' 'IPR013271' '\

    This family contain neuropeptides, isolated from ganglia of Achatina fulica (Giant African snail). Each peptide has a Trp residue at both the N- and C-termini. Purified WWamide-1, -2 and -3 showed an inhibitory effect on the phasic contractions of the anterior byssus retractor muscle (ABRM) PUBMED:8495720.

    \ ' '8091' 'IPR013231' '\

    Perviscerokinin neuropeptides are found in the abdominal perisympathetic organs of insects. They mediate visceral muscle contractile activity (myotropic activity). CAPA, which are in the periviscerokinin and pyrokinin peptide families, has potential medical importance. This is due to its myotropic effects on, for example, heart muscles and due to its occurrence in the Ixodoidea (ticks), which are important vectors in the transmission of many animal diseases PUBMED:18495123. These peptides also have a strong diuretic or anti-diuretic effect, suggesting they have significant medical implications PUBMED:16952053.

    \ ' '8092' 'IPR013202' '\

    These neuropeptides are the first members of the insect kinin-family isolated from the American cockroach. Their occurrence in the retrocerebral complex suggests a physiological role as a neurohormone. The C-terminal sequence Phe-X-Ser-Trp-Gly-NH2 characterised the peptides as members of the insect kinin family. Data suggest a possible involvement of insect kinins in water-balance by regulating the osmoregulation. These peptides have lengths ranging from 6 to 14 amino acids PUBMED:9350979.

    \ ' '8093' 'IPR013165' '\

    A total of 20 peptides of the superfamily allostatin were isolated from Carcinus maenas (Common shore crab) (Green crab). They are named carcinustatin 1 to 20 and their length ranges from 5 to 27 amino acids. This family includes carcinustatin 8, 9, 15 and 16 PUBMED:9461295.

    \ ' '8094' 'IPR013206' '\

    These peptides are designated Leucophaea maderae (Madeira cockroach) tachykinin-related peptides (Lem TRPs). Some were isolated from the midgut of L. maderae, whereas others appear to be brain specific. The Lem TRPs of the brain are myotropic and induce increases in the amplitude and frequency of spontaneous contractions and tonus of hindgut muscle in L. maderae PUBMED:9114447. They were also isolated from brain-corpora, cardiaca-corpora, allata-suboesophageal ganglion extracts of Locusta migratoria (Migratory locust). They stimulate visceral muscle contractions of the oviduct and the foregut of L. migratoria PUBMED:2132575.

    \ ' '8095' 'IPR013210' '\

    Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively PUBMED:10357231). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.

    \

    Reaction of amidotransferase domain:

    \ \ \

    Reactions of FMN-binding domain:

    \ \ \

    This domain is often found at the N-terminus of tandem leucine rich repeats.

    \ ' '8096' 'IPR013155' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This domain is found valyl, leucyl and isoleucyl tRNA synthetases. It binds to the anticodon of the tRNA.

    \ ' '8097' 'IPR013272' '\

    This domain is found at the C terminus in proteins of the YL1 family PUBMED:7702631. These proteins have been shown to be DNA-binding and may be transcription factors PUBMED:7702631. This domain is also found in proteins that do not belong to the YL1 family.

    \ ' '8098' 'IPR013164' '\

    Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration PUBMED:2197976, PUBMED:,PUBMED:14570569. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins.

    \

    Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain PUBMED:11736639. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.

    \

    This entry represents a cadherin domain that is usually found at the N-terminus of cadherin proteins.

    \ ' '8099' 'IPR013215' '\

    Cobalamin-independent methionine synthase, MetE, catalyses the synthesis of the amino acid methionine by the transfer of a methyl group from methyltetrahydrofolate to homocysteine PUBMED:15326182. The N-terminal and C-terminal domains of MetE together define a catalytic cleft in the enzyme. The N-terminal domain is thought to bind the substrate, in particular, the negatively charged polyglutamate chain. The N-terminal domain is also thought to stabilise a loop from the C-terminal domain.

    \ ' '8100' 'IPR013187' '\

    This domain occurs in a diverse superfamily of genes in plants. Most examples are found C-terminal to an F-box (), a 60 amino acid motif involved in ubiquitination of target proteins to mark them for degradation. Two-hybid experiments support the idea that most members are interchangeable F-box subunits of SCF E3 complexes PUBMED:12169662. Some members have two copies of this domain.

    \ ' '8101' 'IPR013163' '\ Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source PUBMED:11084361. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions PUBMED:11292341.\

    This entry is composed of the type 2 Cache domain.

    \ ' '8102' 'IPR013236' '\

    Mga is a DNA-binding protein that activates the expression of several important virulence genes in group A streptococcus in response to changing environmental conditions PUBMED:11952907. This region corresponds to the PRD like region.

    \ ' '8103' 'IPR013137' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a zinc finger motif found in transcription factor IIB (TFIIB). In eukaryotes the initiation of transcription of protein encoding genes by the polymerase II complexe (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least seven different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, and -IIH PUBMED:1633439.

    \

    TFIIB and TFIID are responsible for promoter recognition and interaction with pol II; together with Pol II, they form a minimal initiation complex capable of transcription under certain conditions. The TATA box of a Pol II promoter is bound in the initiation complex by the TBP subunit of TFIID, which bends the DNA around the C-terminal domain of TFIIB whereas the N-terminal zinc finger of TFIIB interacts with Pol II PUBMED:8516312, PUBMED:8504927.

    \

    The TFIIB zinc finger adopts a zinc ribbon fold characterised by two beta-hairpins forming two structurally similar zinc-binding sub-sites PUBMED:8564536. The zinc finger contacts the rbp1 subunit of Pol II through its dock domain, a conserved region of about 70 amino acids located close to the polymerase active site PUBMED:15024075. In the Pol II complex this surface is located near the RNA exit groove. Interestingly this sequence is best conserved in the three polymerases that utilise a TFIIB-like general transcription factor (Pol II, Pol III, and archaeal RNA polymerase) but not in Pol I PUBMED:15024075.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '8104' 'IPR013263' '\

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis PUBMED:12042765, PUBMED:11395412. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227.

    \

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    \

    This entry represents the C-terminal zinc-ribbon-like domain found in bacterial topoisomerase I (type IA) enzymes. Escherichia coli topoisomerase I proteins contain five copies of a zinc-ribbon-like domain at their C-terminus, two of which have lost their cysteine residues and are therefore probably not able to bind zinc PUBMED:10873443. This domain is still considered to be a member of the zinc-ribbon superfamily despite not being able to bind zinc.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '8105' 'IPR013237' '\

    This entry represents a zinc binding domain found in the N-terminal region of the bacteriophage P4 alpha protein. P4 is a multifunctional protein with origin recognition, helicase and primase activities PUBMED:8253092.

    \ ' '8106' 'IPR013987' '\

    The PhnA protein family includes the uncharacterised Escherichia coli protein PhnA and its homologues. The E. coli phnA gene is part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage PUBMED:2155230. The protein is not related to the characterised phosphonoacetate hydrolase designated PhnA PUBMED:9300819.

    \ \

    This entry represents the N-terminal domain of PhnA, which is predicted to form a zinc-ribbon.

    \ ' '8107' 'IPR013264' '\

    This is the N-terminal, catalytic core domain of DNA primases. DNA primase () is a nucleotidyltransferase which synthesizes the oligoribonucleotide primers required for DNA replication on the lagging strand of the replication fork. It can also prime the leading strand and has been implicated in cell division PUBMED:8294018.

    \ ' '8108' 'IPR013227' '\

    PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions PUBMED:10561497. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs

    \ ' '8109' 'IPR006583' '\

    The PAN-3 or CW is a domain associated with a number of Caenorhabditis elegans hypothetical proteins.

    \

    PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions PUBMED:10561497. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs

    \ ' '8110' 'IPR013173' '\

    Eubacterial DnaG primases interact with several factors to form the replisome. One of these factors is DnaB, a helicase. This domain has been demonstrated to be responsible for the interaction between DnaG and DnaB PUBMED:8308039. This domain has a multi-helical structure that forms an orthogonal bundle PUBMED:15649896.

    \ ' '8111' 'IPR013196' '\

    Winged helix DNA-binding proteins share a related winged helix-turn-helix DNA-binding motif, where the "wings", or loops, are small beta-sheets. The winged helix motif consists of two wings (W1, W2), three alpha helices (H1, H2, H3) and three beta-sheets (S1, S2, S3) arranged in the order H1-S1-H2-H3-S2-W1-S3-W2 PUBMED:10679470. The DNA-recognition helix makes sequence-specific DNA contacts with the major groove of DNA, while the wings make different DNA contacts, often with the minor groove or the backbone of DNA. Several winged-helix proteins display an exposed patch of hydrophobic residues thought to mediate protein-protein interactions.

    \

    This entry represents a subset of the winged helix domain superfamily which is predominantly found in bacterial proteins, though there are also some archaeal and eukaryotic examples. This domain is commonly found in the biotin (vitamin H) repressor protein BirA which regulates transcription of the biotin operon PUBMED:1409631. It is also found in other proteins including regulators of amino acid biosynthsis such as LysM PUBMED:12042311, and regulators of carbohydrate metabolisms such as LicR and FrvR PUBMED:10438772, PUBMED:8019415.

    \ ' '8112' 'IPR013199' '\

    Mga is a DNA-binding protein that activates the expression of several important virulence genes in group A streptococcus in response to changing environmental conditions PUBMED:11952907.

    \ ' '8113' 'IPR013249' '\

    The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes.

    With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding \'helix-turn-helix\' motif involved in binding the conserved -35 region of promoters recognised by the major sigma factors PUBMED:3092189, PUBMED:1597408. \

    \

    Region 4 of sigma-70 like sigma-factors are involved in binding to the -35 promoter element via a helix-turn-helix motif PUBMED:11931761.

    \ ' '8114' 'IPR013200' '\

    The Haloacid Dehydrogenase (HAD) superfamily includes phosphatases, phosphonatases, P-type ATPases, beta-phosphoglucomutases, phosphomannomutases, and dehalogenases, which are involved in a variety of cellular processes ranging from amino acid biosynthesis to detoxification PUBMED:7966317. This HAD domain is found in several distinct enzymes including:\ \

    \

    \ \ ' '8115' 'IPR001191' '\ Geminiviruses are characterised by a genome of circular single-stranded DNA encapsidated in twinned (geminate) quasi-isometric particles, from which the group derives its name PUBMED:16453696. Most geminiviruses can be divided into two subgroups on the basis of host range and/or insect vector: i.e.those that infect dicotyledenous plants and are transmitted \ by the same whitefly species, and those that infect monocotyledenous plants and are transmitted by different leafhopper vectors. The genomes of the whitefly-transmitted African cassava mosaic virus, Tomato golden mosaic virus (TGMV) and Bean golden mosaic virus (BGMV) possess a bipartite genome. By contrast, only a single DNA component has been identified for the leafhopper-transmitted Maize streak virus (MSV) and Wheat dwarf virus (WDV) PUBMED:6526009, PUBMED:2829117. Beet curly top virus (BCTV), and Tobacco yellow \ dwarf virus belong to a third possible subgroup. Like MSV and WDV, BCTV is transmitted by a specific leafhopper species, yet like the whitefly-transmitted geminiviruses it has a host range confined to dicotyledenous plants.\ \

    Sequence comparison of the whitefly-transmitted Squash leaf curl virus (SqLCV) and Tomato yellow leaf curl virus (TYLCV) with the genomic components of TGMV and BGMV reveals a close evolutionary relationship PUBMED:1840676, PUBMED:1984668, PUBMED:1926771. Amino acid sequence alignments of Potato yellow mosaic virus (PYMV) \ proteins with those encoded by other geminiviruses show that PYMV is closely related to geminiviruses isolated from the New World, especially in the putative \ coat protein gene regions PUBMED:1926771. Comparison of MSV DNA-encoded proteins with those of other geminiviruses infecting monocotyledonous plants, including Panicum streak virus PUBMED:1588314 and Miscanthus streak virus (MiSV) PUBMED:1919519, reveal high levels of similarity.

    \ ' '8116' 'IPR013242' '\

    This region defines single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases

    \ ' '8117' 'IPR013174' '\

    This family corresponds to subunit 3 of dolichol-phosphate mannosyltransferase, an enzyme which generates mannosyl donors for glycosylphosphatidylinositols, N-glycan and protein O- and C-mannosylation. DPM3 is an integral membrane protein and plays a role in stabilising the dolichol-phosphate mannosyl transferase complex PUBMED:10835346.

    \ ' '8118' 'IPR013252' '\

    Spc24 is a component of the evolutionarily conserved kinetochore-associated Ndc80 complex and is involved in chromosome segregation PUBMED:11266451.

    \ ' '8119' 'IPR013251' '\

    Spc19 is a component of the DASH complex. The DASH complex associates with the spindle pole body and is important for spindle and kinetochore integrity during cell division PUBMED:11799062, PUBMED:11782438.

    \ ' '8120' 'IPR013234' '\

    This domain is found on phosphatidylinositol N-acetylglucosaminyltransferase proteins. These proteins are involved in GPI anchor biosynthesis and are associated with the disease paroxysmal nocturnal haemoglobinuria PUBMED:12488505.

    \ ' '8121' 'IPR013188' '\

    Matrix protein (M1) of Influenza virus is a bifunctional membrane/RNA-binding protein that mediates the encapsidation of RNA-nucleoprotein cores into the membrane envelope. It is therefore required that M1 binds both membrane and RNA simultaneously. M1 is comprised of two domains connected by a linker sequence. The C-terminal domain contains alpha-helical structure and appears to be involved in growth and virulence of the virus PUBMED:15892972, PUBMED:12590584.

    \ ' '8122' 'IPR013195' '\

    This entry represent a short region found at the N-terminus of some viral capsid (HBcAg) proteins from various Hepatitis B virus (HBV), which is a major human pathogen. The conservation of four Cys residues suggests that this region acts as a zinc binding domain.

    \

    Hepatitis virus is composed of an outer envelope of host-derived lipid containing the surface proteins, and an inner protein capsid that contains genomic DNA. The capsid is composed of a single polypeptide, HBcAg, also known as the core antigen. The capsid has a 5-helical fold, where two long helices form a hairpin that dimerises into a 4-helical bundle PUBMED:10394365; this fold is unusual for icosahedral viruses. The monomer fold is stabilised by a hydrophobic core that is highly conserved among human viral variants. The capsid is assembled from dimers via interactions involving a highly conserved arginine-rich region near the C terminus. This viral capsid acts as a core antigen, the major immunodominant region lying at the tips of the alpha-helical hairpins that form spikes on the capsid surface.

    \ \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '8123' 'IPR013230' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This entry represents the C-terminal domain of zinc D-Ala-D-Ala carboxypeptidases from Streptomyces species and non-peptidase homologues that belong to MEROPS peptidase family M15 (subfamily M15A, clan MD) PUBMED:15044722.

    \ ' '8124' 'IPR013238' '\

    Rpc25 is a strongly conserved subunit of RNA polymerase III and has homology to Rpa43 in RNA polymerase I, Rpb7 in RNA polymerase II and the archaeal RpoE subunit. Rpc25 is required for transcription initiation and is not essential for the elongating properties of RNA polymerase III PUBMED:15612920.

    \ ' '8125' 'IPR013219' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This entry represents a mitochondrial ribosomal subunit annotated as S27 in yeast and S33 in humans PUBMED:11278769, PUBMED:11344316. It is a small 106 residue protein. The evolutionary history of the mitoribosomal proteome that is encoded by a diverse subset of eukaryotic genomes, reveals an ancestral ribosome of alpha-proteobacterial descent that more than doubled its protein content in most eukaryotic lineages. Several new MRPs have originated via duplication of existing MRPs as well as by recruitment from outside of the mitoribosomal proteome PUBMED:17604309.

    \ ' '8126' 'IPR013261' '\

    TIM21 interacts with the outer mitochondrial TOM complex and promotes the insertion of proteins into the inner mitochondrial membrane PUBMED:15797382.

    \ ' '8127' 'IPR013194' '\

    This domain is found on transcriptional regulators. It forms interactions with histone deacetylases PUBMED:12773392.

    \ ' '8129' 'IPR013268' '\

    This family of proteins is associated with U3 snoRNA PUBMED:12068309. U3 snoRNA is required for nucleolar processing of pre-18S ribosomal RNA.

    \ ' '8130' 'IPR013153' '\

    This is entry is found at the N-terminus of PrkA proteins - bacterial and archaeal serine kinases approximately 630 residues in length. PrkA possesses the A-motif of nucleotide-binding proteins and exhibits distant homology to eukaryotic protein kinases PUBMED:8626065. Note that many of these are hypothetical.

    \ ' '8131' 'IPR013159' '\

    This entry represents the C-terminal domain of bacterial DnaA proteins PUBMED:8110826, PUBMED:1779750, PUBMED:2558436 that play an important role in initiating and regulating chromosomal replication. DnaA is an ATP- and DNA-binding protein. It binds specifically to 9 bp nucleotide repeats known as dnaA boxes which are found in the chromosome origin of replication (oriC).

    \

    DnaA is a protein of about 50 kDa that contains two conserved regions: the first is located in the N-terminal half and corresponds to the ATP-binding domain, the second is located in the C-terminal half and could be involved in DNA-binding. The protein may also bind the RNA polymerase beta subunit, the dnaB and dnaZ proteins, and the groE gene products (chaperonins) PUBMED:2172087.

    \ ' '8132' 'IPR013192' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a zinc finger motif found in the non-structural 5a protein (NS5a) in Hepatitis C virus. The molecular function of NS5a is uncertain, but it is phosphorylated when expressed in mammalian cells. It is thought to interact with the dsRNA dependent (interferon inducible) kinase PKR, PUBMED:9710605, PUBMED:9143277. This region corresponds to the N-terminal zinc binding domain (1a) PUBMED:15902263.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '8133' 'IPR013193' '\

    The molecular function of the non-structural 5a protein is uncertain. The NS5a protein is phosphorylated when expressed in mammalian cells. It is thought to interact with the dsRNA dependent (interferon inducible) kinase PKR, PUBMED:9710605, PUBMED:9143277. This region corresponds to the 1b domain PUBMED:15902263.

    \ ' '8134' 'IPR015965' '\

    This entry represents a phosphodiesterase domain found in fungal tRNA ligases PUBMED:12933796. Please see the following relevant references: PUBMED:12466548, PUBMED:1922054.

    \ ' '8135' 'IPR015966' '\

    This entry represents a kinase domain found in fungal tRNA ligases PUBMED:12933796. Please see the following relevant references: PUBMED:12466548, PUBMED:1922054.

    \ ' '8137' 'IPR013222' '\

    This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins.

    \ ' '8138' 'IPR013191' '\

    This domain is the putative catalytic domain of glycosyl hydrolase family 98 proteins.

    \ ' '8139' 'IPR013190' '\

    This putative domain is found at the C-terminus of glycosyl hydrolase family 98 proteins. This domain is not expected to form part of the catalytic activity.

    \ ' '8140' 'IPR013229' '\

    This domain is found in both archaea and bacteria and has similarity to S-layer (surface layer) proteins. It is named after the characteristic PEGA sequence motif found in this domain. The secondary structure of this domain is predicted to be beta-strands PUBMED:.

    \ ' '8141' 'IPR013211' '\

    This repeat is found in bacterial and archaeal cell surface proteins, many of which are hypothetical. The secondary structure corresponding to this repeat is predicted to comprise 4 beta-strands, which may associate to form a beta-propeller PUBMED:. The repeat copy number varies from 2-14. This repeat is sometimes found with the PKD domain .

    \ ' '8142' 'IPR013207' '\

    This 54 amino acid repeat is found in many hypothetical proteins. Several hypothetical proteins from Corynebacterium glutamicum (Brevibacterium flavum) and Corynebacterium efficiens along with PS1 protein contain this repeat region. The N-terminal region of PS1 contains an esterase domain which transfers corynomycolic acid. The C-terminal region consists of 4 tandem LGFP repeats. It is hypothesised that the PS1 proteins in Corynebacterium, when associated with the cell wall, may be anchored via the LGFP tandem repeats that may be important for maintaining cell wall integrity [PUBMED:. Deletion of protein results in a 10-fold increase in the cell volume of the organism and infers the corresponding involvement of the protein in the cell shape formation PUBMED:12740729. The secondary structure of each repeat is predicted to comprise two beta-strands and one alpha-helix PUBMED:.

    \ ' '8143' 'IPR013212' '\

    Proteins containing this domain are checkpoint proteins involved in cell division. This region has been shown to be essential for the binding of BUB1 and MAD3 to CDC20p PUBMED:10704439.

    \ ' '8144' 'IPR013170' '\

    The cwf21 family is involved in mRNA splicing. It has been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe (Fission yeast) PUBMED:11884590.

    \ ' '8145' 'IPR013243' '\

    This domain is found in the protein Sgf73/Sca7 which is a component of the multihistone acetyltransferase complexes SAGA and SILK PUBMED:15932941. This domain is also found in Ataxin-7, a human protein which in its polyglutamine expanded pathological form, is responsible for the neurodegenerative disease spinocerebellar ataxia 7 (SCA7) PUBMED:15932941.

    \ ' '8146' 'IPR013244' '\

    Sec39 is involved in the secretory pathway. In Saccharomyces cerevisiae (Baker\'s yeast) it has been shown to localise to the endoplasmic reticulum and nuclear membrane PUBMED:15942868.

    \ ' '8147' 'IPR013169' '\

    The cwf18 family is involved in mRNA splicing. It has been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe (Fission yeast) PUBMED:11884590.

    \ ' '8148' 'IPR013226' '\

    Pal1 is a membrane associated protein that is involved in the maintenance of cylindrical cellular morphology. It localises to sites of active growth. Pal1 physically interacts and displays overlapping localisation with the Huntingtin-interacting-protein (Hip1)-related protein Sla2p/End4p PUBMED:15975911.

    \ ' '8149' 'IPR013253' '\

    This entry consists of cell division proteins which are required for kinetochore-spindle association PUBMED:15371542.

    \ ' '8150' 'IPR013167' '\

    This region is found in yeast oligomeric golgi complex component 4 which is involved in ER to Golgi and intra Golgi transport PUBMED:12006647.

    \ ' '8152' 'IPR013233' '\

    Mammalian PIG-X and yeast PBN1 are essential components of glycosylphosphatidylinositol-mannosyltransferase I. These enzymes are involved in the transfer of sugar molecules.

    \ ' '8153' 'IPR013235' '\

    This domain is specific to the PPP5 subfamily of serine/threonine phosphatases.

    \ ' '8155' 'IPR013534' '\

    This region represents the catalytic domain of glycogen (or starch) synthases that use ADP-glucose (), rather than UDP-glucose () as in animals, as the glucose donor. This enzyme is found in bacteria and plants. Whether the name given is glycogen synthase or starch synthase depends on context, and therefore on substrate.

    \ ' '8156' 'IPR013535' '\

    The PUL (PLAP, Ufd3p and Lub1p) domain is a novel alpha-helical Ub-associated domain.

    \ ' '8157' 'IPR013536' '\

    This is a predicted metallopeptidase domain called WLM (Wss1p-like metalloproteases). These are linked to the Ub-system by virtue of fusions with the UB-binding PUG (PUB), Ub-like, and Little Finger domains. More specifically, genetic evidence implicates the WLM family in de-SUMOylation PUBMED:15483401.

    \ ' '8158' 'IPR013537' '\

    This region is found in various eukaryotic acetyl-CoA carboxylases, N-terminal to the catalytic domain (). This enzyme () is involved in the synthesis of long-chain fatty acids, as it catalyses the rate limiting step in this process.

    \ ' '8159' 'IPR013538' '\

    This family includes eukaryotic, prokaryotic and archaeal proteins that bear similarity to a C-terminal region of human activator of 90 kDa heat shock protein ATPase homologue 1 (AHSA1/p38, ). This protein is known to interact with the middle domain of Hsp90, and stimulate its ATPase activity PUBMED:12604615. It is probably a general up regulator of Hsp90 function, particularly contributing to its efficiency in conditions of increased stress PUBMED:12504007. p38 is also known to interact with the cytoplasmic domain of the VSV G protein, and may thus be involved in protein transport PUBMED:11554768. It has also been reported as being under expressed in Down\'s syndrome. This region is found repeated in two members of this family ( and ).

    \ ' '8160' 'IPR013539' '\

    This domain is found at the C terminus of adenylosuccinate lyase(ASL; PurB in Escherichia coli). It has been identified in bacteria, eukaryotes and archaea and is found together with the lyase domain . ASL catalyses the cleavage of succinylaminoimidazole carboxamide ribotide to aminoimidazole carboxamide ribotide and fumarate and the cleavage of adenylosuccinate to adenylate and fumarate PUBMED:8530047.

    \ ' '8161' 'IPR013540' '\

    This domain is found in a number of bacterial chitinases and similar viral proteins. It is organised into a fibronectin III module domain-like fold, comprising only beta strands. Its function is not known, but it may be involved in interaction with the enzyme substrate, chitin PUBMED:7704527, PUBMED:9377712. It is separated by a hinge region from the catalytic domain (); this hinge region is probably mobile, allowing the N-terminal domain to have different relative positions in solution PUBMED:7704527.

    \ ' '8163' 'IPR013542' '\

    This domain of unknown function occurs in iron-sulphur cluster-binding proteins together with the 4Fe-4S binding domain ().

    \ ' '8164' 'IPR013543' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    This domain is found at the C terminus of the Calcium/calmodulin dependent protein kinases II (CaMKII). These proteins also have a Ser/Thr protein kinase domain () at their N terminus PUBMED:12603201. The function of the CaMKII association domain is the assembly of the single proteins into large (8 to 14 subunits) multimers PUBMED:14993460.

    \ ' '8165' 'IPR013544' '\

    This domain is found at the C terminus of many eukaryotic and one bacterial sequence. Many of its members are annotated as being putative L1 retrotransposons or LINE-1 reverse transcriptase homologues. The region in question is found repeated in some family members.

    \ ' '8166' 'IPR013545' '\

    The general secretion pathway, or type II pullulanase-like machinery, is responsible for the transport of proteins from the periplasm across the outer membrane in Gram-negative bacteria PUBMED:15491357, PUBMED:1588814. This entry includes protein G (e.g. , ) involved in this pathway. The PulG protein () is thought to be anchored in the inner membrane with its C terminus directed towards the periplasm PUBMED:2129543. Together with other members of the secretion machinery, it is thought to assemble into a pilus-like structure that may function as a dynamic mechanism to push secreted proteins out of the cell. The polypeptide is organised into a long N-terminal alpha-helix followed by a loop region that separates it from a C-terminal anti-parallel beta-sheet PUBMED:15491357.

    \ ' '8167' 'IPR013546' '\

    This entry represents a region found in a family of nucleotide transferases that includes bifunctional uridylyl-removing enzymes/uridylyltransferases (UR/UTases, GlnD; ) and glutamine-synthetase adenylyltransferases (GlnE; ). The region described in this family is found in many of its members to be C-terminal to a nucleotidyltransferase domain, and N-terminal to an HD domain and two ACT domains PUBMED:8412694.

    \ \

    Bifunctional uridylyl-removing enzymes/uridylyltransferases are responsible for the modification of the regulatory protein PII, or GlnB, thereby acting as the sensory component of the nitrogen regulation (ntr) system PUBMED:11810255. The ntr system modulates nitrogen metabolism in response to the prevailing nitrogen source and the requirements of the cell. During nitrogen fixation, ammonia and 2-oxoglutarate can be used to produce glutamate. In response to nitrogen limitation, these transferases catalyse the uridylylation of the PII protein, which in turn stimulates deadenylylation of glutamine synthetase (GlnA), leading to the activation of glutamate synthetase and to the stimulation of NtrC-dependent promoters PUBMED:12384297. Uridylylated PII can act together with NtrB and NtrC to increase transcription of genes in the sigma54 regulon, which include glnA and other nitrogen-level controlled genes PUBMED:10931314. Under high concentrations of fixed nitrogen, PII is de-uridylylated leading to the inactivation of the glutamate synthetase pathway and switching off NtrC-dependent promoters PUBMED:11065377. It has also been suggested that the product of the glnD gene is involved in other physiological functions such as control of iron metabolism in certain species PUBMED:10931314.

    \ \

    Glutamine-synthetase adenylyltransferase is an adenylyl transferase comprised of an adenylylating domain and a deadenylylating domain which modulate glutamine synthetase (GS) activity, where GS plays an important role in nitrogen assimilation PUBMED:18469098.

    \ ' '8168' 'IPR013547' '\

    The members found in this entry are eukaryotic proteins, and include all three isoforms of the prolyl 4-hydroxylase alpha subunit. This enzyme () is important in the post-translational modification of collagen, as it catalyses the formation of 4-hydroxyproline. In vertebrates, the complete enzyme is an alpha2-beta2 tetramer; the beta-subunit is identical to protein disulphide isomerase PUBMED:7753822, PUBMED:14500733, PUBMED:2552442, PUBMED:11850189. The function of the N-terminal region featured in this family does not seem to be known.

    \ ' '8169' 'IPR013548' '\

    This domain is found at C terminus of various plexins (e.g. ). Plexins are receptors for semaphorins, and plexin signalling is important in pathfinding and patterning of both neurons and developing blood vessels PUBMED:15239959, PUBMED:11959816. The cytoplasmic region, which has been called a SEX domain PUBMED:8570614, and is involved in downstream signalling pathways, by interaction with proteins such as Rac1, RhoD, Rnd1 and other plexins PUBMED:12559962.

    \ ' '8170' 'IPR013549' '\

    This domain of unknown function appears towards the C terminus of proteins of the NAD dependent epimerase/dehydratase family () in bacteria, eukaryotes and archaea. Many of the proteins in which it is found are involved in cell-division inhibition.

    \ ' '8171' 'IPR013550' '\

    This domain describes the C-terminal region of various bacterial haemolysins and leukotoxins, which belong to the RTX family of toxins. These are produced by various Gram negative bacteria, such as Escherichia coli () and Actinobacillus pleuropneumoniae (). RTX toxins may interact with lipopolysaccharide (LPS) to functionally impair and eventually kill leukocytes PUBMED:8800842. This region is found in association with the RTX N-terminal domain () and multiple hemolysin-type calcium-binding repeats ().

    \ ' '8172' 'IPR013551' '\

    This domain of unknown function is found at the C terminus of bacterial proteins, many of which are hypothetical and include proteins of the YicC family.

    \ ' '8173' 'IPR013552' '\

    This domain is found near the N terminus of fibronectin-binding proteins in Streptococcus where it functions as a signal sequence PUBMED:15516573.

    \ ' '8175' 'IPR013554' '\

    This domain is found at the N terminus of bacterial ribonucleoside-diphosphate reductases (ribonucleotide reductases, RNRs) which catalyse the formation of deoxyribonucleotides PUBMED:8052308. It occurs together with the RNR all-alpha domain () and the RNR barrel domain ().

    \ ' '8176' 'IPR013555' '\

    Transient receptor potential (Trp) and related proteins are thought to be\ Ca2+ ion channel subunits that mediate capacitative Ca2+ entry in response \ to a range of external and internal cell stimuli. Such Ca2+ entry is thought \ to be an essential component of cellular responses to many hormones and\ growth factors, and acts to replenish intracellular Ca2+ stores that have\ been emptied through the action of inositol triphosphate (IP3) and other \ agents. In non-excitable cells, i.e. those that lack voltage-gated Ca2+\ channels, such as hepatocytes, this mode of Ca2+ entry is thought to be an\ important step in generating the oscillations of intracellular Ca2+\ concentration that characterise their response to stimulatory agents PUBMED:8646775.\ \ Studies on the visual transduction system in Drosophila led to the molecular\ cloning of Trp and the cDNA of a related protein, Trp-like, which show\ similarity to voltage gated Ca2+ channels in the regions known as S3 through\ S6, including the S5-S6 linker that forms the ion-selective channel pore PUBMED:9368034.\ This provided evidence that Trp and/or related proteins might form mammalian\ capacitative Ca2+ entry channels.

    \ \

    A number of Trp and Trp-like channel gene isoforms have now been cloned, \ including several mammalian homologues. The Trp family is thought to encode\ at least 20 Ca2+-permeable channel proteins. Hydropathy analysis suggests \ that they share a common transmembrane (TM) topology. Each family member is \ predicted to possess 6 TM domains with intracellular N- and C-termini, which\ is similar to the core structure of the pore-forming subunits of the voltage-gated Na+ and Ca2+ channels. By analogy with these proteins, which have \ 4 linked domains of 6 TM segments, it is likely that Trp channels are\ homo- or heterotetramers of 4 single subunits PUBMED:11389472.\ The Trp family can be divided on the basis of sequence similarity into 3\ subfamilies: short (S), long (L) and osm-like (O) Trp channels. The \ STrp subfamily includes Drosophila Trp and Trpl-like, and the mammalian \ homologues TrpC1-7. Channels of the STrpC subfamily are activated following\ receptor-mediated stimulation of different isoforms of phospholipase C PUBMED:10717675.

    \

    This domain is found in Trp proteins, generally located to the C terminus of the Ankyrin repeats ().

    \ ' '8177' 'IPR013556' '\

    This domain is found in bacterial flagellar M-ring (FliF) proteins together with the YscJ/FliF domain ().

    \ ' '8178' 'IPR013557' '\

    In Escherichia coli the two proteins AntA and AntB have 62% amino acid identities near their N termini. AntA appears to be encoded by a truncated and divergent copy of AntB. The two proteins are homologous to putative antirepressors found in numerous bacteriophages, such as the hypothetical antirepressor protein encoded by the gene LO142 of the Bacteriophage 933W.

    \ ' '8179' 'IPR013558' '\

    This region tends to appear at the N terminus of proteins also containing DNA-binding HMG (high mobility group) boxes () and appears to bind the armadillo repeat of CTNNB1 (beta-catenin), forming a stable complex. Signalling by Wnt through TCF/LCF is involved in developmental patterning, induction of neural tissues, cell fate decisions and stem cell differentiation PUBMED:15765502. Isoforms of HMG T-cell factors lacking the N-terminal CTNNB1-binding domain cannot fulfil their role as transcriptional activators in T-cell differentiation PUBMED:10080941, PUBMED:9783587.

    \ ' '8180' 'IPR013559' '\

    This domain is found in various hypothetical bacterial proteins that are similar to the Escherichia coli protein YheO (). Their function is unknown, but a few members are annotated as being HTH-containing proteins and putative DNA-binding proteins.

    \ ' '8181' 'IPR013560' '\

    This domain of unknown function is found in bacteria and archaea and is homologous to the hypothetical protein ybgA from Escherichia coli.

    \ ' '8182' 'IPR013561' '\

    This domain of unknown function has so far only been found at the C terminus of archaean proteins, including several transcriptional regulators of the ArsR family (see ).

    \ ' '8183' 'IPR013562' '\

    This domain of unknown function is found towards the N terminus of putative ATPases ().

    \ ' '8184' 'IPR013563' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain PUBMED:9873074.

    \ The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site PUBMED:11421269, PUBMED:1282354, PUBMED:9640644.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis PUBMED:11988180, PUBMED:11470432, PUBMED:11402022, PUBMED:9872322, PUBMED:11080142, PUBMED:11532960.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions PUBMED:9873074. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette PUBMED:9873074, PUBMED:11421270. More than 50 subfamilies have been described based on a phylogenetic and functional classification PUBMED:9873074, PUBMED:11421269, PUBMED:11421270; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    \

    This entry features a region found towards the C terminus of oligopeptide ABC transporter ATP binding proteins, immediately following the ATP-binding domain (). All characterised members appear able to be involved in the transport of oligopeptides or dipeptides. Some are important for sporulation or antibiotic resistance. Some dipeptide transporters also act on the haem precursor delta-aminolevulinic acid.

    \ ' '8185' 'IPR013564' '\

    This domain of unknown function is found at the C terminus of bacterial proteins which include UDP-N-acetylmuramyl tripeptide synthase and the related Mur ligase.

    \ ' '8186' 'IPR013565' '\

    This domain of unknown function is found in fatty acid synthase beta subunits together with the MaoC-like domain () and the Acyltransferase domain () PUBMED:9693066. The domain has been identified in fungi and bacteria.

    \ ' '8187' 'IPR013566' '\

    This region typically appears on the C terminus of EF hands in GTP-binding proteins such as Arht/Rhot (may be involved in mitochondrial homeostasis and apoptosisPUBMED:15218247). The EF hand associated region is found in yeast, vertebrates and plants.

    \ ' '8188' 'IPR013567' '\

    This region predominantly appears near EF-hands () in GTP-binding proteins. It is found in all three eukaryotic kingdoms.

    \ ' '8189' 'IPR013568' '\

    This domain is found in IL17 receptors (IL17Rs, e.g. ) and SEF proteins (e.g. ). The latter are feedback inhibitors of FGF signalling and are also thought to be receptors. Due to its similarity to the TIR domain (), the SEFIR region is thought to be involved in homotypic interactions with other SEFIR/TIR-domain-containing proteins. Thus, SEFs and IL17Rs may be involved in TOLL/IL1R-like signalling pathways PUBMED:12765832.

    \ ' '8190' 'IPR013569' '\

    This domain is found together with the viral coat protein domain () in coat/capsid proteins of the plant infecting Carlavirus. It is required for genome encapsidation by forming ribonucleoprotein complexes along with TGB1 helicase and viral RNA. The N- and the C-terminus of this coat protein can be exposed on the surface of the virus particle. The central core sequence may be important in maintaining correct tertiary structure of the coat protein and/or play a role in the interaction with the viral RNA. Coat proteins are often used to distinguish between Carlavirus isolates.

    \ \

    In the coat protein amino acid sequences of definitive and tentative species of carlaviruses, there is a region of seven amino acids (GLGVPTE) that are conserved PUBMED:18192032. The complete coat protein (CP) sequences of 29 Indian Chrysanthemum virus B (CVB) isolates were highly heterogeneous, sharing nucleotide sequence identities of 74-98% PUBMED:17006596, PUBMED:16279199.

    \ ' '8191' 'IPR013570' '\

    This entry represents the C-terminal domain found in the hypothetical transcriptional regulator YsiA () from Bacillus subtilis, which is a member of the TetR (tetracycline resistance) transcriptional regulator family of proteins. The C-terminal domains of YsiA and TetR share a multi-helical, interlocking structure.

    \ ' '8192' 'IPR013571' '\

    This entry represents the C-terminal domain found in the multidrug-binding transcription regulator QacR () from Staphylococcus aureus, which is a member of the TetR (tetracycline-resistance) transcriptional regulator family of proteins. QacR is able to bind various environmental agents, which include a number of cationic lipophilic compounds, and thus regulate the transcription of QacA (), a multidrug efflux pump PUBMED:9660841. The C-terminal region of QacR contains a multifaceted, expansive drug-binding pocket, which is composed of several separate, but linked, binding sites PUBMED:11739955. The C-terminal domains of QacR and TetR share a multi-helical, interlocking structure.

    \ ' '8193' 'IPR013572' '\

    This entry is named after the various transcriptional regulatory proteins that it contains, including MtrR (), AcrR (), ArpR (), TtgR () and SmeT (). These are members of the TetR (tetracycline resistance) family of transcriptional repressors, that are involved in the control of expression of multidrug resistance proteins PUBMED:1720861, PUBMED:11160799, PUBMED:12384340.

    \ ' '8194' 'IPR013573' '\

    This entry represents the C-terminal domain found in the hypothetical transcriptional regulators RutR and YcdC () from Escherichia coli. Both of these proteins are member of the TetR (tetracycline resistance) transcriptional regulator family of proteins. RutR negatively controls the transcription of the rut operon involved in pyrimidine utilization. The C-terminal domains of RutR, YsiA and TetR share a multi-helical, interlocking structure. These proteins also contain helix-turn-helix (HTH) DNA-binding domains.

    \ ' '8195' 'IPR013574' '\

    This domain is found in glucan-binding protein C (GbpC) and in the V-region of surface protein antigen; both these proteins belong to the Spa family of Streptococcal proteins PUBMED:9009329. This domain consists of a beta-supersandwich of 18 beta-strands in two sheets.

    \

    There are at least four types of glucan-binding proteins (Gbp) in Streptococcus mutans, GbpA, GbpB, GbpC and GbpD. These proteins promote the adhesion of Streptococcal bacteria to teeth and are associated with dental caries PUBMED:17241168. GbpC is a cell-wall anchoring protein that plays an important role in sucrose-dependent adhesion by binding to soluble glucan synthesised by glucosyltransferase D (GTFD) PUBMED:16390340.

    \

    Spa antigens I/II are multi-functional proteins expressed at the cell wall surface of oral Streptococci, where they function as adhesins. Antigens I/II recognise a wide range of ligands. They exert an immunomodulatory effect on human cells and are important in inflammatory disorders, such as dental caries. These proteins can be divided into seven regions: signal peptide, N-terminal, A-region (alanine-rich), V-region (variable domain), P-region (proline-rich), C-terminal domain, and a cell wall anchor motif. The V-region is the central domain and exhibits the greatest variability in sequence, and is responsible for binding monocyte receptors, its binding stimulating the release of TNF-alpha from the monocytes. The crystal structure of the V-region revealed a lectin-like fold that displays a putative preformed carbohydrate-binding site stabilised by a metal ion PUBMED:12054777.

    \ ' '8196' 'IPR013575' '\

    Most of the sequences in this alignment come from bacterial translation initiation factors (IF-2, also ), but the domain is also found in the eukaryotic translation initiation factor 4 gamma in yeast and in a hypothetical Euglenozoa protein of unknown function.

    \ ' '8197' 'IPR013576' '\

    Insulin-like Growth Factor Binding Proteins (IGFBP) are a group of vertebrate secreted proteins, which bind to IGF-I and IGF-II with high affinity and modulate the biological actions of IGFs. The IGFBP family has six distinct subgroups, IGFBP-1 through 6, based on conservation of gene (intron-exon) organisation, structural similarity, and binding affinity for IGFs. Across species, IGFBP-5 exhibits the most sequence conservation, while IGFBP-6 exhibits the least sequence conservation. The IGFBPs contain inhibitor domain homologues, which are related to MEROPS protease inhibitor family I31 (equistatin, clan IX).

    \ \

    All IGFBPs share a common domain architecture (:). While the N-terminal (, IGF binding protein domain), and the C-terminal (, thyroglobulin type-1 repeat) domains are conserved across vertebrate species, the mid-region is highly variable with respect to protease cleavage sites and phosphorylation and glycosylation sites. IGFBPs contain 16-18 conserved cysteines located in the N-terminal and the C-terminal regions, which form 8-9 disulphide bonds PUBMED:11874691.

    As demonstrated for human IGFBP-5, the N terminus is the primary binding site for IGF. This region, comprised of Val49, Tyr50, Pro62 and Lys68-Leu75, forms a hydrophobic patch on the surface of the protein PUBMED:9822601. The C terminus is also required for high affinity IGF binding, as well as for binding to the extracellular matrix PUBMED:9725901 and for nuclear translocation PUBMED:7519375, PUBMED:9660801 of IGFBP-3 and -5.

    IGFBPs are unusually pleiotropic molecules. Like other binding proteins, IGFBP can prolong the half-life of IGFs via high affinity binding of the ligands. In addition to functioning as simple carrier proteins, serum IGFBPs also serve to regulate the endocrine and paracrine/autocrine actions of IGF by modulating the IGF available to bind to signalling IGF-I receptors PUBMED:12379487, PUBMED:12379489. Furthermore, IGFBPs can function as growth modulators independent of IGFs. For example, IGFBP-5 stimulates markers of bone formation in osteoblasts lacking functional IGFs PUBMED:11874691. The binding of IGFBP to its putative receptor on the cell membrane may stimulate the signalling pathway independent of an IGF receptor, to mediate the effects of IGFBPs in certain target cell types. IGFBP-1 and -2, but not other IGFBPs, contain a C-terminal Arg-Gly-Asp integrin-binding motif. Thus, IGFBP-1 can also stimulate cell migration of CHO and human trophoblast cells through an action mediated by alpha 5 beta 1 integrin PUBMED:7504269. Finally, IGFBPs transported into the nucleus (via the nuclear localisation signal) may also exert IGF-independent effects by transcriptional activation of genes.

    \ \

    The insulin family of proteins PUBMED:6107857 groups a number of active peptides which are evolutionary related including insulin; relaxin; insulin-like growth factors I and II PUBMED:2197088; mammalian\ Leydig cell-specific insulin-like peptide (gene INSL3) PUBMED:8253799 and early placenta insulin-like peptide (ELIP) (gene INSL4) PUBMED:8666396; insect prothoracicotropic hormone (bombyxin) PUBMED:; locust insulin-related peptide (LIRP) PUBMED:1688797; molluscan insulin-related peptides 1 to 5 (MIP)\ PUBMED:1868853; and Caenorhabditis elegans insulin-like peptides PUBMED:9548970. Structurally, all these peptides consist of two polypeptide chains (A and B) linked by two disulphide bonds. They all share a conserved arrangement of four cysteines in their A chain. The first of these cysteines is linked by a disulphide bond to the third one and the second and fourth cysteines are linked by interchain disulphide bonds to cysteines in the B chain.

    \ \

    Insulin is involved in the regulation of normal glucose homeostasis, as well\ as other specific physiological functions PUBMED:6243748. It is synthesised as a prepropeptide from which an endoplasmic reticulum-targeting sequence is cleaved to yield proinsulin. Prosinsulin contains regions A and B separated by an intervening connecting region C. The connecting region is cleaved, liberating the active protein, which contains the A and B chains,\ held together by 2 disulphide bonds PUBMED:503234.

    \

    This domain is the C-terminal domain of insulin-like growth factor II proteins (IGF-2, also see ) in vertebrates and seems to represent the E-peptide PUBMED:8215015, PUBMED:12324491.

    \ ' '8198' 'IPR013577' '\

    This domain is found in lethal giant larvae homologue 2 (LLGL2) proteins and syntaxin-binding proteins like tomosyn PUBMED:14767561. It has been identified in eukaryotes and tends to be found together with WD repeats ().

    \ ' '8199' 'IPR013578' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This domain appears in eukaryotes as well as bacteria and tends to be found near the C terminus of metalloproteases and related sequences belonging to MEROPS peptidase family M16 (subfamily M16C, clan ME). These include: eupitrilysin, falcilysin, PreP peptidase, CYM1 peptidase and subfamily M16C non-peptidase homologues.

    \ ' '8200' 'IPR013579' '\

    This domain represents a conserved region of eukaryotic Fas-activated serine/threonine (FAST) kinases () that contains several conserved leucine residues. FAST kinase is rapidly activated during Fas-mediated apoptosis, when it phosphorylates TIA-1, a nuclear RNA-binding protein that has been implicated as an effector of apoptosis PUBMED:7544399. Note that many family members are hypothetical proteins. This subdomain is often found associated with the FAST kinase-like protein, subdomain 2.

    \ ' '8201' 'IPR013580' '\

    This domain is found in bacteria and plant chloroplast proteins. It often appears at the C-terminal of nitrogenase component 1 type oxidoreductases () and sometimes independently in bacterial proteins such as the proto-chlorophyllide reductase 57 kDa subunit of the cyanobacterium Synechocystis.

    \ ' '8202' 'IPR013581' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain PUBMED:9873074.

    \ The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site PUBMED:11421269, PUBMED:1282354, PUBMED:9640644.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis PUBMED:11988180, PUBMED:11470432, PUBMED:11402022, PUBMED:9872322, PUBMED:11080142, PUBMED:11532960.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions PUBMED:9873074. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette PUBMED:9873074, PUBMED:11421270. More than 50 subfamilies have been described based on a phylogenetic and functional classification PUBMED:9873074, PUBMED:11421269, PUBMED:11421270; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    \

    This domain is found on the C terminus of ABC-2 type transporter domains (). It seems to be associated with the plant pleiotropic drug resistance (PDR) protein family of ABC transporters. Like in yeast, plant PDR ABC transporters may also play a role in the transport of antifungal agents PUBMED:12430018 (see also ). The PDR family is characterised by a configuration in which the ABC domain is nearer the N terminus of the protein than the transmembrane domain PUBMED:12430018.

    \ ' '8203' 'IPR013582' '\

    This domain is found in phospholipases and viral envelope proteins between Phospholipase D (PLD) active site motifs (). PLD is associated with Golgi membranes and alters their lipid content by the conversion of phospholipids into phosphatidic acid, which is thought to be involved in the regulation of lipid movement. This might explain the prevalence of this domain in viral envelope proteins PUBMED:9140189.

    \ ' '8204' 'IPR013583' '\

    This domain is found at the C terminus of phosphoribosyltransferases and phosphoribosyltransferase-like proteins. It contains putative transmembrane regions. It often appears together with calcium-ion dependent C2 domains ().

    \ ' '8205' 'IPR013584' '\

    The ~60-residue RAP (an acronym for RNA-binding domain abundant in Apicomplexans) domain is found in various proteins in eukaryotes. It is particularly abundant in apicomplexans and might mediate a range of cellular functions through its potential interactions with RNA PUBMED:15501674.

    \ \

    The RAP domain consists of multiple blocks of charged and aromatics residues and is predicted to be composed of alpha helical and beta strand structures. Two predicted loop regions that are dominated by glycine and tryptophan residues are found before and after the central beta sheet PUBMED:15501674. Some proteins known to contain a RAP domain are listed below:

    \ \ ' '8206' 'IPR013585' '\

    The structure of protocadherins is similar to that of classic cadherins (), but they also have some unique features associated with the cytoplasmic domains. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localised mainly at cell-cell contact sites. Their expression seems to be developmentally regulated PUBMED:8508762.

    \ ' '8207' 'IPR013586' '\

    Intracellular proteins, including short-lived proteins such as cyclin, Mos, Myc, p53, NF-kappaB, and IkappaB, are degraded by the ubiquitin-proteasome system. The 26S proteasome is a self-compartmentalising protease responsible for the regulated degradation of intracellular proteins in eukaryotes PUBMED:15571806, PUBMED:15890341. This giant intracellular protease is formed by several subunits arranged into two 19S polar caps, where protein recognition and ATP-dependent unfolding occur, flanking a 20S central barrel-shaped structure with an inner proteolytic chamber. This overall structure is highly conserved among eukaryotes and is essential for cell viability. Proteins targeted to the 26S proteasome are conjugated with a polyubiquitin chain by an enzymatic cascade before delivery to the 26S proteasome for degradation into oligopeptides.

    \ \

    The 19S component is divided into a "base" subunit containing six ATPases (Rpt proteins) and two non-ATPases (Rpn1, Rpn2), and a "lid" subunit composed of eight stoichiometric proteins (Rpn3, Rpn5, Rpn6, Rpn7, Rpn8, Rpn9, Rpn11, Rpn12) PUBMED:9741626. Additional non-essential and species specific proteins may also be present. The 19S unit performs several essential functions including binding the specific protein substrates, unfolding them, cleaving the attached ubiquitin chains, opening the 20S subunit, and driving the unfolded polypeptide into the proteolytic chamber for degradation. The 26s proteasome and 19S regulator are of medical interest due to their involvement in burn rehabilitation PUBMED:16566573.

    \ \

    This eukaryotic domain is found at the C terminus of 26S proteasome regulatory subunits such as the non-ATPase Rpn3 subunit which is essential for proteasomal function PUBMED:10490625. It occurs together with the PCI/PINT domain ().

    \ ' '8208' 'IPR013587' '\

    The nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure PUBMED:12633990.

    \ ' '8209' 'IPR013588' '\

    This domain is found in the MAP2/Tau family of proteins which includes MAP2, MAP4, Tau, and their homologues. All isoforms contain a conserved C-terminal domain containing tubulin-binding repeats (), and a N-terminal projection domain of varying size. This domain has a net negative charge and exerts a long-range repulsive force. This provides a mechanism that can regulate microtubule spacing which might facilitate efficient organelle transport PUBMED:15642108, PUBMED:11576531.

    \ ' '8210' 'IPR011528' '\

    The nuclease-related domain (NERD) is found in a broad range of bacterial, as well as single archaeal and plant proteins. Most NERD-containing proteins have a single domain, sometimes with additional (predicted) transmembrane helices. In a few instances, proteins containing NERD domains have additional domains (mostly involved in DNA processing), such as the HRDC, the UvrD/REP helicase, the DNA-binding C4 zinc finger, or the serine/threonine and tyrosine protein kinases. In all cases in which a NERD domain is present in multidomain proteins, it is found at the N terminus. The NERD domain is predicted to function in DNA processing, and may have a nuclease function PUBMED:15055202.

    \ \ ' '8211' 'IPR013589' '\

    This region is found towards the N terminus of various archaeal and bacterial hypothetical proteins. Some of these are annotated as being transglutaminase-like proteins, and in fact contain a transglutaminase-like superfamily domain ().

    \ ' '8213' 'IPR013591' '\

    This is a short region, approximately 35 residues in length that is found near the C terminus in a number of plant proteins being repeated up to three times in some members. These proteins are annotated as being involved in disease resistance and in the regulation of chromosome condensation. They contain domains with varied functions, such as TIR () and FYVE ().

    \ ' '8215' 'IPR013592' '\

    This region is found in various leucine zipper transcription factors of the Maf family. These are implicated in the regulation of insulin gene expression PUBMED:12368292, in erythroid differentiation PUBMED:8620536, and in differentiation of the neuroretina PUBMED:11416124.

    \ ' '8216' 'IPR013593' '\

    This domain represents the N-terminal peptide of pro-opiomelanocortin (NPP). It is thought to represent an important pituitary peptide, given its high yield from pituitary glands, and exhibits a potent in vitro aldosterone-stimulating activity PUBMED:6945581.

    \ ' '8217' 'IPR013594' '\

    Dynein heavy chains interact with other heavy chains to form dimers, and with intermediate chain-light chain complexes to form a basal cargo binding unit PUBMED:10862709. The region featured in this family includes the sequences implicated in mediating these interactions PUBMED:10336435. It is thought to be flexible and not to adopt a rigid conformation PUBMED:10862709.

    \ ' '8218' 'IPR013595' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This is a C-terminal domain associated with putative hydrolases and bacterial peptidases that belong to MEROPS peptidase family S33 (clan SC). They are related to tripeptidyl-peptidase B from Streptomyces lividans (). A member of this family () is thought to be involved in the C-terminal processing of propionicin F, a bacteriocidin characterised from Propionibacterium freudenreichii PUBMED:15574930.

    \ ' '8219' 'IPR013596' '\

    This region is found in F-box () and other domain containing plant proteins; it is repeated in two family members. Its precise function is unknown, but it is thought to be associated with nuclear processes PUBMED:11779830. In fact, several family members are annotated as being similar to transcription factors.

    \ ' '8220' 'IPR013597' '\

    This region is found mainly in various bacterial and archaeal species, but a few members of this family are expressed by fungal and chlamydomonal species. It has been implicated in the binding of intron RNA during reverse transcription and splicing PUBMED:11959575.

    \ ' '8221' 'IPR013598' '\

    The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.

    \

    This domain is found close to the N-terminus of yeast exportin 1 (Xpo1, Crm1, ), as well as adjacent to the N-terminal domain of importin-beta (). Exportin 1 is a nuclear export receptor that translocates proteins out of the nucleus; it interacts with leucine-rich nuclear export signal (NES) sequences in proteins to be transported, as well as with RanGTP PUBMED:9323132, PUBMED:9323123. Importin-beta is a nuclear import receptor that translocates proteins into the nucleus; it interacts with RanGTP and importin-alpha, the latter binding with the nuclear localisation signal (NLS) sequences in proteins to be transported PUBMED:17170104.

    \

    More information about these proteins can be found at Protein of the Month: Importins PUBMED:.

    \ \ ' '8222' 'IPR013599' '\

    This family comprises sequences that are similar to human TRAM1 (). This is a transmembrane protein of the endoplasmic reticulum, thought to be involved in the membrane transfer of secretory proteins PUBMED:1315422. The region featured in this family is found N-terminal to the longevity-assurance protein region ().

    \ ' '8223' 'IPR013600' '\

    The sequences making up this entry are annotated as, or are similar to, Ly49 receptors (e.g. ). These are type II transmembrane receptors expressed by mouse natural killer (NK) cells. They are classified as being activating (e.g.Ly49D and H) or inhibitory (e.g. Ly49A and G), depending on their effect on NK cell function PUBMED:15607796. They are members of the C-type lectin receptor superfamily PUBMED:10925254, and in fact in many family members this region is found immediately N-terminal to a lectin C-type domain ().

    \ ' '8224' 'IPR013601' '\

    This domain is found in proteins that are described as 3-ketoacyl-CoA synthases, type III polyketide synthases, fatty acid elongases and fatty acid condensing enzymes, and are found in both prokaryotic and eukaryotic (mainly plant) species. The region contains the active site residues, as well as motifs involved in substrate binding PUBMED:12139488.

    \ ' '8225' 'IPR013602' '\

    Dyneins are described as motor proteins of eukaryotic cells, as they can convert energy derived from the hydrolysis of ATP to force and movement along cytoskeletal polymers, such as microtubules. Dyneins generally contain one to three heavy chains, where each heavy chain consists of a C-terminal globular head, a flexible microtubule-binding stalk, and a flexible N-terminal tail known as the cargo-binding domain PUBMED:15661525. The two categories of dyneins are the axonemal dyneins, which produce the bending motions that propagate along cilia and flagella, and the cytosolic dyneins, which drive a variety of fundamental cellular processes including nuclear migration, organization of the mitotic spindle, chromosome separation during mitosis, and the positioning and function of many intracellular organelles. Cytoplasmic dyneins contain several accessory subunits ranging from light to intermediate chains.

    \ \

    This entry represents a region found C-terminal to the dynein heavy chain N-terminal region 1 () in many members of this family. No functions seem to have been attributed specifically to this region.

    \ ' '8226' 'IPR013603' '\

    This region is found in the C terminus of a number of archaeal transcriptional regulators. It is thought to function as a metal-sensing regulatory module PUBMED:12713899.

    \ ' '8227' 'IPR013604' '\

    This family includes a number of gustatory and odorant receptors mainly from insect species such as Anopheles gambiae (African malaria mosquito) and Drosophila melanogaster (Fruit fly). They are classified as G-protein-coupled receptors (GPCRs), or seven-transmembrane receptors. They show high sequence divergence, consistent with an ancient origin for the family PUBMED:14608037, PUBMED:12364795.

    \ ' '8228' 'IPR013605' '\

    The Tx1 family lethal spider neurotoxin induces excitatory symptoms in mice PUBMED:8340362.

    \ ' '8229' 'IPR013606' '\

    The IMD (IRSp53 and MIM (missing in metastases) homology) domain is a BAR-like domain of approximately 250 amino acids found at the N-terminal in the insulin receptor tyrosine kinase substrate p53 (IRSp53) and in the evolutionarily related IRSp53/MIM family. In IRSp53, a ubiquitous regulator o the actin cytoskeleton, the IMD domain acts as conserved F-actin bundling domain involved in filopodium formation. Filopodium-inducing IMD activity is regulated by Cdc42 and Rac1 (Rho-family GTPases) and is SH3-independent PUBMED:14752106, PUBMED:17430976, PUBMED:15635447. The IRSp53/MIM family is a novel F-actin bundling protein family that includes invertebrate relatives:

    \

    \

    The vertebrate IRSp53/MIM family is divided into two major groups: the IRSp53 subfamily and the MIM/ABBA subfamily. The putative invertebrate homologues are positioned between them. The IRSp53 subfamily members contain an SH3 domain, and the MIM/ABBA subfamily proteins contain a WH2 (WASP-homology 2) domain. The vertebrate SH3-containing subfamily is further divided into three groups according to the presence or absence of the WWB and the half-CRIB motif. The IMD domain can bind to and bundle actin filaments, bind to membranes and interact with the small GTPase Rac PUBMED:14752106, PUBMED:17497115.

    \

    The IMD domain folds as a coiled coil of three extended alpha-helices and a shorter C-terminal helix. Helix 4 packs tightly against the other three helices, and thus represents an integral part of the domain. The fold of the IMD domain closely resembles that of the BAR (Bin-Amphiphysin-RVS) domain, a functional module serving both as a sensor and inducer of membrane curvature PUBMED:15635447. The WH2 domain performs a scaffolding function PUBMED:17292833.

    \ ' '8230' 'IPR013607' '\

    This is the N-terminal region of the Parvovirus VP1 coat protein PUBMED:9927584. Also see Parvovirus coat protein VP2 ().

    \ ' '8231' 'IPR013608' '\

    This domain is found at the N terminus of proteins containing von Willebrand factor type A (VWA, ) and Cache () domains. It has been found in vertebrates, Drosophila melanogaster (Fruit fly) and Caenorhabditis elegans but has not yet been identified in other eukaryotes. It is probably involved in the function of some voltage-dependent calcium channel subunits PUBMED:11487633.

    \ ' '8232' 'IPR013609' '\

    This domain is found at the N terminus of prophage tail fibre proteins.

    \ ' '8233' 'IPR013610' '\

    This region is found in a number of bacterial hypothetical proteins. Some members are annotated as being similar to replication primases, and in fact this region is often found together with the Toprim domain ().

    \ ' '8234' 'IPR013611' '\

    The TOBE domain PUBMED:10829230 (Transport-associated OB) always occurs as a dimer as the C-terminal strand of each domain is supplied by the partner. Probably involved in the recognition of small ligands such as molybdenum (e.g. ) and sulphate (). Found in ABC transporters immediately after the ATPase domain. A strong RPE motif is found at the presumed N terminus of the domain.

    \ ' '8235' 'IPR013612' '\

    Amino acid permeases are integral membrane proteins involved in the transport\ of amino acids into the cell. A number of such proteins have been found to be\ evolutionary related PUBMED:3146645, PUBMED:2687114, PUBMED:8382989. These proteins appear to contain up to 12 transmembrane segments. The best conserved region in this family is located in the second transmembrane segment. This domain is found to the N terminus of the amino acid permease domain () in metazoan Na-K-Cl cotransporters.

    \ ' '8236' 'IPR013613' '\

    This domain is found at the N terminus of P74 occlusion-derived virus (ODV) envelope proteins which are required for oral infectivity. The envelope proteins are found in baculoviruses which are insect pathogens. The C terminus of P74 is anchored to the membrane whereas the N terminus is exposed to the virion surface. Furthermore P74 is unusual for a virus envelope protein as it lacks an N-terminal localisation signal sequence PUBMED:15914841. Also see .

    \ ' '8237' 'IPR013614' '\

    This domain is found at the N terminus of non-structural viral polyproteins of the Caliciviridae subfamily.

    \ ' '8238' 'IPR013615' '\

    This domain is found at the C terminus of proteins of the CbbQ/NirQ/NorQ family of proteins which play a role in the post-translational activation of Rubisco PUBMED:10548510. It is also found in the Thauera aromaticaTutH protein which is similar to the CbbQ/NirQ/NorQ family PUBMED:10698784, as well as in putative chaperones. The ATPase domain associated with various cellular activities (AAA) is found in the same bacterial and archaeal proteins as the domain described here.

    \ ' '8239' 'IPR013616' '\

    This is the N-terminal domain of Chitin synthase ().

    \ ' '8240' 'IPR013617' '\

    This viral domain is found between the exonuclease domain of the DNA polymerase family B () and the domain, connecting the two.

    \ ' '8241' 'IPR013618' '\

    This domain of unknown function is found in various hypothetical metazoan proteins.

    \ ' '8242' 'IPR013619' '\

    This domain of unknown function is found at the N terminus of bacterial and viral hypothetical proteins.

    \ ' '8243' 'IPR013620' '\

    This bacterial domain is found at the C terminus of exodeoxyribonuclease I/Exonuclease I (), which is a single-strand specific DNA nuclease affecting recombination and expression pathways. The exonuclease I protein in Escherichia coli is associated with DNA deoxyribophosphodiesterase (dRPase) PUBMED:1329027.

    \ ' '8244' 'IPR013621' '\

    This metazoan domain is found to the N terminus of in voltage- and cyclic nucleotide-gated K/Na ion channels.

    \ ' '8245' 'IPR013622' '\

    This domain is found to the C terminus of Calcineurin-like phosphoesterase domains () in cAMP phosphodiesterases and the homologous Icc proteins PUBMED:9721275.

    \ \

    YfcE, from Escherichia coli, belongs to a conserved protein family within the calcineurin-like phosphoesterase superfamily () that is widely distributed in bacteria and archaea PUBMED:17586769. The B-subunits of replicative DNA polymerases also belong to the superfamily of calcineurin-like phosphoesterases and are conserved from Archaea to humans. Since B-subunit sequences are very similar, it implies a common fold, however the key catalytic and metal binding residues of the phosphoesterase domain are disrupted in the eukaryotic beta-subunits.

    \ \

    The C-terminal part of the protein contains both the calcineurin-like phosphoesterase domain and the OB-fold, which are sufficient for its nuclease activity PUBMED:15979340, PUBMED:15121900. Calcineurin is also the conserved target of the immunosuppressants cyclosporin A and FK506 PUBMED:10899116.

    \ ' '8246' 'IPR013623' '\

    This domain is found in plant proteins such as respiratory burst NADPH oxidase proteins which produce reactive oxygen species as a defence mechanism. It tends to occur to the N terminus of an EF-hand (), which suggests a direct regulatory effect of Ca2+ on the activity of the NADPH oxidase in plants PUBMED:9628030.

    \ ' '8247' 'IPR013624' '\

    This domain is found in bacterial non-ribosomal peptide synthetases (NRPS). NRPS are megaenzymes organised as iterative modules, one for each amino acid to be built into the peptide product PUBMED:11342140. NRPS modules are involved in epothilone biosynthesis (EpoB), myxothiazol biosynthesis (MtaC and MtaD), and other functions PUBMED:15595851. The NRPS domain tends to be found together with the condensation domain () and the phosphopantetheine binding domain ().

    \ ' '8248' 'IPR013625' '\

    The phosphotyrosine-binding domain (PTB, also phosphotyrosine-interaction or PI domain) of tensin tends to be found at the C terminus of a protein. Tensin is a multi-domain protein that binds to actin filaments and functions as a focal-adhesion molecule (focal adhesions are regions of plasma membrane through which cells attach to the extracellular matrix). Human tensin has actin-binding sites, an SH2 () domain and a region similar to the tumour suppressor PTEN PUBMED:11023826. The PTB domain interacts with the cytoplasmic tails of beta integrin by binding to an NPXY motif PUBMED:14592531.

    \ ' '8249' 'IPR013626' '\

    This domain is found in bacterial and plant proteins to the C terminus of a Rieske 2Fe-2S domain (). One of the proteins the domain is found in is Pheophorbide a oxygenase (PaO) which seems to be a key regulator of chlorophyll catabolism. Arabidopsis PaO (AtPaO) is a Rieske-type 2Fe-2S enzyme that is identical to Arabidopsis accelerated cell death 1 and homologous to lethal leaf spot 1 (LLS1) of maize PUBMED:14657372, in which the domain described here is also found.

    \ ' '8250' 'IPR013627' '\

    This is the eukaryotic DNA polymerase alpha subunit B N-terminal domain which is involved in complex formation PUBMED:8223465.

    \ ' '8252' 'IPR013629' '\

    This domain is found to the C terminus of C4 type zinc fingers () in metazoan steroid/thyroid hormone receptors. Proteins in this family include the chicken ovalbumin upstream promoter transcription factor (COUP-TF, also known as NR2F) which functions as a transcriptional regulator PUBMED:1820218 and plays a major role in the development of the nervous system PUBMED:11784326.

    \ ' '8253' 'IPR013630' '\

    This domain is found at the N terminus of bacterial methyltransferases.

    \ ' '8255' 'IPR013632' '\

    This domain is found at the C terminus of the DNA repair and recombination protein Rad51. It is critical for DNA binding PUBMED:15908697. Rad51 is a homologue of the bacterial RecA protein. Rad51 and RecA share a core ATPase domain.

    \ ' '8256' 'IPR013633' '\

    This is domain is found in eukaryotic proteins of unknown function.

    \ ' '8258' 'IPR013635' '\

    ICE2 is a fungal ER protein which has been shown to play an important role in forming/maintaining the cortical ER PUBMED:15585575. It has also been identified as a protein which is necessary for nuclear inner membrane targeting PUBMED:15911569.

    \ ' '8259' 'IPR013636' '\

    This is a eukaryotic domain of unknown function.

    \ ' '8260' 'IPR012706' '\

    This entry represents a region of about 79 amino acids found tandemly repeated up to fourteen times within the proteins that contain it. The repeats lack cysteines and are highly conserved, even at the DNA level, within and between proteins PUBMED:8702550. Proteins containing these repeats include the Rib and alpha surface antigens of group B Streptococcus, Esp of Enterococcus faecalis (Streptococcus faecalis), and related proteins of Lactobacillus. Most members of this protein family also have the cell wall anchor motif, LPXTG, shared by many staphyloccal and streptococcal surface antigens. These repeats are thought to define protective epitopes and may play a role in generating phenotypic and genotypic variation PUBMED:1438195.

    \ ' '8261' 'IPR013637' '\

    This domain shows similarity to the central region of PLU-1 (). This is a nuclear protein that may have a role in DNA-binding and transcription, and is closely associated with the malignant phenotype of breast cancer PUBMED:10336460. This region is found in various other Jumonji/ARID domain-containing proteins (see , ).

    \ ' '8262' 'IPR013638' '\

    The region described in this entry is found towards the N terminus of various eukaryotic fork head/HNF-3-related transcription factors (which contain the domain). These proteins play key roles in embryogenesis, maintenance of differentiated cell states, and tumorigenesis PUBMED:8817449.

    \ ' '8264' 'IPR013640' '\

    This is a family of fungal proteins of unknown function.

    \ ' '8265' 'IPR013641' '\

    This is a family of chromatin associated proteins which interact with the Elongator complex, a component of the elongating form of RNA polymerase II PUBMED:15772087. The Elongator complex has histone acetyltransferase activity.

    \ ' '8266' 'IPR013642' '\

    The CLCA family of calcium-activated chloride channels has been identified in many epithelial and endothelial cell types as well as in smooth muscle cells PUBMED:11896056 and has four or five putative transmembrane regions. Additionally to their role as chloride channels some CLCA proteins function as adhesion molecules and may also have roles as tumour suppressors PUBMED:15284223. The domain described here is found at the N terminus of CLCAs.

    \ ' '8267' 'IPR013643' '\

    Bovine calicivirus is a positive-stranded ssRNA viruses that cause gastroenteritis PUBMED:1840711. The calicivirus genome contains two open reading frames, ORF1 and ORF2 PUBMED:8892921, PUBMED:8642693. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the poly-protein in which these activities lie are similar to proteins produced by the picornaviruses PUBMED:8892921, PUBMED:1551442. ORF2 encodes a structural protein PUBMED:8892921. This signature finds ORF2, the structural coat protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs.

    \ \

    Rabbit hemorrhagic disease virus (RHDV) which causes a highly contagious disease of wild and domestic rabbits belongs to the family Caliciviridae PUBMED:16733562. The capsid protein self assembles to form an icosahedral capsid with a T=3 symmetry. It is about 38nm in diameter and consists of 180 capsid proteins. The capsid encapsulates the genomic RNA and VP2 proteins and attaches the virion to target cells by binding histo-blood group antigens present on gastroduodenal epithelial cells. The Shell domain (S domain) contains elements essential for the formation of the icosahedron. The Protruding domain (P domain) is divided into sub-domains P1 and P2. An hypervariable region in P2 is thought to play an important role in receptor binding and immune reactivity.

    \ \

    This is the calicivirus coat protein () C-terminal region.

    \ ' '8268' 'IPR013644' '\ 1-deoxy-D-xylulose 5-phosphate reductoisomerase synthesises 2-C-methyl-D-erythritol 4-phosphate from 1-deoxy-D-xylulose 5-phosphate in a single step by intramolecular rearrangement and reduction and is responsible for terpenoid biosynthesis in some organisms PUBMED:. In Arabidopsis thaliana 1-deoxy-D-xylulose 5-phosphate reductoisomerase is the first committed enzyme of the non-mevalonate pathway for isoprenoid biosynthesis. The enzyme requires Mn2+, Co2+ or Mg2+ for activity, with the first being most effective.\

    This domain is found to the C terminus of domains in bacterial and plant 1-deoxy-D-xylulose 5-phosphate reductoisomerases.

    \ ' '8269' 'IPR013645' '\

    This domain is found at the C terminus of bacterial glucosyltransferase and galactosyltransferase proteins.

    \ ' '8270' 'IPR013646' '\

    This domain is found at the C terminus of in archaeal and eukaryotic GTP-binding proteins.

    \ ' '8271' 'IPR013647' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This domain is found towards the N-terminus of metallopeptidases belonging to MEROPS peptidase subfamily M3B (oligopeptidase F, clan MA). An example protein is Lactococcus lactisPepF PUBMED:7798200. The function of this N-terminal domain is unknown.

    \ ' '8272' 'IPR013648' '\

    This domain is found in polyproteins of the viral Potyviridae taxon.

    \ ' '8273' 'IPR013649' '\

    This domain is found in integrin alpha and integrin alpha precursors to the C terminus of a number of repeats and to the N terminus of the cytoplasmic region.

    \ ' '8274' 'IPR013650' '\

    The ATP-grasp superfamily currently includes 17 groups of enzymes, catalyzing\ ATP-dependent ligation of a carboxylate containing molecule to an amino or\ thiol group-containing molecule PUBMED:. They contribute predominantly to macromolecular synthesis. ATP-hydrolysis is used to activate a substrate. For example, DD-ligase transfers phosphate from ATP to D-alanine on the first step of catalysis. On the second step the resulting acylphosphate is attacked by a second D-alanine to produce a DD dipeptide following phosphate elimination PUBMED:7939684.

    \ \

    The ATP-grasp domain contains three conserved motifs, corresponding to the\ phosphate binding loop and the Mg(2+) binding site PUBMED:8804825. The fold is characterised by two alpha-beta subdomains that grasp the ATP molecule between them. Each subdomain provides a variable loop that forms a part of the active site, completed by region of other domains not conserved between the various ATP-grasp enzymes PUBMED:7862655.

    \ \

    The ATP-grasp domain represented by this entry is found primarily in succinyl-CoA synthetases ().

    \ ' '8275' 'IPR013651' '\

    This ATP-grasp domain is found in the ribosomal S6 modification enzyme RimK PUBMED:9416615. It has an unusual nucleotide-binding fold referred to as palmate, or ATP-grasp fold. This domain is found in a number of enzymes of known structure as well as in urea amidolyase, tubulin-tyrosine ligase, and three enzymes of purine biosynthesis.

    \ ' '8276' 'IPR013652' '\

    This entry represents mammalian-specific glycine N-acyltransferase (also called aralkyl acyl-CoA:amino acid N-acyltransferase; ). Mitochondrial acyltransferases catalyse the transfer of an acyl group from acyl-CoA to the N-terminus of glycine to produce N-acylglycine. These enzymes can conjugate a multitude of substrates to form a variety of N-acylglycines. The CoA derivatives of a number of aliphatic and aromatic acids, but not phenylacetyl-CoA or (indol-3-yl)acetyl-CoA, can act as donor PUBMED:10630424, PUBMED:8660675.

    \ ' '8277' 'IPR013653' '\

    Proteins in this entry have a conserved region similar to the C-terminal region of the Drosophila melanogaster (Fruit fly) hypothetical protein FR47 (). This protein has been found to consist of two N-acyltransferase-like domains swapped with the C-terminal strands.

    \ ' '8278' 'IPR013654' '\

    The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs PUBMED:15009198. The PAS fold appears in archaea, eubacteria and eukarya.

    \ ' '8279' 'IPR013655' '\

    The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs PUBMED:15009198. The PAS fold appears in archaea, eubacteria and eukarya. The PAS domain contains a sensory box, or S-box domain that occupies the central portion of the PAS domain but is more widely distributed. It is often tandemly repeated. Known prosthetic groups bound in the S-box domain include haem in the oxygen sensor FixL PUBMED:16681374, FAD in the redox potential sensor NifL PUBMED:16417511, and a 4-hydroxycinnamyl chromophore in photoactive yellow protein PUBMED:14979724. Proteins containing the domain often contain other regulatory domains such as response regulator or sensor histidine kinase domains. Other S-box proteins include phytochromes and the aryl hydrocarbon receptor nuclear translocator.

    \ \

    This domain has been found in the gene product of the madA gene of the filamentous zygomycete fungus Phycomyces blakesleeanus. It has been shown that MadA encodes a blue-light photoreceptor for phototropism and other light responses. The gene is involved in the phototropic responses associated with sporangiophore growth; they exhibit phototropism by bending toward near-UV and blue wavelengths and away from far-UV wavelengths in a manner that is physiologically similar to plant phototropic responses PUBMED:16537433.

    \ ' '8280' 'IPR013656' '\

    The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs PUBMED:15009198. The PAS fold appears in archaea, eubacteria and eukarya.

    \ ' '8281' 'IPR013657' '\

    This family includes transporters with a specificity for UDP-N-acetylglucosamine PUBMED:11432728.

    \ ' '8282' 'IPR013658' '\

    This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species. These proteins include various enzymes, such as senescence marker protein 30 (SMP-30, ), gluconolactonase () and luciferin-regenerating enzyme (LRE, ). SMP-30 is known to hydrolyse diisopropyl phosphorofluoridate in the liver, and has been noted as having sequence similarity, in the region described in this family, with PON1 () and LRE.

    \ ' '8283' 'IPR013659' '\

    This domain is found to the N terminus of the Adenosine/AMP deaminase domain () in metazoan proteins such as the Cat eye syndrome critical region protein 1 and its homologues.

    \ ' '8284' 'IPR013660' '\

    This domain is found in viral DNA polymerases to the N terminus of DNA polymerase family B exonuclease domains ().

    \ ' '8285' 'IPR013661' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This domain is found in microbial collagenase metalloproteases to the N terminus of . Proteins containing this domain belong tp MEROPS peptidase family M9, subfamilies M9A and M9B (microbial collagenase, clan MA(E)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH PUBMED:7674922.

    \ \

    Microbial collagenases have been identified from bacteria of both the\ Vibrio and Clostridium genuses. Collagenase is used during bacterial attack to degrade the collagen barrier of the host during invasion. Vibrio bacteria are non-pathogenic, and are sometimes used in hospitals to remove dead tissue from burns and ulcers. Clostridium histolyticum is a pathogen that causes gas gangrene;\ nevertheless, the isolated collagenase has been used to treat bed sores.\ Collagen cleavage occurs at an Xaa+Gly in Vibrio bacteria and at Yaa+Gly\ bonds in Clostridium collagenases PUBMED:.

    \ \

    Analysis of the primary structure of the gene product from Clostridium perfringens has revealed that the enzyme is produced with a stretch of 86 residues that contain a putative signal sequence PUBMED:8282691. Within this stretch is found PLGP, an amino acid sequence typical of collagenase substrates. This sequence may thus be implicated in self-processing of the collagenase PUBMED:8282691.

    \ ' '8286' 'IPR013662' '\

    This eukaryotic domain is found in ryanodine receptors (RyR) and inositol 1, 4, 5-trisphosphate receptors (IP3R) which together form a superfamily of homotetrameric ligand-gated intracellular Ca2+ channels PUBMED:14516409. There seems to be no known function for this domain PUBMED:10664581. Also see the IP3-binding domain and .

    \ ' '8287' 'IPR013663' '\

    This domain is found in bacterial proteins of the SWF/SNF/SWI helicase family to the N terminus of the SNF2 family N-terminal domain () and together with the Helicase conserved C-terminal domain (). The function of the domain is not clear PUBMED:9025290.

    \ ' '8288' 'IPR013664' '\

    This domain is found to the C terminus of the viral methyltransferase domain ().

    \ ' '8289' 'IPR013665' '\

    This is a domain of fungal spindle pole body proteins that play a role in spindle body duplication. They contain binding sites for calmodulin-like proteins called centrins PUBMED:14504268 which are present in microtubule-organising centres.

    \ ' '8290' 'IPR013666' '\

    This domain describes a pleckstrin homology (PH)-like region found in several plant proteins of unknown function.

    \ ' '8291' 'IPR001162' '\

    During the process of Escherichia coli nucleotide excision repair, DNA damage recognition and processing are achieved by the action of the uvrA, uvrB, and uvrC gene products PUBMED:12034838. The UvrC proteins contain 4 conserved regions: a central region which interacts with UvrB (Uvr domain), a Helix hairpin Helix (HhH) domain important for 5 prime incision of damage DNA and the homology regions 1 and 2 of unknown function. UvrC homology region 2 is specific for UvrC proteins, whereas UvrC homology region 1 is also shared by few other nucleases.

    \

    Proteins that contain the UvrC homology region 1, , are listed below:

    \ \

    \

    \

    \ ' '8292' 'IPR013667' '\ SH3 (src Homology-3) domains are small protein modules containing \ approximately 50 amino acid residues PUBMED:15335710, PUBMED:11256992. They are found in a \ great variety of intracellular or\ membrane-associated proteins PUBMED:1639195, PUBMED:14731533, PUBMED:7531822 for example, in a variety of\ proteins with enzymatic activity, in adaptor\ proteins that lack catalytic sequences and in cytoskeletal\ proteins, such as fodrin and yeast actin binding protein ABP-1. \

    The SH3 domain has a characteristic fold which consists of five or six beta-strands arranged as two tightly packed anti-parallel beta sheets. The linker\ regions may contain short helices PUBMED:. The surface of the SH3-domain bears a flat, hydrophobic ligand-binding pocket which consists of three shallow grooves defined by conservative aromatic residues in which the ligand adopts an extended left-handed helical arrangement. The ligand binds with low affinity but this may be enhanced by multiple interactions.\ The region bound by the SH3 domain is in all cases proline-rich and contains PXXP as a core-conserved binding motif. The function of the SH3 domain is not well understood but they may mediate many diverse processes such as increasing local concentration of proteins, altering their subcellular location and mediating the assembly of large multiprotein complexes PUBMED:7953536.

    \

    The SH3 domain has been found in a number of different bacterial proteins including glycyl-glycine endopeptidase, bacteriocin and some hypothetical proteins.

    \ ' '8293' 'IPR013669' '\

    This domain is found to the C terminus of the domain in Carmoviruses. The coat protein of the Turnip crinkle virus (TCV; Tombusviridae) is a suppressor of RNA silencing and is required for cell to cell movement in its host PUBMED:18533829. The plant cellular trafficking machinery could hijack functional viral proteins to permit cell-to-cell movement of RNA silencing PUBMED:18515824. The 3\'-proximal coat protein is coded by ORF4 PUBMED:17657600.

    \ ' '8294' 'IPR013670' '\

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    Type I restriction endonucleases are components of prokaryotic DNA restriction-modification mechanisms that protects the organism against invading foreign DNA. Type I enzymes have three different subunits subunits - M (modification), S (specificity) and R (restriction) - that form multifunctional enzymes with restriction (), methylase () and ATPase activities PUBMED:15121719, PUBMED:12595133. The S subunit is required for both restriction and modification and is responsible for recognition of the DNA sequence specific for the system. The M subunit is necessary for modification, and the R subunit is required for restriction. These enzymes use S-Adenosyl-L-methionine (AdoMet) as the methyl group donor in the methylation reaction, and have a requirement for ATP. They recognise asymmetric DNA sequences split into two domains of specific sequence, one 3-4 bp long and another 4-5 bp long, separated by a nonspecific spacer 6-8 bp in length. Cleavage occurs a considerable distance from the recognition sites, rarely less than 400 bp away and up to 7000 bp away. Adenosyl residues are methylated, one on each strand of the recognition sequence. These enzymes are widespread in eubacteria and archaea. In enteric bacteria they have been subdivide into four families: types IA, IB, IC and ID.

    \

    Type III restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. Type III enzymes are hetero-oligomeric, multifunctional proteins composed of two subunits, Res and Mod. The Mod subunit recognises the DNA sequence specific for the system and is a modification methyltransferase; as such it is functionally equivalent to the M and S subunits of type I restriction endonuclease. Res is required for restriction, although it has no enzymatic activity on its own. Type III enzymes recognise short 5-6 bp long asymmetric DNA sequences and cleave 25-27 bp downstream to leave short, single-stranded 5\' protrusions. They require the presence of two inversely oriented unmethylated recognition sites for restriction to occur. These enzymes methylate only one strand of the DNA, at the N-6 position of adenosyl residues, so newly replicated DNA will have only one strand methylated, which is sufficient to protect against restriction. Type III enzymes belong to the beta-subfamily of N6 adenine methyltransferases, containing the nine motifs that characterise this family, including motif I, the AdoMet binding pocket (FXGXG), and motif IV, the catalytic region (S/D/N (PP) Y/F) PUBMED:15121719, PUBMED:12595133.

    \

    This entry represents the C-terminal domain found in both the R subunit of type I enzymes and the Res subunit of type III enzymes. The type I enzyme represented is EcoEI, which recognises 5\'-GAGN(7)ATGC-3; the R protein (HsdR) is required for both nuclease and ATPase activity PUBMED:8412658, PUBMED:10449767.

    \ ' '8295' 'IPR013671' '\

    This domain is found in replication initiator (Rep) associated proteins such as AC5 in the Geminivirus/Begomovirus.

    \ ' '8296' 'IPR013672' '\

    This domain is found towards the C terminus in Herpesvirus Thymidine kinases.

    \ ' '8297' 'IPR013673' '\

    Inwardly-rectifying potassium channels (Kir) are the principal class of two-TM domain potassium channels. They are characterised by the property of inward-rectification, which is described as the ability to allow large inward currents and smaller outward currents. Inwardly rectifying potassium channels (Kir) are responsible for regulating diverse processes including: cellular excitability, vascular tone, heart rate, renal salt flow, and insulin release PUBMED:10102275. To date, around twenty members of this superfamily have been cloned, which can be grouped into six families by sequence similarity, and these are designated Kir1.x-6.x PUBMED:7580148, PUBMED:10449331.

    \

    Cloned Kir channel cDNAs encode proteins of between ~370-500 residues, both N- and C-termini are thought to be cytoplasmic, and the N-terminus lacks a signal sequence. Kir channel alpha subunits possess only 2TM domains linked with a P-domain. Thus, Kir channels share similarity with the fifth and sixth domains, and P-domain of the other families. It is thought that four Kir subunits assemble to form a tetrameric channel complex, which may be hetero- or homomeric PUBMED:10102275.

    \

    Potassium channels are the most diverse group of the ion channel family\ PUBMED:1772658, PUBMED:1879548. They are important in shaping the action potential, and in neuronal excitability and plasticity PUBMED:2451788. The potassium channel family is\ composed of several functionally distinct isoforms, which can be broadly\ separated into 2 groups PUBMED:2555158: the practically non-inactivating \'delayed\' group and the rapidly inactivating \'transient\' group.

    \

    These are all highly similar proteins, with only small amino acid\ changes causing the diversity of the voltage-dependent gating mechanism,\ channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or\ other second messengers PUBMED:2448635. In eukaryotic cells, K+ channels\ are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes PUBMED:1373731. In prokaryotic cells, they play a role in the\ maintenance of ionic homeostasis PUBMED:11178249.

    \

    All K+ channels discovered so far possess a core of \ alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has\ been termed the K+ selectivity sequence.\ In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane.\ However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains.\ The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK)\ PUBMED:11178249, PUBMED:. The 2TM domain family comprises inward-rectifying K+ \ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.

    \

    This metazoan domain is found to the N terminus of the domain in Inward rectifier potassium channels (KIR2 or IRK2).

    \ ' '8298' 'IPR013674' '\

    This domain is found in RNA-dependent RNA polymerase P1-P2 fusion/replicase proteins in plant Luteoviruses.

    \ ' '8299' 'IPR013675' '\

    This domain is found to the N terminus of the methyltransferase small domain () in bacterial proteins PUBMED:9873033.

    \ ' '8300' 'IPR013676' '\

    This viral domain is found to the C terminus of Poxvirus nucleoside triphosphatase phosphohydrolase I (NPH I) PUBMED:1850911 together with the helicase conserved C-terminal domain ().

    \ ' '8301' 'IPR013677' '\

    Bacteria of the Clostridium genus produce protein neurotoxins, which are complexes consisting of neurotoxin (NT), haemagglutinin (HA), nontoxic nonhaemagglutinin (NTNH), and RNA PUBMED:11233171, PUBMED:11595633. The domain described here is found at the C terminus of the NTNH component.

    \ ' '8302' 'IPR013678' '\

    This domain is found to the N terminus of the ribonucleotide reductase barrel domain (). It occurs in bacterial class II ribonucleotide reductase proteins which depend upon coenzyme B12 (deoxyadenosylcobalamine) PUBMED:11832503.

    \ ' '8303' 'IPR013679' '\

    This is the Sucrose-6-phosphate phosphohydrolase (S6PP or SPP) C-terminal domain PUBMED:11050182 as found in plant sucrose phosphatases. These enzymes irreversibly catalyse the last step in sucrose synthesis following the formation of Sucrose-6-Phosphate via sucrose-phosphate synthase (SPS).

    \ ' '8304' 'IPR013680' '\

    Ca2+ ions are unique in that they not only carry charge but they are also the most widely used of diffusible second messengers. Voltage-dependent Ca2+ channels (VDCC) are a family of molecules that allow cells to couple electrical activity to intracellular Ca2+ signalling. The opening and closing of these channels by depolarizing stimuli, such as action potentials, allows Ca2+ ions to enter neurons down a steep electrochemical gradient, producing transient intracellular Ca2+ signals. Many of the processes that occur in neurons, including transmitter release, gene transcription and metabolism are controlled by Ca2+ influx occurring simultaneously at different cellular locales. The pore is formed by the alpha-1 subunit which incorporates the conduction pore, the voltage sensor and gating apparatus, and the known sites of channel regulation by second messengers, drugs, and toxins PUBMED:14657414. The activity of this pore is modulated by 4 tightly-coupled subunits: an intracellular beta subunit; a transmembrane gamma subunit; and a disulphide-linked complex of alpha-2 and delta subunits, which are proteolytically cleaved from the same gene product. Properties of the protein including gating voltage-dependence, G protein modulation and kinase susceptibility can be influenced by these subunits.

    \ \

    Voltage-gated calcium channels are classified as T, L, N, P, Q and R, and are distinguished by their sensitivity to pharmacological blocks, single-channel conductance kinetics, and voltage-dependence. On the basis of their voltage activation properties, the voltage-gated calcium classes can be further divided into two broad groups: the low (T-type) and high (L, N, P, Q and R-type) threshold-activated channels PUBMED:.

    \

    This eukaryotic domain has been found in the neuronal voltage-dependent calcium channel (VGCC) alpha 2a, 2c, and 2d subunits. It is also found in other calcium channel alpha-2/delta subunits to the N terminus of a Cache domain ().

    \ ' '8305' 'IPR013681' '\

    This domain is found in the myelin transcription factor 1 (MYT1) of chordates. MYT1 contains C2HC zinc finger domains () and is expressed in developing neurons of the central nervous system PUBMED:9373037 where it is involved in the selection of neuronal precursor cells PUBMED:8980226.

    \ ' '8306' 'IPR013668' '\

    This domain is found at the amino terminus of Ribonuclease R and a number of presumed transcriptional regulatory proteins from archaea.

    \ ' '8307' 'IPR013682' '\

    This domain is found in Baculoviridae including the nucleopolyhedrovirus at the N terminus of the viral capsid protein 91 (VP91) PUBMED:11602755.

    \ ' '8308' 'IPR013683' '\

    This domain is found on the N terminus of the viral protein D10 (VD10) and the related MutT motif proteins PUBMED:9847390. The VD10 protein is probably essential for virus replication PUBMED:2177083 and is often found to the N terminus of a domain.

    \ ' '8309' 'IPR013684' '\

    Mitochondrial Rho proteins (Miro-1, , and Miro-2, ), are atypical Rho GTPases. They have a unique domain organisation, with tandem GTP-binding domains and two EF hand domains (), that may bind calcium. They are also larger than classical small GTPases. It has been proposed that they are involved in mitochondrial homeostasis and apoptosis PUBMED:12482879.

    \ ' '8310' 'IPR013685' '\

    FtsQ/DivIB bacterial division proteins () contain an N-terminal POTRA domain (for polypeptide-transport-associated domain). This is found in different types of proteins, usually associated with a transmembrane beta-barrel. FtsQ/DivIB may have chaperone-like roles, which has also been postulated for the POTRA domain in other contexts PUBMED:14559180.

    \ ' '8311' 'IPR013686' '\

    The POTRA domain (for polypeptide-transport-associated domain) is found towards the N terminus of ShlB family proteins (). ShlB is important in the secretion and activation of the haemolysin ShlA. It has been postulated that the POTRA domain has a chaperone-like function over ShlA; it may fold back into the C-terminal beta-barrel channel PUBMED:14559180.

    \ ' '8312' 'IPR013687' '\

    The members of this family are disaggregatases and several hypothetical proteins of the archaeal genus Methanosarcina. Disaggregatases cause aggregates to separate into single cells PUBMED:2082820 and contain parallel beta-helix repeats. Also see .

    \ ' '8313' 'IPR013688' '\

    This repeat is found in a number of Streptococcus proteins including some hypothetical proteins and Bsp. Bsp is a protein of group B Streptococcus (GBS) which might control cell morphology PUBMED:12368458.

    \ ' '8314' 'IPR013689' '\

    This domain is found near the C terminus of bacterial ATP-dependent helicases such as HrpB.

    \ ' '8315' 'IPR013690' '\

    This bacterial domain is found to the N terminus of the -like ATP binding domain in proteins which are putative transposase subunits PUBMED:7698671.

    \ ' '8316' 'IPR013691' '\

    This domain is found in bacterial C-methyltransferase proteins, often together with other methyltransferase domains such as or .

    \ ' '8317' 'IPR013692' '\

    This domain is found to the C terminus of the domain in bacterial polysaccharide biosynthesis enzymes including the capsule protein CapD PUBMED:7961465 and several putative epimerases/dehydratases.

    \ ' '8318' 'IPR013693' '\

    This domain is found in the stage II sporulation protein SpoIID. SpoIID is necessary for membrane migration as well as for some of the earlier steps in engulfment during bacterial endospore formation PUBMED:12502745. The domain is also found in amidase enhancer proteins. Amidases, like SpoIID, are cell wall hydrolases PUBMED:10961456.

    \ ' '8319' 'IPR013694' '\

    Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition PUBMED:14744536. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N terminus of a von Willebrand factor type A domain () in ITI heavy chains (ITIHs) and their precursors.

    \ ' '8320' 'IPR013695' '\

    This domain is found together with the eukaryotic protein kinase domain in plant wall-associated kinases (WAKs) and related proteins. WAKs are serine-threonine kinases which might be involved in signalling to the cytoplasm and are required for cell expansion PUBMED:11544019.

    \ ' '8321' 'IPR013696' '\

    This domain of unknown function is found in many hypothetical proteins and predicted DNA-binding proteins such as transcription-associated proteins. It is found in bacteria and archaea.

    \ ' '8322' 'IPR013697' '\

    This domain is found on the epsilon catalytic subunit of DNA polymerase. It is foundC-terminal to and .

    \ ' '8323' 'IPR013698' '\

    This domain is found in squalene epoxidase (SE) and related proteins which are found in taxonomically diverse groups of eukaryotes and also in bacteria. SE was first cloned from Saccharomyces cerevisiae (Baker\'s yeast) where it was named ERG1. It contains a putative FAD binding site and is a key enzyme in the sterol biosynthetic pathway PUBMED:9161422. Putative transmembrane regions are found to the protein\'s C terminus.

    \ ' '8324' 'IPR013699' '\

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes PUBMED:17622352, PUBMED:16469117. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor PUBMED:17507650. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5\' and 3\' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    \

    This entry represents the RNA binding domain of the SRP72 subunit. This domain is responsible for the binding of SRP72 to the 7S SRP RNA PUBMED:15588816.

    \ ' '8325' 'IPR013700' '\ Aflatoxins belong to a family of decaketides that are produced as secondary metabolites by Aspergillus flavus and Aspergillus parasiticus PUBMED:8074521. The aflatoxin biosynthetic pathway involves several enzymatic steps that appear to be regulated by the aflR genes in A. flavus and A. parasiticus. AflR encodes a protein that contains a cysteine-rich motif. Several fungal transcriptional activator proteins contain this motif, which binds DNA in a zinc-dependent fashion (occurs in fungal transcriptional regulatory proteins) PUBMED:1557122, PUBMED:2107541.\

    This domain is found in the aflatoxin regulatory protein (AflR) which is involved in the regulation of the biosynthesis of aflatoxin in the fungal genus Aspergillus PUBMED:9758790. It occurs together with the fungal Zn(2)-Cys(6) binuclear cluster domain ().

    \ ' '8326' 'IPR013701' '\

    This domain is found in ATP-dependent helicases as well as a number of hypothetical proteins together with the helicase conserved C-terminal domain () and the domain.

    \ ' '8327' 'IPR013702' '\

    The FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids PUBMED:17855421.

    \ ' '8328' 'IPR013703' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This domain is found to the N terminus of bacterial signal peptidases that belong to the MEROPS peptidase family S49 (protease IV family, clan SK) (see also ) PUBMED:15611110, PUBMED:15205439.

    \ ' '8329' 'IPR013704' '\

    This domain tends to occur to the N terminus of the domain in hypothetical bacterial proteins.

    \ ' '8330' 'IPR013705' '\

    This domain is found to the C terminus of a methyltransferase domain () in fungal and plant sterol methyltransferases PUBMED:9746350.

    \ ' '8331' 'IPR013706' '\

    The cyclic nucleotide phosphodiesterases (PDE) comprise a group of enzymes that degrade the phosphodiester bond in the second messenger molecules cAMP and cGMP. They are divided into 11 families. They regulate the localisation, duration and amplitude of cyclic nucleotide signalling within subcellular domains. PDEs are therefore important for signal transduction.

    \ \

    PDE enzymes are often targets for pharmacological inhibition due to their unique tissue distribution, structural properties, and functional properties. Inhibitors include: Roflumilast for chronic obstructive pulmonary disease and asthma PUBMED:18447606, Sildenafil for erectile dysfunction PUBMED:18367027 and Cilostazol for peripheral arterial occlusive disease PUBMED:18436153, amongst others.

    \ \

    Retinal 3\',5\'-cGMP phosphodiesterase is located in photoreceptor outer segments PUBMED:: it is light activated, playing a pivotal role in signal transduction. In rod cells, PDE is oligomeric, comprising an alpha-, a beta- and 2 gamma-subunits, while in cones, PDE is a homodimer of alpha chains, which are associated with several smaller subunits. Both rod and cone PDEs catalyse the hydrolysis of cAMP or cGMP to the corresponding nucleoside 5\' monophosphates, both enzymes also binding\ cGMP with high affinity. The cGMP-binding sites are located in the\ N-terminal half of the protein sequence, while the catalytic core \ resides in the C-terminal portion.

    \ \

    This domain is found to the N terminus of the calcium/calmodulin-dependent 3\'5\'-cyclic nucleotide phosphodiesterase domain ().

    \ ' '8332' 'IPR013707' '\

    Tombusviruses, which replicate in a wide range of plant hosts, replicate with the help of viral replicase protein including the overlapping p33 and p92 proteins which contain the domain described here PUBMED:15936051.

    \ ' '8333' 'IPR013708' '\

    This domain is the substrate binding domain of shikimate dehydrogenase PUBMED:15735308. Shikimate dehydrogenase catalyses the fourth step of the mycobacterial Shikimate pathway, which results in the biosynthesis of chorismate. Chorismate is a precursor of aromatic amino acids, naphthoquinones, menaquinones and mycobactins PUBMED:18260104, PUBMED:12637497. This pathway is an important target for antibacterial agents, especially against Mycobacterium tuberculosis, since it does not occur in mammals.

    \ ' '8334' 'IPR013709' '\

    This is the C-terminal regulatory (R) domain of alpha-isopropylmalate synthase, which catalyses the first committed step in the leucine biosynthetic pathway PUBMED:15159544. This domain, is an internally duplicated structure with a novel fold PUBMED:15159544. It comprises two similar units that are arranged such that the two -helices pack together in the centre, crossing at an angle of 34 degrees, sandwiched between the two three-stranded, antiparallel beta-sheets. The overall domain is thus constructed as a beta-alpha-beta three-layer sandwich PUBMED:15159544.

    \ ' '8335' 'IPR013710' '\

    This domain is found at the N terminus of tetrahydrodipicolinate N-succinyltransferase (DapD) which catalyses the acylation of L-2-amino-6-oxopimelate to 2-N-succinyl-6-oxopimelate in the meso-diaminopimelate/lysine biosynthetic pathway of bacteria, blue-green algae, and plants PUBMED:11910040. The N-terminal domain as defined here contains three alpha-helices and two twisted hairpin loops PUBMED:9012664.

    \ ' '8336' 'IPR013711' '\

    This domain lies to the C terminus of Runx-related transcription factors and homologous proteins (AML, CBF-alpha, PEBP2). Its function might be to interact with functional cofactors PUBMED:15713794.

    \ ' '8337' 'IPR013712' '\

    This is a family of coatomer-interacting proteins which are involved in Golgi to ER retrograde transport PUBMED:15958492, PUBMED:11493604.

    \ ' '8338' 'IPR013713' '\

    The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.

    \

    This domain is found in exportin Cse1 (also known as importin-alpha re-exporter). Exportin Cse1 mediates nuclear transport of importin-alpha back into the cytosol, where importin-alpha functions as a transporter of proteins carrying nuclear localisation signals (NLS) from the cytoplasm into the nucleus PUBMED:15602554, PUBMED:15866177, PUBMED:17170104. This domain contains HEAT repeats.

    \

    More information about these proteins can be found at Protein of the Month: Importins PUBMED:.

    \ \ ' '8339' 'IPR013714' '\

    Proteins in this family co-localise with COPI vesicle coat proteins PUBMED:14562095.

    \ ' '8340' 'IPR013715' '\

    This is a fungal domain of unknown function.

    \ ' '8341' 'IPR013716' '\

    Adenylate cyclase catalyses the conversion of ATP to 3\',5\'-cyclic AMP (cAMP) and pyrophosphate. It plays an essential role in the regulation of cellular metabolism by catalysing the synthesis of a second messenger, cAMP. G protein-mediated signalling is implicated in yeast and fungal cAMP pathways. The cAMP-PKA pathway consists of an extracellular ligand-sensitive G protein-coupled receptor, a G protein signal transmitter, and the effector adenylate cyclase. The product of adenylate cyclase, cAMP, acts as an intracellular second messenger PUBMED:16924114.

    \ \

    GTP-bound RAS2 is required to elicit magnesium-dependent adenylyl cyclase activity in Saccharomyces cerevisiae. In Schizosaccharomyces pombe, however, the cyclase is probably not regulated by RAS proteins, but is activated by git1.

    \ \

    In S. pombe, Gpa2 Galpha binds an N-terminal domain of adenylate cyclase, comprising a moderately conserved sequence, which is within a region that is poorly related to other fungal adenylate cyclases. Adenylate cyclase is directly activated by a fungal G protein, which suggests a distinct activation mechanism from that of mammals PUBMED:15831585.

    \ \

    This fungal domain interacts with the alpha subunit of heterotrimeric G proteins PUBMED:15831585.

    \ ' '8342' 'IPR013717' '\

    PIG-P (phosphatidylinositol N-acetylglucosaminyltransferase subunit P) is an enzyme involved in GPI anchor biosynthesis PUBMED:10944123.

    \ ' '8343' 'IPR013718' '\

    COQ9 is an enzyme that is required for the biosynthesis of coenzyme Q PUBMED:16027161. It may either catalyse a reaction in the coenzyme Q biosynthetic pathway or have a regulatory role.

    \ ' '8344' 'IPR013719' '\

    This is a domain of unknown function that is associated with a number of different protein families. It is found in Rtt106p, which is a histone chaperone involved in heterochromatin-mediated silencing PUBMED:16157874. It is also found in genes annotated as transcription factors/regulators.

    \ \

    This domain is the C-terminal domain of yeast Spt16p , which is a subunit of the heterodimeric yeast FACT complex (Spt16p-Pob3p, ) PUBMED:15987999. In addition Spt16p and its relatives, in this entry, are described as non-peptidase homologues belonging to the MEROPS peptidase family M24. The FACT complex facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilising and then reassembling nucleosome structure PUBMED:12524332, PUBMED:12934006.

    \ \ \ ' '8345' 'IPR013720' '\

    The LisH motif is found in a large number of eukaryotic proteins, from metazoa, fungi and plants that have a wide range of functions. The recently solved structure of the LisH domain in the N-terminal region of LIS1 depicted it as a novel dimerization motif, and that other structural elements are likely to play an important role in dimerisation PUBMED:15274919, PUBMED:16445939, PUBMED:16258276.

    \ \

    The LisH (lis homology) domain mediates protein dimerisation and tetramerisation. The LisH domain is found in Sif2, a component of the Set3 complex which is responsible for repressing meiotic genes. It has been shown that the LisH domain helps mediate interaction with components of the Set3 complex PUBMED:16051270.

    \ ' '8346' 'IPR013721' '\

    STAG domain proteins are subunits of cohesin complex - a protein complex required for sister chromatid cohesion in eukaryotes. The STAG domain is present in Schizosaccharomyces pombe (Fission yeast) mitotic cohesin Psc3, and the meiosis specific cohesin Rec11. Many organisms express a meiosis-specific STAG protein, for example, mice and humans have a meiosis specific variant called STAG3, although budding yeast does not have a meiosis specific version PUBMED:16043696.

    \ ' '8347' 'IPR003605' '\

    Transforming growth factor beta (TGF-beta) is a member of a large family of\ secreted growth factors of central importance in eukaryotic development and\ homeostasis. Members of this family, which includes the activins, inhibins and\ bone morphogenic proteins (BMPs), bind to receptors that consist of two\ transmembrane serine/threonine (Ser/Thr) kinases called the type I and type II\ receptors. Type II activates Type I\ upon formation of the ligand receptor complex by multiply phosphorylating the\ GS domain, a short (~30 residues), highly conserved regulatory sequence just\ N-terminal to the kinase domain on the cytoplasmic side of the receptor. The\ GS domain is found only in the type I receptor family and is named for the\ TTSGSGSG sequence at its core. At least three, and perhaps four to five of the\ serines and threonines in the GS domain, must be phosphorylated to fully\ activate TbetaR-1 PUBMED:11583628.

    \ \

    The GS domain forms a helix-loop-helix structure in which the sites of\ activating phosphorylation are situated in a loop known as the GS loop. One key role for phosphorylation is to block the adoption of an\ inactivating configuration by the GS domain PUBMED:10025408.

    \ ' '8348' 'IPR006586' '\

    An ADAM is a transmembrane protein that contains a disintegrin and metalloprotease domain (MEROPS peptidase family M12B). All members of the ADAM family display a common domain organization - a pro-domain, the metalloprotease, disintigrin, cysteine-rich, epidermal-growth factor like, and transmembrane domains and a C-terminal cytoplasmic tail. They possess four potential functions: proteolysis, cell adhesion, cell fusion, and cell signalling. \ \ ADAMs are membrane-anchored proteases that proteolytically modify cell surface and extracellular matrix (ECM) in order to alter cell behaviour.\ \ They are responsible for the proteolytic cleavage of transmembrane proteins and release of their extracellular domain PUBMED:11193153, PUBMED:12514095.

    \ \

    The ADAM cysteine-rich domain is not found in plant, archaeal, bacterial or viral proteins. The cysteine-rich domain complements the binding capacity of the disintegrin domain, and perhaps imparts specificity to disintegrin domain-mediated interactions. It has been shown that the cysteine-rich domain of ADAM13 regulates the protein\'s metalloprotease activity PUBMED:12460986.

    \ ' '8349' 'IPR013723' '\

    AXH is a protein-protein and RNA binding motif found in Ataxin-1 (ATX1)PUBMED:14583607. ATX1 is responsible for the autosomal-dominant neurodegenerative disorder Spinocerebellar ataxia type-1 (SCA1) in humans. The AXH module has also been identified in the apparently unrelated transcription factor HBP1 which is thought to be involved in the architectural regulation of chromatin and in specific gene expression PUBMED:12965213.

    \ ' '8350' 'IPR013724' '\

    GIT proteins are signalling integrators with GTPase-activating function which may be involved in the organisation of the cytoskeletal matrix assembled at active zones (CAZ). The function of the CAZ might be to define sites of neurotransmitter release. Mutations in the Spa2 homology domain (SHD) domain of GIT1 described here interfere with the association of GIT1 with Piccolo, beta-PIX, and focal adhesion kinase PUBMED:12473661.

    \ ' '8351' 'IPR013725' '\

    This is the C-terminal domain of replication factor C, RFC1. RFC complexes hydrolyse ATP and load sliding clamps such as PCNA (proliferating cell nuclear antigen) onto double-stranded DNA. RFC1 is essential for RFC function in vivo PUBMED:16040599, PUBMED:9092549.

    \ ' '8352' 'IPR013726' '\

    This is a family of fungal proteins of unknown function.

    \ ' '8353' 'IPR013727' '\

    This domain is found in bacterial two-component sensor kinases towards the N terminus.

    \ ' '8354' 'IPR013728' '\

    This domain of unknown function is found in a number of Bacteroidetes proteins including acylhydrolases.

    \ ' '8355' 'IPR013729' '\

    This domain is found in the multiprotein bridging factor 1 (MBF1) which forms a heterodimer with MBF2. It has been shown to make direct contact with the TATA-box binding protein (TBP) and interacts with Ftz-F1, stabilising the Ftz-F1-DNA complex PUBMED:9207077. It is also found in the endothelial differentiation-related factor (EDF-1). Human EDF-1 is involved in the repression of endothelial differentiation, interacts with CaM and is phosphorylated by PKC PUBMED:11587857. The domain is found in a wide range of eukaryotic proteins including metazoans, fungi and plants. A helix-turn-helix motif () is found to its C terminus.

    \ ' '8356' 'IPR013730' '\

    This is a family of fungal proteins that are involved in rRNA processing PUBMED:12837249. In a localisation study they were found to localise to the nucleus and nucleolus PUBMED:14562095.

    \ ' '8357' 'IPR013731' '\

    This domain is found in the Haemophilus influenzae opacity-associated protein (OapA). It is required for efficient nasopharyngeal mucosal colonisation, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation PUBMED:8559074, PUBMED:8830271. This motif occurs at the N terminus of these proteins. It contains a conserved histidine followed by a run of hydrophobic residues.

    \ \

    Many of the proteins in this entry are unassigned peptidases belonging to MEROPS peptidase family M23B.

    \ ' '8358' 'IPR013732' '\

    This entry represents the N-terminal non-catalytic domain of protein-arginine deiminase. This domain has a cupredoxin-like fold.

    \ ' '8359' 'IPR013733' '\

    This entry represents the central non-catalytic domain of protein-arginine deiminase. This domain has an immunoglobulin-like fold.

    \ ' '8360' 'IPR013734' '\

    This is a family of retinoblastoma related proteins.

    \ ' '8361' 'IPR013735' '\

    This entry represents the N-terminal RNA polymerase-binding domain of bacterial transcription factors such as NusA (N-utilising substance A). NusA is involved in transcriptional pausing, termination and anti-termination. NusA from Thermotoga maritima contains an N-terminal domain and three RNA-binding domains (one S1 domain and two KH domains). The N-terminal domain consists of a bifurcated coiled beta-sheet within an alpha/beta(3)/alpha/beta/alpha fold, which can be divided into two subdomains: a globular head and a helical body. The globular head subdomain may interact with RNA polymerase, while the helical body displays a similar structure to that of the helical domain in sigma70 PUBMED:14621988.

    \ ' '8362' 'IPR013736' '\

    This domain is found at the C-terminus of cocaine esterase CocE, several glutaryl-7-ACA acylases, and the putative diester hydrolase NonD of Streptomyces griseus (all hydrolases). The domain, which is a beta sandwich, is also found in serine peptidases belonging to MEROPS peptidase family S15: Xaa-Pro dipeptidyl-peptidases. Members of this entry, that are not characterised as peptidases, show extensive low-level similarity to the Xaa-Pro dipeptidyl-peptidases.

    \ ' '8363' 'IPR013737' '\

    This domain is found in bacterial rhamnosidase A and B enzymes and is probably involved in substrate recognition.

    \ ' '8364' 'IPR013738' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This is non catalytic domain B of beta-galactosidase enzymes belong to the glycosyl hydrolase 42 family. This domain is related to glutamine amidotransferase enzymes, but the catalytic residues are replaced by non functional amino acids. This domain is involved in trimerisation.

    \ ' '8365' 'IPR013739' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This domain is found at the C terminus of beta-galactosidase enzymes that belong to the glycosyl hydrolase 42 family PUBMED:12215416.

    \ ' '8367' 'IPR013741' '\

    This entry contains several KorB transcriptional repressor proteins. The korB gene is a major regulatory element in the replication and maintenance of broad host-range plasmid RK2. It negatively controls the replication gene trfA, the host-lethal determinants kilA and kilB, and the korA-korB operon PUBMED:3430606. This domain includes the DNA-binding HTH motif PUBMED:15170177.

    \ ' '8368' 'IPR013742' '\

    This is a family of plant transcription factors.

    \ ' '8369' 'IPR013743' '\

    NBP1 is a nuclear protein which has been shown in Saccharomyces cerevisiae (Bakers yeast) to be essential for the G2/M transition of the cell cycle.

    \ ' '8370' 'IPR013744' '\

    This is a plant and fungal family of unknown function. This family contains many hypothetical proteins a number of which are classified as unassigned peptidases belonging to MEROPS peptidase family S9 (prolyl oligopeptidase family).

    \ ' '8371' 'IPR013745' '\

    HbrB is involved in hyphal growth and polarity PUBMED:14998529.

    \ ' '8372' 'IPR013746' '\

    Synonym(s): 3-hydroxy-3-methylglutaryl-coenzyme A synthase, HMG-CoA synthase.

    \ \

    Hydroxymethylglutaryl-CoA synthase () catalyses the condensation of acetyl-CoA with acetoacetyl-CoA to produce HMG-CoA and CoA, the second reaction in the mevalonate-dependent isoprenoid biosynthesis pathway. HMG-CoA synthase contains an important catalytic cysteine residue that acts as a nucleophile in the first step of the reaction: the acetylation of the enzyme by acetyl-CoA (its first substrate) to produce an acetyl-enzyme thioester, releasing the reduced coenzyme A. The subsequent nucleophilic attack on acetoacetyl-CoA (its second substrate) leads to the formation of HMG-CoA PUBMED:15498869.

    \

    HMG-CoA synthase occurs in eukaryotes, archaea and certain bacteria PUBMED:15546978. In vertebrates, there are two isozymes located in different subcellular compartments: a cytosolic form that is the starting point of the mevalonate pathway (leads to cholesterol and other sterolic and isoprenoid compounds), and a mitochondrial form responsible for ketone body biosynthesis. HMG-CoA is also found in other eukaryotes such as insects, plants and fungi PUBMED:16640729. In bacteria, isoprenoid precursors are generally synthesised via an alternative, non-mevalonate pathway, however a number of Gram-positive pathogens utilise a mevalonate pathway involving HMG-CoA synthase that is parallel to that found in eukaryotes PUBMED:17128980, PUBMED:16245942.

    \ \

    This entry represents the C-terminal domain of HMG-CoA synthase enzymes from both eukaryotes and prokaryotes.

    \ ' '8373' 'IPR013747' '\

    This domain is found on 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III , the enzyme responsible for initiating the chain of reactions of the fatty acid synthase in plants and bacteria.

    \ ' '8374' 'IPR013748' '\

    Replication factor C (RFC) is a heteropentameric AAA+ protein complex that loads the DNA polymerase processivity clamp PCNA (Proliferating Cell Nuclear Antigen) onto DNA using ATP to drive the reaction PUBMED:16980295. PCNA functions at multiple levels in directing DNA metabolic pathways PUBMED:15210332. When bound to DNA, PCNA organises various proteins involved in DNA replication, DNA repair, DNA modification, and chromatin modelling. Unbound PCNA promoted the localisation of replication factors.

    \

    Replication factor C consists of five subunits in a spiral arrangement: Rfc1, Rfc2, Rfc3, Rfc4, and Rfc5 subunits. Rfc1 and Rfc2 load the PCNA sliding clamp onto DNA, while Rfc3 binds ATP and also acts as a checkpoint sensor. The RFC complex contains four ATP sites (sites A, B, C, and D) located at subunit interfaces. In each ATP site, an arginine residue from one subunit is located near the gamma-phosphate of ATP bound in the adjacent subunit. These arginine residues act as "arginine fingers" that can potentially perform two functions: sensing that ATP is bound and catalyzing ATP hydrolysis PUBMED:16980295.

    \

    This entry represents the core domain found in all five RFC subunits.

    \ ' '8375' 'IPR013749' '\

    This enzyme is part of the Thiamine pyrophosphate (TPP) synthesis pathway, TPP is an essential cofactor for many enzymes PUBMED:9244280.

    \ ' '8376' 'IPR013750' '\

    This domain is found in homoserine kinases (), galactokinases () and mevalonate kinases (). These kinases make up the GHMP kinase superfamily of ATP-dependent enzymes PUBMED:8382990. These enzymes are involved in the biosynthesis of isoprenes and amino acids as well as in carbohydrate metabolism. The C-terminal domain of homoserine kinase has a central alpha-beta plait fold and an insertion of four helices, which, together with the N-terminal fold, create a novel nucleotide binding fold PUBMED:11188689.

    \ ' '8377' 'IPR013751' '\

    Fatty acid synthesis (FAS) is a vital aspect of cellular physiology which can occur by two distinct pathways. The FAS I pathway, which generally only produces palmitate, is found in eukaryotes and is performed either by a single polypeptide which contains all the reaction centres needed to form a fatty acid, or by two polypeptides which interact to form a multifunctional complex. The FAS II pathway, which is capable of producing many different fatty acids, is found in mitochondria, bacteria, plants and parasites, and is performed by many distinct proteins, each of which catalyses a single step within the pathway. The large diversity of products generated by this pathway is possible because the acyl carrier protein (ACP) intermediates are diffusible entities that can be diverted into other biosynthetic pathways PUBMED:15952903.

    \ \

    3-Oxoacyl-[acyl carrier protein (ACP)] synthase III catalyses the first condensation step within the FAS II pathway, using acetyl-CoA as the primer and malonyl-ACP as the acceptor, as shown below.\ \ The oxoacyl-ACP formed by this reaction subsequently enters the elongation cycle, where the acyl chain is progressively lengthened by the combined activities of several enzymes.

    \ \

    The enzymes studied so far are homodimers, where each monomer consists of two domains (N-terminal and C-terminal) which are similar in structure, but not in sequence PUBMED:11243824, PUBMED:12429097. This entry represents a conserved region within the N-terminal domain.

    \ \ \ ' '8378' 'IPR013752' '\

    This is the C-terminal domain of 2-dehydropantoate 2-reductases also known as ketopantoate reductases, . The reaction catalysed by this enzyme is: (R)-pantoate + NADP(+) = 2-dehydropantoate + NADPH. AbpA catalyses the NADPH reduction of ketopantoic acid to pantoic acid in the alternative pyrimidine biosynthetic (APB) pathway PUBMED:9721324. ApbA and PanE are allelic PUBMED:9721324. ApbA, the ketopantoate reductase enzyme is required for the synthesis of thiamine via the APB biosynthetic pathway PUBMED:9488683.

    \ ' '8379' 'IPR013857' '\

    This protein is associated with mitochondrial complex I intermediate-associated protein 30 (CIA30) in human and mouse, it is also present in Schizosaccharomyces pombe (Fission yeast) which does not contain the NADH dehydrogenase component of complex I, or many of the other essential subunits. This means it is possible that it is not directly involved in oxidative phosphorylation PUBMED:11935339, PUBMED:1518044.

    \ ' '8380' 'IPR013858' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Proteins in this entry are metalloendopeptidases belong to the MEROPS peptidase family M10 (subfamily M10B, clan MA). They include serralysin, epralysin, mirabilysin, aeruginolysin and other related peptidases. \ The peptidase unit is found at the N terminus while in this entry it is found at the C terminus and forms a corkscrew. It is thought to be important for secretion of the protein through the bacterial cell wall. Proteins in this entry contain a calcium ion binding domain.

    \ ' '8381' 'IPR013859' '\

    This is a fungal protein of unknown function.

    \ ' '8382' 'IPR013860' '\

    This entry indicates fungal proteins of unknown function.

    \ ' '8383' 'IPR013861' '\

    This entry is found in eukaryotic integral membrane proteins. , a Saccharomyces cerevisiae (Baker\'s yeast) protein, has been shown to localise COP II vesicles PUBMED:14562095.

    \ ' '8384' 'IPR013862' '\

    This entry indicates Golgi proteins of unknown function.

    \ ' '8385' 'IPR013863' '\

    This entry represents fungal and plant proteins and contains many hypothetical proteins. Vid27p is a cytoplasmic protein of unknown function, possibly regulates import of fructose-1,6-bisphosphatase into Vacuolar Import and Degradation (Vid) vesicles and is not essential for proteasome-dependent degradation of fructose-1,6-bisphosphatase (FBPase) PUBMED:10029995, PUBMED:12686616.

    \ ' '8387' 'IPR013865' '\

    This is a eukaryotic protein of unknown function.

    \ ' '8389' 'IPR013866' '\

    Sphingolipids are important membrane signalling molecules involved in many different cellular functions in eukaryotes. Sphingolipid delta 4-desaturase catalyses the formation of (E)-sphing-4-enine PUBMED:11937514. Some proteins in this entry have bifunctional delta 4-desaturase/C-4-hydroxylase activity. Delta 4-desaturated sphingolipids may play a role in early signalling required for entry into meiotic and spermatid differentiation pathways during Drosophila spermatogenesis PUBMED:11937514. This small protein associates with FA_desaturase and appears to be specific to sphingolipid delta 4-desaturase.

    \ ' '8390' 'IPR013867' '\

    Telomeres function to shield chromosome ends from degradation and end-to-end fusions, as well as preventing the activation of DNA damage checkpoints. Telomeric repeat binding factor (TRF) proteins TRF1 and TRF2 are major components of vertebrate telomeres required for regulation of telomere stability. TRF1 and TRF2 bind to telomeric DNA as homodimers. Dimerisation involves the TRF homology (TRFH) subdomain contained within the dimerisation domain. The TRFH subdomain is important not only for dimerisation, but for DNA binding, telomere localisation, and interactions with other telomeric proteins. The dimerisation domains of TRF1 and TRF2 show the same multi-helical structure, arranged in a solenoid conformation similar to TPR repeats, which can be divided into an alpha-alpha superhelix and a long alpha hairpin PUBMED:11545737.

    \ \ \

    The two related human TRF proteins hTRF1 and hTRF2 form homodimers and bind directly to telomeric TTAGGG repeats via the myb DNA binding domain at the carboxy terminus PUBMED:15316005. TRF1 is implicated in telomere length regulation and TRF2 in telomere protection PUBMED:15316005. Other telomere complex associated proteins are recruited through their interaction with either TRF1 or TRF2. The fission yeast protein Taz1p (telomere-associated in Schizosaccharomyces pombe (Fission yeast)) has similarity to both hTRF1 and hTRF2 and may perform the dual functions of TRF1 and TRF2 at fission yeast telomeres PUBMED:9034194.

    \ ' '8391' 'IPR013868' '\

    In Schizosaccharomyces pombe (Fission yeast), Cut8 is a nuclear envelope protein that physically interacts with and tethers 26S proteasome in the nucleus resulting in the nuclear accumulation of proteasomes PUBMED:16096059. Cut8 is a proteasome substrate and amino terminal residues 1-72 are polyubiquitinated and function as a degron tag. Ubiquitination of the amino terminal is essential to the function of Cut8. Lysine residues in the amino terminal 72 amino acids of Cut8 are required for physical interaction with the proteasome. In fission yeast the function of Cut8 has been demonstrated to be regulated by ubiquitin-conjugating Rhp6/Ubc2/Rad6 and ligating enzymes Ubr1. Cut8 homologs have been identified in Drosophila melanogaster (Fruit fly), Anopheles gambiae (African malaria mosquito) and Dictyostelium discoideum (Slime mold).

    \ ' '8392' 'IPR013869' '\

    This entry shows proteins that are about 150 amino acids in length and have no known function.

    \ ' '8393' 'IPR013870' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This entry includes yeast MRPL37 a mitochondrial ribosomal protein PUBMED:15543521.

    \ ' '8394' 'IPR013871' '\

    This entry is found on Crisp proteins which contain and has been termed the Crisp domain. It is found in the mammalian reproductive tract and the venom of reptiles, and has been shown to regulate ryanodine receptor Ca2+ signalling PUBMED:16339766. It contains 10 conserved cysteines which are all involved in disulphide bonds and is structurally related to the ion channel inhibitor toxins BgK and ShK PUBMED:16339766.

    \ ' '8395' 'IPR013872' '\

    The binding of this protein by regulatory proteins regulates p53 transcription activation. This entry is comprised of a single amphipathic alpha helix and contains a highly conserved motif PUBMED:8875929, PUBMED:16159876.

    \ ' '8396' 'IPR013873' '\

    Cdc37 is a protein required for the activity of numerous eukaryotic protein kinases. This entry corresponds to the C-terminal domain whose function is unclear. It is found C-terminal to the Hsp90 chaperone (heat shock protein 90) binding domain and the N-terminal kinase binding domain of Cdc37 PUBMED:16098195.

    \ ' '8397' 'IPR013874' '\

    Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This entry corresponds to the Hsp90 chaperone (heat shock protein 90) binding domain of Cdc37 PUBMED:16098195. It is found between the N-terminal Cdc37 domain , which is predominantly involved in kinase binding, and the C-terminal domain of Cdc37 whose function is unclear.

    \ ' '8398' 'IPR013875' '\

    The presequence translocase-associated motor (PAM) drives the completion of preprotein translocation into the mitochondrial matrix. The Pam17 subunit is required for formation of a stable complex between cochaperones Pam16 and Pam18 and promotes the association of Pam16-Pam18 with the presequence translocase PUBMED:16107694. Mitochondria lacking Pam17 are selectively impaired in the import of matrix proteins PUBMED:16107694.

    \ ' '8399' 'IPR013876' '\

    The N-terminal region of the TFIIH basal transcription factor complex p62 subunit (BTF2-p62) forms an interaction with the 3\' endonuclease XPG, which is essential for activity. The 3\' endonuclease XPG is a major component of the nucleotide excision repair machinery. The structure of the N-terminal region reveals that it adopts a pleckstrin homology (PH) fold PUBMED:15195146, PUBMED:15909982.

    \ ' '8400' 'IPR013877' '\

    This shows fungal proteins of unknown function.

    \ ' '8401' 'IPR013878' '\

    Mo25-like proteins are involved in both polarised growth and cytokinesis. In fission yeast Mo25 is localised alternately to the spindle pole body and to the site of cell division in a cell cycle dependent manner PUBMED:16325501, PUBMED:16096637.

    \ ' '8402' 'IPR013879' '\

    This entry shows conserved fungal proteins with unknown function.

    \ ' '8403' 'IPR013880' '\

    In yeast, Yos1 is a subunit of the Yip1p-Yif1p complex and is required for transport between the endoplasmic reticulum and the Golgi complex. Yos1 appears to be conserved in eukaryotes PUBMED:15659647.

    \ ' '8404' 'IPR013881' '\

    Pre-mRNA processing factor 3 (PRP3) is a U4/U6-associated splicing factor. The human PRP3 has been implicated in autosomal retinitis pigmentosa PUBMED:11773002.

    \ ' '8405' 'IPR013882' '\

    Sae2 is a protein involved in repairing meiotic and mitotic double-strand breaks in DNA. It has been shown to negatively regulate DNA damage checkpoint signalling PUBMED:16374511, PUBMED:16162495. SAE2 is homologous to the CtIP proteins in mammals.

    \ ' '8406' 'IPR013883' '\

    This is a group of proteins of unknown function. is known to interact with RNA polymerase II and deletion of this protein results in hypersensitivity to the K1 killer toxin PUBMED:12663529.

    \ ' '8407' 'IPR013884' '\

    This protein is found in fungi and has no known function.

    \ ' '8408' 'IPR013885' '\

    This entry consists of eukaryotic proteins of unknown function, including many hypothetical proteins.

    \ ' '8409' 'IPR013886' '\

    PI31 is a cellular regulator of proteasome formation and of proteasome-mediated antigen processing PUBMED:12374861.

    \ ' '8410' 'IPR013887' '\

    This entry represents a conserved region found in hypothetical proteins from fungi, mycetozoa and entamoebidae.

    \ ' '8411' 'IPR013888' '\

    Ribonuclease P (RNase P) generates mature tRNA molecules by cleaving their 5\' ends. Rpm2 is a protein subunit of the yeast mitochondrial RNase P. It has the ability to act as a transcriptional activator in the nucleus, where it plays a role in defining the steady-state levels of mRNAs for some nucleus-encoded mitochondrial components. Rpm2p is also involved in maturation of Rpm1 and in translation of mitochondrial mRNAs PUBMED:16024791, PUBMED:8668158, PUBMED:11404323.

    \ ' '8412' 'IPR013889' '\

    The KAR9 protein in Saccharomyces cerevisiae (Baker\'s yeast) is a cytoskeletal protein required for karyogamy, correct positioning of the mitotic spindle and for orientation of cytoplasmic microtubules PUBMED:9442113. KAR9 localises at the shmoo tip in mating cells and at the tip of the growing bud in anaphase PUBMED:9442113.

    \ ' '8413' 'IPR013890' '\

    The N-terminal region of the Tup protein has been shown to interact with the Ssn6 transcriptional co-repressor PUBMED:12234489.

    \ ' '8415' 'IPR013892' '\

    Cmc1 is a metallo-chaperone like protein which is known to localise to the inner mitochondrial membrane in Saccharomyces cerevisiae. It is essential for full expression of cytochrome c oxidase and respiration PUBMED:18443040. Cmc1 contains two Cx9C motifs and is able to bind copper(I). Cmc1 is thought to play a role in mitochondrial copper trafficking and transfer to cytochrome c oxidase.

    \ ' '8416' 'IPR013893' '\

    The tRNA processing enzyme ribonuclease P (RNase P) consists of an RNA molecule and at least eight protein subunits. Subunits hpop1, Rpp21, Rpp29, Rpp30, Rpp38, and Rpp40 (this entry) are involved in extensive, but weak, protein-protein interactions in the holoenzyme complex PUBMED:11158571.

    \ ' '8417' 'IPR013894' '\

    This domain is present in eukaryotic proteins of unknown function, and is sometimes found to the N-terminus of ubiquitin-binding and nucleic acid-binding domains.

    \ ' '8418' 'IPR013895' '\

    RSC is an ATP-dependent chromatin remodelling complex found in yeast. The RSC components Rsc7/Npl6 and Rsc14/Ldb7 interact physically and/or functionally with Rsc3, Rsc30, and Htl1 to form a module important for a broad range of RSC functions PUBMED:16204215.

    \ ' '8419' 'IPR013896' '\

    This is a UBA (ubiquitin associated) protein PUBMED:12672455. Ubiquitin is involved in intracellular proteolysis.

    \ ' '8420' 'IPR013897' '\

    This entry is found in fungal proteins with unknown function.

    \ ' '8421' 'IPR013898' '\

    The functions of these proteins are unknown. They are rather dissimilar except for a single strongly conserved motif (PDLRFEQ).

    \ ' '8422' 'IPR013899' '\

    This domain is almost always found adjacent to .

    \ ' '8423' 'IPR013900' '\

    This entry includes Schizosaccharomyces pombe (Fission yeast) Spd1. Spd1p inhibits fission yeast RNR activity by interacting with the Cdc22p PUBMED:16317005.

    \ ' '8424' 'IPR013901' '\

    This protein of unknown function is currently only found in fungi.

    \ ' '8425' 'IPR013902' '\

    The function of these proteins is unknown.

    \ ' '8426' 'IPR013903' '\

    This entry of proteins appear to be specific to Schizosaccharomyces pombe (Fission yeast).

    \ ' '8427' 'IPR013904' '\

    The entry represents the N-terminal region of RXT2-like proteins. In Saccharomyces cerevisiae (Baker\'s yeast), RXT2 has been demonstrated to be involved in conjugation with cellular fusion (mating) and invasive growth PUBMED:10628851. A high throughput localisation study has localised RXT2 to the nucleus PUBMED:14562095.

    \ ' '8428' 'IPR013905' '\

    The Lethal giant larvae (Lgl) tumour suppressor protein is conserved from yeast to mammals. The Lgl protein functions in cell polarity, at least in part, by regulating SNARE-mediated membrane delivery events at the cell surface PUBMED:15964280. The N-terminal half of Lgl members contains WD40 repeats (see ), while the C-terminal half appears specific to the protein PUBMED:15964280.

    \ ' '8429' 'IPR013906' '\

    This shows protein subunits of the eukaryotic translation initiation factor 3 (eIF3). In yeast it is called Hcr1. The Saccharomyces cerevisiae (Baker\'s yeast) protein has been shown to be required for processing of 20S pre-rRNA and binds to 18S rRNA and eIF3 subunits Rpg1p and Prt1p PUBMED:11387228.

    \ ' '8430' 'IPR013907' '\

    Repression of gene transcription is mediated by histone deacetylases containing repressor-co-repressor complexes, which are recruited to promoters of target genes via interactions with sequence-specific transcription factors. The co-repressor complex contains a core of at least seven proteins PUBMED:15451426. This entry represents the conserved region found in Sds3, Dep1 and BRMS1-homologue p40 proteins.

    \ ' '8431' 'IPR013908' '\

    This C-terminal region of the DNA damage repair protein Nbs1 has been identified to be necessary for the binding of Mre11 and Tel1 PUBMED:15964794.

    \ ' '8432' 'IPR013909' '\

    This domain is found C-terminal to a zinc-finger like domain . The Schizosaccharomyces pombe protein containing this domain () is involved in mRNA export from the nucleus PUBMED:15357289.

    \ ' '8433' 'IPR013910' '\

    The transcription factor Pap1 regulates antioxidant-gene transcription in response to H2O2 PUBMED:15956211. This region is cysteine rich. Alkylation of cysteine residues following treatment with a cysteine alkylating agent can mask the accessibility of the nuclear exporter Crm1, triggering nuclear accumulation and Pap1 dependent transcriptional expression PUBMED:12100563.

    \ ' '8434' 'IPR013911' '\

    The Saccharomyces cerevisiae (Baker\'s yeast) Mgr1 protein has been shown to be required for mitochondrial viability in yeast lacking mitochondrial DNA. It is a mitochondrial inner membrane protein, which interacts with Yme1 and is a new subunit of the i-AAA protease complex PUBMED:16267274, PUBMED:8861950.

    \ ' '8435' 'IPR013912' '\

    Cyclase-associated proteins (CAPs) are highly conserved actin-binding proteins present in a wide range of organisms including yeast, fly, plants, and mammals. CAPs are multifunctional proteins that contain several structural domains. CAP is involved in species-specific signalling pathways PUBMED:11919151, PUBMED:17635992, PUBMED:10658207, PUBMED:12351838. In Drosophila, CAP functions in Hedgehog-mediated eye development and in establishing oocyte polarity. In Dictyostelium (slim mold), CAP is involved in microfilament reorganisation near the plasma membrane in a PIP2-regulated manner and is required to perpetuate the cAMP relay signal to organise fruitbody formation. In plants, CAP is involved in plant signalling pathways required for co-ordinated organ expansion. In yeast, CAP is involved in adenylate cyclase activation, as well as in vesicle trafficking and endocytosis. In both yeast and mammals, CAPs appear to be involved in recycling G-actin monomers from ADF/cofilins for subsequent rounds of filament assembly PUBMED:17376963, PUBMED:15004221. In mammals, there are two different CAPs (CAP1 and CAP2) that share 64% amino acid identity.

    \

    All CAPs appear to contain a C-terminal actin-binding domain that regulates actin remodelling in response to cellular signals and is required for normal cellular morphology, cell division, growth and locomotion in eukaryotes. CAP directly regulates actin filament dynamics and has been implicated in a number of complex developmental and morphological processes, including mRNA localisation and the establishment of cell polarity. Actin exists both as globular (G) (monomeric) actin subunits and assembled into filamentous (F) actin. In cells, actin cycles between these two forms. Proteins that bind F-actin often regulate F-actin assembly and its interaction with other proteins, while proteins that interact with G-actin often control the availability of unpolymerised actin. CAPs bind G-actin.

    \

    In addition to actin-binding, CAPs can have additional roles, and may act as bifunctional proteins. In Saccharomyces cerevisiae (Baker\'s yeast), CAP is a component of the adenylyl cyclase complex (Cyr1p) that serves as an effector of Ras during normal cell signalling. S. cerevisiae CAP functions to expose adenylate cyclase binding sites to Ras, thereby enabling adenylate cyclase to be activated by Ras regulatory signals. In Schizosaccharomyces pombe (Fission yeast), CAP is also required for adenylate cyclase activity, but not through the Ras pathway. In both organisms, the N-terminal domain is responsible for adenylate cyclase activation, but the S cerevisiae and S. pombe N-termini cannot complement one another. Yeast CAPs are unique among the CAP family of proteins, because they are the only ones to directly interact with and activate adenylate cyclase PUBMED:10594005. S. cerevisiae CAP has four major domains. In addition to the N-terminal adenylate cyclase-interacting domain, and the C-terminal actin-binding domain, it possesses two other domains: a proline-rich domain that interacts with Src homology 3 (SH3) domains of specific proteins, and a domain that is responsible for CAP oligomerisation to form multimeric complexes (although oligomerisation appears to involve the N- and C-terminal domains as well). The proline-rich domain interacts with profilin, a protein that catalyses nucleotide exchange on G-actin monomers and promotes addition to barbed ends of filamentous F-actin PUBMED:17376963. Since CAP can bind profilin via a proline-rich domain, and G-actin via a C-terminal domain, it has been suggested that a ternary G-actin/CAP/profilin complex could be formed.

    \ \

    This entry represents the C-terminal domain of CAP proteins, which is responsible for G-actin-binding. This domain has a superhelical structure, where the superhelix turns are made of two beta-strands each PUBMED:15311924.

    \ ' '8436' 'IPR013913' '\

    This entry contains both the nucleoporin Nup153 from human and Nup153 from fission yeast. These have been demonstrated to be functionally equivalent PUBMED:15659641.

    \ ' '8437' 'IPR013914' '\

    In Saccharomyces cerevisiae (Baker s yeast), the Rad9 is a key adaptor protein in DNA damage checkpoint pathways. DNA damage induces Rad9 phosphorylation, and Rad53 specifically associates with this region of Rad9, when phosphorylated, via the Rad53 domain PUBMED:10518219. There is no clear higher eukaryotic ortholog to Rad9.

    \ ' '8438' 'IPR013915' '\

    This region is found specifically in PRP19-like protein. The region represented by this protein covers the sequence implicated in self-interaction and a coiled-coiled motif PUBMED:16332694. PRP19-like proteins form an oligomer that is necessary for spliceosome assembly PUBMED:16332694.

    \ ' '8439' 'IPR013740' '\

    This redoxin domain includes peroxiredoxin, thioredoxin and glutaredoxin proteins. Peroxiredoxins (Prxs) constitute a family of thiol peroxidases that reduce hydrogen peroxide, peroxinitrite, and hydroperoxides using a strictly conserved cysteine PUBMED:15697201. Chloroplast thioredoxin systems in plants regulate the enzymes involved in photosynthetic carbon assimilation PUBMED:18047840. It is thought that redoxins have a large role to play in anti-oxidant defence. Cadmium-sensitive proteins are also regulated via thioredoxin and glutaredoxin thiol redox systems PUBMED:17103236.

    \ ' '8441' 'IPR013917' '\

    The proteins in this entry appear to be important in wyosine base formation in a subset of phenylalanine specific tRNAs. It has been proposed that it participates in converting tRNA(Phe)-m(1)G(37) to tRNA(Phe)-yW PUBMED:16162496.

    \ ' '8442' 'IPR013918' '\

    Fes1 is a cytosolic homologue of Sls1, an ER protein which has nucleotide exchange factor activity. Fes1 in yeast has been shown to bind to the molecular chaperone Hsp70 and has adenyl-nucleotide exchange factor activity PUBMED:12052876.

    \ ' '8443' 'IPR013919' '\

    Pex16 is a peripheral protein located at the matrix face of the peroxisomal membrane PUBMED:9182661.

    \ ' '8444' 'IPR013920' '\

    This is a fungal protein of unknown function.

    \ ' '8445' 'IPR013921' '\

    Proteins in this entry are subunit Med20 of the Mediator complex and related to the TATA-binding protein (TBP). TBP is a highly conserved RNA polymerase II general transcription factor that binds to the core promoter and initiates assembly of the preinitiation complex. Human TRF has been shown to associate with an RNA polymerase II-SRB complex PUBMED:9933582.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '8446' 'IPR013922' '\

    Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles PUBMED:12910258, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

    \

    Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi\'s sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus PUBMED:11056549.

    \ \

    This entry includes many different cyclin proteins. Members include the G1/S-specific cyclin pas1 PUBMED:, and the phosphate system cyclin PHO80/PHO85 PUBMED:.

    \ ' '8447' 'IPR013923' '\ Macroautophagy is a bulk degradation process induced by starvation in eukaryotic cells. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. No molecule involved in autophagy has yet been identified in higher eukaryotes PUBMED:9852036. The pre-autophagosomal structure contains at least five Apg proteins: Apg1p, Apg2p, Apg5p, Aut7p/Apg8p and Apg16p. It is found in the vacuole PUBMED:11689437. The C-terminal glycine of Apg12p is conjugated to a lysine residue of Apg5p via an isopeptide bond. During autophagy, cytoplasmic components are enclosed in autophagosomes and delivered to lysosomes/vacuoles. Auotphagy protein 16 (Apg16) has been shown to be bind to Apg5 and is required for the function of the Apg12p-Apg5p conjugate PUBMED:10406794. Autophagy protein 5 (Apg5) is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway PUBMED:10712513.\

    This entry represents auotphagy protein 16 (Apg16), which is required for the function of the Apg12p-Apg5p conjugate.

    \ ' '8448' 'IPR013924' '\

    Whereas bacterial and archaeal RNases H2 are active as single polypeptides, the Saccharomyces cerevisiae (Baker\'s yeast) homologue, Rnh2Ap, when expressed in Escherichia coli, fails to produce an active RNase H2. For RNase H2 activity three proteins are required [Rnh2Ap (Rnh201p), Ydr279p (Rnh202p) and Ylr154p (Rnh203p)]. Deletion of any one of the proteins or mutations in the catalytic site in Rnh2A leads to loss of RNase H2 activity PUBMED:14734815. RNase H2 ia an endonuclease that specifically degrades the RNA of RNA:DNA hybrids. It participates in DNA replication, possibly by mediating the removal of lagging-strand Okazaki fragment RNA primers during DNA replication.

    \

    This entry represents the non-catalytic C subunit of RNase H2, which in S. cerevisiae is (Ylr154p/Rnh203p).

    \ ' '8449' 'IPR013925' '\

    has been shown to interact with the outer plaque of the spindle pole body PUBMED:14515169.

    \ ' '8450' 'IPR013926' '\

    CGI-121 has been shown to bind to the p53-related protein kinase (PRPK) PUBMED:12659830. PRPK is a novel protein kinase which binds to and induces phosphorylation of the tumour suppressor protein p53.

    \ ' '8451' 'IPR013927' '\

    Opi1 is a leucine zipper containing yeast transcription factor that negatively regulates phospholipid biosynthesis PUBMED:. It represses the expression of several UAS(INO) cis acting element containing genes and its activity is mediated by phosphorylations catalysed by protein kinase A, protein kinase C and casein kinase II PUBMED:16407309.

    \ ' '8452' 'IPR013928' '\

    The C terminus of the plasma membrane Nha1 antiporter plays an important role in the immediate cell response to hypo-osmotic shock which prevents an excessive loss of ions and water PUBMED:16402204. This protein is found C-terminal to .

    \ ' '8453' 'IPR013929' '\

    Inhibition of RNA polymerase II-associated protein 1 (RPAP1) synthesis in Saccharomyces cerevisiae (Baker\'s yeast) results in changes in global gene expression that are similar to those caused by the loss of the RNAPII subunit Rpb11 PUBMED:15282305. This entry represents the C-terminal region that contains the motif GLHHH. This region is conserved from yeast to humans.

    \ ' '8454' 'IPR013930' '\

    Inhibition of RNA polymerase II-associated protein 1 (RPAP1) synthesis in Saccharomyces cerevisiae (Baker\'s yeast) results in changes in global gene expression that are similar to those caused by the loss of the RNAPII subunit Rpb11 PUBMED:15282305. This entry represents the N-terminal region of RPAP-1 that is conserved from yeast to humans.

    \ ' '8455' 'IPR013931' '\

    This entry represents oxidative stress survival proteins, such as Svf1. The protein Svf1 is required for yeast survival under conditions of oxidative stress, including cold stress PUBMED:16034825. Cells deficient in Svf1 have increased levels of reactive oxygen species (ROS) under certain conditions.

    \ \ \ ' '8456' 'IPR013932' '\

    TIP120 (also known as cullin-associated and neddylation-dissociated protein 1) is a TATA binding protein interacting protein that enhances transcription PUBMED:10567521.

    \ ' '8457' 'IPR013933' '\

    This entry contains subunits of the chromatin remodelling complexes. Saccharomyces cerevisiae (Baker\'s yeast) and its paralogue have been identified as subunits of the RSC chromatin remodelling complex, and SWI/SNF chromatin remodelling complex respectively PUBMED:16204215.

    \ ' '8458' 'IPR013934' '\

    A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome PUBMED:12068309, PUBMED:15590835. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:

    \ \

    There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5\' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble PUBMED:15489292. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5\' ends of nascent 18S rRNA.

    \

    Utp13 is a nucleolar protein and component of the small subunit (SSU) processome containing the U3 snoRNA that is involved in processing of pre-18S rRNA PUBMED:12068309.

    \ \

    Upt13 is also a component of the Pwp2 complex that forms part of a stable particle subunit independent of the U3 small nucleolar ribonucleoprotein that is essential for the initial assembly steps of the 90S pre-ribosome PUBMED:15231838. Components of the Pwp2 complex are:\ Utp1 (Pwp2), Utp6, Utp12 (Dip2), Utp13, Utp18, and Utp21. The relationship between the Pwp2 complex and the t-Utps complex PUBMED:15489292 that also associates with the 5\' end of nascent pre-18S rRNA is unclear.

    \ ' '8459' 'IPR013935' '\

    This region is found at the N terminus of Saccharomyces cerevisiae (Baker\'s yeast) Trs120 protein (). Trs120 is a subunit of the multiprotein complex TRAPP (transport particle protein), which functions in ER to Golgi traffic PUBMED:10727015.

    \ ' '8460' 'IPR013936' '\

    This region is found in proteins related to Plasmodium falciparum chloroquine resistance transporter (CRT).

    \ ' '8461' 'IPR013937' '\

    This region is found at the C terminus of proteins belonging to the nexin family. It is found on proteins which also contain .

    \ ' '8462' 'IPR013938' '\

    The cyclic nucleotide phosphodiesterases (PDE) comprise a group of enzymes that degrade the phosphodiester bond in the second messenger molecules cAMP and cGMP. They are divided into 11 families. They regulate the localisation, duration and amplitude of cyclic nucleotide signalling within subcellular domains. PDEs are therefore important for signal transduction.

    \ \

    PDE enzymes are often targets for pharmacological inhibition due to their unique tissue distribution, structural properties, and functional properties. Inhibitors include: Roflumilast for chronic obstructive pulmonary disease and asthma PUBMED:18447606, Sildenafil for erectile dysfunction PUBMED:18367027 and Cilostazol for peripheral arterial occlusive disease PUBMED:18436153, amongst others.

    \ \

    Retinal 3\',5\'-cGMP phosphodiesterase is located in photoreceptor outer segments PUBMED:: it is light activated, playing a pivotal role in signal transduction. In rod cells, PDE is oligomeric, comprising an alpha-, a beta- and 2 gamma-subunits, while in cones, PDE is a homodimer of alpha chains, which are associated with several smaller subunits. Both rod and cone PDEs catalyse the hydrolysis of cAMP or cGMP to the corresponding nucleoside 5\' monophosphates, both enzymes also binding\ cGMP with high affinity. The cGMP-binding sites are located in the\ N-terminal half of the protein sequence, while the catalytic core \ resides in the C-terminal portion.

    \ \

    This region is found at the N terminus of members of PDE8 phosphodiesterase family PUBMED:9784418. Phosphodiesterase 8 (PDE8) regulates chemotaxis of activated lymphocytes PUBMED:16696947.

    \ ' '8463' 'IPR013939' '\

    This region, together with the C-terminal zinc finger () is essential for the mitotic and kinase activation functions of Dfp1/Him1 PUBMED:11402029.

    \ ' '8464' 'IPR013940' '\

    SPO22 is a meiosis-specific protein with similarity to phospholipase A2, involved in completion of nuclear divisions during meiosis; induced early in meiosis PUBMED:11101837. It is also involved in sporulation PUBMED:16314568.

    \ ' '8465' 'IPR013941' '\

    This region of the Zds1 protein is critical for sporulation and has also been shown to suppress the calcium sensitivity of Zds1 deletions PUBMED:16322512.

    \ ' '8466' 'IPR013942' '\

    This entry represents the Med19 subunit of the Mediator complex in fungi.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '8467' 'IPR013943' '\

    Pet127 has been implicated in mitochondrial RNA stability and/or processing and is localised to the mitochondrial membrane PUBMED:9111353.

    \ ' '8468' 'IPR013944' '\

    This is the C terminus of putative oxidoreductases.

    \ ' '8469' 'IPR013945' '\

    Pkr1 has been identified as an ER protein of unknown function.

    \ ' '8470' 'IPR013946' '\

    NCA2 (Nuclear Control of ATPase), is one of the two nuclear genes involved in the control of mitochondrial expression of subunits 6 and 8 of the Fo-F1 ATP synthase in Saccharomyces cerevisiae (Baker\'s yeast). Mutations in either NCA2 or NCA3 () dramatically lower the level of the co-transcript encoding subunits 6 and 8 PUBMED:7723016, PUBMED:7586026.

    \ ' '8471' 'IPR013947' '\

    Saccharomyces cerevisiae (Baker\'s yeast) RGR1 mediator complex subunit affects chromatin structure, transcriptional regulation of diverse genes, and sporulation. It is required for glucose repression, HO repression, RME1 repression and sporulation PUBMED:7851756, PUBMED:7635307. This subunit is also found in higher eukaryotes and MED14 is the agreed unified nomenclature for this subunit PUBMED:15175151.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '8472' 'IPR013948' '\

    The SLD3 DNA replication regulator is required for loading and maintenance of Cdc45 on chromatin during DNA replication PUBMED:12006645.

    \ ' '8473' 'IPR013949' '\

    This entry represents U3 nucleolar RNA-associated proteins which are involved in nucleolar processing of pre-18S ribosomal RNA PUBMED:12068309.

    \ ' '8474' 'IPR013950' '\

    Mis14 is a kinetochore protein which is known to be recruited to kinetochores independently of CENP-A PUBMED:15369671.

    \ ' '8475' 'IPR013951' '\

    Rxt3 has been shown in yeast to be required for histone deacetylation PUBMED:16314178.

    \ ' '8476' 'IPR013952' '\

    This is a fungal protein of unknown function. One of the proteins has been localised to the mitochondria PUBMED:14576278.

    \ ' '8477' 'IPR013953' '\

    Proteins in this entry are subunits the FACT complex; the FACT complex is a stable heterodimer in Saccharomyces cerevisiae (Baker\'s yeast) comprising Spt16p and Pob3p. The complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin PUBMED:15987999; it also facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilising and then reassembling nucleosome structure PUBMED:12524332, PUBMED:12934006.

    \ \

    The proteins in this entry are non-peptidase homologues belonging to MEROPS peptidase family M24 (clan MG).

    \ ' '8478' 'IPR013954' '\

    Polynucleotide kinase 3 phosphatases play a role in the repair of single breaks in DNA induced by DNA-damaging agents such as gamma radiation and camptothecin PUBMED:11729194.

    \ ' '8479' 'IPR013955' '\

    This entry is found at the C terminus of replication factor A. Replication factor A (RPA) binds single-stranded DNA and is involved in replication, repair and recombination of DNA PUBMED:10713540.

    \ ' '8480' 'IPR013956' '\

    BRE1 is an E3 ubiquitin ligase that has been shown to act as a transcriptional activator through direct activator interactions PUBMED:16337599.

    \ ' '8481' 'IPR013957' '\

    This entry shows eukaryotic proteins of unknown function. Some of the proteins are putative nucleic acid binding proteins.

    \ ' '8482' 'IPR013958' '\

    The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis PUBMED:11799062. In Saccharomyces cerevisiae (Baker\'s yeast) DASH forms both rings and spiral structures on microtubules in vitro PUBMED:15640796, PUBMED:15664196. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules PUBMED:15632076. Throughout the cell cycle Dad1 remains bound to kinetochores and its association is dependent on the Mis6 and Mal2 PUBMED:16079915.

    \ ' '8483' 'IPR013959' '\

    The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis PUBMED:11799062. In Saccharomyces cerevisiae (Baker\'s yeast) DASH forms both rings and spiral structures on microtubules in vitro PUBMED:15640796, PUBMED:15664196.

    \ ' '8484' 'IPR013960' '\

    The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis PUBMED:11799062. In Saccharomyces cerevisiae (Baker\'s yeast) DASH forms both rings and spiral structures on microtubules in vitro PUBMED:15640796, PUBMED:15664196.

    \ ' '8485' 'IPR013961' '\

    RAI1 is homologous to Caenorhabditis elegans DOM-3 and human DOM3Z and binds to a nuclear exoribonuclease PUBMED:10805743. It is required for 5.8S rRNA processing PUBMED:10805743.

    \ ' '8486' 'IPR013962' '\

    The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis PUBMED:11799062. In Saccharomyces cerevisiae (Baker\'s yeast) DASH forms both rings and spiral structures on microtubules in vitro PUBMED:15640796, PUBMED:15664196. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules PUBMED:15632076.

    \ ' '8487' 'IPR013963' '\

    The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis PUBMED:11799062. In Saccharomyces cerevisiae (Baker\'s yeast) DASH forms both rings and spiral structures on microtubules in vitro PUBMED:15640796, PUBMED:15664196.

    \ ' '8488' 'IPR013964' '\

    The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis PUBMED:11799062. In Saccharomyces cerevisiae (Baker\'s yeast) DASH forms both rings and spiral structures on microtubules in vitro PUBMED:15640796, PUBMED:15664196. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules PUBMED:15632076.

    \ ' '8489' 'IPR013965' '\

    The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis PUBMED:11799062. In Saccharomyces cerevisiae (Baker\'s yeast) DASH forms both rings and spiral structures on microtubules in vitro PUBMED:15640796, PUBMED:15664196.

    \ ' '8490' 'IPR013966' '\

    The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis PUBMED:11799062. In Saccharomyces cerevisiae (Baker\'s yeast) DASH forms both rings and spiral structures on microtubules in vitro PUBMED:15640796, PUBMED:15664196. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules PUBMED:15632076.

    \ ' '8491' 'IPR013967' '\

    This is the N-terminal of the DNA repair protein Rad54 PUBMED:14551247.

    \ ' '8492' 'IPR013968' '\

    This domain is found in bacterial polyketide synthases that catalyse the first step in the reductive modification of the beta-carbonyl centres in the growing polyketide chain. It uses NADPH to reduce the keto group to a hydroxy group.

    \ ' '8493' 'IPR013969' '\

    Alg14 is involved dolichol-linked oligosaccharide biosynthesis and anchors the catalytic subunit Alg13 to the ER membrane PUBMED:16100110.

    \ ' '8494' 'IPR013970' '\

    Replication factor A is involved in eukaryotic DNA replication, recombination and repair.

    \ ' '8495' 'IPR013979' '\

    This entry contains eukaryotic translation initiation factors.

    \ ' '8496' 'IPR013971' '\

    HalX is a protein of unknown function, previously mis-annotated as HoxA-like transcriptional regulator.

    \ ' '8497' 'IPR013972' '\

    YcbB is a DNA-binding protein PUBMED:15995196.

    \ ' '8498' 'IPR013973' '\

    This entry is a member of the Alkaline phosphatase clan.

    \ ' '8499' 'IPR013974' '\

    This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins.

    \ ' '8500' 'IPR013975' '\

    This entry includes an N-terminal helix-turn-helix domain.

    \ ' '8501' 'IPR013976' '\

    This domain is found in a superfamily of enzymes with a predicted or known phosphohydrolase activity. These members appear to be involved in the nucleic acid metabolism and signal transduction or possibly other functions and are restricted to bacteria, primarily the proteobacteria. The fact that all the highly conserved residues in the HD superfamily are histidines or aspartates suggests that coordination of divalent cations is essential for the activity of these proteins PUBMED:9868367.

    \ ' '8502' 'IPR013977' '\

    This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase.

    \ ' '8503' 'IPR013978' '\

    The MEKHLA domain shares similarity with the PAS domain and is found in the 3\' end of plant HD-ZIP III homeobox genes, and bacterial proteins.

    \ ' '8504' 'IPR010981' '\

    The SinR repressor is part of a group of Sin (sporulation inhibition) proteins in Bacillus subtilis that regulate the commitment to sporulation in response to extreme adversity PUBMED:9799632. SinR is a tetrameric repressor protein that binds to the promoters of genes essential for entry into sporulation and prevents their transcription. This repression is overcome through the activity of SinI, which disrupts the SinR tetramer through the formation of a SinI-SinR heterodimer, thereby allowing sporulation to proceed. The SinR structure consists of two domains: a dimerisation domain stabilised by a hydrophobic core, and a DNA-binding domain that is identical to domains of the bacteriophage 434 CI and Cro proteins that regulate prophage induction. The dimerisation domain is a four-helical bundle formed from two helices from the C-terminal residues of SinR and two helices from the central residues of SinI. These regions in SinR and SinI are similar in both structure and sequence. The interaction of SinR monomers to form tetramers is weaker than between SinR and SinI, since SinI can effectively disrupt SinR tetramers.

    \

    This entry represents the dimerisation domain in both SinI and SinR proteins.

    \ ' '8505' 'IPR014786' '\

    The anaphase-promoting complex (APC) is a multi-subunit E3 protein ubiquitin ligase that is responsible for the metaphase to anaphase transition and the exit from mitosis. Anaphase is initiated when the APC triggers the destruction of securin, thereby allowing the protease, separase, to disrupt sister-chromatid cohesion. Securin ubiquitination by the APC is inhibited by cyclin-dependent kinase 1 (Cdk1)-dependent phosphorylation PUBMED:18552837.

    \ \

    Forkhead Box M1 (FoxM1), which is a transcription factor that is over-expressed in many cancers, is degraded in late mitosis and early G1 phase by the APC/cyclosome (APC/C) E3 ubiquitin ligase PUBMED:18573889. The APC/C targets mitotic cyclins for destruction in mitosis and G1 phase and is then inactivated at S phase. It thereby generates alternating states of high and low cyclin-Cdk activity, which is required for the alternation of mitosis and DNA replication PUBMED:18559889.

    \ \

    APC from Schizosaccharomyces pombe and Saccharomyces cerevisiae was previously thought to have 11 subunits, but more sensitive techniques have identified 13 subunits in both yeasts PUBMED:12477395.

    \ \

    APC2 is an E3 ubiquitin ligase which is part of the SCF family of ubiquitin ligases. Ubiquitin ligases catalyse the transfer of ubiquitin from the ubiquitin conjugating enzyme (E2), to the substrate protein.

    \ ' '8506' 'IPR014787' '\

    The phosphoserine phosphatase RsbU acts as a positive regulator of the general stress-response factor of Gram-positive organisms, sigma-B. RsbU dephosphorylates rsbV in response to environmental stress conveyed from the rsbXST module. The phosphatase activity of RsbU is stimulated during the stress response by associating with the RsbT kinase. This association leads to the induction of sigmaB activity. The N-terminal domain forms a helix-swapped dimer that is otherwise similar to the KaiA domain dimer. Deletions in the N-terminal domain are deleterious to the activity of RsbU. The C-terminal domain of RsbU is similar to the catalytic domains of PP2C-type phosphatases PUBMED:15263010.

    \ ' '8507' 'IPR014788' '\

    Cholinesterase enzymes are members of the broader alpha/beta hydrolase family and can be dividied into two distinct groups: those that catalyse the hydrolysis of acetylcholine to choline and acetate (acetylcholinesterases )\ \ and those that catalyse the conversion of other acylcholines to a choline and a weak acid (cholinesterases )\ \

    \ \

    Acetylcholinesterase also acts on a variety of acetic esters and catalyses transacetylations. It is the most intensively studied of the cholinesterase enzymes due to its key physiological role in the turnover of the neurotransmitter acylcholine PUBMED:15907917. This enzyme is found in, or attached to, cellular or basement membranes of presynaptic cholinergic neurons and postsynaptic cholinoceptive cells within the neuromuscular junction. Signal transmission at the neuromuscular junction involves the release of acylcholine, its interaction with the acycholine receptor and hydrolysis, all occuring in a period of a few milliseconds. Rapid hydrolysis of the newly released aceytlcholine is vital in order to prevent continuous firing of the nerve impulses PUBMED:8161450. Consistent with its role in this process, acetylcholinesterase has an unusually high turnover number, ensuring that acetylcholine is broken down quickly. There is evidence to suggest that acetylcholinesterase has additional important roles including involvement in neuronal adhesion, the formation of Alzheimer fibrils, and neurite growth PUBMED:8890157, PUBMED:8608006, PUBMED:11169626.

    \ \ \ \

    The 3D structure of acetylcholinesterase and a cholinesterase have been determined PUBMED:1678899, PUBMED:12869558. These proteins share the 3-layer alpha-beta-alpha sandwich fold common to members of the alpha/beta hydrolase family. Surprisingly, given the high turnover number of acetylcholinesterase, the active site of these enzymes is located at the bottom of a deep and narrow cleft, named the active-site gorge.

    \

    The acetylcholinesterase tetramerisation domain is found at the C terminus and forms a left handed superhelix.

    \ ' '8508' 'IPR014789' '\

    This domain corresponds to the RNA binding domain of Poly(A)-specific ribonuclease (PARN).

    \ ' '8509' 'IPR014790' '\

    MutL and MutS are key components of the DNA repair machinery that corrects replication errors PUBMED:8811176. MutS recognises mispaired or unpaired bases in a DNA duplex and in the presence of ATP, recruits MutL to form a DNA signalling complex for repair. The N-terminal region of MutL contains the ATPase domain and the C-terminal is involved in dimerisation PUBMED:15470502.

    \ ' '8510' 'IPR014791' '\

    The bacteriophage baseplate controls host cell recognition, attachment, tail sheath contraction and viral DNA ejection. The baseplate is a multi-subunit assembly at the distal end of the tail, which is composed of long and short tail fibres PUBMED:12923574. The tail region is responsible for attachment to the host bacteria during infection: long tail fibres enable host receptor recognition, while irreversible attachment is via short tail fibres. Recognition and attachment induce a conformational transition of the baseplate from a hexagonal to a star-shaped structure. In viruses such as Bacteriophage T4, GP11 acts as a structural protein to connect the short tail fibres to the baseplate, while GP9 connects the baseplate with the long tail fibres. Both GP9 and GP11 are trimers. Each GP11 monomer consists of three domains, which are entwined together in the trimer: the N-terminal domains of the three monomers form a central, trimeric, parallel coiled coil surrounded by the entwined middle finger domains; the C-terminal domains appear to be responsible for trimerisation PUBMED:10966799.

    \ ' '8511' 'IPR014792' '\

    Rsbr is a regulator of the RNA polymerase sigma factor subunit sigma(B). The structure of the N-terminal domain belongs to the globin fold superfamily PUBMED:16301540.

    \ ' '8512' 'IPR014793' '\

    The structure of the dissimilatory sulphite reductase D (DsrD) protein has shown it to contain a winged-helix motif similar to those found in DNA binding proteins PUBMED:12962631. The structure suggests a possible role for DsrD in transcription or translation of genes, which catalyse dissimilatory sulphite reduction.

    \ ' '8513' 'IPR014794' '\

    This is a family of uncharacterised proteins. The structure of the ywmB protein from Bacillus subtilis has shown it to adopt an alpha/beta fold.

    \ ' '8514' 'IPR014795' '\

    This is a family of uncharacterised proteins. The structure of one of the hypothetical proteins in this family has been solved and it forms a helix structure which may form interactions with DNA.

    \ ' '8515' 'IPR014796' '\

    This is a family of uncharacterised proteins. The structure of a hypothetical protein from Pseudomonas aeruginosa has shown it to adopt an alpha/beta fold.

    \ ' '8516' 'IPR014797' '\

    This is a family of uncharacterised proteins. The structure of a murine hypothetical protein from RIKEN cDNA has shown it to adopt a mainly beta barrel structure with an alpha hairpin.

    \ ' '8517' 'IPR014798' '\

    The structure of an Ocr protein from bacteriophage T7 has shown that this protein mimics the size and shape of a bent DNA molecule PUBMED:11804597. Ocr has also been shown to be an inhibitor of the complex type I DNA restriction enzymes PUBMED:11804597.

    \ ' '8518' 'IPR012314' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The ADAMTSs (a disintegrin and metalloproteinase domain with thrombospondin\ type-1 modules) are a family of zinc dependent metalloproteinases that play\ important roles in a variety of normal and pathological conditions. These\ enzymes show a complex domain organization including signal sequence,\ propeptide, metalloproteinase domain (see ), disintegrin-like\ domain (see ), central TS-1 motif (see ), cysteine-rich\ region, and a variable number of TS-like repeats at the C-terminal region. The\ GON domain is an approximately 200-residue module, whose presence is the hallmark of\ a subfamily of structurally and evolutionarily related ADAMTSs, called GON-\ ADAMTSs. The GON domain is characterised by the presence of several conserved\ cysteine residues and is likely to be globular PUBMED:12562771, PUBMED:12514189.

    \

    Some proteins known to contain a GON domain are listed below:\

    \

    \

    Proteins containing the GON domain belong to MEROPS peptidase subfamily M12B (adamalysin, clan MA).

    \ ' '8519' 'IPR010909' '\

    The PLAC (protease and lacunin) domain is a six-cysteine region of about 40 residues that is present at or near the C-terminal of various enzymes and matrix proteins, including: mammalian PACE4 (paired basic amino acid cleaving enzyme 4), mammalian PCSK5 (proprotein convertase subtilisin/kexin type 5), mammalian metalloproteinases ADAMTS-2, -3, -10, -12, -14, -16, -17, and -19, and manduca Sexta matrix protein lacunin PUBMED:11867212. The PLAC domain is often associated with other domains, such as the thrombospondin type I repeat (TSP1) (), the Kunitz proteinase inhibitor domain (), the Ig-like domain (), the WAP domain (), the subtilase domain (), or the ADAM-type metalloprotease domain ().

    \ ' '8520' 'IPR014799' '\

    Cell shape changes require the coordination of actin and microtubule cytoskeletons. The Shroom family is a small group of related proteins that are defined by sequence similarity and in most cases by some link to the actin cytoskeleton. The Shroom (Shrm) protein family is found only in animals. Proteins of this family are predicted to be utilised in multiple morphogenic and developmental processes across animal phyla to regulate cells shape or intracellular architecture in an actin and myosin-dependent manner PUBMED:16684770. While the founding member of the Shrm family is Shrm1 (formerly Apx), it appears that this protein is found only in Xenopus PUBMED:17009331. In mice and humans, the Shrm family of proteins consists of:\

    \

    This protein family is based on the conservation of a specific arrangement of an N-terminal PDZ domain, a centrally positioned sequence motif termed ASD1 (Apx/Shrm Domain 1) and a C-terminal motif termed ASD2 PUBMED:16684770, PUBMED:17009331, PUBMED:10589677. Shrm2 and Shrm3 contain all three domains, while Shrm4 contains the PDZ and ASD2 domains, but lacks a discernible ASD1 element. To date, the ASD1 and ASD2 elements have only been found in Shrm-related proteins and do not appear in combination with other conserved domains. ASD1 is required for targeting actin, while ASD2 is capable of eliciting an actomyosin based constriction event PUBMED:16684770, PUBMED:17009331. \ ASD2 is the most highly conserved sequence element shared by Shrm1, Shrm2, Shrm3, and Shrm4. It possesses a well conserved series of leucine residues that exhibit spacing consistent with that of a leucine zipper motif PUBMED:16684770.

    \ \

    Shroom2 is both necessary and sufficient to govern the localization of pigment granules at the apical surface of epithelial cells. Shroom2 is a central regulator of RPE pigmentation. Despite their diverse biological roles, Shroom family proteins share a common activity. Since the locus encoding human SHROOM2 lies within the critical region for two distinct forms of ocular albinism, it is possible that SHROOM2 mutations may contribute to human visual system disorders PUBMED:16987870.

    \ ' '8521' 'IPR014800' '\

    Cell shape changes require the coordination of actin and microtubule cytoskeletons. The Shroom family is a small group of related proteins that are defined by sequence similarity and in most cases by some link to the actin cytoskeleton. The Shroom (Shrm) protein family is found only in animals. Proteins of this family are predicted to be utilised in multiple morphogenic and developmental processes across animal phyla to regulate cells shape or intracellular architecture in an actin and myosin-dependent manner PUBMED:16684770. While the founding member of the Shrm family is Shrm1 (formerly Apx), it appears that this protein is found only in Xenopus PUBMED:17009331. In mice and humans, the Shrm family of proteins consists of:\

    \

    This protein family is based on the conservation of a specific arrangement of an N-terminal PDZ domain, a centrally positioned sequence motif termed ASD1 (Apx/Shrm Domain 1) and a C-terminal motif termed ASD2 PUBMED:16684770, PUBMED:17009331, PUBMED:10589677. Shrm2 and Shrm3 contain all three domains, while Shrm4 contains the PDZ and ASD2 domains, but lacks a discernible ASD1 element. To date, the ASD1 and ASD2 elements have only been found in Shrm-related proteins and do not appear in combination with other conserved domains. ASD1 is required for targeting actin, while ASD2 is capable of eliciting an actomyosin based constriction event PUBMED:16684770, PUBMED:17009331. \ ASD2 is the most highly conserved sequence element shared by Shrm1, Shrm2, Shrm3, and Shrm4. It possesses a well conserved series of leucine residues that exhibit spacing consistent with that of a leucine zipper motif PUBMED:16684770.

    \ \

    This region is found in the actin binding protein Shroom. ASD1 has been implicated directly in F-actin binding.

    \ ' '8522' 'IPR014801' '\

    This entry represents the Med5 subunit of the Mediator complex in fungi. Deletion of the MED5 gene leads to increased transcription of nuclear genes encoding components of the oxidative phosphorylation machinery, and decreased transcription of mitochondrial genes encoding components of the same machinery PUBMED:16230344.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '8523' 'IPR014802' '\

    This family corresponds to the GET complex subunit GET2. The GET complex is involved in the retrieval of ER resident proteins from the Golgi PUBMED:16269340.

    \ ' '8524' 'IPR014803' '\

    Nse5 is a non essential nuclear protein that is critical for chromosome segregation in fission yeast PUBMED:16478984. Nse5 forms a dimer with Nse6 and facilitates DNA repair as part of the Smc5-Smc6 holocomplex.

    \ ' '8525' 'IPR014804' '\

    Pet20 is a mitochondrial protein which is thought to play a role in the correct assembly/maintenance of mitochondrial components PUBMED:16491469.

    \ ' '8526' 'IPR014805' '\

    SKG6/Axl2 are membrane proteins that shows polarised intracellular localisation PUBMED:17460121, PUBMED:16314687. SKG6_Tmem is the highly conserved transmembrane alpha-helical domain of SKG6 and Axl2 proteins PUBMED:17460121, PUBMED:16816427. The full-length fungal protein has a negative regulatory function in cytokinesis PUBMED:16816427.

    \ ' '8527' 'IPR014806' '\

    Ubiquitin-like (UBL) post-translational modifiers are covalently linked to most, if not all, target protein(s) through an enzymatic cascade analogous to ubiquitylation, consisting of E1 (activating), E2 (conjugating), and E3 (ligating) enzymes. Ubiquitin-fold modifier 1 (Ufm1) a ubiquitin-like protein is activated by a novel E1-like enzyme, Uba5, by forming a high-energy thioester bond. Activated Ufm1 is then transferred to its cognate E2-like enzyme, Ufc1, in a similar thioester linkage. This family represents the E2-like enzyme PUBMED:15071506.

    \ ' '8528' 'IPR014807' '\

    This is a fungal family of uncharacterised proteins.

    \ ' '8529' 'IPR014808' '\

    Dna2 is a DNA replication factor with single-stranded DNA-dependent ATPase, ATP-dependent nuclease, (5\'-flap endonuclease) and helicase activities. It is required for Okazaki fragment processing and is involved in DNA repair pathways PUBMED:10880469.

    \ ' '8531' 'IPR014810' '\

    This domain is found in eukaryotic nucleolar proteins that are involved in pre-rRNA processing PUBMED:16762320.

    \ ' '8532' 'IPR014811' '\

    This region is found in argonaute PUBMED:16216572 proteins and often co-occurs with and .

    \ ' '8533' 'IPR014812' '\

    The VFT tethering complex (also known as GARP complex, Golgi associated retrograde protein complex, Vps53 tethering complex) is a conserved eukaryotic docking complex which is involved in recycling of proteins from endosomes to the late Golgi. Vps51 (also known as Vps67) is a subunit of VFT and interacts with the SNARE Tlg1 PUBMED:12377769.

    \ ' '8534' 'IPR014813' '\

    Grn1 (yeast) and GNL3L (human) are putative GTPases which are required for growth and play a role in processing of nucleolar pre-rRNA PUBMED:16251348. This family contains a potential nuclear localisation signal.

    \ ' '8535' 'IPR014814' '\

    Fibrinogen is a protein involved in platelet aggregation and is essential for the coagulation of blood. This domain forms part of the central coiled coiled region of the protein which is formed from two sets of three non-identical chains (alpha, beta and gamma).

    \ ' '8536' 'IPR014815' '\

    This domain corresponds to the alpha helical C-terminal domain of phospholipase C beta.

    \ ' '8537' 'IPR014816' '\

    GCD14 is a subunit of the tRNA methyltransferase complex and is required for 1-methyladenosine modification and maturation of initiator methionyl-tRNA PUBMED:9851972.

    \ ' '8538' 'IPR014817' '\

    HIV protein p6 contains two late-budding domains (L domains) which are short sequence motifs essential for viral particle release. p6 interacts with the endosomal sorting complex and represents a docking site for several cellular and binding factors PUBMED:16234236. The PTAP motif interacts with the cellular budding factor TSG101 PUBMED:16234236. This domain is also found in some chimpanzee immunodeficiency virus (SIV-cpz) proteins.

    \ ' '8539' 'IPR014818' '\

    This domain is found in D5 proteins of DNA viruses and bacteriophage P4 DNA primase.

    \ ' '8540' 'IPR014819' '\

    This alpha helical domain is found at the C-terminal of primases.

    \ ' '8541' 'IPR014820' '\

    This alpha helical domain is found at the C-terminal of primases.

    \ ' '8542' 'IPR014821' '\

    This protein corresponds to the ligand binding region on inositol 1,4,5-trisphosphate receptor, and the N-terminal region of the ryanodine receptor. Both receptors are involved in Ca2+ release. They can couple to the activation of neurotransmitter-gated receptors and voltage-gated Ca2+ channels on the plasma membrane, thus allowing the endoplasmic reticulum to discriminate between different types of neuronal activity PUBMED:15664189.

    \ ' '8543' 'IPR014822' '\

    Nsp9 is a single-stranded RNA-binding viral protein likely to be involved in RNA synthesis PUBMED:15007178. The structure comprises of a single beta barrel PUBMED:12925794.

    \ ' '8544' 'IPR010990' '\

    Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II PUBMED:3346229. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius PUBMED:8502569. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner PUBMED:1917889, PUBMED:8566795.

    \

    TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III PUBMED:12914699. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

    \

    This entry represents the conserved N-terminal domain found in the transcription elongation factors TFIIS, elongin A and CRSP70 PUBMED:10811649. The N-terminal domain in these transcription factors is conserved from yeast to man, and has a 4-helical bundle fold with a left-handed twist within a left-handed superhelix. Elongin A is a mammalian transcription elongation factor that forms the active subunit of the Elongin complex, which stimulates the rate of elongation by RNA polymerase II by suppressing the transient pausing of the polymerase at many sites along the DNA template PUBMED:17112477. CRSP70 is an essential subunit of the CRSP complex, which is required for the activity of the enhancer-binding protein Sp1 PUBMED:9989412.

    \ ' '8545' 'IPR014824' '\

    Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S] PUBMED:16221578. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.

    \

    The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins PUBMED:16211402, PUBMED:16843540. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly PUBMED:15937904.

    \

    The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA PUBMED:17350000. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA PUBMED:15278785, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.

    \

    In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins PUBMED:11498000. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen PUBMED:8875867.

    \ \

    This protein is found at the N terminus of NifU (from NIF system) and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the assembly of iron-sulphur clusters, functioning as scaffolds PUBMED:12886008, PUBMED:14993221.

    \ ' '8546' 'IPR014825' '\

    These proteins are predicted to be DNA alkylation repair enzymes. The structure of a hypothetical protein shows it to adopt a super coiled alpha helical structure.

    \ ' '8547' 'IPR014826' '\

    This family consists of formaldehyde-activating enzyme, or the corresponding domain of longer, bifunctional proteins. It links formaldehyde to the C1 carrier tetrahydromethanopterin (H4MPT), an analog of tetrahydrofolate, and is common among species with H4MPT PUBMED:11073907. The ribulose monophosphate (RuMP) pathway, which removes the toxic metabolite formaldehyde by assimilation, runs in the opposite direction in some species to produce ribulose 5-phosphate for nucleotide biosynthesis, leaving formaldehyde as an additional metabolite. In these species, formaldehyde activating enzyme may occur as a fusion protein with D-arabino 3-hexulose 6-phosphate formaldehyde lyase from the RuMP pathway.

    \ ' '8548' 'IPR014827' '\

    This family of viral proteases are similar to the papain protease and are required for proteolytic processing of the replicase polyprotein. The structure of this protein has shown it adopts a fold similar to that of de-ubiquitinating enzymes PUBMED:16581910.

    \ ' '8549' 'IPR014828' '\

    Nsp7 (non structural protein 7) has been implicated in viral RNA replication and is predominantly alpha helical in structure PUBMED:16188992. It forms a hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure PUBMED:16228002. The dimensions of the central channel and positive electrostatic properties of the cylinder imply that it confers processivity on RNA-dependent RNA polymerase PUBMED:16228002.

    \ ' '8550' 'IPR014829' '\

    Viral Nsp8 (non structural protein 8) forms a hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure PUBMED:16228002. The dimensions of the central channel and positive electrostatic properties of the cylinder imply that it confers processivity on RNA-dependent RNA polymerase PUBMED:16228002.

    \ ' '8551' 'IPR014830' '\

    Glycolipid transfer protein (GLTP) is a cytosolic protein that catalyses the intermembrane transfer of glycolipids such as glycosphingolipids, glyceroglycolipids, and possibly glucosylceramides, but not of phospholipids. GLTP has a multi-helical structure consisting of two layers of orthogonally packed helices PUBMED:15504043, PUBMED:16309699.

    \ ' '8552' 'IPR012816' '\

    This entry describes a sequence region that occurs in at least three different polypeptide contexts. It is found fused to GTP cyclohydrolase II, the RibA of riboflavin biosynthesis (), as in Vibrio vulnificus. It is found fused to riboflavin biosynthesis protein RibD () in rice and Arabidopsis. It occurs as a standalone protein in a number of bacterial species in varied contexts, including single gene operons and bacteriophage genomes. The member from Escherichia coli currently is named YbiA. The function(s) of members of this family is unknown.

    \ ' '8553' 'IPR014831' '\

    This protein corresponds to the stalk segment of haemagglutinin in influenza C virus. It forms a coiled coil structure PUBMED:9817207.

    \ ' '8554' 'IPR014832' '\

    The Tn7 transposase is composed of proteins TnsA and TnsB. DNA breakage at the 5\'-end of the transposon is carried out by TnsA, and breakage and joining at the 3\'-end is carried out by TnsB. The C-terminal domain of TnsA binds DNA.

    \ ' '8555' 'IPR014833' '\

    The Tn7 transposase is composed of proteins TnsA and TnsB. DNA breakage at the 5\'-end of the transposon is carried out by TnsA, and breakage and joining at the 3\'-end is carried out by TnsB. The N-terminal domain of TnsA is catalytic.

    \ ' '8556' 'IPR014834' '\

    Gag p15 is a viral membrane-binding matrix protein which is alpha helical in structure.

    \ ' '8557' 'IPR014835' '\

    Adeno-associated virus (AAV) Replication (Rep) protein is essential for viral replication and integration. The catalytic domain has DNA binding and endonuclease activity.

    \ ' '8558' 'IPR014836' '\

    Integrins are the major metazoan receptors for cell adhesion to extracellular matrix proteins and, in vertebrates, also play important roles in certain cell-cell adhesions, make transmembrane connections to the cytoskeleton and activate many intracellular signalling pathways PUBMED:12297042, PUBMED:12361595. The integrin receptors are composed of alpha and beta subunit heterodimers. Each subunit crosses the membrane once, with most of the polypeptide residing in the extracellular space, and has two short cytoplasmic domains. Some members of this family have EGF repeats at the C terminus and also have a vWA domain inserted within the integrin domain at the N terminus.

    \

    Most integrins recognise relatively short peptide motifs, and in general require an acidic amino acid to be present. Ligand specificity depends upon both the alpha and beta subunits PUBMED:12234368. There are at least 18 types of alpha and 8 types of beta subunits recognised in humans PUBMED:14689578. Each alpha subunit tends to associate only with one type of beta subunit, but there are exceptions to this rule PUBMED:2467745. Each association of alpha and beta subunits has its own binding specificity and signalling properties. Many integrins require activation on the cell surface before they can bind ligands. Integrins frequently intercommunicate, and binding at one integrin receptor activate or inhibit another.

    \

    The structure of unliganded alphaV beta3 showed the molecule to be folded, with the head bent over towards the C termini of the legs which would normally be inserted into the membrane PUBMED:12714499. The head comprises a beta propeller domain at the end terminus of the alphaV subunit and an I/A domain inserted into a loop on the top of the hybrid domain in the beta subunit. The I/A domain consists of a Rossman fold with a core of beta parallel sheets surrounded by amphipathic alpha helices.

    \ \

    This entry represents the cytoplasmic domain of integrin beta subunits.

    \ ' '8559' 'IPR014837' '\

    EF hands are helix-loop-helix binding motifs involved in the regulation of many cellular processes. EF hands usually bind to Ca2+ ions, which cause a major conformational change that allows the protein to interact with its designated targets. This protein corresponds to an EF hand which has partially or entirely lost its calcium-binding properties. The calcium insensitive EF hand is still able to mediate protein-protein recognition PUBMED:11573089.

    \ ' '8560' 'IPR014838' '\

    This protein is found in positive-strand RNA viruses. The 3A protein is a critical component of the poliovirus replication complex, and is also an inhibitor of host cell ER to Golgi transport.

    \ ' '8561' 'IPR014839' '\

    CRT10 is a transcriptional regulator of ribonucleotide reductase (RNR) genes PUBMED:16600900. RNR catalyses the rate limiting step in dNTP synthesis. Mutations in CRT10 have been shown to enhance hydroxyurea resistance PUBMED:16600900.

    \ ' '8562' 'IPR014840' '\

    HPC2 is required for cell-cycle regulation of histone transcription PUBMED:1406694. It regulates transcription of the histone genes during the S-phase of the cell cycle by repressing transcription at other cell cycle stages. HPC2 mutants display synthetic interactions with FACT complex, which allows RNA Pol II to elongate through nucleosomes PUBMED:12524332.

    \ ' '8563' 'IPR014841' '\

    Rad33 is involved in nucleotide excision repair (NER). NER is the main pathway for repairing DNA lesions induced by UV. Cells deleted for RAD33 display intermediate UV sensitivity that is epistatic with NER PUBMED:16595192.

    \ ' '8564' 'IPR014842' '\

    AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2.

    \ ' '8565' 'IPR014843' '\

    HIM1 (high induction of mutagenesis protein 1) plays a role in the control of spontaneous and induced mutagenesis PUBMED:15885712. It is thought to participate in the control of processing of mutational intermediates appearing during error-prone bypass of DNA damage.

    \ ' '8566' 'IPR014844' '\

    PalH (also known as RIM21) is a transmembrane protein required for proteolytic cleavage of Rim101/PacC transcription factors which are activated by C-terminal proteolytic processing. Rim101/PacC family proteins play a key role in pH-dependent responses and PalH has been implicated as a pH sensor PUBMED:16099830.

    \ ' '8567' 'IPR014845' '\

    This protein is found in a range of bacteria. It is usually less than 100 amino acids in length. The function of the protein is unknown. It may belong to the dimeric alpha/beta barrel superfamily.

    \ ' '8568' 'IPR014846' '\

    This family is annotated as pyruvate formate-lyase activating enzyme () in UniProt. It is not clear where this annotation comes from.

    \ ' '8569' 'IPR014847' '\

    This region is found adjacent to Band 4.1 / FERM domains () in a subset of FERM containing protein. The region has been hypothesised to play a role in regulatory adaptation, based on similarity to other protein kinase substrates PUBMED:16626485.

    \ ' '8570' 'IPR014848' '\

    Rgp1 forms heterodimer with Ric1 () which associates with Golgi membranes and functions as a guanyl-nucleotide exchange factor PUBMED:10990452.

    \ ' '8571' 'IPR014849' '\

    In Saccharomyces cerevisiae Gon7 is a member of the KEOPS protein complex. A protein complex proposed to be involved in transcription and promoting telomere uncapping and telomere elongation PUBMED:16564010.

    \ ' '8573' 'IPR014851' '\

    AAA ATPases (ATPases Associated with diverse cellular Activities) form a large protein family and play a number of roles in the cell including cell-cycle regulation, protein proteolysis and disaggregation, organelle biogenesis and intracellular transport. Some of them function as molecular chaperones, subunits of proteolytic complexes or independent proteases (FtsH, Lon). They also act as DNA helicases and transcription factors PUBMED:17201069.

    \ \

    AAA ATPases belong to the AAA+ superfamily of ringshaped P-loop NTPases, which act via the energy-dependent unfolding of macromolecules PUBMED:15037233, PUBMED:16828312. There are six major clades of AAA domains (proteasome subunits, metalloproteases, domains D1 and D2 of ATPases with two AAA domains, the MSP1/katanin/spastin group and BCS1 and it homologues), as well as a number of deeply branching minor clades PUBMED:15037233.

    \ \

    They assemble into oligomeric assemblies (often hexamers) that form a ring-shaped structure with a central pore. These proteins produce a molecular motor that couples ATP binding and hydrolysis to changes in conformational states that act upon a target substrate, either translocating or remodelling it PUBMED:16919475.

    \ \ \

    They are found in all living organisms and share the common feature of the presence of a highly conserved AAA domain called the AAA module. This domain is responsible for ATP binding and hydrolysis. It contains 200-250 residues, among them there are two classical motifs, Walker A (GX4GKT) and Walker B (HyDE) PUBMED:17201069.

    \ \

    This protein is found at the N terminus of the mitochondrial BSC1 subfamily, belonging to the AAA ATPase family.

    \ \

    At2g21640 and BCS1 are both highly stress responsive genes which encode mitochondrial proteins. The promoter of BCS1 was not responsive to H2O2 or rotenone, but highly responsive to salicylic acid (SA). The SA dependent pathway represented by BCS1 is one of at least three distinctive pathways to regulate mitochondrial stress response at a transcriptional level PUBMED:18567827. The BCS1 product is a mitochondrial protein required for the assembly of respiratory complex III PUBMED:17328740.

    \ \

    BCS1, a component of the inner membrane of mitochondria, belongs to the group of proteins with internal, noncleavable import signals. It has a transmembrane domain (amino acid residues 51 to 68), a presequence type helix (residues 69 to 83), and an import auxiliary region (residues 84 to 126) PUBMED:12640110.

    \ ' '8574' 'IPR014852' '\

    The members of this entry are currently uncharacterised. They are around 170 amino acids in length.

    \ ' '8575' 'IPR014853' '\

    The proteins in this entry contained a domain rich in positionally conserved cysteine residues. Most proteins contains 7 or 8 cysteine residues. The domain is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing and .

    \ ' '8576' 'IPR014854' '\

    Nse4 is a component of the Smc5/6 DNA repair complex. It forms interactions with Smc5 and Nse1 PUBMED:15331764.

    \ ' '8577' 'IPR014855' '\

    NOZZLE is a transcription factor that plays a role in patterning the proximal-distal and adaxial-abaxial axes PUBMED:12074197, PUBMED:12183381.

    \ ' '8578' 'IPR014856' '\

    These proteins are functionally uncharacterised and are about 200 amino acids in length.

    \ ' '8579' 'IPR014857' '\

    This is a zinc finger domain that is related to the C3HC4 RING finger domain ().

    \ ' '8580' 'IPR014858' '\

    This entry represents a putative uncharacterised protein of length around 200 amino acids.

    \ ' '8581' 'IPR014859' '\

    This entry represents a putative uncharacterised protein found in phage-related conserved hypothetical protein from Bordetella.

    \ ' '8582' 'IPR014860' '\

    These proteins are functionally uncharacterised and range in length from 150 to 210 amino acids.

    \ ' '8583' 'IPR014861' '\

    This group of proteins are likely to be lipoproteins. CNP1 (cryptic neisserial protein) has been expressed in Escherichia coli and shown to be localised periplasmicly PUBMED:1541538.

    \ ' '8584' 'IPR014862' '\

    Relaxases are DNA strand transferases which function during the conjugative cell to cell DNA transfer. TrwC binds to the origin of transfer (oriT) and melts the double helix.

    \ ' '8585' 'IPR014863' '\

    Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer PUBMED:15261670. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins PUBMED:14690497. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi PUBMED:11208122. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes PUBMED:17041781. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta\', gamma, delta, epsilon and zeta subunits.

    \

    This entry represents the C-terminal appendage domain of the gamma subunit of coatomer complexes. The appendage domain of the gamma coatomer subunit has a similar overall structural fold to the appendage domain of clathrin adaptors, and can also share the same motif-based cargo recognition and accessory factor recruitment mechanisms. The coatomer gamma subunit appendage domain contains a protein-protein interaction site and a second proposed binding site that interacts with the alpha, beta, epsilon COPI subcomplex PUBMED:14690497.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '8586' 'IPR014864' '\

    NikR is a transcription factor that regulates nickel uptake. It consists of two dimeric DNA binding domains separated by a tetrameric regulatory domain that binds nickel. This protein corresponds to the C-terminal regulatory domain which contains four nickel binding sites at the tetramer interface PUBMED:12970756.

    \ ' '8588' 'IPR011722' '\

    This entry describes the small protein from Escherichia coli YccV and its homologs in other Proteobacteria. YccV is now described as a hemimethylated DNA binding protein PUBMED:12700277. The model entry describes a domain in longer eukaryotic proteins.

    \ ' '8589' 'IPR014866' '\

    This protein is adjacent to YfkA in Bacillus subtilis. In other bacterial species, it is fused to this protein. As YfkA contains a Radical SAM domain it suggests this domain is interacts with them.

    \ ' '8590' 'IPR014867' '\

    Members of this group include the Bacillus subtilis spore coat protein H (CotH). Assembly of CotH requires both CotE and GerE and is reuired for the correct assembly of both inner and outer layers of the coat. CotH appears to be a structural component of the coat being localised at the interface of the 2 coat layers PUBMED:17114257, PUBMED:10198031, PUBMED:14762006.

    \ ' '8591' 'IPR014868' '\

    Cadherins are a group of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This protein corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions PUBMED:15130472.

    \ ' '8592' 'IPR014869' '\

    This protein is probably miss-annotated as a glycosyl transferase 8 family member. It is found at the C-terminus of protein such as that also contain the glycosyl transferase domain at the N-terminus.

    \ ' '8593' 'IPR014870' '\

    This protein is found at the C-terminus of a glutaminase protein from fungi PUBMED:10952006. It is also found as a single domain protein in Bacteroides thetaiotaomicron.

    \ ' '8594' 'IPR014871' '\

    2-Deoxyuridine 5-triphosphate nucleotidohydrolase (dUTPase) catalyses the hydrolysis of dUTP to dUMP and pyrophosphate (). Members of this family have a novel all-alpha fold and are unrelated to the all-beta fold found in dUTPases of the majority of organisms. This family contains both dUTPase homologues of dUTPase including dCTPase of phage T4.

    \ ' '8595' 'IPR014872' '\

    This protein is found in Cripavirus capsid proteins, which are positive stranded ssRNA viruses, such as Cricket paralysis virus (CRPV). It forms an all beta sheet structure PUBMED:10426956.

    \ ' '8596' 'IPR014873' '\

    Ca2+ ions are unique in that they not only carry charge but they are also the most widely used of diffusible second messengers. Voltage-dependent Ca2+ channels (VDCC) are a family of molecules that allow cells to couple electrical activity to intracellular Ca2+ signalling. The opening and closing of these channels by depolarizing stimuli, such as action potentials, allows Ca2+ ions to enter neurons down a steep electrochemical gradient, producing transient intracellular Ca2+ signals. Many of the processes that occur in neurons, including transmitter release, gene transcription and metabolism are controlled by Ca2+ influx occurring simultaneously at different cellular locales. The pore is formed by the alpha-1 subunit which incorporates the conduction pore, the voltage sensor and gating apparatus, and the known sites of channel regulation by second messengers, drugs, and toxins PUBMED:14657414. The activity of this pore is modulated by 4 tightly-coupled subunits: an intracellular beta subunit; a transmembrane gamma subunit; and a disulphide-linked complex of alpha-2 and delta subunits, which are proteolytically cleaved from the same gene product. Properties of the protein including gating voltage-dependence, G protein modulation and kinase susceptibility can be influenced by these subunits.

    \ \

    Voltage-gated calcium channels are classified as T, L, N, P, Q and R, and are distinguished by their sensitivity to pharmacological blocks, single-channel conductance kinetics, and voltage-dependence. On the basis of their voltage activation properties, the voltage-gated calcium classes can be further divided into two broad groups: the low (T-type) and high (L, N, P, Q and R-type) threshold-activated channels PUBMED:.

    \

    The voltage-gated calcium channel alpha 1 subunit contains an IQ domain, named for its isoleucine-glutamine (IQ) motif, which interacts with hydrophobic pockets of Ca2+/calmodulin PUBMED:16299511. The interaction regulates two self-regulatory calcium dependent feedback mechanisms, calcium dependent inactivation (CDI), and calcium-dependent facilitation (CDF).

    \ ' '8597' 'IPR014874' '\

    Staphylococcus aureus secretes a cofactor called coagulase. Coagulase is an extracellular protein that forms a complex with human prothrombin, and activates it without the usual proteolytic cleavages. The resulting complex directly initiates blood clotting.

    \ ' '8598' 'IPR014875' '\

    Mor (Middle operon regulator) is a sequence specific DNA binding protein. It mediates transcription activation through its interactions with the C-terminal domains of the alpha and sigma subunits of bacterial RNA polymerase. The N-terminal region of Mor is the dimerisation region, and the C-terminal contains a helix-turn-helix motif which binds DNA.

    \ ' '8599' 'IPR014876' '\

    DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C-terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients PUBMED:7504406. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family PUBMED:15238633. This domain is also found in chitin synthase proteins like , and in protein phosphatases such as .

    \ ' '8600' 'IPR014877' '\

    CRM1 (also known as Exportin1) mediates the nuclear export of proteins bearing a leucine-rich nuclear export signal (NES). CRM1 forms a complex with the NES containing protein and the small GTPase Ran. This region forms an alpha helical structure formed by six helical hairpin motifs that are structurally similar to the HEAT repeat, but share little sequence similarity to the HEAT repeat PUBMED:15574331.

    \ ' '8601' 'IPR014878' '\

    This protein forms a beta barrel structure. It is sometimes found on proteins containing a THAP () domain.

    \ ' '8602' 'IPR014879' '\

    The response regulator Spo0A is comprised of a phophoacceptor domain and a transcription activation domain. This domain corresponds to the transcription activation domain and forms an alpha helical structure comprising of 6 alpha helices. The structure contains a helix-turn-helix and binds DNA PUBMED:11069648, PUBMED:12176382.

    \ ' '8603' 'IPR014880' '\

    SoxZ forms an anti parallel beta structure and forms a complex with SoxY. Sulphur oxidation occurs at the thiol of a conserved cysteine residue of the SoxY subunit PUBMED:11513876.

    \ ' '8604' 'IPR009076' '\

    Rapamycin and FK506 are potent immunosuppressive agents that bind to the FK506-binding protein (FKBP12), inhibiting its peptidyl-prolyl isomerase activity. The rapamycin-FKBP12 complex can then bind to and inhibit the FKBP12-rapamycin-associated protein (FRAP) in humans and RAFT1 in rats, causing cell-cycle arrest PUBMED:10089303. The FK506-FKBP12 complex cannot bind FRAP, but can bind to and inhibit calcineurin. Rapamycin is able to bind to two proteins, FKBP12 and FRAP, by simultaneously occupying two hydrophobic binding pockets, thereby linking these two proteins together to form a dimer PUBMED:8662507. The structure of the FKBP12-rapamycin-binding domain of FRAP consists of a core bundle of four helices arranged up-and-down in a left-handed twist.

    \

    FRAP has been shown to interact in vitro with CLIP-170, a protein involved in microtubule organisation and function PUBMED:12231510. FRAP is thought to act as a kinase to phosphorylate CLIP-170, thereby regulating its binding to microtubules. FRAP is also thought to cooperate with p85/p110 phosphatidylinositol 3-kinase (PI3K) to induce the activation of the serine/threonine kinase p70 S6 kinase (p70S6K), which in turn phosphorylates the 40S ribosomal protein S6, thereby altering the translation of ribosomal proteins and translation elongation factors PUBMED:11684675.

    \ ' '8605' 'IPR014881' '\

    This entry corresponds to a zinc ribbon and is found on the RNA binding protein NOB1.

    \ ' '8606' 'IPR014882' '\

    Cathepsin C (dipeptidyl peptidase I) is the physiological activator of a group of serine proteases. This protein corresponds to the exclusion domain whose structure excludes the approach of a polypeptide apart from its termini. It forms an enclosed beta barrel structure composed from 8 anti-parallel beta strands PUBMED:11726493. Based on a structural comparison and interaction data, it is suggested that the exclusion domain originates from a metallo-protease inhibitor PUBMED:11726493.

    \ ' '8607' 'IPR014883' '\

    This entry contains proteins with the VRR-NUC domain. It is associated with members of the PD-(D/E)XK nuclease superfamily, which include the type III restriction modification enzymes, for example StyLTI: ().

    \ ' '8608' 'IPR014884' '\

    ParB is a component of the par system which mediates accurate DNA partition during cell division. It recognises A-box and B-box DNA motifs. ParB forms an asymmetric dimer with 2 extended helix-turn-helix (HTH) motifs that bind to A-boxes. The HTH motifs emanate from a beta sheet coiled coil DNA binding module PUBMED:16306995. Both DNA binding elements are free to rotate around a flexible linker, this enables them to bind to complex arrays of A- and B-box elements on adjacent DNA arms of the looped partition site PUBMED:16306995.

    \ ' '8609' 'IPR014885' '\

    Vasodilator-stimulated phosphoprotein (VASP) is an actin cytoskeletal regulatory protein. This region corresponds to the tetramerisation domain which forms a right handed alpha helical coiled coil structure PUBMED:15569942.

    \ ' '8610' 'IPR014886' '\

    This protein is found in protein La which functions as an RNA chaperone during RNA polymerase III transcription, and can also stimulate translation initiation. It contains a five stranded beta sheet which forms an atypical RNA recognition motif PUBMED:12842046.

    \ ' '8611' 'IPR014887' '\

    Hypoxia inducible factor-1 alpha (HIF-1 alpha) is the regulatory subunit of the heterodimeric transcription factor HIF-1. It plays a key role in cellular response to low oxygen tension. This region corresponds to the C-terminal transactivation domain.

    \ ' '8612' 'IPR014888' '\

    The structure of the coronavirus X4 protein (also known as 7a and U122) shows similarities to the immunoglobulin like fold and suggests a binding activity to integrin I domains PUBMED:16328780.

    \ ' '8613' 'IPR010235' '\

    The member of this family from Haemophilus influenzae, HI0074, has been shown by crystal structure to resemble nucleotidyltransferase substrate binding proteins PUBMED:12486719. It forms a complex with HI0073 (), encoded by the adjacent gene, which contains a nucleotidyltransferase nucleotide binding domain (). Double- and single-stranded DNA binding assays showed no evidence of DNA binding to HI0074 or to HI0073/HI0074 complex despite the suggestive shape of the putative binding cleft formed by the HI0074 dimer PUBMED:12486719.

    \ ' '8614' 'IPR014889' '\

    DP forms a heterodimer with E2F and regulates genes involved in cell cycle progression. The transcriptional activity of E2F is inhibited by the retinoblastoma protein which binds to the E2F-DP heterodimer PUBMED:16360038 and negatively regulates the G1-S transition.

    \ ' '8615' 'IPR014890' '\

    c-SKI is an oncoprotein that inhibits TGF-beta signalling through interaction with Smad proteins PUBMED:15107821. This protein binds to Smad4 PUBMED:12419246.

    \ ' '8616' 'IPR014891' '\

    The ~75-residue DWNN (Domain With No Name) domain is highly conserved through eukaryotic species but is absent in prokaryotes. The DWNN domain is found only at the N-terminus of the RBBP6 family of proteins which includes:

    \ \

    \

    All of the identified RBBP6 homologues include the DWNN domain, a CCHC-type zinc finger (see ) and a RING-type zinc finger (see ). The three domain form is found in plants, protozoa, fungi and microsporidia. The RBBP6 homologues in vertebrates, insects and worms are longer and include additional domains. In addition to forming part of the full-length RBBP6 protein, the DWNN domain is also expressed in vertebrates as a small protein containing a DWNN domain and a short C-terminal tail (RBBP6 variant 3). The DWNN domain adopts a fold similar to the ubiquitin one, characterised by two alpha-helices and four beta-sheets ordered as beta-beta-alpha-beta-alpha-beta along the sequence. The similarity of DWNN domain to ubiquitin and the presence of the RING finger suggest that the DWNN domain may act as an ubiquitin-like modifier, possibly playing a role in the regulation of the splicing machinery PUBMED:15733535, PUBMED:16396680.

    \ ' '8617' 'IPR014892' '\

    This protein corresponds to the C-terminal of the single stranded DNA binding protein RPA (replication protein A). RPA is involved in many DNA metabolic pathways including DNA replication, DNA repair, recombination, cell cycle and DNA damage checkpoints.

    \ ' '8618' 'IPR014893' '\

    The non-homologous end joining (NHEJ) pathway is one method by which double stranded breaks in chromosomal DNA are repaired. Ku is a component of a multi-protein complex that is involved in the NHEJ. Ku has affinity for DNA ends and recruits the DNA-dependent protein kinase catalytic subunit (DNA-PKcs). This domain is found at the C-terminal of Ku which binds to DNA-PKcs PUBMED:14672664.

    \ ' '8619' 'IPR014894' '\

    This is a bacterial protein of unknown function. It forms an antiparallel beta sheet structure and contains some alpha helical regions.

    \ ' '8620' 'IPR014895' '\

    Alginate lyases are enzymes that degrade the linear polysaccharide alignate. They cleave the glycosidic linkage of alignate through a beta-elimination reaction. This region forms an all beta fold, which is different to the all alpha fold of .

    \ ' '8621' 'IPR014896' '\

    NHR2 (Nervy homology 2) is found in the ETO protein where it mediates oligomerisation and protein-protein interactions. It forms an alpha-helical tetramer PUBMED:16616331.

    \ ' '8622' 'IPR014897' '\

    The small PBCV-specific basic adaptor protein is found fused to S/T protein kinases and the 2-Cysteine domain PUBMED:16494962.

    \ ' '8623' 'IPR014898' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This C2HC zinc finger domain is found in LYAR proteins such as , which are involved in cell growth regulation.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '8625' 'IPR014900' '\

    This zinc ribbon protein is found associated with some viral A2L transcription factors PUBMED:16494962.

    \ ' '8626' 'IPR014901' '\

    The virus-specific 2-cysteine adaptor is found fused to OTU/A20-like peptidases and S/T protein kinases. The associations to these proteins indicate that they might function as viral adaptors connecting the kinases and OTU/A20 peptidases to specific targets PUBMED:16494962.

    \ ' '8627' 'IPR014902' '\

    GNA1870 is a surface exposed lipoprotein in Neisseria meningitidis that is a potent antigen and a potential candidate for a vaccine against meningococcal disease. The structure of the C-terminal domain consists of an anti-parallel beta barrel overlaid by a short alpha helical region PUBMED:16407174.

    \ ' '8628' 'IPR014903' '\

    The proteins in this entry are uncharacterised; but are related to papain-like cysteine peptidases.

    \ ' '8629' 'IPR014904' '\

    The function of this protein is unknown. It forms a central anti-parallel beta sheet with flanking alpha helical regions.

    \ ' '8630' 'IPR014905' '\

    The HIRAN protein (HIP116, Rad5p N-terminal) is found in the N-terminal regions of the SWI2/SNF2 proteins typified by HIP116 and Rad5p. HIRAN is found as a standalone protein in several bacteria and prophages, or fused to other catalytic domains, such as a nuclease of the restriction endonuclease fold and TDP1-like DNA phosphoesterases, in the eukaryotes PUBMED:16627993. It has been predicted that this protein functions as a DNA-binding domain that probably recognises features associated with damaged DNA or stalled replication forks PUBMED:16627993.

    \ ' '8631' 'IPR010179' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a family of Cas proteins, including CT1974 from \ Chlorobium tepidum. This family is found in a minority of Cas regions.

    \ ' '8632' 'IPR014906' '\

    This small protein is found on PRP4 ribonuleoproteins. PRP4 is a U4/U6 small nuclear ribonucleoprotein that is involved in pre-mRNA processing.

    \ ' '8633' 'IPR014907' '\

    This presumed protein is found at the N-terminus of VirE proteins.

    \ ' '8634' 'IPR014908' '\

    Nup133 is a nucleoporin that is crucial for nuclear pore complex (NPC) biogenesis. The N-terminal forms a seven-bladed beta propeller structure PUBMED:15557116.

    \ ' '8635' 'IPR014909' '\

    The cytochrome b6-f complex mediates electron transfer between photosystem II (PSII) and photosystem I (PSI), cyclic electron flow around PSI, and state transitions. The cytochrome b6-f complex has 4 large subunits, these are: cytochrome b6, subunit IV (17 kDa polypeptide, PetD), cytochrome f and the Rieske protein, while the 4 small subunits are: PetG, PetL, PetM and PetN. The complex functions as a dimer.

    This protein corresponds to the alpha helical transmembrane domain of the cytochrome b6-f complex Rieske iron-sulphur subunit.

    \ ' '8636' 'IPR014910' '\

    YdhR is a homodimeric protein that comprises of a central four-stranded beta sheet and four surrounding alpha helices PUBMED:16260765. It shows structural homology to the ActVA-Orf6 and YgiN proteins which indicates it could be a mono-oxygenase.

    \ ' '8637' 'IPR012339' '\

    This entry represents the phage single stranded-DNA (ssDNA) binding protein, gp32, from Bacteriophage T4. Gp32 is essential for T4 DNA replication, recombination and repair, acting to stimulate replisome processing and accuracy through its binding to ssDNA as the replication fork advances. The crystal structure of gp32 shows an ssDNA binding cleft comprised of regions from three structural subdomains, through which ssDNA can slide freely PUBMED:7630406. The structure of gp32 is similar to other phage ssDNA-binding proteins such as gp2.5 from bacteriophage T4, and gene V protein, both of which have a nucleic acid-binding OB-type fold. However, gp32 contains a zinc-finger subdomain at residues 63-111 that is not found in the other two phage proteins.

    \ ' '8638' 'IPR014911' '\

    Type IV pili are bacterial virulence-associated adhesins that promote bacterial attachment to host cells. In Salmonella typhi, the structural pilin protein PilS interacts with the cystic fibrosis transmembrane conductance regulator PUBMED:14500527. Mutagenesis studies suggest that residues on an alpha-beta loop and the C-terminal disulphide-bonded region of PilS might be involved in binding specificity of the pilus PUBMED:15159389.

    \ ' '8639' 'IPR014912' '\

    Sep15 and SelM are eukaryotic selenoproteins that have a thioredoxin-like domain and a surface accessible active site redox motif PUBMED:16319061. This suggests that they function as thiol-disulphide isomerases involved in disulphide bond formation in the endoplasmic reticulum PUBMED:16319061.

    \ ' '8640' 'IPR014913' '\

    This protein is found in many hypothetical proteins. The structure of one of the proteins in this family has been solved and it adopts an all alpha helical fold.

    \ ' '8641' 'IPR014914' '\

    This presumed protein contains 3 highly conserved polar groups that could form an active site. These are an arginine, glutamate and serine, hence the RES domain. RES is found widely distributed in bacteria, it has about 150 residues in length.

    \ ' '8642' 'IPR014915' '\

    Members of this entry are about 100 amino acids in length and are uncharacterised.

    \ ' '8643' 'IPR014916' '\

    This bacterial protein forms an anti-parallel beta sheet with an extending alpha helical region.

    \ ' '8644' 'IPR014917' '\

    This is an entry of large bacterial proteins of unknown function.

    \ ' '8645' 'IPR014199' '\

    This uncharacterised protein is one of a number of proteins conserved in all known endospore-forming Firmicutes (low-GC Gram-positive bacteria), including Carboxydothermus hydrogenoformans, and it is not found in non-endospore forming species. It is uniformly distributed in the mother\ cell cytoplasm in Bacillus subtilis PUBMED:12662922.

    \ ' '8646' 'IPR014918' '\

    Proteins of this entry include phage tail proteins. They probably include bacterial Ig-like domains related to . Which also includes a number of phage tail invasin proteins.

    \ ' '8647' 'IPR014919' '\

    The fdxN element, along with two other DNA elements, is excised from the chromosome during heterocyst differentiation in cyanobacteria. The xisH as well as the xisF and xisI genes are required PUBMED:9106215.

    \ ' '8648' 'IPR014920' '\

    This entry represents the interlocking domain of the eukaryotic nuclear receptor coactivators Ncoa1, Ncoa2 and Ncoa3. The interlocking domain forms a 3-helical non-globular array that forms interlocked heterodimers with its target.

    \

    Nuclear receptors are ligand-activated transcription factors involved in the regulation of many processes, including development, reproduction and homeostasis. Nuclear receptor coactivators act to modulate the function of nuclear receptors. Coactivators associate with promoters and enhancers primarily through protein-protein contacts to facilitate the interaction between DNA-bound transcription factors and the transcription machinery. In addition to their role as coactivators of various nuclear receptors, Ncoa1 and Ncoa3 both have histone acetyltransferase activity (), but Ncoa2 does not PUBMED:14757047, PUBMED:15145939.

    \ ' '8649' 'IPR014453' '\

    C-type lysozyme enzymes, such as hen egg white lysozyme (HEWL), provide anti-bacterial activity by cleaving peptidoglycan in Gram-positive bacterial cell walls. In humans, C-type lysozyme is found in all secretions, including tears and saliva. Certain Gram-positive bacteria can produce proteins with anti-lysozyme activity known as Inhibitor of Vertebrate Lysozyme (IVY), which act as virulence factors PUBMED:15141308, PUBMED:11278658. IVY proteins have a 3-layer alpha(2)/beta(5)/alpha(2) topology, and contain a protruding 5-residue loop that is essential for their inhibitory effect PUBMED:17405861.

    \ ' '8650' 'IPR014921' '\

    YukD is a bacterial protein that adopts a ubiquitin-like fold PUBMED:15978580. Ubiquitin covalently binds to protein and flags them for protein degradation, however conjugation assays have indicated that YukD lacks the capacity for covalent bond formation with other proteins PUBMED:15978580.

    \ ' '8651' 'IPR014922' '\

    This large entry of bacterial proteins is uncharacterised. They contain a presumed domain about 110 amino acids in length.

    \ ' '8652' 'IPR014923' '\

    The function of this family is unknown. This region is found associated with a suggesting they could be part of a restriction modification system.

    \ ' '8653' 'IPR014924' '\

    This small protein is found in one or two copies in bacteria. The function of this is unknown.

    \ ' '8654' 'IPR014925' '\

    Proteins in this entry are a quite highly conserved sequence of CGGC in its central region. The region has many conserved cysteines and histidines suggestive of a zinc binding function.

    \ ' '8655' 'IPR014926' '\

    This entry consists of a bacterial protein which is uncharacterised.

    \ ' '8656' 'IPR014927' '\

    This entry may be a peptidoglycan binding domain.

    \ ' '8657' 'IPR014928' '\

    This is a serine rich protein that is found in the docking protein p130(cas) (Crk-associated substrate). The protein folds into a four helix bundle which is associated with protein-protein interactions PUBMED:15795225.

    \ ' '8658' 'IPR014929' '\

    E1 and E2 enzymes play a central role in ubiquitin and ubiquitin-like protein transfer cascades. This is an E2 binding domain that is found on NEDD8 activating E1 enzyme. The protein resembles ubiquitin, and recruits the catalytic core of the E2 enzyme Ubc12 in a similar manner to that in which ubiquitin interacts with ubiquitin binding domains PUBMED:15694336.

    \ ' '8659' 'IPR014930' '\

    This protein is found in the myotonic dystrophy protein kinase (DMPK) and adopts a coiled coil structure. It plays a role in dimerisation PUBMED:12832055.

    \ ' '8660' 'IPR014931' '\

    This protein is found in bacteria and archaea and has an N-terminal tetramerisation region that is composed of beta sheets.

    \ ' '8661' 'IPR014932' '\

    Doublesex (DSX) is a transcription factor that regulates somatic sexual differences in Drosophila. The structure has revealed a novel dimeric arrangement of ubiquitin-associated folds that has not previously been identified in a transcription factor PUBMED:16049008.

    \ ' '8662' 'IPR014933' '\

    The alpha C protein (ACP) is found in Streptococcus and acts as an invasin which plays a role in the internalisation and translocation of the organism across human epithelial surfaces. Group B Streptococcus is the leading cause of diseases including bacterial pneumonia, sepsis and meningitis. The N-terminal of ACP is associated with virulence and forms a beta sandwich and a three helix bundle PUBMED:15753100, PUBMED:12427097, PUBMED:9371832.

    \ ' '8663' 'IPR014934' '\

    This entry consists of bacterial uncharacterised proteins. The structure of one of the proteins has been solved and it adopts a beta barrel-like structure.

    \ ' '8664' 'IPR011988' '\

    This entry represents the trimerisation domain of the MHC class II-associated invariant chain (Ii). Ii plays a critical role in the assembly of the MHC, as well as in MHC II antigen processing by stabilising peptide-free class II alpha/beta heterodimers in a complex soon after their synthesis and directing transport of the complex from the endoplasmic reticulum to compartments where peptide loading of class II takes place PUBMED:16337363. In antigen-presenting cells (APCs), loading of MHC II molecules with peptides is regulated by Ii, which blocks MHC II antigen-binding sites in pre-endosomal compartments PUBMED:16181341. Several molecules then act upon MHC II molecules in endosomes to facilitate peptide loading: Ii-degrading proteases, the peptide exchange factor, human leukocyte antigen-DM (HLA-DM), and its modulator, HLA-DO (DO).

    \

    The Invariant chain contains a single transmembrane domain. Ii first assembles into a trimer and then associates with three class II alpha/beta MHC heterodimers. Although the membrane-proximal region of the Ii luminal domain is structurally disordered, the C-terminal segment of the luminal domain is largely alpha-helical and contains a major interaction site for the Ii trimer PUBMED:9843486.

    \

    More information about these proteins can be found at Protein of the Month: MHC PUBMED:.

    \ ' '8665' 'IPR014935' '\

    This protein is found in steroid/nuclear receptor coactivators and contains two LXXLL motifs that are involved in receptor binding PUBMED:9744270 and includes SRC-1/NcoA-1, NcoA-2/TIF2, pCIP/ACTR/GRIP-1/AIB1.

    \ ' '8666' 'IPR014936' '\

    Proteins in this entry are found on the scaffolding protein Axin which is a component of the beta-catenin destruction complex. It competes with the tumour suppressor adenomatous polyposis coli protein (APC) for binding to beta-catenin PUBMED:14600025.

    \ ' '8669' 'IPR014937' '\

    This is a family of uncharacterised proteins. The structure of one of the members in this family has been solved and it adopts a mainly alpha helical structure.

    \ ' '8670' 'IPR014938' '\

    This entry consists uncharacterised bacterial proteins. Some of the proteins are annotated as being transcriptional regulators (see , ). The structure of one of the proteins has revealed a beta-barrel like structure with helix-turn-helix like motif.

    \ ' '8671' 'IPR014939' '\

    CDT1 is a component of the replication licensing system and promotes the loading of the mini-chromosome maintenance complex onto chromatin. Geminin is an inhibitor of CDT1 and prevents inappropriate re-initiation of replication on an already fired origin. This region of CDT1 binds to Geminin PUBMED:15286659.

    \ ' '8672' 'IPR014940' '\

    Acyl-CoA thioesterases are a group of enzymes that catalyse the hydrolysis of acyl-CoAs to the free fatty acid and coenzyme A (CoASH). They consequently have the potential to regulate intracellular levels of acyl-CoAs, free fatty acids and CoASH. They may also be involved in the metabolic regulation of peroxisome proliferation.

    \ \

    Thioesters play a central role in cells as they participate in metabolism, membrane synthesis, signal transduction, and gene regulation. Thioesterases catalyse the hydrolysis of thioesters to the thiol and carboxylic acid components. Many thioesterases have a hot dog fold, including YciA from Escherichia coli and its close sequence homologue HI0827 from Haemophilus influenzae (HiYciA) PUBMED:18247525.

    \ \

    In Helicobacter pylori, YbgC also belongs to the hot-dog family of proteins, with a epsilongamma tetrameric arrangement PUBMED:18338382. YbgC proteins are bacterial acyl-CoA thioesterases associated with the Tol-Pal system. This system is important for cell envelope integrity and is part of the cell division machinery.

    \ \

    The E. coli thioesterase II, however, reveals a new tertiary fold: a \'double hot dog\'. It has an internal repeat with a basic unit that is structurally similar to the recently described beta-hydroxydecanoyl thiol ester dehydrase PUBMED:10876240.

    \ \

    This catalytic protein is found at the C-terminal of acyl-CoA thioester hydrolases and bile acid-CoA:amino acid N-acetyltransferases (BAAT).

    \ ' '8673' 'IPR009191' '\

    Diol dehydratase (propanediol dehydratase) and glycerol dehydratase undergo concomitant, irreversible inactivation by glycerol during catalysis PUBMED:889846, PUBMED:321014. This inactivation is mechanism-based and involves cleavage of the Co-C bond of the cobalamin cofactor, coenzyme B12 (AdoCbl), forming 5 -deoxyadenosine and a modified coenzyme PUBMED:889846. Irreversible inactivation of the enzyme results from tight binding to the modified, inactive cobalamin PUBMED:889846, PUBMED:321014.

    The glycerol-inactivated enzyme undergoes rapid reactivation in the presence of free AdoCbl, ATP, and Mg2+ (or Mn2+) PUBMED:6752354. Reactivation is mediated by a complex of two proteins: a large subunit (DdrA/PduG) and a small subunit (DdrB/PduH, ) PUBMED:9362119, PUBMED:9405397.

    \

    The two subunits of the reactivating factor for glycerol dehydratase have been shown to form a tight complex that serves to reactivate the glycerol-inactivated holoenzyme, as well as O2-inactivated holoenzyme in vitro PUBMED:9920879. It is believed that this reactivating factor replaces an enzyme-bound, adenine-lacking inactive cobalamin with a free, adenine-containing active cobalamin PUBMED:9920879.

    \

    PduG and PduH, part of the propanediol utilization pdu operon, are believed to have a similar function in the reactivation of propanediol dehydratase. PduG was also proposed, on the basis of genetic tests, to be a cobalamin adenosyltransferase involved in the conversion of inactive cobalamin (B12) to AdoCbl PUBMED:9023178. However, this function has since been shown to belong to another protein, PduO (, ) PUBMED:11160088.

    Please see , , for more details on the propanediol utilization pathway and pdu operon, as well as on the glycerol breakdown pathway.

    \ ' '8674' 'IPR014941' '\

    Proteins if this entry may be lipoproteins principally from bacilli. They are between 300 and 400 residues and are functionally uncharacterised.

    \ ' '8675' 'IPR014942' '\

    This large group of proteins are largely uncharacterised. Some are annotated as abortive infective proteins but support for this annotation could not be found.

    \ ' '8676' 'IPR014943' '\

    This entry is about 100 amino acids in length and is functionally uncharacterised.

    \ ' '8677' 'IPR014944' '\

    These short proteins have no known function. However, they do appear to be distantly related to HSP20.

    \ ' '8678' 'IPR014945' '\

    is associated with the domain suggesting this protein could have a role in phycobilisomes.

    \ ' '8679' 'IPR014946' '\

    Members of this entry are functionally uncharacterised.

    \ ' '8680' 'IPR014947' '\

    This protein is found in a small family of cyanobacterial protein. These proteins are functionally uncharacterised.

    \ ' '8681' 'IPR014948' '\

    These proteins are functionally uncharacterised. Several are annotated as putative inner membrane proteins.

    \ ' '8682' 'IPR014949' '\

    This protein includes small functionally uncharacterised proteins of around 100 amino acids in length.

    \ ' '8683' 'IPR014950' '\

    These uncharacterised proteins are principally found in cyanobacteria.

    \ ' '8684' 'IPR014951' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8685' 'IPR014952' '\

    These proteins are functionally uncharacterised.

    \ ' '8686' 'IPR014953' '\

    This uncharacterised group of proteins are principally found in cyanobacteria.

    \ ' '8687' 'IPR014954' '\

    These roteins are uncharacterised and are principally found in cyanobacteria.

    \ ' '8688' 'IPR014955' '\

    These proteins are functionally uncharacterised.

    \ ' '8689' 'IPR014956' '\

    These proteins are probably distantly related to . Suggesting these, uncharacterised proteins have a nuclease function.

    \ ' '8690' 'IPR014957' '\

    This short protein is found at the C-terminus of proteins in the UPF0302 family and is functionally uncharacterised. It is named after the sequence of the most conserved region in some members.

    \ ' '8691' 'IPR014958' '\

    This protein appears to be a zinc binding domain from the conservation of four potential chelating cysteines. The protein is named after a conserved central motif, the function is unknown.

    \ ' '8692' 'IPR014959' '\

    This presumed domain has no known function.

    \ ' '8693' 'IPR014960' '\

    These proteins are functionally uncharacterised.

    \ ' '8694' 'IPR014961' '\

    This short protein is usually associated with .

    \ ' '8695' 'IPR014962' '\

    These proteins are functionally uncharacterised. However it has been predicted that these proteins are functionally equivalent to the UmuD subunit of polymerase V from Gram-negative bacteria PUBMED:12137951.

    \ ' '8696' 'IPR014963' '\

    These proteins are functionally uncharacterised.

    \ ' '8697' 'IPR014964' '\

    This group of short proteins is functionally uncharacterised.

    \ ' '8698' 'IPR014965' '\

    These short proteins are functionally uncharacterised.

    \ ' '8699' 'IPR014966' '\

    This entry contains a conserved N-terminal (F/Y)RG motif. It is functionally uncharacterised.

    \ ' '8700' 'IPR014967' '\

    This entry contains proteins related to Bacillus subtilis YugN, they are functionally uncharacterised.

    \ ' '8701' 'IPR014968' '\

    The fdxN element, along with two other DNA elements, is excised from the chromosome during heterocyst differentiation in cyanobacteria. The xisH as well as the xisF and xisI genes are required PUBMED:9106215.

    \ ' '8702' 'IPR014969' '\

    This entry describes the DndE protein encoded by an operon associated with a sulphur-containing modification to DNA PUBMED:16102010. The operon is sporadically distributed in bacteria, much like some restriction enzyme operons. DndE is a putative carboxylase homologous to NCAIR synthetases.

    \ ' '8704' 'IPR014971' '\

    This protein is found in one or two copies in cyanobacterial proteins. It is named after a short sequence motif.

    \ ' '8705' 'IPR014972' '\

    This group of proteins are functionally uncharacterised. One member is the Gp37 protein from the FluMu prophage.

    \ ' '8706' 'IPR014973' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8707' 'IPR014974' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8708' 'IPR014975' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8709' 'IPR011235' '\

    This is a family of uncharacterised bacterial proteins.

    \ ' '8710' 'IPR014976' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8711' 'IPR014977' '\

    WRC is named after the conserved Trp-Arg-Cys motif, it contains two distinctive features: a putative nuclear localisation signal and a zinc-finger motif (C3H). It is suggested that WRC functions in DNA binding PUBMED:12974814.

    \ ' '8712' 'IPR014978' '\

    QLQ is named after the conserved Gln, Leu, Gln motif. QLQ is found at the N-terminus of SWI2/SNF2 protein, which has been shown to be involved in protein-protein interactions. QLQ has been postulated to be involved in mediating protein interactions PUBMED:12974814.

    \ ' '8713' 'IPR011058' '\

    Cyanovirin-N (CV-N) is an 11-kDa protein from the cyanobacterium Nostoc ellipsosporum that displays virucidal activity against several viruses, including human immunodeficiency virus (AIDS). The virucidal activity of CV-N is mediated through specific high-affinity interactions with the viral surface envelope glycoproteins gp120 and gp41, as well as to high-mannose oligosaccharides found on the HIV envelope PUBMED:12678493. In addition, CV-N is active against rhinoviruses, human parainfluenza virus, respiratory syncytial virus, and enteric viruses. The virucidal activity of CV-N against influenza virus is directed towards viral haemagglutinin PUBMED:12878514. CV-N has a complex fold composed of a duplication of a tandem repeat of two homologous motifs comprising three-stranded beta-sheet and beta-hairpins PUBMED:12110688.

    \ ' '8714' 'IPR014979' '\

    Acetone carboxylase is the key enzyme of bacterial acetone metabolism, catalysing the condensation of acetone and CO2 to form acetoacetate PUBMED:12003937 according to the following reaction:\

    \ \

    It has the subunit composition: (alpha(2)beta(2)gamma(2) multimers of 85-, 78-, and 20-kDa subunits). It is expressed to high levels (17 to 25% of soluble protein) in cells grown with acetone as the carbon source but are not present at detectable levels in cells grown with other carbon sources PUBMED:12003937. Acetone carboxylase may enable Helicobacter pylori to survive off acetone in the stomach of humans and other mammals where it is the etiological agent of peptic ulcer disease PUBMED:18215283.

    \

    This entry represents the family of gamma subunit-related acetone carboxylase proteins.

    \ ' '8715' 'IPR014980' '\

    This family of proteins is related to a DOPA 4,5-dioxygenase that is involved in synthesis of betalain. DOPA-dioxygenase is the key enzyme involved in betalain biosynthesis. It converts 3,4-dihydroxyphenylalanine to betalamic acid, a yellow chromophore.

    \ ' '8716' 'IPR014981' '\

    This protein is found in the central portion bacterial flagellin FliC, it contains a structural motif called a beta-folium fold PUBMED:11268201. Although no specific function is assigned its deletion leads to a reduction in filament stability PUBMED:7860589.

    \ ' '8717' 'IPR014982' '\

    This group of proteins are functionally uncharacterised. They have been named GSCFA after a highly conserved N-terminal motif in the alignment, they are functionally uncharacterised.

    \ ' '8718' 'IPR011718' '\

    This entry represents a rare family of glutamate--cysteine ligases, demonstrated first in Thiobacillus ferrooxidans and present in a few other Proteobacteria PUBMED:8828222. It is the first of two enzymes for glutathione biosynthesis. It is also called gamma-glutamylcysteine synthetase.

    \ ' '8719' 'IPR014983' '\

    This protein is functionally uncharacterised, but it appears to be distantly related to the GAD domain .

    \ ' '8720' 'IPR014984' '\

    Pathovars of Pseudomonas syringae interact with their plant hosts via the action of Hrp outer protein (Hop) effector proteins, injected into plant cells by the type III secretion system. The proteins are called HopJ after the original member HopPmaJ PUBMED:15828679.

    \ ' '8721' 'IPR014985' '\

    This family of proteins are functionally uncharacterised. However, it is found in an O-antigen gene cluster in Escherichia coli PUBMED:12843098 and other bacteria PUBMED:14687563 suggesting a role in O-antigen production. It has been suggested that wbnG may code for a glycine transferase PUBMED:14687563.

    \ ' '8722' 'IPR014986' '\

    This group of proteins are functionally uncharacterised. They are found in prophage sequence in various bacteria.

    \ ' '8723' 'IPR014987' '\

    This group of proteins are functionally uncharacterised. They are related to the short YfcL protein from Escherichia coli.

    \ ' '8724' 'IPR014988' '\

    This group of proteins are functionally uncharacterised. They include YqcI and YcgG from Bacillus subtilis. The alignment contains a conserved FPC motif at the N-terminus and CPF at the C-terminus.

    \ ' '8725' 'IPR014989' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8726' 'IPR014990' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8727' 'IPR014991' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8728' 'IPR014992' '\

    This protein is found at the N-terminus of proteins that are functionally uncharacterised.

    \ ' '8729' 'IPR014993' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8730' 'IPR014994' '\

    This protein is found at the C-terminus of a group of proteins that are functionally uncharacterised. The presumed domain is about 60 amino acid residues in length and is found independently in some proteins.

    \ ' '8731' 'IPR014995' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8732' 'IPR014996' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8733' 'IPR014997' '\

    This group of proteins are functionally uncharacterised. They contain 4 N-terminal cysteines that may form a zinc-binding domain.

    \ ' '8734' 'IPR014998' '\

    This group of proteins are functionally uncharacterised. The C-terminus contains a cluster of cysteines that are similar to the iron-sulphur cluster found at the N-terminus of .

    \ ' '8735' 'IPR014999' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8736' 'IPR015000' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8737' 'IPR015001' '\

    This entry contains proteins, which are functionally uncharacterised. Some members of this family appear to be miss-annotated as RocC an amino acid transporter from Bacillus subtilis.

    \ ' '8738' 'IPR015002' '\

    This protein is found at the C-terminus of a variety of proteins that are functionally uncharacterised.

    \ ' '8739' 'IPR015003' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8740' 'IPR015004' '\

    This group of proteins are functionally uncharacterised.

    \ ' '8741' 'IPR015005' '\

    These protein is functionally uncharacterised. It is found at the C-terminus of a number of ATP transporter proteins suggesting it may be involved in ligand binding.

    \ ' '8742' 'IPR015006' '\

    This entry consists of several hypothetical eukaryotic proteins of unknown function.

    \ ' '8743' 'IPR015007' '\

    Nucleoporin 50 kDa (NUP50) acts as a cofactor for the importin-alpha:importin-beta heterodimer, which in turn allows for transportation of many nuclear-targeted proteins through nuclear pore complexes. The C terminus of NUP50 binds importin-beta through RAN-GTP, the N terminus binds the C terminus of importin-alpha, while a central domain binds importin-beta. NUP50:importin-alpha:importin-beta then binds cargo and can stimulate nuclear import. The N-terminal domain of NUP50 is also able to actively displace nuclear localisation signals from importin-alpha PUBMED:16222336.

    \ ' '8744' 'IPR015008' '\

    Rho is responsible for the recognition and binding of Rho binding domain-containing proteins (such as ROCK) to Rho, resulting in activation of the GTPase which in turn modulates the phosphorylation of various signalling proteins. This domain is within an amphipathic alpha-helical coiled-coil and interacts with Rho through predominantly hydrophobic interactions PUBMED:14660612.

    \ ' '8745' 'IPR015009' '\

    Vinculin binding sites are predominantly found in talin and talin-like molecules, enabling binding of vinculin to talin, stabilising integrin-mediated cell-matrix junctions. Talin, in turn, links integrins to the actin cytoskeleton. The consensus sequence for Vinculin binding sites is LxxAAxxVAxxVxxLIxxA, with a secondary structure prediction of four amphipathic helices. The hydrophobic residues that define the VBS are themselves \'masked\' and are buried in the core of a series of helical bundles that make up the talin rod PUBMED:16460027.

    \ ' '8746' 'IPR015010' '\

    Rap1 Myb adopts a canonical three-helix bundle tertiary structure, with the second and third helices forming a helix-turn-helix variant motif. The function is unclear: butit may either interact with DNA via an adaptor protein or it may be only involved in protein-protein interactions PUBMED:11545594.

    \ ' '8747' 'IPR015011' '\

    Archaea-specific editing domain of threonyl-tRNA synthetase, with marked structural similarity to D-amino acids deacylases found in eubacteria and eukaryotes. This domain can bind D-amino acids, and ensures high fidelity during translation. It is especially responsible for removing incorrectly attached serine from tRNA-Thr. The domain forms a fold that can be defined as two layers of beta-sheets (a three-stranded sheet and a five-stranded sheet), with two alpha-helices located adjacent to the five-stranded sheet PUBMED:15908961.

    \ ' '8748' 'IPR015012' '\

    The phenylalanine zipper consists of aromatic side chains from ten phenylalanine residues that are stacked within a hydrophobic core. This zipper mediates dimerisation of various proteins, such as APS, SH2-B and Lnk PUBMED:15378031.

    \ ' '8749' 'IPR015013' '\

    The Transforming growth factor beta receptor 2 ectodomain is a compact fold consisting of nine beta-strands and a single helix stabilised by a network of six intra strand disulphide bonds. The folding topology includes a central five-stranded antiparallel beta-sheet, eight-residues long at its centre, covered by a second layer consisting of two segments of two-stranded antiparallel beta-sheets (beta1-beta4, beta3-beta9) PUBMED:11850637.

    \ ' '8750' 'IPR015014' '\

    The PhoQ Sensor is required for the virulence of various Gram-negative bacteria by allowing interaction of PhoPQ with the intracellular membrane, resulting in remodelling of the bacterial cell surface and subsequent bacterial resistance to host antimicrobial peptides. The domain contains a major flat acidic surface, which binds to at least 3 calcium ions, neutralising the domain\'s negative charge and allowing interaction with the negatively charged membrane PUBMED:16406409.

    \ ' '8751' 'IPR015015' '\

    The F-actin binding domain forms a compact bundle of four antiparallel alpha-helices, which are arranged in a left-handed topology. Binding of F-actin to the F-actin binding domain may result in cytoplasmic retention and subcellular distribution of the protein, as well as possible inhibition of protein function PUBMED:16109371.

    \ ' '8752' 'IPR015016' '\

    This group of proteins consists of several eukaryotic splicing factor 3B subunit 1 proteins, which associate with p14 through a C-terminus beta-strand that interacts with beta-3 of the p14 RNA recognition motif (RRM) beta-sheet, which is in turn connected to an alpha-helix by a loop that makes extensive contacts with both the shorter C-terminal helix and RRM of p14. This subunit is required for \'A\' splicing complex assembly (formed by the stable binding of U2 snRNP to the branchpoint sequence in pre-mRNA) and \'E\' splicing complex assembly PUBMED:16432215.

    \ ' '8753' 'IPR015017' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8754' 'IPR015018' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8755' 'IPR015019' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    This entry represents Mitogen-activated protein kinase kinase 1 interacting protein, which is a small subcellular adaptor protein required for MAPK signalling and ERK1/2 activation. The overall topology of this domain has a central five-stranded beta-sheet sandwiched between a two alpha-helix and a one alpha-helix layer PUBMED:15263099.

    \ ' '8756' 'IPR015020' '\

    This protein is found in a set of uncharacterised hypothetical bacterial proteins.

    \ ' '8757' 'IPR015021' '\

    The structure of this protein displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxH motif that coordinates a zinc ion, and an acetate anion at a site that likely supports the enzymatic activity of an ester hydrolase PUBMED:16522806.

    \ ' '8758' 'IPR015022' '\

    This protein is found in a set of hypothetical/structural eukaryotic proteins.

    \ ' '8760' 'IPR015024' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8761' 'IPR015025' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8762' 'IPR015026' '\

    This protein has no known function. It is found in various Streptococcal proteins.

    \ ' '8763' 'IPR015027' '\

    This protein has no known function. It is found in a various putative receptor proteins from Lactococcus bacteriophages.

    \ ' '8764' 'IPR015028' '\

    This domain has no known function. It is found in a various putative receptor proteins from Lactococcus bacteriophages.

    \ ' '8765' 'IPR015029' '\

    This protein has no known function. It is found in various hypothetical and conserved domain proteins.

    \ ' '8766' 'IPR015030' '\

    The Rb C-terminal domain is required for high-affinity binding to E2F-DP complexes and for maximal repression of E2F-responsive promoters, thereby acting as a growth suppressor by blocking the G1-S transition of the cell cycle. This domain has a strand-loop-helix structure, which directly interacts with both E2F1 and DP1, followed by a tail segment that lacks regular secondary structure PUBMED:16360038.

    \ ' '8767' 'IPR015031' '\

    This entry represents the VP4 coat protein found in Picornaviruses, small RNA-containing mammalian viruses such as Foot-and-mouth disease virus (FMDV) PUBMED:9927414, Mengo encephalomyocarditis virus PUBMED:2156078 and Theiler\'s murine encephalomyelitis virus (strain DA) (TMEV) PUBMED:1549565.

    \

    The viral capsid of Picornaviruses is composed of 60 icosahedral copies of four capsid proteins, VP1, VP2, VP3 and VP4, enclosing the viral positive-strand RNA genome. VP4 lies on the inner surface of the protein shell formed by the major capsid proteins, VP1, VP2 and VP3. The three major capsid proteins have a conserved beta-barrel fold, while VP4 has little regular secondary structure. The organisation of the three major capsid proteins leads to surface depressions, or pits, thought to be involved in receptor binding, while the variable outer rim is involved in antibody recognition. The small VP4 is thought to be involved in the initial disassembly and final assembly stages PUBMED:2156078.

    \ ' '8768' 'IPR014074' '\

    This entry describes a carboxysome shell protein that proves to be a novel class, designated epsilon, of carbonic anhydrase. It tends to be encoded near genes for RuBisCo and other carboxysome shell proteins.

    \ ' '8769' 'IPR015032' '\

    This protein adopts the flavodoxin fold, that is, five parallel beta-strands and four helical segments. The structure is a three-layer sandwich with alpha-1 and alpha-4 on one side of the beta-sheet, and alpha-2 and alpha-3 on the other side. Probable role in signal transduction as a phosphorylation-independent conformational switch protein PUBMED:10964569. This domain is similar to the TIR domain PUBMED:18327267.

    \ ' '8770' 'IPR015033' '\

    This protein is found in various eukaryotic HBS1-like proteins.

    \ ' '8771' 'IPR015034' '\

    This protein is found in various hypothetical and basophilic leukaemia proteins. It has no known function.

    \ ' '8772' 'IPR015035' '\

    This protein is found in various hypothetical bacterial proteins, and has no known function.

    \ ' '8773' 'IPR015036' '\

    This protein interacts with the UBP deubiquitinating enzyme USP8.

    \ ' '8774' 'IPR015037' '\

    This protein has no known function. It is found in various hypothetical and putative bacterial proteins.

    \ ' '8775' 'IPR015038' '\

    This group of proteins consists of various bacterial proteins pertaining to the non-haem Fe(II)-dependent oxygenase family. CsiD of Escherichia coli is induced on carbon starvation. Its expression is sigma-S dependent and additionally requires activation by cAMP-CRP PUBMED:9512707. The exact function and role of CsiD is unknown, but a putative role may involve the control of utilisation of gamma-aminobutyric acid and glutamate accumulation in general stress adaption PUBMED:11910018.

    \ ' '8776' 'IPR015039' '\

    The C-terminal domain of the phagocyte NADPH oxidase subunit p47Phox contains conserved PxxP motifs that allow binding to SH3 domains, with subsequent activation of the NADPH oxidase, and generation of superoxide, which plays a crucial role in host defence against microbial infection PUBMED:12169629.

    \ ' '8777' 'IPR015040' '\

    Apoptosis, or programmed cell death (PCD), is a common and evolutionarily conserved property of all metazoans PUBMED:11341280. In many biological processes, apoptosis is required to eliminate supernumerary or dangerous (such as pre-cancerous) cells and to promote normal development. Dysregulation of apoptosis can, therefore, contribute to the development of many major diseases including cancer, autoimmunity and neurodegenerative disorders. In most cases, proteins of the caspase family execute the genetic programme that leads to cell death.

    \

    Bcl-2 proteins are central regulators of caspase activation, and play a key role in cell death by regulating the integrity of the mitochondrial and endoplasmic reticulum (ER) membranes PUBMED:12631689. At least 20 Bcl-2 proteins have been reported in mammals, and several others have been identified in viruses. Bcl-2 family proteins fall roughly into three subtypes, which either promote cell survival (anti-apoptotic) or trigger cell death (pro-apoptotic). All members contain at least one of four conserved motifs, termed Bcl-2 Homology (BH) domains. Bcl-2 subfamily proteins, which contain at least BH1 and BH2, promote cell survival by inhibiting the adapters needed for the activation of caspases.

    \ \

    Pro-apoptotic members potentially exert their effects by displacing the adapters from the pro-survival proteins; these proteins belong either to the Bax subfamily, which contain BH1-BH3, or to the BH3 subfamily, which mostly only feature BH3 PUBMED:9735050. Thus, the balance between antagonistic family members is believed to play a role in determining cell fate. Members of the wider Bcl-2 family, which also includes Bcl-x, Bcl-w and Mcl-1, are described by their similarity to Bcl-2 protein, a member of the pro-survival Bcl-2 subfamily PUBMED:9735050. Full-length Bcl-2 proteins feature all four BH domains, seven alpha-helices, and a C-terminal hydrophobic motif that targets the protein to the outer mitochondrial membrane, ER and nuclear envelope.

    \

    Members of this entry induce apoptosis. The isoform BimL is more potent than the isoform BimEL. They form heterodimers with a number of antiapoptotic Bcl-2 proteins including Mcl-1, Bcl-2, Bcl-X(L), BFL-1, and BHRF1, but do not heterodimerise with proapoptotic proteins such as BAD, BOK, BAX or BAK. They are peripheral membrane proteins, associated with intracytoplasmic membranes. The BH3 motif is required for Bcl-2 binding and cytotoxicity.

    \ \

    After antigen-driven expansion, the majority of T cells involved in an immune response die rapidly by apoptosis dependent on the Bcl-2 related proteins; Bim and Bax or Bak PUBMED:14499110. Bcl-xL regulates Bax and Bim is an important regulator of bcl-x deficiency induced cell death during hematopoiesis and testicular development in mice PUBMED:18606610. Bim(L) displaces Bcl-x(L) in the mitochondria and promotes Bax translocation during TNFalpha-induced apoptosis PUBMED:18500555. A potent inhibitor of antiapoptotic Bcl-2 family members, including Bcl-X(L), is AT-101 PUBMED:18292288. The immunophilin protein FKBP8 and its splice variant are Bcl-XL-interacting proteins and regulate the apoptotic signalling pathways in the RPE PUBMED:18385096.

    \ \

    This protein is a long alpha helix, required for interaction with Bcl-x. It is found in BAM, Bim and Bcl2-like protein 11 PUBMED:14499110.

    \ ' '8778' 'IPR015041' '\

    The osmosensory transporter coiled coil is a C-terminal domain found in various bacterial osmoprotective transporters, such as ProP, Proline/betaine transporter, Proline permease 2 and the citrate proton symporters. It adopts an antiparallel coiled-coil structure, and is essential for osmosensory and osmoprotectant transporter function PUBMED:14643666.

    \ ' '8779' 'IPR015042' '\

    The BPS (Between PH and SH2) domain, comprised of 2 beta strands and a C-terminal helix, is an approximately 45 residue region found in the adaptor proteins Grb7/10/14 that mediates inhibition of the tyrosine kinase domain of the insulin receptor by binding of the N-terminal portion of the BPS domain to the substrate peptide groove of the kinase, acting as a pseudosubstrate inhibitor PUBMED:16246733.

    \ ' '8780' 'IPR015043' '\

    This protein has no known function. It is predominantly found in the N-terminus of bacteriophage spike proteins PUBMED:15525981.

    \ ' '8781' 'IPR015044' '\

    This protein has no known function. It is predominantly found in the C-terminus of bacteriophage spike proteins PUBMED:15525981.

    \ ' '8782' 'IPR015045' '\

    This hypothetical protein, found in bacteria and in the eukaryote Leishmania, has no known function.

    \ ' '8783' 'IPR015046' '\

    Enterocin A Immunity Protein is a pediocin-like immunity protein, conferring immunity to the bacteriocin enterocin A produced by lactic acid bacteria. The protein adopts a globular structure consisting of an antiparallel four alpha-helix bundle with a flexible hydrophobic C-terminal part, which appears to form a hairpin-like structure PUBMED:15753083.

    \ ' '8784' 'IPR015047' '\

    This protein, found in Synaptojanin, has no known function.

    \ ' '8785' 'IPR015048' '\

    This set of proteins are found in various eukaryotic proteins. The function is unknown.

    \ ' '8786' 'IPR015049' '\

    This protein is predominantly found in the structural protein coronin, and is duplicated in some sequences. It has no known function PUBMED:16172398.

    \ ' '8787' 'IPR015050' '\

    The C-terminal domain of the bacterial protein, bypass of forespore C, contains a three-stranded beta-sheet and three alpha-helices. The exact function is unknown PUBMED:16049010.

    \ ' '8788' 'IPR015051' '\

    This domain is found in a set of hypothetical bacterial proteins.

    \ ' '8790' 'IPR015053' '\

    This set of hypothetical proteins is produced by prokaryotes pertaining to the Bacillus genus.

    \ ' '8791' 'IPR015054' '\

    The CS domain, found in Ubiquitin specific peptidase 19 (USP-19), has no known function.

    \ ' '8792' 'IPR015055' '\

    This protein is found in a set of hypothetical viral and bacterial proteins.

    \ ' '8793' 'IPR015056' '\

    MIT can be found in the Nuclear receptor-binding factor 2, it has no known function.

    \ ' '8794' 'IPR015057' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8795' 'IPR015058' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8796' 'IPR015059' '\

    This protein is found in a set of hypothetical bacterial and eukaryotic proteins, as well as in various calcium-dependent cell adhesion molecules.

    \ ' '8797' 'IPR015060' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8798' 'IPR015061' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8799' 'IPR014418' '\

    This group represents an uncharacterised conserved protein.

    \ ' '8800' 'IPR015062' '\

    This protein is found in a set of hypothetical proteins produced by bacteria of the Bacillus genus.

    \ ' '8801' 'IPR015063' '\

    This protein is predominantly found in the amino terminal region of Ubiquitin carboxyl-terminal hydrolase 8 (USP8). It has no known function.

    \ ' '8802' 'IPR015064' '\

    Members of this protein group contain two antiparallel alpha helices that are linked by a highly structured inter-helix loop to form a helical hairpin; the structure is stabilised by numerous hydrophobic and electrostatic interactions. These sporulation inhibitors are antikinases that bind to the histidine kinase KinA phosphotransfer domain and act as a molecular barricade that inhibit productive interaction between the ATP binding site and the phosphorylatable KinA His residue. This results in the inhibition of sporulation (by preventing phosphorylation of spo0A) PUBMED:15023339.

    \ ' '8803' 'IPR015065' '\

    Members of this protein are involved in glycogen synthesis in Enterobacteria. The structure of the polypeptide chain comprises a bundle of two parallel amphipathic helices, alpha-1 and alpha-3, and a short hydrophobic helix alpha-2 sandwiched between them PUBMED:15161493.

    \ ' '8804' 'IPR015066' '\

    Members of these prokaryotic proteins adopt a fold consisting of one alpha-helix and four beta-strands. Their function has not, as yet, been elucidated PUBMED:16250002.

    \ ' '8805' 'IPR015067' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8806' 'IPR015068' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8807' 'IPR015069' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8808' 'IPR015070' '\

    This protein is found predominantly in DJ binding protein. It has no known function.

    \ ' '8809' 'IPR015071' '\

    The N-terminal domain of, bypass of forespore C, is composed of a four-stranded beta-sheet covered by an alpha-helix. The beta-sheet has a beta2-beta1-beta4-beta3 topology, where strands beta1 and beta2 and strands beta3 and beta4 are connected by beta-turns, whereas strands beta2 and beta3 are joined by an alpha-helix that runs across one face of the beta-sheet. This domain is similar to the third immunoglobulin G-binding domain of protein G from Streptococcus, the latter belonging to a large and diverse group of cell surface-associated proteins that bind to immunoglobulins. It has been hypothesised that this domain may be a mediator of protein-protein interactions involved in proteolytic events at the cell surface PUBMED:16049010.

    \ ' '8810' 'IPR015072' '\

    This protein is found in various VP9 viral outer-coat proteins. It has no known function.

    \ ' '8811' 'IPR012031' '\

    There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '8812' 'IPR015073' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8814' 'IPR015075' '\

    This protein has no known function. It is found in various hypothetical bacterial and fungal proteins.

    \ ' '8815' 'IPR015076' '\

    This protein has no known function. It is found in the C-terminal segment of various vasopressin receptors.

    \ ' '8816' 'IPR015077' '\

    This protein has no known function. It is found in various hypothetical bacterial proteins.

    \ ' '8817' 'IPR015078' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8818' 'IPR015079' '\

    This protein is found in a set of hypothetical bacterial proteins.

    \ ' '8819' 'IPR015080' '\

    Proteins in this entry, which are synthesised by Saccharomycetes, adopt a structure consisting of a four-stranded beta-sheet, with strand order beta2-beta1-beta4-beta3, and two alpha-helices, with an overall topology of beta-beta-alpha-beta-beta-alpha. They have no known function PUBMED:12717036.

    \ ' '8820' 'IPR015081' '\

    The YscE protein, produced by the pathogen Yersinia, assumes a secondary structure composed of two anti-parallel alpha-helices separated by a flexible loop. The function of this protein is, as yet, unknown.

    \ ' '8821' 'IPR015082' '\

    This domain is found in a set of hypothetical bacterial proteins.

    \ ' '8822' 'IPR015083' '\

    The N-terminal docking domain found in modular polyketide synthase assumes an alpha-helical structure, wherein two alpha-helices are connected by a short loop. Two such N-terminal domains dimerise to form amphipathic parallel alpha-helical coiled coils: dimerisation is essential for protein function PUBMED:12954331.

    \ ' '8823' 'IPR009069' '\

    The mature-T-cell-proliferation (MTCP1) putative oncogene was identified for its involvement in t(X:14)(q28;q11)-associated T-cell leukaemia PUBMED:8634440. MTCP1 is alternatively spliced to produce two completely distinct proteins: the small mitochondrial protein, p8MTCP1, and the protein p13MTCP1, which shows strong homology to another oncogene product, p14TCL1. While p13MTCP1 expression appears to be restricted to mature T-cell proliferation with t(X,14) translocations, the mitochondrial p8MTCP1 is expressed at low levels in most human tissues, and is over-expressed in the proliferating T-cells. The biological function of p8MTCP1 is still unknown, but it appears to play a role in oncogenesis. The structure of p8MTCP1 reveals a disulphide-rich, irregular array of three helices PUBMED:9405159.

    \ ' '8824' 'IPR015084' '\

    This entry represents the gamma subunit of quinohemoprotein amine dehydrogenases (QHNDH; ), enzymes produced in the periplasmic space of certain Gram-negative bacteria, such as Paracoccus denitrificans and Pseudomonas putida, in response to primary amines, including n-butylamine and benzylamine. QHNDH catalyses the oxidative deamination of a wide range of aliphatic and aromatic amines through formation of a Schiff-base intermediate involving one of the quinone O atoms PUBMED:15234267. Catalysis requires the presence of a novel redox cofactor, cysteine tryptophylquinone (CTQ). CTQ is derived from the post-translational modification of specific residues, which involves the oxidation of the indole ring of a tryptophan residue to form tryptophylquinone, followed by covalent cross-linking with a cysteine residue PUBMED:12925784. There is one CTQ per subunit in QHNDH. In addition to CTQ, two haem c cofactors are present in QHNDH that mediate the transfer of the substrate-derived electrons from CTQ to an external electron acceptor, cytochrome c-550 PUBMED:12974623, PUBMED:12427036.

    \

    QHNDH is a heterotrimer of alpha, beta and gamma subunits. The alpha and beta subunits contain signal peptides necessary for the translocation of QHNDH to the periplasm. The alpha subunit is a 4-domain protein that contains the di-haem cytochrome c; the beta subunit is a 7-bladed beta-propeller that provides part of the active site; and the small, catalytic gamma subunit contains the novel cross-linked CTQ cofactor, in addition to additional thioester cross-links between Cys and Asp/Glu residues that encage CTQ. The gamma subunit assumes a globular secondary structure with two short alpha-helices having many turns and bends PUBMED:11704672.

    \ ' '8825' 'IPR015085' '\

    The N-terminal domain of the T4 gene 59 helicase consists of six alpha-helices linked by loop segments and short turns; the surface of the domain contains large regions of exposed hydrophobic residues and clusters of acidic and basic residues. This domain has structural similarity to members of the high-mobility-group (HMG) family of DNA minor groove binding proteins including rat HMG1A and lymphoid enhancer-binding factor, and is required for binding of the helicase to the DNA minor groove PUBMED:10669611.

    \ ' '8826' 'IPR015086' '\

    The C-terminal domain of the T4 gene 59 helicase consists of seven alpha-helices with short intervening loops and turns; the surface of the domain contains large regions of exposed hydrophobic residues and clusters of acidic and basic residues. The hydrophobic region on the \'bottom\' surface of the domain near the C-terminal helix binds the leading strand DNA, whilst the hydrophobic region on the, top, surface of the domain lies between the two arms of the fork DNA, allowing for T4 gene 41 helicase binding and assembly into a hexameric complex around the lagging strand PUBMED:10669611.

    \ ' '8827' 'IPR015087' '\

    Necrosis inducing protein-1, a fungal avirulence protein produced by plants, consists of two parts containing beta-sheets of two and three anti-parallel strands, respectively. Five intramolecular disulphide bonds, stabilise these parts and their position with respect to each other, providing a high level of stability PUBMED:12944393.

    \ ' '8828' 'IPR015088' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    The DNA Polymerase alpha zinc finger domain adopts an alpha-helix-like structure, followed by three turns, all of which involve proline. The resulting motif is a helix-turn-helix motif, in contrast to other zinc finger domains, which show anti-parallel sheet and helix conformation. Zinc binding occurs due to the presence of four cysteine residues positioned to bind the metal centre in a tetrahedral coordination geometry. The function of this domain is uncertain: it has been proposed that the zinc finger motif may be an essential part of the DNA binding domain PUBMED:14499601.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '8829' 'IPR015089' '\

    The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is an essential component of the mitochondrial cellular respiratory chain. This family represents the 6.4 kDa protein, which may be closely linked to the iron-sulphur protein in the complex and function as an iron-sulphur protein-binding factor PUBMED:15312779.

    \ ' '8830' 'IPR015090' '\

    The epsilon antitoxin, produced by various prokaryotes, forms part of a post-segregational killing system, which is involved in the initiation of programmed cell death of plasmid-free cells. The protein is folded into a three-helix bundle that directly interacts with the zeta toxin, inactivating it PUBMED:12571357.

    \ ' '8831' 'IPR015091' '\

    The N-terminal propeptide of surfactant protein C adopts an alpha-helical structure, with turn and extended regions. Its main function is the stabilisation of metastable surfactant protein C (SP-C), since the latter can irreversibly transform from its native alpha-helical structure to beta-sheet aggregates and form amyloid-like fibrils. The correct intracellular trafficking of proSP-C has also been reported to depend on the propeptide PUBMED:16478467.

    \ ' '8832' 'IPR009105' '\

    Colicins are plasmid-encoded protein antibiotics, or bacteriocins, produced by strains of Escherichia coli that kill closely related bacteria. Colicins are classified according to the cell-surface receptor they bind to, colicin E3 binding to the BtuB receptor involved in vitamin B12 uptake. The lethal action of colicin E3 arises from its ability to inactivate the ribosome by site-specific RNase cleavage of the 16S ribosomal RNA, which is carried out by the catalytic, or ribonuclease domain. Colicin E3 is comprised of three domains, each domain being involved in a different stage of infection: receptor binding, translocation and cytotoxicity. Colicin E3 is a Y-shaped molecule with the receptor-binding middle domain forming the stalk, the N-terminal translocation domain forming the two globular heads (), and the C-terminal catalytic domain forming the two globular arms. To neutralise the toxic effects of colicin E3, the host cell produces an immunity protein, which binds to the C-terminal end of the ribonuclease domain and effectively suppresses its activity.

    \ \

    This entry represents the ribonuclease domain (also called catalytic or cytotoxic domain) found in various colicins. This domain confers cytotoxic activity to proteins, enabling the formation of nucleolytic breaks in 16S ribosomal RNA. The structure of the domain reveals a highly twisted central beta-sheet elaborated with a short N-terminal alpha-helix PUBMED:10986462, PUBMED:11741540.

    \ ' '8833' 'IPR012033' '\

    There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. The structure of the Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum) protein has been determined but no evidence as to the function is available yet.

    \ ' '8834' 'IPR015093' '\

    This entry represents a set of hypothetical bacterial proteins.

    \ ' '8835' 'IPR015094' '\

    The amino terminal domain of bacteriophage lambda integrase folds into a three-stranded, antiparallel beta-sheet that packs against a C-terminal alpha-helix, adopting a fold that is structurally related to the three-stranded beta-sheet family of DNA-binding domains (which includes the GCC-box DNA-binding domain and the N-terminal domain of Tn916 integrase). This domain is responsible for high-affinity binding to each of the five DNA arm-type sites and is also a context-sensitive modulator of DNA cleavage PUBMED:11904406.

    \ ' '8836' 'IPR015095' '\

    This domain is found in a set of hypothetical eukaryotic proteins.

    \ ' '8837' 'IPR015096' '\

    This domain is found in Psi proteins produced by Drosophila, and in various eukaryotic hypothetical proteins. It has no known function.

    \ ' '8838' 'IPR015097' '\

    This domain, predominantly found in lung surfactant protein D, forms a triple-helical parallel coiled coil, and mediates trimerisation of the protein PUBMED:12495025.

    \ ' '8839' 'IPR015098' '\

    This C-terminal domain allows interaction of EBP50 with FERM (four-point one ERM) domains, resulting in the activation of Ezrin-radixin-moesin (ERM), with subsequent cytoskeletal modulation and cellular growth control PUBMED:15020681.

    \ ' '8840' 'IPR009093' '\

    The tailspike protein of Salmonella bacteriophage P22 is a viral adhesion protein that mediates attachment of the viral protein to host cell-surface lipopolysaccharide. The tailspike protein displays both receptor binding and destroying properties, inactivating the receptor by endoglycosidase activity. The N-terminal, head-binding domain mediates the non-covalent attachment of the six homotrimeric tailspike molecules to the DNA injection apparatus PUBMED:9135118. The N-terminal domain of the P22 tailspike protein shows significant sequence similarity to the N-terminal domain of the Shigella phage Sf6 tailspike protein PUBMED:12424253.

    \ \ ' '8841' 'IPR015099' '\

    Prokaryotic exotoxin A catalyses the transfer of ADP ribose from nicotinamide adenine dinucleotide (NAD) to elongation factor-2 in eukaryotic cells, with subsequent inhibition of protein synthesis PUBMED:8692916.

    \ ' '8842' 'IPR015100' '\

    Anti-sigma factor A is a transcriptional inhibitor that inhibits sigma 70-directed transcription by weakening its interaction with the core of the host\'s RNA polymerase. It is an all-helical protein, composed of six helical segments and intervening loops and turns, as well as a helix-turn-helix DNA binding motif, although neither free anti-sigma factor nor anti-sigma factor bound to sigma-70 has been shown to interact directly with DNA. In solution, the protein forms a symmetric dimer of small (10.59 kDa) protomers, which are composed of helix and coil regions and are devoid of beta-strand/sheet secondary structural elements PUBMED:11830637.

    \ ' '8843' 'IPR015101' '\

    This domain is predominantly found in Maelstrom homologue proteins. It has no known function.

    \ ' '8844' 'IPR015102' '\

    This family contains several transcriptional regulators, including FeoC, which contain a HTH motif. FeoC acts as a [Fe-S] dependent transcriptional repressor PUBMED:16718600.

    \ ' '8845' 'IPR015103' '\

    Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; ) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation PUBMED:9818190, PUBMED:14625689. The PTP superfamily can be divided into four subfamilies PUBMED:12678841:

    \

    \

    Based on their cellular localisation, PTPases are also classified as:

    \

    \

    All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif PUBMED:9646865. Functional diversity between PTPases is endowed by regulatory domains and subunits.

    \ \

    This entry represents the N-terminal domain of YopH protein tyrosine phosphatase (PTP). This domain has a compact structure composed of four alpha-helices and two beta-hairpins. Helices alpha-1 and alpha-3 are parallel to each other and antiparallel to helices alpha-2 and alpha-4. This domain targets YopH for secretion from the bacterium and translocation into eukaryotic cells, and has phosphotyrosyl peptide-binding activity, allowing for recognition of p130Cas and paxillin PUBMED:11375498. YopH from Yersinia sp. is essential for pathogenesis, as it allows the bacteria to resist phagocytosis by host macrophages through its ability to dephosphorylate host proteins, thereby interfering with the host signalling process. Yersinia has one of the most active PTP enzymes known. YopH contains a loop of ten amino acids (the WPD loop) that covers the entrance of the active site of the enzyme during substrate binding PUBMED:17352459.

    \

    A homologous domain is found in YscM (Yop secretion protein M), which acts as a Yop protein translocation protein. Several Yop proteins are involved in pathogenesis. YscM is produced by the virulence operon virC, which encodes thirteen genes, yscA-M PUBMED:1860816. Transcription of the virC operon was subjected to the same regulation as the yop genes.

    \ ' '8846' 'IPR015104' '\

    The fifth domain of beta-2-glycoprotein-1 (b2GP-1) is composed of four well-defined anti-parallel beta-strands and two short alpha-helices, as well as a long highly flexible loop. It plays an important role in the binding of b2GP-1 to negatively charged compounds and subsequent capture for binding of anti-b2GP-1 antibodies PUBMED:11124037.

    \ ' '8847' 'IPR015105' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents NgoMIV-type prokaryotic DNA restriction enzymes exhibiting an alpha/beta structure, with a central region comprising a mixed six-stranded beta-sheet with alpha-helices on each side. A long \'arm\' protrudes out of the core of the domain between strands beta2 and beta3 and is mainly involved in the tetramerisation interface of the protein. These restriction enzymes recognise the double-stranded sequence GCCGGC and cleave after G-1 PUBMED:10966652.

    \ ' '8848' 'IPR015106' '\

    Members of this family adopt a compact structure comprising five alpha helices. Charged and polar residues are exposed mostly on the surface, while most of the hydrophobic residues are buried inside the hydrophobic core of the helical bundle. The precise function of this domain is unknown, but it is has been shown to induce secretion of periplasmic proteins, especially collagenase PUBMED:16318855.

    \ ' '8849' 'IPR015107' '\

    Microbial transglutaminase (MTG) catalyses an acyl transfer reaction by means of a Cys-Asp diad mechanism, in which the gamma-carboxyamide groups of peptide-bound glutamine residues act as the acyl donors. The MTG molecule forms a single, compact domain belonging to the alpha+beta folding class, containing 11 alpha-helices and 8 beta-strands. The alpha-helices and the beta-strands are concentrated mainly at the amino and carboxyl ends of the polypeptide, respectively. These secondary structures are arranged so that a beta-sheet is surrounded by alpha-helices, which are clustered into three regions PUBMED:12221081.

    \ ' '8850' 'IPR015108' '\

    The major capsid protein p3 from Bacteriophage PRD1 adopts a double-barrel structure comprising two eight-stranded viral beta-barrels or jelly rolls, each of which contains a 12-residue alpha-helix. This protein then trimerises through a \'trimerisation loop\' sequence, and is incorporated within the viral capsid PUBMED:11752778.

    \ ' '8851' 'IPR015109' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the C-terminal catalytic domain of the type II restriction endonuclease EcoRII, which has a restriction endonuclease-like fold with a central five-stranded mixed beta-sheet surrounded on both sides by alpha-helices. EcoRII cleaves DNA specifically at single 5\' CCWGG sites PUBMED:14659759.

    \ ' '8852' 'IPR015110' '\

    The N-terminal YopE domain targets YopE for secretion from the bacterium and translocation into eukaryotic cells PUBMED:12049734.

    \ ' '8853' 'IPR015111' '\

    The HutP protein family regulates the expression of Bacillus \'hut\' structural genes by an anti-termination complex, which recognises three UAG triplet units, separated by four non-conserved nucleotides on the RNA terminator region. L-histidine and Mg2+ ions are also required. These proteins exhibit the structural elements of alpha/beta proteins, arranged in the order: alpha-alpha-beta-alpha-alpha-beta-beta-beta in the primary structure, and the four antiparallel beta-strands form a beta-sheet in the order beta1-beta2-beta3-beta4, with two alpha-helices each on the front (alpha1 and alpha2) and at the back (alpha3 and alpha4) of the beta-sheet PUBMED:15758992.

    \ ' '8854' 'IPR015112' '\

    The staphostatin A polypeptide chain folds into a slightly deformed, eight-stranded beta-barrel, with strands beta-4 through beta-8 forming an antiparallel sheet while the N-terminus forms a psi-loop motif. Members of this family constitute a class of cysteine protease inhibitors distinct in the fold and the mechanism of action from any known inhibitors of these enzymes PUBMED:14621990.

    \ ' '8855' 'IPR015113' '\

    Staphostatin B inhibits the cysteine protease Staphopain B, produced by Staphylococcus aureus, by blocking the active site of the enzyme. The domain adopts an eight-stranded mixed beta-barrel structure, with a deviation from the up-down topology of canonical beta-barrels in the amino-terminal part of the molecule PUBMED:15644332.

    \ ' '8857' 'IPR013349' '\

    Proteins in this entry are type III secretion system effectors, named differently in different species and designated YopR (Yersinia outer protein R), encoded by the YscH (Yersinia secretion H) gene. This Yop protein is unusual in that it is released to the extracellular environment rather than injected directly into the target cell as are most Yop proteins.

    \ ' '8858' 'IPR015115' '\

    Centromere protein B (CENP-B) interacts with centromeric heterochromatin in chromosomes and binds to a specific subset of alphoid satellite DNA, called the CENP-B box. CENP-B may organise arrays of centromere satellite DNA into a higher order structure, which then directs centromere formation and kinetochore assembly in mammalian chromosomes.

    \

    The CENP-B dimerisation domain is composed of two alpha-helices, which are folded into an antiparallel configuration. Dimerisation of CENP-B is mediated by this domain, in which monomers dimerise to form a symmetrical, antiparallel, four-helix bundle structure with a large hydrophobic patch in which 23 residues of one monomer form van der Waals contacts with the other monomer. This CENP-B dimer configuration may be suitable for capturing two distant CENP-B boxes during centromeric heterochromatin formation PUBMED:14522975.

    \ \ ' '8859' 'IPR015116' '\

    The GTPase binding domain binds to the G protein Cdc42, inhibiting both its intrinsic and stimulated GTPase activity. The domain is largely unstructured in the absence of Cdc42 PUBMED:10360579.

    \ ' '8860' 'IPR015117' '\

    The bacterial protein Mac 1 adopts an alpha/beta fold, with 14 beta strands and 9 alpha helices. The N-terminal domain is made up predominantly of alpha helices, whereas the C-terminal domain consists predominantly of beta sheets. Mac 1 blocks polymorphonuclear opsonophagocytosis, inhibits the production of reactive oxygen species and contains IgG endopeptidase activity.

    \ ' '8861' 'IPR015118' '\

    The N-terminal presequence domain found in 5-aminolevulinate synthase exists as an amphipathic helix, with a positively charged surface provided by lysine residues and no stable helix at the N-terminus. The domain is essential for the import process by which ALAS is transported into the mitochondria: translocase of the outer membrane (Tom) and translocase of the inner membrane protein complexes appear responsible for recognition and import through the mitochondrial membrane. The protein Tom20 is anchored to the mitochondrial outer membrane, and its interaction with presequences is thought to be the recognition step which allows subsequent import PUBMED:11566198.

    \ ' '8862' 'IPR014744' '\

    This entry represents the interlocking domain of the eukaryotic nuclear receptor coactivators CREBP and P300. The interlocking domain forms a 3-helical non-globular array that forms interlocked heterodimers with its target.

    \

    Nuclear receptors are ligand-activated transcription factors involved in the regulation of many processes, including development, reproduction and homeostasis. Nuclear receptor coactivators act to modulate the function of nuclear receptors. Coactivators associate with promoters and enhancers primarily through protein-protein contacts to facilitate the interaction between DNA-bound transcription factors and the transcription machinery. Many of these coactivators are structurally related, including CBP (CREB-binding protein) and P300 PUBMED:11823864. CBP and P300 both have histone acetyltransferase activity (). CBP/P300 proteins function synergistically to activate transcription, acting to remodel chromatin and to recruit RNA polymerase II and the basal transcription machinery. CBP is required for proper cell cycle control, differentiation and apoptosis. The interaction of CBP/P300 with transcription factors involves several small domains. The IBiD domain in the C-terminal of CBP is responsible for CBP interaction with IRF-3, as well as with the adenoviral oncoprotein E1A, TIF-2 coactivator, and the IRF homologue KSHV IRF-1 PUBMED:11583620.

    \ ' '8864' 'IPR015120' '\

    The N-terminal domain of Siah interacting protein (SIP) adopts a helical hairpin structure with a hydrophobic core stabilised by a classic knobs-and-holes arrangement of side chains contributed by the two amphipathic helices. Little is known about this domain\'s function, except that it is crucial for interactions with Siah. It has also been hypothesised that SIP can dimerise through this N-terminal domain PUBMED:15996101.

    \ ' '8865' 'IPR015121' '\

    The C-terminal domain of DNA fragmentation factor 45 kDa (DFF-C) consists of four alpha-helices, which are folded in a helix-packing arrangement, with alpha-2 and alpha-3 packing against a long C-terminal helix (alpha-4). The main function of this domain is the inhibition of DFF40 by binding to its C-terminal catalytic domain through ionic interactions, thereby inhibiting the fragmentation of DNA in the apoptotic process. In addition to blocking the DNase activity of DFF40, the C-terminal region of DFF45 is also important for the DFF40-specific folding chaperone activity, as demonstrated by the ability of DFF45 to refold DFF40 PUBMED:12144788.

    \ ' '8866' 'IPR009095' '\

    TRADD is a signalling adaptor protein involved in tumour necrosis factor-receptor I (TNFR1)-associated apoptosis and cell survival. The decision between apoptosis and cell survival involves the interplay between two sequential signalling complexes. The plasma membrane-bound complex I is comprised of TNFR1, TRADD, the kinase RIP1, and TRAF2, which together mediate the activation of NF-kappaB. Subsequently, complex II is formed in the cytoplasm, where TRADD and RIP1 associate with FADD and caspase-8. If NF-kappaB is activated by complex I, then complex II will associate with the caspase-8 inhibitor FLIP(L) and the cell survives, while the failure to activate NF-kappaB leads to apoptosis PUBMED:12887920.

    \

    The TRADD C-terminal death domain is responsible for its association with TNFR1, and with the death-domain proteins FADD and RIP1, which promote apoptosis. The TRADD N-terminal domain binds TRAF2 and promotes TRAF2 recruitment to TNFR1, thereby mediating the activation of NK-kappaB and JNK/AP1, which promote cell survival PUBMED:10911999. The N-terminal TRADD domain is composed of an alpha-beta sandwich, where the beta strands form an antiparallel beta-sheet.

    \ \ ' '8867' 'IPR015122' '\

    The phage-encoded excisionase protein Tn916-Xis adopts a winged-helix structure that consists of a three-stranded anti-parallel beta-sheet that packs against a helix-turn-helix (HTH) motif and a third C-terminal alpha-helix. It is encoded for by Tn916, which also codes for the integrase Tn916-Int. The protein interacts with DNA by the insertion of helix alpha-2 into the major groove and the contact of the hairpin that connects strands beta-2 and beta-3 with the adjacent phosphodiester backbone and/or minor groove. Tn916-Xis stimulates phage excision and inhibits viral integration by stabilising distorted DNA structures PUBMED:15733914.

    \ ' '8868' 'IPR015123' '\

    This entry represents the oligomerisation domain of the breakpoint cluster region oncoprotein Bcr, and the Bcr/Abl (Abelson-leukemia-virus) fusion protein created by a reciprocal (9;22) fusion PUBMED:17090304. Brc displays serine/threonine protein kinase activity (), acting as a GTPase-activating protein for RAC1 and CDC42. Brc promotes the exchange of RAC or CDC42-bound GDP by GTP, thereby activating them PUBMED:15302586. The Bcr/Abl fusion protein loses some of the regulatory function of Bcr with regards to small Rho-like GTPases with negative consequences on cell motility, in particular on the capacity to adhere to endothelial cells PUBMED:17090304.

    \ \

    The Bcr, Bcr/Abl oncoprotein oligomerisation domain consists of a short N-terminal helix (alpha-1), a flexible loop and a long C-terminal helix (alpha-2). Together these form an N-shaped structure, with the loop allowing the two helices to assume a parallel orientation. The monomeric domains associate into a dimer through the formation of an antiparallel coiled coil between the alpha-2 helices and domain swapping of two alpha-1 helices, where one alpha-1 helix swings back and packs against the alpha-2 helix from the second monomer. Two dimers then associate into a tetramer. The oligomerisation domain is essential for the oncogenicity of the Bcr-Abl protein PUBMED:11780146.

    \ ' '8869' 'IPR015124' '\

    Members of this family are essential for the biosynthesis of sulpholipid-1 in prokaryotes. They adopt a structure that belongs to the sulphotransferase superfamily, consisting of a single domain with a core four-stranded parallel beta-sheet flanked by alpha-helices PUBMED:15258569.

    \ ' '8870' 'IPR015125' '\

    This domain consist of ten beta-strands and a carboxy-terminal alpha-helix. The amino-terminal five beta-strands and the C-terminal five beta-strands adopt folds that are identical to each other. The domain is essential for the recruitment of proteins to double stranded breaks in DNA, which is mediated by interaction with methylated Lys 79 of histone H3 PUBMED:15525939.

    \ ' '8871' 'IPR015126' '\

    This domain is responsible for binding the DNA attachment sites at each end of the Mu genome. They adopt a secondary structure comprising a four helix bundle tightly packed around a hydrophobic core consisting of aliphatic and aromatic amino acid residues. Helices 1 and 2 are oriented antiparallel to each other. Helix 3 crosses helices 1 and 2 at angles of 60 and 120 degrees, respectively. Excluding the C-terminal helix 4, the fold of the I-gamma subdomain is remarkably similar to that of the homeodomain family of helix-turn-helix DNA-binding proteins, although their amino acid sequences are completely unrelated PUBMED:9367742.

    \ ' '8872' 'IPR015127' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    P-ATPases (sometime known as E1-E2 ATPases) () are found in bacteria and in a number of eukaryotic plasma membranes and organelles PUBMED:9419228. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.

    \

    This entry represents the N-terminal domain found in gastric H+/K+-transporter ATPases. This domain adopts an alpha-helical conformation under hydrophobic conditions. The domain contains tyrosine residues, phosphorylation of which regulates the function of the ATPase. Additionally, the domain also interacts with various structural proteins, including the spectrin-binding domain of ankyrin III PUBMED:12480547.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '8873' 'IPR015128' '\

    The Aurora-A binding domain binds to two distinct sites on the Aurora kinase: the upstream residues bind at the N-terminal lobe, whilst the downstream residues bind in an alpha-helical conformation between the N- and C-terminal lobes. The two Aurora-A binding motifs are connected by a flexible linker that is variable in length and sequence across species. Binding of the domain results strong activation of Aurora-A and protection from deactivating dephosphorylation by phosphatase PP1 PUBMED:14580337.

    \ ' '8874' 'IPR015129' '\

    The titin Z domain, that recognises and binds to the C-terminal calmodulin-like domain of alpha-actinin-2 (Act-EF34), adopts a helical structure, and binds in a groove formed by the two planes between the helix pairs of Act-EF34. This interaction is essential for sarcomere assembly PUBMED:11573089.

    \ ' '8875' 'IPR015130' '\

    This domain is found in proteins involved in the 1,2 rearrangement of the terminal amino group of DL-lysine and of L-beta-lysine, using adenosylcobalamin (AdoCbl) and pyridoxal-5\'-phosphate as cofactors. The structure is predominantly a PLP-binding TIM barrel domain, with several additional alpha-helices and beta-strands at the N and C termini. These helices and strands form an intertwined accessory clamp structure that wraps around the sides of the TIM barrel and extends up toward the Ado ligand of the Cbl cofactor, providing most of the interactions observed between the protein and the Ado ligand of the Cbl, suggesting that its role is mainly in stabilising AdoCbl in the precatalytic resting state.

    \ ' '8876' 'IPR015131' '\

    Killer toxins are polypeptides secreted by some fungal species that kill sensitive cells of the same or related species, often functioning by creating pores in target cell membranes. The fungal killer toxin KP4 from the corn smut fungus, Ustilago maydis (Smut fungus), is encoded by a resident symbiotic double-stranded RNA virus, Ustilago maydis P4 virus (UmV4), within fungal cells. Unlike most killer toxins, KP4 is a single polypeptide PUBMED:8145639. KP4 inhibits voltage-gated calcium channels in mammalian cells, which in turn inhibits cell growth and division by blocking calcium import. KP4 adopts a structure consisting of a two-layer alpha/beta sandwich with a left-handed crossover PUBMED:7582897.

    \ ' '8877' 'IPR015132' '\

    The L27_2 domain is a protein-protein interaction domain capable of organising scaffold proteins into supramolecular assemblies by formation of heteromeric L27_2 domain complexes. L27_2 domain-mediated protein assemblies have been shown to play essential roles in cellular processes including asymmetric cell division, establishment and maintenance of cell polarity, and clustering of receptors and ion channels. Members of this family form specific heterotetrameric complexes, in which each domain contains three alpha-helices. The two N-terminal helices of each L27_2 domain pack together to form a tight, four-helix bundle in the heterodimer, whilst the third helix of each L27_2 domain forms another four-helix bundle that assembles the two units of the heterodimer into a tetramer PUBMED:15863617.

    \ ' '8878' 'IPR015133' '\

    The E3 ubiquitin ligase protein found in the bacterial protein AvrPtoB inhibits immunity-associated programmed cell death (PCD) when translocated into plant cells, probably by recruiting E2 enzymes and transferring ubiquitin molecules to cellular proteins involved in regulation of PCD and targeting them for degradation. The structure reveals a globular fold centred on a four-stranded beta-sheet that packs against two helices on one face and has three very extended loops connecting the elements of secondary structure, with remarkable homology to the RING-finger and U-box families of proteins involved in ubiquitin ligase complexes in eukaryotes PUBMED:16373536.

    \ ' '8879' 'IPR015134' '\

    The myocyte enhancer factor-2 (MEF2) binding domain, predominantly found in the calcineurin-binding protein CABIN 1, adopts an amphipathic alpha-helical structure, which allows it to bind a hydrophobic groove on the MEF2S domain, forming a triple-helical interaction. Interaction of this domain with MEF2 causes repression of transcription PUBMED:12700764.

    \ ' '8880' 'IPR000655' '\

    Bacteriophage lambda encodes two repressors: the Cro repressor that acts to turn off early gene transcription during the lytic cycle, and the lambda or cI repressor that is required to maintain lysogenic growth. Together the Cro and cI repressors form a helix-turn-helix (HTH) superfamily. The lambda Cro repressor binds to DNA as a highly flexible dimer. The crystal structure of the lambda Cro repressor PUBMED:9653036 reveals a HTH DNA-binding protein with an alpha/beta fold that differs from other Cro family members, possibly by an evolutionary fold change PUBMED:2598646. Most Cro proteins, such as Enterobacteria phage P22 Cro and Bacteriophage 434 Cro, have an all-alpha structure that is thought to be ancestral to lambda Cro, where the fourth and fifth helices are replaced by a beta-sheet, possibly as a result of secondary structure switching rather than by nonhomologous replacement PUBMED:15062080. This entry represents the lambda-type Cro repressor with an alpha/beta topology.

    \ ' '8881' 'IPR015135' '\

    This region consists of a single highly hydrophobic transmembrane helix that transverses the lipid bilayer at a 20 degree angle with respect to the membrane normal. It contains a conserved cysteine residue (Cys32) that, together with Cys34 found in the stannin unstructured linker domain, constitutes the putative trimethyltin-binding site that resides at the end of the transmembrane domain close to the lipid/solvent interface PUBMED:16246365.

    \ ' '8882' 'IPR015136' '\

    This entry represents an unstructured protein region which connects two adjacent stannin helical domains. It contains a conserved CXC metal-binding motif and a putative 14-3-3-zeta binding domain. Upon coordinating dimethytin, considerable structural or dynamic changes in the flexible loop region of SNN may take place, recruiting other binding partners such as 14-3-3-zeta, and thereby initiating the apoptotic cascade PUBMED:16246365.

    \ ' '8883' 'IPR015137' '\

    This domain forms a distorted cytoplasmic helix that is partially absorbed into the plane of the lipid bilayer with a tilt angle of approximately 80 degrees from the membrane normal. It interacts with the surface of the lipid bilayer, and contributes to the initiation of the apoptotic cascade on binding of the unstructured linker domain to dimethyltin PUBMED:16246365.

    \ ' '8884' 'IPR015138' '\

    Salmonella invasion protein A is an actin-binding protein that contributes to host cytoskeletal rearrangements by stimulating actin polymerisation and counteracting F-actin destabilising proteins. Members of this family possess an all-helical fold consisting of eight alpha-helices arranged so that six long, amphipathic helices form a compact fold that surrounds a final, predominantly hydrophobic helix in the middle of the molecule PUBMED:16507363.

    \ ' '8885' 'IPR015139' '\

    Helicobacter pylori (Campylobacter pylori)\ clinical isolates can be classified into two types according to their degree of pathogenicity. Type I strains are associated with a severe disease pathology, express functional VacA (vacuolating cytotoxin A) \ and contain an insertion of 40 kb of foreign DNA: the cag (cytotoxin-associated gene) pathogenicity island (cagPAI). Type II strains lack the 40 kb insert, cagPAI. The cagPAI may be divided into two regions, cag I and cag II and contain approximately 16 and 15 genes, respectively. \ The cagPAI encodes a type IV secretion system (T4SS), which delivers CagA into the cytosol of gastric epithelial cells \ through a rigid needle structure covered by Cag7 or CagY, a VirB10-homologous protein, and CagT, a VirB7-homologous protein, \ at the base PUBMED:17172510. The CagA protein is the virulence factor that induces morphological changes in host cells, which \ may be associated with the development of peptic ulcer and gastric carcinoma PUBMED:16933206.

    \ \

    CagZ is a 23 kDa protein consisting of a single compact L-shaped domain, composed of seven alpha-helices that run antiparallel to each other. 70% of the residues are in alpha-helix conformation and no beta-sheet is present. CagZ is essential for the translocation of the pathogenic protein CagA into host cells PUBMED:15223328.

    \ ' '8887' 'IPR014123' '\

    This entry represents nickel-dependent superoxide dismutase (NiSOD) (), a SOD enzyme that uses nickel, rather than iron, manganese, copper, or zinc. All SOD enzymes catalyse the dismutation of toxic superoxide radical anions to oxygen and hydrogen peroxide in order to protect cells from oxidative damage. The catalytic cycle of NiSOD consists of two half-reactions, each initiated by the successive approach of substrate to the metal centre. The first (reductive) phase involves Ni(III) reduction to Ni(II), and the second (oxidative) phase involves the metal reoxidation back to its resting state PUBMED:16756300. NiSOD has a novel SOD fold and assembly, consisting of a hexameric assembly of 4-helix bundles of up-and-down topology, which contains a 9-residue nickel-hook structural motif that is critical for metal binding and catalysis PUBMED:15209499. A gene for a required protease (NiSOD maturation protease; ) is adjacent to the NiSOD gene.

    \ ' '8888' 'IPR015141' '\

    This entry represents bacterial and fungal phospholipase A2 proteins, as well as various hypothetical and putative proteins. They enable the liberation of fatty acids and lysophospholipid by hydrolysing the 2-ester bond of 1,2-diacyl-3-sn-phosphoglycerides. The phospholipase domain adopts an alpha-helical secondary structure, consisting of five alpha-helices and two helical segments PUBMED:11897785.

    \ ' '8889' 'IPR015142' '\

    This entry represents Smac (Second Mitochondria-derived Activator of Caspases) and DIABLO (Direct IAP-Binding protein with Low PI) proteins and their homologues. Smac promotes apoptosis by activating caspases in the cytochrome c/Apaf-1/caspase-9 pathway, and by opposing the inhibitory activity of inhibitor of apoptosis proteins (XIAP-BIR3). The protein assumes an elongated three-helix bundle structure, and forms a dimer in solution PUBMED:11140638.

    \ ' '8890' 'IPR015143' '\

    The L27 domain is a protein interaction module that exists in a large family of scaffold proteins, functioning as an organisation centre of large protein assemblies required for the establishment and maintenance of cell polarity. L27 domains form specific heterotetrameric complexes, in which each domain contains three alpha-helices PUBMED:15048107.

    \ ' '8891' 'IPR015144' '\

    This domain is composed of two pairs of parallel alpha-helices, and interacts with the bacterial protein YopN via hydrophobic residues located on the helices. Association of TyeA with the C terminus of YopN is accompanied by conformational changes in both polypeptides that create order out of disorder: the resulting structure then serves as an impediment to type III secretion of YopN PUBMED:15701523.

    \ ' '8892' 'IPR015145' '\

    The L27_N domain plays a role in the biogenesis of tight junctions and in the establishment of cell polarity in epithelial cells. Each L27_N domain consists of three alpha-helices, the first two of which form an antiparallel coiled-coil. Two L27 domains come together to form a four-helical bundle with the antiparallel coiled-coils formed by the first two helices. The third helix of each domain forms another coiled-coil packing at one end of the four-helix bundle, creating a large hydrophobic interface: the hydrophobic interactions are the major force that drives heterodimer formation PUBMED:15241471.

    \ ' '8893' 'IPR015146' '\

    The Stirrup domain, found in the prokaryotic protein ribonucleotide reductase, has a molecular mass of 9 kDa and is folded into an alpha/beta structure. It allows for binding of the reductase to DNA via electrostatic interactions, since it has a predominance of positive charges distributed on its surface PUBMED:10891276.

    \ ' '8894' 'IPR015147' '\

    Ribonucleotide reductase from Pyrococcus species has been shown to contain two inteins, PI-PfuI and PI-PfuII. The endonuclease domain from the PI-PfuI intein is composed of two subdomains, each of which assumes an alpha-beta-beta-alpha-beta-beta-alpha-alpha topology. The four stranded beta-sheet forms a saddle-shaped surface and assembles together through an interface made of alpha-helices. The presence of 14 basic residues on the surface of the beta-sheets suggests that this large groove may be involved in DNA binding PUBMED:10891276. This entry represents the endonuclease subdomain found towards the N-terminus.

    \ ' '8895' 'IPR015148' '\

    Members of this family form the capsid of Pseudomonas phage PP7. They adopt a secondary structure consisting of a six stranded beta sheet and an alpha helix PUBMED:10739912.

    \ ' '8896' 'IPR015149' '\

    This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin\'s active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C PUBMED:10761923.

    \ ' '8897' 'IPR015150' '\

    Members of this family adopt a secondary structure consisting of five short beta-strands (beta1-beta5), which are arranged in two antiparallel distorted sheets formed by strands beta1-beta4-beta5 and beta2-beta3 facing each other. This beta-sandwich is stabilised by six enclosed cysteines arranged in a [1-2, 3-5, 4-6] disulphide pairing resulting in a disulphide-rich hydrophobic core that is largely inaccessible to bulk solvent. The close proximity of disulphide bonds [3-5] and [4-6] organises haemadin into four distinct loops. The N-terminal segment of this domain binds to the active site of thrombin, inhibiting it PUBMED:11060016.

    \ ' '8898' 'IPR015151' '\

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport PUBMED:15261670. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors PUBMED:17449236, PUBMED:11598180.

    \

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes PUBMED:15107467. AP2 associates with the plasma membrane and is responsible for endocytosis PUBMED:12952931. AP3 is responsible for protein trafficking to lysosomes and other related organelles PUBMED:16542748. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins PUBMED:11080148. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface PUBMED:17254016.

    \

    This entry represents a subdomain of the appendage (ear) domain of beta-adaptin from AP clathrin adaptor complexes. This domain has a three-layer arrangement, alpha-beta-alpha, with a bifurcated antiparallel beta-sheet PUBMED:10430869. This domain is required for binding to clathrin, and its subsequent polymerisation. Furthermore, a hydrophobic patch present in the domain also binds to a subset of D-phi-F/W motif-containing proteins that are bound by the alpha-adaptin appendage domain (epsin, AP180, eps15) PUBMED:10944104.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '8899' 'IPR015152' '\

    Members of this family interact with erythropoietin (EPO), with subsequent initiation of the downstream chain of events associated with binding of EPO to the receptor, including EPO-induced erythroblast proliferation and differentiation through induction of the JAK2/STAT5 signalling cascade. The domain adopts a secondary structure composed of a short amino-terminal helix, followed by two beta-sandwich regions PUBMED:9774108.

    \ ' '8900' 'IPR015153' '\

    Like other EF hand domains, this domain forms a helix-loop-helix motif, though since it does not contain the canonical pattern of calcium binding residues found in many EF hand domains, it does not bind calcium ions. The main function of this domain is the provision of specificity in beta-dystroglycan recognition, though in dystrophin it serves an additional role: stabilisation of the WW domain (), enhancing dystroglycan binding PUBMED:10932245.

    \ ' '8901' 'IPR015154' '\

    Like other EF hand domains, this domain forms a helix-loop-helix motif, though since it does not contain the canonical pattern of calcium binding residues found in many EF hand domains, it does not bind calcium ions. The main function of this domain is the provision of specificity in beta-dystroglycan recognition, though in dystrophin it serves an additional role: stabilisation of the WW domain (), enhancing dystroglycan binding PUBMED:10932245.

    \ ' '8902' 'IPR015155' '\

    PFU is the ubiquitin binding domain of Doa1 in Saccharomyces cerevisiae and its mammalian homologue. The C-terminal PUL domain of Doa1 interacts directly with Cdc48 and is necessary for Doa1 function PUBMED:16428438. Doa1 is required for maintaining Ubiquitin levels PUBMED:18508771.

    \ ' '8903' 'IPR015156' '\

    This C-terminal domain is found in the prokaryotic protein glycosyltrehalose trehalohydrolase; it assumes a gamma-crystallin-type fold with a five-stranded anti-parallel beta-sheet that packs against the C-terminal side of a beta-alpha barrel. The domain is common to family 13 glycosidases and typically contains a five to ten strand beta-sheet, however its precise fold varies PUBMED:10926520.

    \ ' '8904' 'IPR015157' '\

    TMA7 plays a role in protein translation. Deletions of the TMA7 gene results in altered protein synthesis rates PUBMED:16702403.

    \ ' '8905' 'IPR015158' '\

    BUD22 has been shown in yeast to be a nuclear protein involved in bud-site selection. It plays a role in positioning the proximal bud pole signal PUBMED:11452010.

    \ ' '8906' 'IPR015159' '\

    Mer2 (Rec107) forms part of a complex that is required for meiotic double strand DNA break formation. Mer2 increases in abundance and is phosphorylated during the prophase phase of cell division PUBMED:16783010. Blocking double strand break formation results in delayed dephosphorylation and dissociation of Mer2 from the chromosome PUBMED:16783010.

    \ ' '8907' 'IPR015160' '\

    Members of this family assume a helical secondary structure, with two alpha helices forming a disulphide cross-linked alpha-helical hairpin. The disulphide bonds are crucial for the toxic activity of the protein, and are required for maintenance of the tertiary structure, and subsequent interaction with the particulate form of guanylate cyclase, increasing cyclic GMP levels within the host intestinal epithelial cells PUBMED:8528070.

    \ ' '8908' 'IPR015161' '\

    Members of this family assume a beta-gamma-crystallin fold, wherein nine beta-strands are connected by loop, and are separated into two sheets, each sheet forming the Greek key motif. The two Greek key motifs face each other in the global topology. The three-dimensional structure of the molecule is a \'sandwich\'-shaped beta-barrel structure: hydrophobic side-chains are packed in the large interface area of the beta-sheets. This domain confers a cytocidal effect to the toxin, causing cell death in both budding and fission yeasts, and morphological changes in yeasts and filamentous fungi PUBMED:11114251.

    \ ' '8909' 'IPR009084' '\

    Bacteriophage Mu can integrate into the host bacterial genome and replicate via transposition. Mu requires the activity of four proteins for DNA transposition. Two of these proteins are the phage-encoded A and B transposition proteins, while the other two are host-specified accessory factors HU and IHF. These four proteins can form nucleoprotein complexes (transposomes), which enable strand transfer. The stable protein-DNA intermediate is subsequently disassembled prior to DNA replication by host proteins.

    \

    The Mu B transposition protein is an ATP-dependent, DNA-binding protein required for target capture and immunity, as well as for activating transpososome function PUBMED:12791691. The C-terminal domain of the B transposition protein is believed to be involved in both DNA-binding and protein-protein contacts with the Mu A transposition protein. The structure of the C-terminal domain consists of four helices in an irregular array PUBMED:11060014.

    \ \ ' '8910' 'IPR015162' '\

    The CheY binding domain is found in the response regulator histidine kinase CheA. It adopts a secondary structure consisting of an open-face beta/alpha sandwich, with four antiparallel beta-strands and two alpha-helices. It binds to a corresponding domain on CheY, with subsequent phosphorylation of the CheY Asp57 residue, and activation of CheY, which then affects flagellar rotation PUBMED:11134926.

    \ ' '8911' 'IPR015163' '\

    The C-terminal domain of CDC6 assumes a winged helix fold, with a five alpha-helical bundle (alpha15-alpha19) structure, backed on one side by three beta strands (beta6-beta8). It has been shown that this domain acts as a DNA-localisation factor, however its exact function is, as yet, unknown. Putative functions include: (1) mediation of protein-protein interactions and (2) regulation of nucleotide binding and hydrolysis. Mutagenesis studies have shown that this domain is essential for appropriate Cdc6 activity PUBMED:11030343.

    \ ' '8912' 'IPR015164' '\

    Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles PUBMED:12910258, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

    \

    Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi\'s sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus PUBMED:11056549.

    \ \

    This domain adopts a secondary structure consisting of a five alpha-helix cyclin fold. Interaction with cyclin dependent kinases (CDKs) at a PSTAIRE sequence motif within the catalytic cleft of CDK results in the regulation of CDK activity PUBMED:11124804.

    \ ' '8913' 'IPR015165' '\

    This domain, which is found in a set of prokaryotic amylases, has no known function PUBMED:11272837.

    \ ' '8914' 'IPR015166' '\

    Members of this family consist of a beta-sheet region followed by an alpha-helix and an unstructured C-terminus. The beta-sheet region contains a CXCX...XCXC sequence with Cys residues located in two proximal loops and pointing towards each other. This precise function of this set of bacterial proteins is, as yet, unknown PUBMED:11017201.

    \ ' '8915' 'IPR015167' '\

    This domain is found in maltosyltransferases, adopting a secondary structure that consists of eight antiparallel beta-strands forming an open-sided \'jelly roll\' Greek key beta-barrel. Their exact function is, as yet, unknown PUBMED:11545590.

    \ ' '8916' 'IPR015168' '\

    This entry is found in the NMT1 and THI5 proteins. These proteins are proposed to be required for the biosynthesis of the pyrimidine moiety of thiamine PUBMED:12271461, PUBMED:2358444, PUBMED:12777485. They are regulated by thiamine PUBMED:2358444.

    \ ' '8917' 'IPR015169' '\

    This domain is found in mucosal vascular addressin cell adhesion molecule 1 proteins (MAdCAM-1). These are cell adhesion molecules expressed on the endothelium in mucosa that guide the specific homing of lymphocytes into mucosal tissues. MAdCAM-1 belongs to a subclass of the immunoglobulin superfamily (IgSF), the members of which are ligands for integrins PUBMED:9655832. The crystal structure of this domain has been reported; it adopts an immunoglobulin-like beta-sandwich structure, with seven strands arranged in two beta-sheets in a Greek-key topology PUBMED:11807247, PUBMED:9655832.

    \ ' '8918' 'IPR015170' '\

    This entry is found in a set of bacterial proteins, including Cytochrome c-type protein. It is functionally uncharacterised.

    \ ' '8919' 'IPR015171' '\

    This domain is found at the N terminus of cyclomaltodextrinase. The domain assumes a beta-sandwich structure composed of the eight antiparallel beta-strands. A ten residue linker is also present at the C-terminal end, which connects the N-terminal domain to a distal domain in the protein. This domain participates in oligomerisation of the protein, wherein the N-terminal domain of one subunit contacts the active centre of the other subunit, and is also required for binding of cyclodextrin to substrate PUBMED:12752453.

    \ ' '8920' 'IPR015172' '\

    This entry represents an MIF4G-like domain. MIF4G domains share a common structure but can differ in sequence. This entry is designated "type 1", and is found in nuclear cap-binding proteins and eIF4G.

    \ \

    The MIF4G domain is a structural motif with an ARM (Armadillo) repeat-type fold, consisting of a 2-layer alpha/alpha right-handed superhelix. Proteins usually contain two or more structurally similar MIF4G domains connected by unstructured linkers. MIF4G domains are found in several proteins involved in RNA metabolism, including eIF4G (eukaryotic initiation factor 4-gamma), eIF-2b (translation initiation factor), UPF2 (regulator of nonsense transcripts 2), and nuclear cap-binding proteins (CBP80, CBC1, NCBP1), although the sequence identity between them may be low PUBMED:10958635.

    \ \

    The nuclear cap-binding complex (CBC) is a heterodimer. Human CBC consists of a large CBP80 subunit and a small CBP20 subunit, the latter being critical for cap binding. CBP80 contains three MIF4G domains connected with long linkers, while CBP20 has an RNP (ribonucleoprotein)-type domain that associates with domains 2 and 3 of CBP80 PUBMED:11545740. The complex binds to 5\'-cap of eukaryotic RNA polymerase II transcripts, such as mRNA and U snRNA. The binding is important for several mRNA nuclear maturation steps and for nonsense-mediated decay. It is also essential for nuclear export of U snRNAs in metazoans PUBMED:16043498.

    \ \

    Eukaryotic translation initiation factor 4 gamma (eIF4G) plays a critical role in protein expression, and is at the centre of a complex regulatory network. Together with the cap-binding protein eIF4E, it recruits the small ribosomal subunit to the 5\'-end of mRNA and promotes the assembly of a functional translation initiation complex, which scans along the mRNA to the translation start codon. The activity of eIF4G in translation initiation could be regulated through intra- and inter-protein interactions involving the ARM repeats PUBMED:16156639. In eIF4G, the MIF4G domain binds eIF4A, eIF3, RNA and DNA.

    \ ' '8921' 'IPR015173' '\

    This domain adopts a right-handed triple-stranded beta-helix fold, and is found in the central region of the phage short tail fibre protein gp12 PUBMED:11743729.

    \ ' '8922' 'IPR015174' '\

    This entry represents an MIF4G-like domain. MIF4G domains share a common structure but can differ in sequence. This entry is designated "type 2", and is found in nuclear cap-binding proteins and eIF4G.

    \ \

    The MIF4G domain is a structural motif with an ARM (Armadillo) repeat-type fold, consisting of a 2-layer alpha/alpha right-handed superhelix. Proteins usually contain two or more structurally similar MIF4G domains connected by unstructured linkers. MIF4G domains are found in several proteins involved in RNA metabolism, including eIF4G (eukaryotic initiation factor 4-gamma), eIF-2b (translation initiation factor), UPF2 (regulator of nonsense transcripts 2), and nuclear cap-binding proteins (CBP80, CBC1, NCBP1), although the sequence identity between them may be low PUBMED:10958635.

    \ \

    The nuclear cap-binding complex (CBC) is a heterodimer. Human CBC consists of a large CBP80 subunit and a small CBP20 subunit, the latter being critical for cap binding. CBP80 contains three MIF4G domains connected with long linkers, while CBP20 has an RNP (ribonucleoprotein)-type domain that associates with domains 2 and 3 of CBP80 PUBMED:11545740. The complex binds to 5\'-cap of eukaryotic RNA polymerase II transcripts, such as mRNA and U snRNA. The binding is important for several mRNA nuclear maturation steps and for nonsense-mediated decay. It is also essential for nuclear export of U snRNAs in metazoans PUBMED:16043498.

    \ \

    Eukaryotic translation initiation factor 4 gamma (eIF4G) plays a critical role in protein expression, and is at the centre of a complex regulatory network. Together with the cap-binding protein eIF4E, it recruits the small ribosomal subunit to the 5\'-end of mRNA and promotes the assembly of a functional translation initiation complex, which scans along the mRNA to the translation start codon. The activity of eIF4G in translation initiation could be regulated through intra- and inter-protein interactions involving the ARM repeats PUBMED:16156639. In eIF4G, the MIF4G domain binds eIF4A, eIF3, RNA and DNA.

    \ ' '8924' 'IPR015176' '\

    This entry represents a domain predominantly found in chondroitin ABC lyase I, adopting a jelly-roll fold topology consisting of a two-layered bent beta-sheet sandwich with one short alpha-helix. The convex beta sheet is composed of five antiparallel strands, whilst the concave beta-sheet contains five antiparallel beta-strands with a loop between two consecutive strands folding back onto the concave surface. This domain is required for binding of the protein to long glycosaminoglycan chains PUBMED:12706721.

    \ ' '8925' 'IPR015177' '\

    This domain is predominantly found in chondroitin ABC lyase I, adopting a helical structure, with fifteen alpha-helices which are at least two turns long and several short helical turns. The bulk of the domain is formed by ten alpha-helices forming five hairpin-like pairs and arranged into an incomplete toroid, the (alpha/alpha)5 fold. Additionally, two long and two short alpha-helices at the N terminus of the domain wrap around the toroid. At the C-terminal end of the toroid there is one additional short alpha-helix. This domain is required for degradation of polysaccharides containing 1,4-beta-D-hexosaminyl and 1,3-beta-D-glucoronosyl or 1,3-alpha-L-iduronosyl linkages to disaccharides containing 4-deoxy-beta-D-gluc-4-enuronosyl groups PUBMED:12706721.

    \ ' '8926' 'IPR015178' '\

    This entry represents a domain found in a set of prokaryotic transferases. It adopts an immunoglobulin/albumin-binding domain-like fold, with a bundle of three alpha-helices, though its function is, as yet, unknown PUBMED:12618437.

    \ ' '8927' 'IPR015179' '\

    Alpha-amylase is classified as family 13 of the glycosyl hydrolases and is present in archaea, bacteria, plants and animals. Alpha-amylase is an essential enzyme in alpha-glucan metabolism, acting to catalyse the hydrolysis of alpha-1,4-glucosidic bonds of glycogen, starch and related polysaccharides. Although all alpha-amylases possess the same catalytic function, they can vary with respect to sequence. In general, they are composed of three domains: a TIM barrel containing the active site residues and chloride ion-binding site (domain A), a long loop region inserted between the third beta strand and the alpha-helix of domain A that contains calcium-binding site(s) (domain B), and a C-terminal beta-sheet domain that appears to show some variability in sequence and length between amylases (domain C) PUBMED:11141191. Amylases have at least one conserved calcium-binding site, as calcium is essential for the stability of the enzyme. The chloride-binding functions to activate the enzyme, which acts by a two-step mechanism involving a catalytic nucleophile base (usually an Asp) and a catalytic proton donor (usually a Glu) that are responsible for the formation of the beta-linked glycosyl-enzyme intermediate.

    \

    This entry represents a domain found in prokaryotic alpha-amylase () and 4-alpha-glucanotransferase (). This domain adopts a beta-sandwich fold, in which two layers of anti-parallel beta-sheets are arranged in a nearly parallel fashion. The exact function of this domain is, as yet, unknown, however it has been proposed that it may play a role in transglycosylation reactions PUBMED:12618437.

    \

    More information about this protein can be found at Protein of the Month: alpha-Amylase PUBMED:.

    \ ' '8928' 'IPR015180' '\

    This domain adopts a beta barrel structure with a Greek key topology, which is topologically similar to the FMN-binding split barrel. It is found at the C-terminus of the Gp27 protein; a structural component of the viral baseplate PUBMED:11823865.

    \ ' '8929' 'IPR015181' '\

    This domain adopts a beta barrel structure with a Greek key topology, which is topologically similar to the FMN-binding split barrel. It is found at the N-terminus of the Gp27 protein; a structural component of the viral baseplate PUBMED:11823865.

    \ ' '8930' 'IPR015182' '\

    This domain is predominantly found in the prokaryotic protein quinohemoprotein amine dehydrogenase, assuming a secondary structure which consists of two alpha helices. They contain two cysteine residues that are involved in thioether linkages to haem PUBMED:12925784.

    \ ' '8931' 'IPR015183' '\

    This domain is predominantly found in the prokaryotic protein quinohemoprotein amine dehydrogenase, adopting an immunoglobulin-like beta-sandwich fold, with seven strands arranged into two beta sheets; the fold is possibly related to the immunoglobulin and/or fibronectin type III superfamilies. The precise function of this domain has not, as yet, been defined PUBMED:12925784.

    \ ' '8932' 'IPR015184' '\

    This domain is predominantly found in the prokaryotic protein quinohemoprotein amine dehydrogenase, adopting an immunoglobulin-like beta-sandwich fold, with seven strands arranged into two beta sheets; the fold is possibly related to the immunoglobulin and/or fibronectin type III superfamilies. The precise function of this domain has not, as yet, been defined PUBMED:12925784.

    \ ' '8933' 'IPR015185' '\

    This domain is found in Pseudomonas aeruginosa exotoxin A, and is responsible for binding of the toxin to the alpha-2-macroglobulin receptor, with subsequent internalisation into endosomes. It adopts a thirteen-strand antiparallel beta jelly roll topology, which belongs to the concanavalin A-like lectins/glucanases fold superfamily PUBMED:11734000.

    \ ' '8934' 'IPR015186' '\

    This domain, found in Pseudomonas aeruginosa exotoxin A, is responsible for transmembrane targeting of the toxin, as well as transmembrane translocation of the catalytic domain into the cytoplasmic compartment. A furin cleavage site is present within the domain: cleavage generates a 37 kDa carboxy-terminal fragment, which includes the enzymatic domain, which is then is translocated into the cytoplasm. It adopts a helical structure, with six alpha-helices forming a bundle PUBMED:11734000.

    \ ' '8935' 'IPR015187' '\

    This domain assumes an OB fold, which consists of a highly curved five-stranded beta-sheet that closes on itself to form a beta-barrel. OB1 has a shallow groove formed by one face of the curved sheet and is demarcated by two loops, one between beta 1 and beta 2 and another between beta 4 and beta 5, which allows for weak single strand DNA binding. The domain also binds the 70-amino acid DSS1 (deleted in split-hand/split foot syndrome) protein, which was originally identified as one of three genes that map to a 1.5-Mb locus deleted in an inherited developmental malformation syndrome PUBMED:12228710.

    \ ' '8936' 'IPR015188' '\

    This domain assumes an OB fold, which consists of a highly curved five-stranded beta-sheet that closes on itself to form a beta-barrel. OB3 has a pronounced groove formed by one face of the curved sheet and is demarcated by two loops, one between beta 1 and beta 2 and another between beta 4 and beta 5, which allows for strong ssDNA binding PUBMED:12228710.

    \ ' '8937' 'IPR015189' '\

    This entry represents a domain with a winged helix-type fold, which consists of a closed 3-helical bundle with a right-handed twist, and a small beta-sheet wing PUBMED:12145214. Different winged helix domains share a common structure, but can differ in sequence. This entry is designated "type 1".

    \ \

    The winged helix motif is involved in both DNA and RNA binding. In the elongation factor SelB, the winged helix domains recognise RNA, allowing the complex to wrap around the small ribosomal subunit. In bacteria, the incorporation of the amino acid selenocysteine into proteins requires elongation factor SelB, which binds both transfer RNA (tRNA) and mRNA. SelB binds to an mRNA hairpin formed by the selenocysteine insertion sequence (SECIS) with extremely high specificity PUBMED:15665870.

    \ ' '8938' 'IPR015190' '\

    This entry represents a domain with a winged helix-type fold, which consists of a closed 3-helical bundle with a right-handed twist, and a small beta-sheet wing PUBMED:12145214. Different winged helix domains share a common structure, but can differ in sequence. This entry is designated "type 2".

    \ \

    The winged helix motif is involved in both DNA and RNA binding. In the elongation factor SelB, the winged helix domains recognise RNA, allowing the complex to wrap around the small ribosomal subunit. In bacteria, the incorporation of the amino acid selenocysteine into proteins requires elongation factor SelB, which binds both transfer RNA (tRNA) and mRNA. SelB binds to an mRNA hairpin formed by the selenocysteine insertion sequence (SECIS) with extremely high specificity PUBMED:15665870.

    \ ' '8939' 'IPR015191' '\

    This entry represents a domain with a winged helix-type fold, which consists of a closed 3-helical bundle with a right-handed twist, and a small beta-sheet wing PUBMED:12145214. Different winged helix domains share a common structure, but can differ in sequence. This entry is designated "type 3".

    \ \

    The winged helix motif is involved in both DNA and RNA binding. In the elongation factor SelB, the winged helix domains recognise RNA, allowing the complex to wrap around the small ribosomal subunit. In bacteria, the incorporation of the amino acid selenocysteine into proteins requires elongation factor SelB, which binds both transfer RNA (tRNA) and mRNA. SelB binds to an mRNA hairpin formed by the selenocysteine insertion sequence (SECIS) with extremely high specificity PUBMED:15665870.

    \ ' '8940' 'IPR015192' '\

    This domain, found in sex-determining protein Xol-1, adopts a secondary structure consisting of five alpha helices and six antiparallel beta sheets, in a beta-alpha-beta-beta-beta-alpha-beta-alpha-alpha-alpha-beta arrangement. The fold of this family is similar to that found in ribosomal protein S5 domain 2-like PUBMED:12672694. The active site of the enzyme is found at the interface between this domain and the C-terminal GHMP-like domain.

    \ ' '8941' 'IPR015193' '\

    This domain, found in sex-determining protein Xol-1, adopts a secondary structure consisting of five alpha helices and seven antiparallel beta sheets, in a beta-alpha-beta-alpha-alpha-alpha-beta-beta-alpha-beta-beta-beta arrangement. The fold of this family is structurally similar to that found in the C-terminal domain of GHMP Kinase PUBMED:12672694. The active site of the enzyme is found at the interface between this domain and the N-terminal domain.

    \ ' '8942' 'IPR015194' '\

    Nucleosome remodelling is an energy-dependent process that alters histone-DNA interactions within nucleosomes, thereby rendering nucleosomal DNA accessible to regulatory factors. The ATPases involved belong to the SWI2/SNF2 subfamily of DEAD/H-helicases, which contain a conserved ATPase domain characterised by seven motifs. Proteins within this family differ with regard to domain organisation, their associated proteins and the remodelling complex in which they reside.

    \

    The ATPase ISWI is a member of this family. ISWI can be divided into two regions: an N-terminal region that contains the SWI2/SNF2 ATPase domain, and a C-terminal region that is responsible for substrate recognition. The C-terminal region contains 12 alpha-helices and can be divided into three domains and a spacer region: a HAND domain (named because its 4-helical structure resembles an open hand), a SANT domain (c-Myb DNA-binding like), a spacer helix, and a SLIDE domain (SANT-like but with several insertions).

    \

    This entry represents the HAND domain, which adopts a secondary structure consisting of four alpha helices, three of which (H2, H3, H4) form an L-like configuration. Helix H2 runs antiparallel to helices H3 and H4, packing closely against helix H4, whilst helix H1 reposes in the concave surface formed by these three helices and runs perpendicular to them. This domain confers DNA and nucleosome binding properties to the protein PUBMED:14536084.

    \ ' '8943' 'IPR015195' '\

    The SLIDE domain adopts a secondary structure comprising a main core of three alpha-helices. It has a role in DNA binding, contacting DNA target sites similar to c-Myb () repeats or homeodomains PUBMED:14536084.

    \ ' '8944' 'IPR015196' '\

    This domain adopts an eight-stranded antiparallel beta jelly roll configuration, with the beta strands arranged into two sheets. It is similar in topology to many viral capsid proteins, as well as lectins and several glucanases. This domain allows the protein to bind sugars and catalyses the complete removal of N-linked oligosaccharide chains from glycoproteins PUBMED:7881905.

    \ ' '8945' 'IPR015197' '\

    This domain adopts an eight-stranded antiparallel beta jelly roll configuration, with the beta strands arranged into two sheets. It is similar in topology to many viral capsid proteins, as well as lectins and several glucanases. This domain allows the protein to bind sugars and catalyses the complete removal of N-linked oligosaccharide chains from glycoproteins PUBMED:7881905.

    \ ' '8946' 'IPR015198' '\

    Transcription factor MotA is required for the activation of middle promoters in Bacteriophage T4, in addition to phage T4 co-activator AsiA, and sigma-70-containing Escherichia coli RNA polymerase. Phage T4 middle promoters have the sigma70 -10 DNA element, but not the -35 element; instead, they have a MotA box at -30 to which the transcription factor MotA binds PUBMED:16996538. MotA and AsiA interact with the C-terminal of sigma70 (region 4), which normally binds the -35 element and the beta-flap, thereby diverting sigma70 away from host promoters that require -35 element-binding to phage T4 middle promoters.

    \

    Transcription factor MotA has two domains: an N-terminal domain required for binding to sigma70, and a C-terminal domain required for binding to the -30 MotA box element in the phage T4 middle promoter. This entry represents the N-terminal (activation) domain of MotA factors that binds sigma70. The N-terminal domain adopts an almost completely alpha-helical topology, with five alpha-helices and a short, two-stranded, beta-ribbon. Four alpha helices (alpha1, alpha3, alpha4 and alpha5) are amphipathic and pack their hydrophobic surfaces around the central helix alpha2 PUBMED:9155025.

    \ ' '8947' 'IPR015199' '\

    This entry represents a domain which is predominantly found in prokaryotic DNA polymerase III, assuming an alpha helical structure with a core of five alpha helices and an additional small helix. This domain is essential for the formation of the polymerase clamp loader PUBMED:9363942.

    \ ' '8948' 'IPR015200' '\

    This domain is essential for the interaction of the gp45 sliding clamp with the corresponding polymerase. It adopts a DNA clamp fold, consisting of two alpha helices and two beta sheets - the fold is duplicated and has internal pseudo two-fold symmetry PUBMED:10535734.

    \ ' '8949' 'IPR015201' '\

    MiAMP1 is a highly basic protein from the nut kernel of Macadamia integrifolia (Macadamia nut), which inhibits the growth of several microbial plant pathogens in vitro while having no effect on mammalian or plant cells. It consists of eight beta-strands which are arranged in two Greek key motifs. These Greek key motifs then associate to form a Greek key beta-barrel PUBMED:10543955.

    \ ' '8950' 'IPR015202' '\

    This domain adopts a secondary structure consisting of a bundle of seven, mostly antiparallel, beta-strands surrounding a hydrophobic core. The 7 strands are arranged in 2 sheets, in a Greek-key topology. Their precise function, has not, as yet, been defined, though they are mostly found in sugar-utilising enzymes, such as galactose oxidase PUBMED:11698678.

    \ ' '8951' 'IPR015203' '\

    Members of this family bind the chaperone SicP, which is required both to maintain the stability of SptP, as well as to ensure the eventual secretion of the protein. The domain is found in the Salmonella effector protein SptP, which interacts with SicP chaperone dimers mainly through four regions of its chaperone-binding domain. The structure of the SptP-SicP complex contains four molecules of SicP, aligned in a linear fashion and arranged in two sets of tightly bound homodimers that bind two SptP molecules. The SicP homodimers do not interact with each other, but are held together by a molecular interface formed between two SptP molecules. Each SptP molecule is wrapped around by three SicP chaperones (two chaperones from one homodimer and a third one from the opposite homodimer pair) PUBMED:11689946.

    \ ' '8953' 'IPR015205' '\

    This domain adopts a secondary structure consisting of a pair of long, antiparallel alpha-helices (the stem) that support a three-helix bundle (3HB) at their end. The 3HB contains a helix-turn-helix motif and is similar to the DNA binding domains of the bacterial site-specific recombinases, and of eukaryotic Myb and homeodomain transcription factors. The Tower domain has an important role in the tumour suppressor function of BRCA2, and is essential for appropriate binding of BRCA2 to DNA PUBMED:12228710.

    \ ' '8954' 'IPR015206' '\

    This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase PUBMED:12952945.

    \ ' '8955' 'IPR015207' '\

    This entry represents a set of hypothetical bacterial proteins containing a core of six alpha-helices, where one central helix is surrounded by the other five. The exact function of this family has not, as yet, been determined PUBMED:16287087.

    \ ' '8956' 'IPR015208' '\

    This entry represents a dimerisation domain predominantly found in Bacteriophage T4 recombination endonuclease VII. It adopts a helical secondary structure, with three alpha helices oriented parallel to each other. As well as mediating dimerisation of the protein, this domain is also involved in binding to the DNA major groove PUBMED:11327769.

    \ ' '8957' 'IPR015209' '\

    This N-terminal domain forms the transmembrane region in subunit II of cytochrome c oxidase from Thermus thermophilus. This domain adopts a tertiary structure consisting of two antiparallel transmembrane helices, in a transmembrane helix hairpin fold PUBMED:10775261.

    \ ' '8958' 'IPR015210' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the C-terminal domain of the restriction endonuclease NaeI and other DNA-binding proteins, which adopts a secondary structure consisting of nine alpha-helices, six 3-10 helices and 13 beta-strands. In NaeI binds two GCC-CGG recognition sequences to cleave DNA into blunt-ended products PUBMED:10856254.

    \ ' '8959' 'IPR015211' '\

    This C-terminal domain is found in peptidases belonging to MEROPS peptidase family M1, particularly: aminopeptidase-1 of Caenorhabditis elegans, aminopeptidase O, aminopeptidase B and the bifunctional leukotriene A4 hydrolase/aminopeptidase.

    The domain adopts a structure consisting of two layers of parallel alpha-helices, five in the inner layer and four in the outer, arranged in an antiparallel manner, with perpendicular loops containing short helical segments on top. It is required for the formation of a deep cleft harbouring the catalytic Zn2+ site in leukotriene A4 hydrolase PUBMED:11175901.

    \ ' '8960' 'IPR015212' '\

    This entry represents a domain consisting of twelve helices that fold into a compact structure that contains the overall structural scaffold observed in other regulator of G protein signalling (RGS) proteins and three additional helical elements that pack closely to it. Helices 1-9 comprise the RGS fold, in which helices 4-7 form a classic antiparallel bundle adjacent to the other helices. Like other RGS structures, helices 7 and 8 span the length of the folded domain and form essentially one continuous helix with a kink in the middle. Helices 10-12 form an apparently stable C-terminal extension of the structural domain, and although other RGS proteins lack this structure, these elements are intimately associated with the rest of the structural framework by hydrophobic interactions. This domain binds to active G-alpha proteins, promoting GTP hydrolysis by the alpha subunit of heterotrimeric G proteins, thereby inactivating the G protein and rapidly switching off G protein-coupled receptor signalling pathways PUBMED:11470431.

    \ ' '8961' 'IPR015213' '\

    The substrate-binding domain found in cholesterol oxidase is composed of an eight-stranded mixed beta-pleated sheet and six alpha-helices. This domain is positioned over the isoalloxazine ring system of the FAD cofactor bound by the FAD-binding domain () and forms the roof of the active site cavity, allowing for catalysis of oxidation and isomerisation of cholesterol to cholest-4-en-3-one PUBMED:11397813.

    \ ' '8962' 'IPR015814' '\

    This domain has been found in a number of eukaryotic and prokaryotic proteins, some of which are predicted to be 6-phosphogluconate dehydrogenase, NAD-binding proteins.

    \ ' '8963' 'IPR015214' '\

    Many Bacillus species produce crystals of insecticidal toxins during spore formation. When an insect ingests these proteins, they are activated by proteolytic cleavage. The N-terminus is cleaved in all of the proteins and a C-terminal extension is cleaved in some members. Once activated, the endotoxin binds to the gut epithelium and causes cell lysis by the formation of cation-selective channels, which leads to death. The activated region of the delta toxin is composed of three distinct structural domains: an N-terminal helical bundle domain () involved in membrane insertion and pore formation; a beta-sheet central domain involved in receptor binding; and a C-terminal beta-sandwich domain () that interacts with the N-terminal domain to form a channel PUBMED:7490762, PUBMED:11468393.

    \ \

    This entry represents the central beta-sheet domain, which consistins of three four-stranded beta-sheets, each with a Greek key fold, with internal pseudo threefold symmetry. Thus, it acts as a receptor binding beta-prism, binding to insect-specific receptors of gut epithelial cells PUBMED:11377201. This entry is found almost exclusively in Bacillus thuringiensis.

    \ ' '8964' 'IPR015215' '\

    This entry represents a domain that is often, thought not exclusively, found in short-chain scorpion toxins. It forms a structure consisting of a cysteine-stabilised alpha/beta scaffold consisting of a short 3-10-helix and a two-stranded antiparallel beta-sheet. The biological method of action of the toxins has not yet been defined, and the function of this domain remains unknown PUBMED:15683869.

    \ ' '8965' 'IPR015216' '\

    The SANTA domain (SANT associated) is approximately 90 amino acids in length and is conserved in eukaryotes. It is sometimes found in association with the SANT domain (, also known as the Myb-like DNA-binding domain) implying a putative function in regulating chromatin remodelling. Sequence analysis has showed that the SANTA domain is likely to form four central beta-sheets with three flanking alpha-helices. Many conserved hydrophobic residues are present which implies a possible role in protein-protein interactions.

    \ ' '8966' 'IPR015217' '\

    This domain adopts a structure consisting of an immunoglobulin-like beta-sandwich, with seven strands in two beta-sheets, arranged in a Greek-key topology. It forms part of the extracellular region of the protein, which can be expressed as a soluble protein (Inv497) that binds integrins and promotes subsequent uptake by cells when attached to bacteria PUBMED:10514372.

    \ ' '8967' 'IPR015218' '\

    This entry represents a family of fungal proteins, including Alb1, that are involved in ribosomal biogenesis PUBMED:16651379.

    \ ' '8968' 'IPR015219' '\

    The glucodextranase B domain adopts a structure consisting of seven/eight-strand antiparallel beta-sheets, in a Greek-key topology, similar to the immunoglobulin beta-sandwich fold. It acts as a cell wall anchor, interacting with the S-layer present in the cell wall of Gram-positive bacteria by hydrophobic interactions. In glucodextranase, domain B is buried in the S-layer, and a flexible linker located between domain B and the catalytic unit confers motion to the catalytic unit, which is capable of efficient hydrolysis of the substrates located close to the cell surface PUBMED:14660574.

    \ ' '8969' 'IPR015220' '\

    Glucodextranase domain N, uniquely found in bacterial and archaeal glucoamylases and glucodextranases, adopts a structure consisting of 17 antiparallel beta-strands. These beta-strands are divided into two beta-sheets, and one of the beta-sheets is wrapped by an extended polypeptide, which appears to stabilise the domain. This domain, together with glycoside hydrolase domain A (), is mainly involved with catalytic activity, hydrolysing alpha-1,6-glucosidic linkages of dextran to release beta-D-glucose from the non-reducing end via an inverting reaction mechanism PUBMED:14660574.

    \ ' '8970' 'IPR015221' '\

    Ubiquitin related modifier 1 (Urm1) is a ubiquitin related protein that modifies proteins in the yeast ubiquitin-like urmylation pathway PUBMED:16864801. Structural comparisons and phylogenetic analysis of the ubiquitin superfamily has indicated that Urm1 has the most conserved structural and sequence features of the common ancestor of the entire superfamily PUBMED:14551258.

    \ ' '8971' 'IPR015222' '\

    MMp37 is a mitochondrial matrix protein that functions in the translocation of proteins across the mitochondrial inner membrane PUBMED:16790493.

    \ ' '8972' 'IPR015223' '\

    MipZ is an ATPase that forms a complex with the chromosome partitioning protein ParB near the chromosomal origin of replication PUBMED:16839883. It is responsible for the temporal and spatial regulation of FtsZ ring formation PUBMED:16839883.

    \ ' '8973' 'IPR015224' '\

    This domain adopts a structure consisting of five alpha helices that fold into a bundle. It contains a Vinculin binding site (VBS) composed of a hydrophobic surface spanning five turns of helix four. Activation of the VBS causes subsequent recruitment of Vinculin, which enables maturation of small integrin/talin complexes into more stable adhesions. Formation of the complex between VBS and Vinculin requires prior unfolding of this middle domain: once released from the talin hydrophobic core, the VBS helix is then available to induce the \'bundle conversion\' conformational change within the vinculin head domain thereby displacing the intramolecular interaction with the vinculin tail, allowing vinculin to bind actin PUBMED:15272303.

    \ ' '8974' 'IPR015225' '\

    TruB is responsible for the pseudouridine residue present in the T loops of virtually all tRNAs. TruB recognises the preformed 3-D structure of the T loop primarily through shape complementarity. It accesses its substrate uridyl residue by flipping out the nucleotide and disrupts the tertiary structure of tRNA PUBMED:11779468. The C-terminal domain adopts a secondary structure consisting of a four-stranded beta sheet and one alpha helix, similar to that found in PUA domains. It is predominantly involved in RNA-binding, being mostly found in tRNA pseudouridine synthase B (TruB) PUBMED:15028724.

    \

    Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an alpha+beta structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are PUBMED:10529181:

    \ \

    \ \ ' '8975' 'IPR015226' '\

    Members of this family of plant pathogenic proteins adopt an elongated structure somewhat reminiscent of a mushroom that can be divided into \'stalk\' and \'head\' subdomains. The stalk subdomain is composed of the N-terminal helix (alpha1) and beta strands beta3-beta4. An antiparallel beta sheet (beta5, beta7-beta8) forms the base of the head subdomain that interacts with the stalk. A pair of twisted antiparallel beta sheets (beta1 and beta6; beta2 and beta9/9\') supported by alpha2 form the dome of the head. The head subdomain possesses weak structural similarity with the catalytic portion of a number of ADP-ribosylt ransferase toxins PUBMED:15341731.

    \ ' '8976' 'IPR015227' '\

    Members of this family of Yersinia pseudotuberculosis mitogens adopt a sandwich structure consisting of nine strands in two beta sheets, in a jelly-roll topology. As with other superantigens, they are able to excessively activate T cells by binding to the T cell receptor PUBMED:14725774.

    \ ' '8977' 'IPR015228' '\

    Ubiquitin-associated domains contain approximately 40 residues and bind ubiquitin noncovalently. They adopt a secondary structure consisting of three alpha-helices, and have been identified in various modular proteins involved in protein trafficking, clathrin assembly/disassembly, DNA repair, proteasomal degradation, and cell cycle regulation PUBMED:14997574.

    \ ' '8978' 'IPR015229' '\

    BmKK2 (a member of scorpion toxin subfamily alpha-KTx 14) is a novel short-chain peptide from the Asian scorpion Mesobuthus martensii Karsch. It is a K+ channel blocker and is composed of 31 amino acid residues PUBMED:15146482. The peptide adopts a classical alpha/beta-scaffold for alpha-KTxs. BmKK2 selectively inhibits the delayed rectifier K+ current, but does not affect the fast transient K+ current PUBMED:15208022.

    \ \

    The alpha helix is shorter and the beta-sheet element is smaller (each strand consists of only two residues). There is an alpha-mode binding between the toxin and the channels. It has a lower activity towards Kv channels and it is predicted that it may prefer a type of SK channel with a narrower entryway as a specific receptor PUBMED:15146482.\

    \ ' '8979' 'IPR015230' '\

    This domain is predominantly found in carbapenam synthetase, and is composed of two antiparallel six-stranded beta-sheets that form a sandwich, flanked on each side by two alpha-helices. Their exact function has not, as yet, been determined PUBMED:12890666.

    \ ' '8980' 'IPR015231' '\

    This entry represents a family of hypothetical bacterial proteins. Their precise function has not, as yet, been defined.

    \ ' '8981' 'IPR015232' '\

    This entry represents a conserved region found in various bacterial and eukaryotic hypothetical proteins, as well as in the cysteine protease calpain. Its function has not, as yet, been defined.

    \ ' '8982' 'IPR015233' '\

    Carotenoids such as beta-carotene, lycopene, lutein and beta-cryptoxanthine are produced in plants and certain bacteria, algae and fungi, where they function as accessory photosynthetic pigments and as scavengers of oxygen radicals for photoprotection. They are also essential dietary nutrients in animals. Orange carotenoid-binding proteins (OCP) were first identified in cyanobacterial species, where they occur associated with phycobilisome in the cellular thylakoid membrane. These proteins function in photoprotection, and are essential for inhibiting white and blue-green light non-photochemical quenching (NPQ) PUBMED:17307930, PUBMED:16531492. Carotenoids improve the photoprotectant activity by broadening OCP\'s absorption spectrum and facilitating the dissipation of absorbed energy. OCP acts as a homodimer, and binds one molecule of carotenoid (3\'-hydroxyechinenone) and one chloride ion per subunit, where the carotenoid binding site is lined with a striking number of methionine residues. The carotenoid 3\'-hydroxyechinenone is not found in higher plants. OCP has two domains: an N-terminal helical domain and a C-terminal domain that resembles a NTF2 (nuclear transport factor 2) domain. OCP can be proteolytically cleaved into a red form (RCP), which lacks 15 residues from the N-terminus and approximately 150 residues from the C-terminus PUBMED:16034528.

    \

    This entry represents the N-terminal domain found predominantly in prokaryotic orange carotenoid proteins and related carotenoid-binding proteins. It adopts an alpha-helical structure consisting of two four-helix bundles PUBMED:12517340.

    \ ' '8983' 'IPR015234' '\

    This domain is found in a set of hypothetical archaeal proteins. Its exact function has not, as yet, been defined.

    \ ' '8984' 'IPR015235' '\

    This entry represents a set of hypothetical bacterial proteins whose exact function has not, as yet, been described.

    \ ' '8985' 'IPR015236' '\

    This domain, which is predominantly found in the archaeal protein O6-alkylguanine-DNA alkyltransferase, adopts a secondary structure consisting of a three stranded antiparallel beta-sheet and three alpha helices. The exact function has not, as yet, been defined, though it has been postulated that this domain may confer thermostability to the protein PUBMED:10497033.

    \ ' '8986' 'IPR015237' '\

    This domain, which is predominantly found in archaeal alpha-amylases, adopts a secondary structure consisting of an eight-stranded antiparallel beta-sheet containing a Greek key motif. Its exact function has not, as yet, been determined PUBMED:12482867.

    \ ' '8987' 'IPR015238' '\

    This family adopts a secondary structure consisting of six alpha helices, with four long helices (alpha1, alpha2, alpha5, alpha6) forming a left-handed, antiparallel alpha helical bundle. The function of this family of archaeal hypothetical proteins has not, as yet, been defined PUBMED:15704011.

    \ ' '8988' 'IPR015239' '\

    Anthrax toxin is a plasmid-encoded toxin complex produced by the Gram-positive, spore-forming bacteria, Bacillus anthracis. The toxin consists of three non-toxic proteins: the protective antigen (PA), the lethal factor (LF) and the edema factor (EF) PUBMED:14570563. These component proteins self-assemble at the surface of host cell receptors, yielding a series of toxic complexes that can produce shock-like symptoms and death. Anthrax toxin is one of a large group of Bacillus and Clostridium exotoxins referred to as binary toxins, forming independent enzymatic (A moiety) and binding (B moiety) components. The LF and EF proteins are the enzymes (A moiety) that act on cytosolic substrates, while PA is a multi-functional protein (B moiety) that binds to cell surface receptors, mediates the assembly and internalisation of the complexes, and delivers them to the host cell endosome PUBMED:17335404. Once PA is attached to the host receptor PUBMED:17381430, it must then be cleaved by a host cell surface (furin family) protease before it is able to bind EF and LF. The cleavage of the N-terminus of PA enables the C-terminal fragment to self-associate into a ring-shaped heptameric complex (prepore) that can bind LF or EF competitively. The PA-LF/EF complex is then internalised by endocytosis, and delivered to the endosome, where PA forms a pore in the endosomal membrane in order to translocate LF and EF to the cytosol. LF is a Zn-dependent metalloprotease that cleaves and inactivates mitogen-activated protein (MAP) kinases, kills macrophages, and causes death of the host by inhibiting cell proliferation PUBMED:14616089, PUBMED:11700563. EF is a calcium-and calmodulin-dependent adenylyl cyclase that can cause edema (fluid-filled swelling) when associated with PA. EF is not toxic by itself, and is required for the survival of germinated Bacillus spores within macrophages at the early stages of infection. EF dramatically elevates the level of host intracellular cAMP, a ubiquitous messenger that integrates many processes of the cell; increases in cAMP can interfere with host intracellular signalling PUBMED:15131111.

    \

    This entry represents the central domain found in the lethal factor protein of anthrax toxin.

    \ ' '8989' 'IPR015240' '\

    TruB is responsible for the pseudouridine residue present in the T loops of virtually all tRNAs. TruB recognises the preformed 3-D structure of the T loop primarily through shape complementarity. It accesses its substrate uridyl residue by flipping out the nucleotide and disrupts the tertiary structure of tRNA PUBMED:11779468. The C-terminal domain adopts a secondary structure consisting of a four-stranded beta sheet and one alpha helix, similar to that found in PUA domains. It is predominantly involved in RNA-binding, being mostly found in tRNA pseudouridine synthase B (TruB) PUBMED:15028724.

    \

    Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an alpha+beta structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are PUBMED:10529181:

    \ \

    \ \ ' '8990' 'IPR015241' '\

    Transcription factor MotA is required for the activation of middle promoters in Bacteriophage T4, in addition to phage T4 co-activator AsiA, and sigma-70-containing Escherichia coli RNA polymerase. Phage T4 middle promoters have the sigma70 -10 DNA element, but not the -35 element; instead, they have a MotA box at -30 to which the transcription factor MotA binds PUBMED:16996538. MotA and AsiA interact with the C-terminal of sigma70 (region 4), which normally binds the -35 element and the beta-flap, thereby diverting sigma70 away from host promoters that require -35 element-binding to phage T4 middle promoters.

    \

    Transcription factor MotA has two domains: an N-terminal domain required for binding to sigma70, and a C-terminal domain required for binding to the -30 MotA box element in the phage T4 middle promoter. This entry represents the C-terminal domain of MotA factors, which adopts a compact alpha/beta structure comprising three alpha-helices and six beta-strands in the order: alpha1-beta1-beta2-beta3-beta4-alpha2-beta5-beta6-alpha3. In this architecture, the domain\'s hydrophobic core is at the sheet-helix interface, and the second surface of the beta-sheet is completely exposed. It contains a DNA-binding motif, with a consensus sequence containing nine base pairs (5\'-TTTGCTTTA-3\'), that appears to bind to various mot boxes, allowing access to the minor groove towards the 5\'-end of this sequence and the major groove towards the 3\'-end PUBMED:11918797.

    \ ' '8991' 'IPR015242' '\

    This domain forms a ribonuclease H fold consisting of two beta sheets and one alpha helix, arranged as a beta-alpha-beta motif. Each beta sheet has five strands, arranged in a 32145 order, with the second strand being antiparallel to the rest. They are capable of resolving Holliday junctions and cleave DNA after 5\'-CT-3, and 5\'-TT-3, sequences PUBMED:11726496.

    \ ' '8992' 'IPR015243' '\

    This domain adopts a secondary structure consisting of a beta sandwich, with nine strands arranged in two sheets in a Greek key topology. It is predominantly found in bacterial mannose-specific adhesins, and is capable of binding to D-mannose PUBMED:12010488.

    \ ' '8994' 'IPR015245' '\

    This domain adopts a structure consisting of an alpha+beta sandwich with an antiparallel beta-sheet, arranged in a 2(beta-alpha-beta) motif. They are mainly found in mRNA export factors, and mediate the sequence nonspecific nuclear export of cellular mRNAs as well as the sequence-specific export of retroviral mRNAs bearing the constitutive transport element PUBMED:11854490.

    \ ' '8995' 'IPR015246' '\

    The transmembrane domain of the beta subunit of formate dehydrogenase consists of a single transmembrane helix. This domain acts as a transmembrane anchor, allowing the conduction of electrons within the protein PUBMED:11884747.

    \ ' '8996' 'IPR015247' '\

    This domain is predominantly found in Vitamin D binding proteins, and adopts a multihelical structure. It is required for formation of an actin \'clamp\', allowing the protein to bind to actin PUBMED:12048248.

    \ ' '8997' 'IPR015248' '\

    The N-terminal domain of the ubiquinol-cytochrome c reductase iron-sulphur subunit adopts a structure consisting of many antiparallel beta sheets, with few alpha helices, in a non-globular arrangement. They are required for proper functioning of the respiratory chain PUBMED:12269811.

    \ ' '8998' 'IPR015249' '\

    This entry represents the biliverdin reductase, catalytic domain, which adopts a structure ccontaining a six-stranded beta-sheet that is flanked on one face by several alpha-helices. This domain contains the catalytic active site which reduces the gamma-methene bridge of the open tetrapyrrole, biliverdin IX alpha, to bilirubin with the concomitant oxidation of a NADH or NADPH cofactor PUBMED:12079357.

    \ ' '8999' 'IPR015250' '\

    This domain, found in bacterial proteins, assumes a beta-sandwich structure consisting of two antiparallel beta-sheets (similar to an immunoglobulin-like fold), and an additional small, antiparallel beta-sheet. The longer-stranded beta-sheet is made up of four antiparallel beta-strands. The shorter-stranded beta-sheet consists of five beta-strands, four of which form an antiparallel beta-sheet. The exact function of this domain is unknown, though a putative role includes involvement in host-bacterial interactions involved in endocytosis or phagocytosis, possibly during bacterial internalisation PUBMED:12441386.

    \ ' '9000' 'IPR015251' '\

    This N-terminal domain adopts a secondary structure consisting of a helical bundle of eight alpha helices and three beta strands, with the last alpha helix connecting to the first strand of the catalytic domain. The first strand of the N-terminus also forms a small parallel beta sheet with strand five of the catalytic domain. This domain mediates dimerisation of the protein, with two proline residues present in the domain being critical for interaction PUBMED:12377124.

    \ ' '9001' 'IPR015252' '\

    The mechanism of double-strand break repair (DSBR) is evolutionarily conserved and can be divided into two main pathways: homologous recombination (HR) and non-homologous end joining (NHEJ). NHEJ involves the ligation of broken DNA ends, requires little or no sequence homology and may be mutagenic. HR can occur conservatively by gene conversion using a homologous strand of DNA as a template or non-conservatively by single-strand annealing, which takes place between repeat sequences and can lead to the loss of intervening genomic DNA or translocations. Brca2 cancer susceptibility protein functions in double-strand DNA break repair by homologous recombination PUBMED:17822964. Double-strand breaks in DNA elicit a coordinated response that results in either repair of the damage or elimination of the cell. Mutations in Brca2 lead to various types of cancer, and also give rise to the multi-faceted Fanconi anaemia syndrome PUBMED:17768402. Brca2 is an essential component of the homologous recombination repair pathway and the Fanconi Anaemia complex PUBMED:17786054.

    \

    This entry represents a domain found in Brca2 proteins. This domain adopts a helical structure, consisting of a four-helix cluster core (alpha 1, alpha 8, alpha 9, alpha 10) and two successive beta-hairpins (beta 1 to beta 4). An approximately 50-amino acid segment that contains four short helices (alpha 2 to alpha 4), meanders around the surface of the core structure. In BRCA2, the alpha 9 and alpha 10 helices pack with BRCA-2_OB1 () through van der Waals contacts involving hydrophobic and aromatic residues, and also through side-chain and backbone hydrogen bonds. This domain binds the 70-amino acid DSS1 (deleted in split-hand/split foot syndrome) protein, which was originally identified as one of three genes that map to a 1.5-Mb locus deleted in an inherited developmental malformation syndrome PUBMED:12228710.

    \ ' '9002' 'IPR015253' '\

    This domain is found in a set of hypothetical eukaryotic proteins, as well as in oligonucleotide/oligosaccharide-binding fold-containing protein-1.

    \ ' '9003' 'IPR015254' '\

    This entry represents a set of known and suspected archaeal N-glycosylase/DNA lyases. These DNA repair enzymes are part of the base excision repair (BER) pathway; they protect from oxidative damage by removing the major product of DNA oxidation, 8-oxoguanine (GO), from single- and double-stranded DNA substrates PUBMED:15604455.Cleavage of the N-glycosidic bond between the aberrant base and the sugar-phosphate backbone generates an apurinic (AP) site. Subsequently, the phosphodiester bond 3\' from the AP site is cleaved by an elimination reaction, leaving a 3\'-terminal unsaturated sugar and a product with a terminal 5\'-phosphate. The protein contains two alpha-helical subdomains, with the 8-oxoguanine binding site located in a cleft at their interface. A helix-hairpin-helix (HhH) structural motif and a Gly/Pro-rich sequence followed by a conserved Asp (HhH-GPD motif) are present PUBMED:15642264.

    \ ' '9004' 'IPR015255' '\

    Vitellinogen precursors provide the major egg yolk proteins that are a source of nutrients during early development of oviparous vertebrates and invertebrates. Vitellinogen precursors are multi-domain apolipoproteins that are cleaved into distinct yolk proteins. Different vitellinogen precursors exist, which are composed of variable combinations of yolk protein components; however, the cleavage sites are conserved PUBMED:17314313, PUBMED:9692232.

    \

    In vertebrates, a complete vitellinogen is composed of an N-terminal signal peptide for export, followed by four regions that can be cleaved into yolk proteins: lipovitellin-1, phosvitin, lipovitellin-2, and a von Willebrand factor type D domain (YGP40). Vitellinogens are post-translationally glycosylated and phosphorylated in the endoplasmic reticulum and Golgi complex of hepatocytes, before being secreted into the circulatory system to be taken up by oocytes. In the ovary, vitellinogens bind to specific Vtgr receptors on oocyte membranes to become internalised by endocytosis, where they are cleaved into yolk proteins by cathepsin D. YGP40 is released into the yolk plasma before or during compartmentation of lipovitellin-phosvitin complex into the yolk granule.

    \

    The different yolk proteins have distinct roles. Phosvitins are important in sequestering calcium, iron and other cations for the developing embryo. Phosvitins are one of the most phosphorylated (10%) proteins in nature, the high concentration of phosphate groups providing efficient metal-binding sites in clusters PUBMED:17189915, PUBMED:8838584. Lipovitellins are involved in lipid and metal storage, and contain a heterogeneous mixture of about 16% (w/w) noncovalently bound lipid, most being phospholipid. Lipovitellin-1 contains two chains, LV1N and LV1C PUBMED:12135361, PUBMED:9687371.

    \ \ \

    This entry represents the open beta-sheet domain found in vitellinogen, which generally corresponds to a domain within the lipovitellin-1 peptide product. This domain adopts a structure consisting of several large open beta-sheets PUBMED:12135361, and is almost always found C-terminal to .

    \ ' '9005' 'IPR015256' '\

    This entry represents a domain which is found in the initiation factors eIF2 and EF-Tu, adopting a beta barrel structure with Greek key topology. It is required for formation of the ternary complex with GTP and initiator tRNA PUBMED:11927566.

    \ ' '9006' 'IPR015257' '\

    Maf1 is a negative regulator of RNA polymerase III PUBMED:11438659, PUBMED:16762835. It targets the initiation factor TFIIIB PUBMED:12504022.

    \ ' '9007' 'IPR015258' '\

    Vitellinogen precursors provide the major egg yolk proteins that are a source of nutrients during early development of oviparous vertebrates and invertebrates. Vitellinogen precursors are multi-domain apolipoproteins that are cleaved into distinct yolk proteins. Different vitellinogen precursors exist, which are composed of variable combinations of yolk protein components; however, the cleavage sites are conserved PUBMED:17314313, PUBMED:9692232.

    \

    In vertebrates, a complete vitellinogen is composed of an N-terminal signal peptide for export, followed by four regions that can be cleaved into yolk proteins: lipovitellin-1, phosvitin, lipovitellin-2, and a von Willebrand factor type D domain (YGP40). Vitellinogens are post-translationally glycosylated and phosphorylated in the endoplasmic reticulum and Golgi complex of hepatocytes, before being secreted into the circulatory system to be taken up by oocytes. In the ovary, vitellinogens bind to specific Vtgr receptors on oocyte membranes to become internalised by endocytosis, where they are cleaved into yolk proteins by cathepsin D. YGP40 is released into the yolk plasma before or during compartmentation of lipovitellin-phosvitin complex into the yolk granule.

    \

    The different yolk proteins have distinct roles. Phosvitins are important in sequestering calcium, iron and other cations for the developing embryo. Phosvitins are one of the most phosphorylated (10%) proteins in nature, the high concentration of phosphate groups providing efficient metal-binding sites in clusters PUBMED:17189915, PUBMED:8838584. Lipovitellins are involved in lipid and metal storage, and contain a heterogeneous mixture of about 16% (w/w) noncovalently bound lipid, most being phospholipid. Lipovitellin-1 contains two chains, LV1N and LV1C PUBMED:12135361, PUBMED:9687371.

    \ \ \

    This entry represents the beta-sheet shell domain found in vitellinogen, which generally corresponds to the lipovitellin-2 peptide product. This domain consists of several large open beta-sheets PUBMED:12135361. It is often found C-terminal to and .

    \ ' '9008' 'IPR015259' '\

    Prokaryotic methylene-tetrahydromethanopterin dehydrogenase catalyses the dehydrogenation of methylene-tetrahydromethanopterin during growth on one-carbon compounds such as methanol. It can also catalyse the reversible dehydrogenation of methylene-tetrahydrofolate, though at much lower efficiency PUBMED:12176390. The pterin domain of this protein is composed of two alpha-beta segments found at the N- and C-terminal ends of the polypeptide respectivly. This entry represents the N-terminal segment of the pterin domain, with a core comprising three alpha/beta/alpha layers in which each sheet contains four strands.

    \ ' '9009' 'IPR015260' '\

    Members of this entry, which are found in the amino terminus of various SNARE proteins, adopt a structure consisting of an antiparallel three-helix bundle. Their exact function has not been determined, though it is known that they regulate the SNARE motif, as well as mediate various protein-protein interactions involved in membrane-transport PUBMED:12082176.

    \ ' '9010' 'IPR015261' '\

    Members of this entry, which are predominantly found in prokaryotic 4-alpha-glucanotransferase, adopt a structure composed of six antiparallel beta-strands, four of which form a beta-sheet and another two form a type I, beta-hairpin. The role of this family of domains, has not, as yet, been defined PUBMED:12139940.

    \ ' '9011' 'IPR015262' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This entry represents the substrate-binding domain of lysidine-tRNA(Ile) synthetase, which ligates lysine onto the cytidine present at position 34 of the AUA codon-specific tRNA(Ile) that contains the anticodon CAU, in an ATP-dependent manner. Cytidine is converted to lysidine, thus changing the amino acid specificity of the tRNA from methionine to isoleucine. The N-terminal region contains the highly conserved SGGXDS motif, predicted to be a PP-loop motif involved in ATP binding.

    \ \

    The only examples in which the wobble position of a tRNA must discriminate between G and A of mRNA are AUA (Ile) versus AUG (Met) and UGA (stop) versus UGG (Trp). In all bacteria, the wobble position of the tRNA(Ile) recognizing AUA is lysidine, a lysine derivative of cytidine. This domain is found, apparently, in all bacteria in a single copy. Eukaryotic sequences appear to be organellar. The domain architecture of this protein is variable; some, including characterised proteins of Escherichia coli and Bacillus subtilis known to be tRNA(Ile)-lysidine synthetase, include a conserved 50-residue domain that many other members lack. This protein belongs to the ATP-binding PP-loop family. It appears in the literature and protein databases as TilS, YacA, and putative cell cycle protein MesJ (a misnomer).

    \ \

    The PP-loop motif appears to be a modified version of the P-loop of nucleotide binding domain that is involved in phosphate binding PUBMED:7731953. Named PP-motif, since it appears to be a part of a previously uncharacterised ATP pyrophophatase domain. ATP sulfurylases, E. coli NtrL, and B. subtilis OutB consist of this domain alone. In other proteins, the pyrophosphatase domain is associated with amidotransferase domains (type I or type II), a putative citrulline-aspartate ligase domain or a nitrilase/amidase domain. The HUP domain class (after HIGH-signature proteins, UspA, and PP-ATPase) groups together PP-loop ATPases, the nucleotide-binding domains of class I aminoacyl-tRNA synthetases, UspA protein (USPA domains), photolyases, and electron transport flavoproteins (ETFP). The HUP domain is a distinct class of alpha/beta domainPUBMED:12012333.

    \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '9012' 'IPR016061' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    Prolyl tRNA synthetase () exists in two forms, which are loosely related. The first form is present in the majority of eubacteria species. The second one, present in some eubacteria, is essentially present in archaea and eukaryota. Prolyl-tRNA synthetase belongs to class IIa.

    \ \

    This domain is found at the C-terminal in archaeal and eukaryotic enzymes, as well as in certain bacterial ones.

    \ ' '9013' 'IPR015264' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This domain is found predominantly found in prolyl-tRNA synthetases from archaeal Methanococci species. It contains a zinc binding site, and adopts a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif PUBMED:12578991.

    \ ' '9014' 'IPR015265' '\

    The N-terminal domain of the bacterial purine repressor PuR is a winged-helix domain, a subdivision of the HTH structural family. It consists of a canonical arrangement of secondary structures: a1-b1-a2-T-a3-b2-W-b3, where a2-T-a3 is the HTH motif, a3 is the recognition helix, and W is the wing. The domain allows for recognition of a conserved CGAA sequence in the centre of a DNA PurBox, resulting in binding to the major groove of DNA PUBMED:12837783.

    \ ' '9015' 'IPR015266' '\

    Members of this entry are a set of hypothetical archaeal proteins. Their exact function has not, as yet, been defined.

    \ ' '9016' 'IPR015267' '\

    PPP4R2 (protein phosphatase 4 core regulatory subunit R2) is the regulatory subunit of the histone H2A phosphatase complex. It has been shown to confer resistance to the anticancer drug cisplatin in yeast PUBMED:16857015, and may confer resistance in higher eukaryotes.

    \ ' '9017' 'IPR015268' '\

    Members of this family of Mycoplasma hypothetical proteins adopt a helical structure, with one central alpha-helix surrounded by five others, in a NusB-like fold. Their function has not, as yet, been determined PUBMED:15146506.

    \ ' '9018' 'IPR015269' '\

    Members of this entry are a set of functionally uncharacterised hypothetical bacterial proteins. They adopt a ferredoxin-like fold, with a beta-alpha-beta-beta-alpha-beta arrangement PUBMED:15103642.

    \ \

    This entry contains the protein Impact, which is a translational regulator that ensures constant high levels of translation under amino acid starvation. It acts by interacting with Gcn1/Gcn1L1, thereby preventing activation of Gcn2 protein kinases (EIF2AK1 to 4) and subsequent down-regulation of protein synthesis. It is evolutionary conserved from eukaryotes to archaea PUBMED:11116084.

    \ ' '9019' 'IPR015270' '\

    Members of this family are a set of functionally uncharacterised hypothetical eukaryotic proteins PUBMED:16511118.

    \ ' '9020' 'IPR015271' '\

    Members of this family of Mycoplasma hypothetical proteins adopt a helical structure, with a buried central helix. Their function has not, as yet, been determined.

    \ ' '9021' 'IPR015272' '\

    In Escherichia coli, the MoaD protein plays a central role in the conversion of precursor Z to molybdopterin (MPT) during molybdenum cofactor biosynthesis. MoaD has a fold similar to that of ubiquitin and contains a highly conserved C-terminal Gly-Gly motif, which in its active form contains a transferrable sulphur in the form of a thiocarboxylate group PUBMED:17223713.

    \

    This entry represents a domain found in MoaD-related proteins, but with a different structure from the MoaD domain; this domain consists of a TBP-like fold of beta/apha/beta(4)/alpha. These proteins are found in Thermus thermophilus and contain a ubiquitin-like MoaD domain at their N-terminal, and the domain represented by this entry at their C-terminal. One of these proteins is threonine synthase (), which catalyses the conversion of O-phospho-L-homoserine to L-threonine and phosphate.

    \ ' '9022' 'IPR015273' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This DALR domain is found in cysteinyl-tRNA-synthetases PUBMED:10447505.

    \ ' '9023' 'IPR015274' '\

    This domain adopts an immunoglobulin-like beta-sandwich with seven strands in 2 beta sheets, in a Greek key topology. It is predominantly found in the extracellular portion of CD4 proteins, where it enables interaction with major histocompatibility complex class II antigens PUBMED:8493535.

    \ ' '9024' 'IPR015275' '\

    This domain assumes a secondary structure consisting of eight beta strands and 11 alpha-helices, organised in two lobes. It is predominantly found in actin-fragmin kinase, it is the catalytic domain that mediates the phosphorylation of actin PUBMED:10357805.

    \ ' '9025' 'IPR015276' '\

    This domain represents the extracellular, N-terminal, region of the cholecystokinin A receptor, where it adopts a tertiary structure consisting of a few helical turns and a disulphide-cross linked loop. It is required for interaction of the cholecystokinin A receptor with its corresponding hormonal ligand PUBMED:10555959.

    \ ' '9026' 'IPR015277' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represent AvaI and BsoBI restriction endonucleases, both of which recognise the double-stranded sequence CYCGRG (where Y = T/C, and R = A/G) and cleave after C-1 PUBMED:11250198.

    \ ' '9027' 'IPR015278' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents BglII restriction endonucleases, which recognise AGATCT and cleaves after A-1 PUBMED:10655616, PUBMED:11175900. BglII adopts a structure consisting of an alpha/beta core containing a six-stranded beta-sheet surrounded by five alpha-helices, two of which are involved in homodimerisation of the endonuclease.

    \ ' '9028' 'IPR015279' '\

    This domain is found in the Archaeal protein maltooligosyl trehalose synthase produced by Sulfolobus spp. Its function has not, as yet, been defined.

    \ ' '9029' 'IPR015280' '\

    Members of this entry, which are predominantly found in the yeast protein Rap1, assume a secondary structure consisting of a three-helix bundle and an N-terminal arm. They contain an Arg-Asp-Arg-Lys sequence that interacts with an ACAregion in the 3, region of the DNA-binding site PUBMED:8620531.

    \ ' '9030' 'IPR015281' '\

    Members of this family are DNA-modifying enzymes encoded by bacteriophage T4 that transfer glucose from uridine diphosphoglucose to 5-hydroxymethyl cytosine bases of phage T4 DNA PUBMED:11493010.

    \ ' '9031' 'IPR015282' '\

    This domain is found in various staphylococcal toxins, and adopts an OB fold, wherein the domain folds into a five-stranded beta-barrel. The exact manner in which they confer pathogenic properties to the protein has not, as yet, been determined PUBMED:12082105.

    \ ' '9032' 'IPR015283' '\

    Monellin, a protein produced by the West African plant Dioscoreophyllum cumminsii (Serendipity berry), is approximately 70,000 times sweeter than sucrose on a molar basis. The protein adopts an alpha-beta structure, with a cystatin-like fold, where each helix packs against a coiled antiparallel beta-sheet PUBMED:8230222.

    \ ' '9033' 'IPR015284' '\

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes PUBMED:17622352, PUBMED:16469117. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor PUBMED:17507650. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5\' and 3\' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    \

    The SR receptor is a monomer consisting of the loosely membrane-associated SR-alpha homologue FtsY, while the eukaryotic SR receptor is a heterodimer of SR-alpha (70 kDa) and SR-beta (25 kDa), both of which contain a GTP-binding domain PUBMED:12654246. SR-alpha regulates the targeting of SRP-ribosome-nascent polypeptide complexes to the translocon PUBMED:10859309. SR-alpha binds to the SRP54 subunit of the SRP complex. The SR-beta subunit is a transmembrane GTPase that anchors the SR-alpha subunit (a peripheral membrane GTPase) to the ER membrane PUBMED:7844142. SR-beta interacts with the N-terminal SRX-domain of SR-alpha, which is not present in the bacterial FtsY homologue. SR-beta also functions in recruiting the SRP-nascent polypeptide to the protein-conducting channel.

    \

    This entry represents a homologue of the alpha subunit of the SR receptor. Members of this entry consist of a central six-stranded anti-parallel beta-sheet sandwiched by helix alpha1 on one side and helices alpha2-alpha4 on the other. They interact with the small GTPase SR-beta, forming a complex that matches a class of small G protein-effector complexes, including Rap-Raf, Ras-PI3K(gamma), Ras-RalGDS, and Arl2-PDE(delta) PUBMED:12654246.

    \ ' '9034' 'IPR015285' '\

    This N-terminal domain is found in RIO2 kinases, and is structurally homologous to the winged helix (wHTH) domain. It adopts a structure consisting of four alpha helices followed by two beta strands and a fifth alpha helix. The domain confers DNA binding properties to the protein, as per other winged helix domains PUBMED:15341724.

    \ ' '9035' 'IPR015286' '\

    MspA is a membrane porin produced by Mycobacteria, allowing hydrophilic nutrients to enter the bacterium. The protein forms a tightly interconnected octamer with eightfold rotation symmetry that resembles a goblet and contains a central channel. Each subunit fold contains a beta-sandwich of Ig-like topology and a beta-ribbon arm that forms an oligomeric transmembrane barrel PUBMED:14976314.

    \ ' '9036' 'IPR015287' '\

    Colicin D, which is synthesised by various prokaryotes, adopts an antiparallel four helical bundle fold: the helices are tightly packed, forming a compact cylindrical molecule. The protein specifically cleaves the anticodon loop of all four tRNA-Arg isoacceptors, thereby inactivating prokaryotic protein synthesis and leading to cell death PUBMED:15014439.

    \ ' '9037' 'IPR015288' '\

    Members of this family are found in hypothetical proteins synthesised by the Archaeal organism Sulfolobus. Their exact function has not, as yet, been determined.

    \ ' '9038' 'IPR015289' '\

    This domain, found in fungal alpha-L-arabinofuranosidase B, adopts a beta-sandwich fold similar to that of concanavalin A-like lectins/glucanase. The beta-sandwich fold consists of two anti-parallel beta-sheets with seven and six strands, respectively. In addition, there are four helices outside of the beta-strands. The beta-sandwich strands are closely packed and curved with a jelly roll topology, creating a small catalytic pocket. The domain catalyses the hydrolysis of alpha-1,2-, alpha-1,3- and alpha-1,5-L-arabinofuranosidic bonds in L-arabinose-containing hemicelluloses such as arabinoxylan and L-arabinan PUBMED:15292273.

    \ ' '9039' 'IPR015290' '\

    Members of this entry, which are produced by Williopsis fungi, adopt a secondary structure consisting of eight strands in two beta sheets, in a Greek-key topology PUBMED:8756320.

    \ ' '9040' 'IPR015291' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents type II restriction endonucleases such as MspI, which recognises the palindromic tetranucleotide sequence 5\'-CCGG and cleave between the first and second nucleotides, leaving 2 base 5\' overhangs. They fold into an alpha/beta architecture, with a five-stranded mixed beta-sheet sandwiched on both sides by alpha-helices PUBMED:15341737.

    \ ' '9041' 'IPR015292' '\

    This entry represents the C-terminal domain found in the hypothetical transcriptional regulator YbiH from bacteria such as Salmonella typhimurium and Escherichia coli. YbiH is a member of the TetR (tetracycline resistance) transcriptional regulator family of proteins. The C-terminal domains of YbiH and TetR share a multi-helical, interlocking structure.

    \ ' '9042' 'IPR015293' '\

    This domain is found in a set of hypothetical bacterial proteins. Its exact function has not, as yet, been defined.

    \ ' '9043' 'IPR015294' '\

    Penicillin-binding proteins are beta-lactam antibiotic-sensitive bacterial enzymes required for the growth and maintenance of the peptidoglycan layer of the bacterial cell wall that protects the cell from osmotic stress. Penicillin-binding protein 4 (PBP4) functions as a transpeptidase, and belongs to MEROPS peptidase family S11 (clan SE). PBP4 acts co-operatively with PBP2 in staphylococcal cell wall biosynthesis and susceptibility to antimicrobial agents PUBMED:16411754. This entry represents the C-terminal domain PBP4.

    \ ' '9044' 'IPR015295' '\

    this domain is found in carbohydrate binding proteins that bind to beta-1, 4-mannooligosaccharides, carob galactomannan, and konjac glucomannan, but not to cellulose (insoluble and soluble) or soluble birchwood xylan. The region adopts a beta sandwich structure comprising 13 beta strands with a single, small alpha-helix and a single metal atom PUBMED:12791255.

    \ ' '9045' 'IPR015296' '\

    This entry represents a group of viral chemokine binding proteins, which bind with CC-chemokine MCP-1, acting as cytokine decoy receptors PUBMED:12419245. For example, the murine herpesvirus decoy receptor M3 acts as an immune system saboteur by altering host anti-viral inflammatory repsonses. M3 adopts a structure consisting of two different beta-sandwich domains of partial topological similarity to immunoglobulin-like folds.

    \ ' '9046' 'IPR015297' '\

    This entry represents absorption protein P2 (synonym: receptor-binding protein P2) from the bacteriophage PRD1. Absorption protein P2 is a multi-beta-sheet protein whose complicated topology forms an elongated seahorse-shaped molecule with a distinct head, containing a pseudo-beta propeller structure with approximate 6-fold symmetry, and tail (beta-sandwich). They are required for the attachment of the phage to the host conjugative DNA transfer complex. This is a poorly understood large transmembrane complex of unknown architecture, with at least 11 different proteins PUBMED:12623018.

    \ ' '9047' 'IPR015298' '\

    Members of this family of viral baseplate structural proteins adopt a structure consisting of a three-layer beta-sandwich with two finger-like loops containing an alpha-helix at the opposite sides of the sandwich. The two peripheral, five-stranded, antiparallel beta-sheets are stacked against the middle, four-stranded, antiparallel beta-sheet. Attachment of this family of proteins to the baseplate during assembly creates a binding site for subsequent attachment of Gp6 PUBMED:12729757.

    \ ' '9048' 'IPR015299' '\

    Members of this family are essential for gametocytogenesis in Plasmodium falciparum. They contain a fold composed of two pseudo dyad-related repeats of the helix-turn-helix motif, serving as a platform for RNA and Src homology-3 (SH3) binding PUBMED:12577051.

    \ ' '9049' 'IPR015300' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the N-terminal effector-binding domain of the type II restriction endonuclease EcoRII, which has a DNA recognition fold, allowing for binding to 5\'-CCWGG sequences. It assumes a structure composed of an eight-stranded beta-sheet with the strands in the order of b2, b5, b4, b3, b7, b6, b1 and b8. They are mostly antiparallel to each other except that b3 is parallel to b7. Alternatively, it may also be viewed as consisting of two mini beta-sheets of four antiparallel beta-strands, sheet I from beta-strands b2, b5, b4, b3 and sheet II from strands b7, b6, b1, b8, folded into an open mixed beta-barrel with a novel topology. Sheet I has a simple Greek key motif while sheet II does not PUBMED:14659759.

    \ ' '9050' 'IPR012056' '\

    Energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type) form a distinct group within the [NiFe] hydrogenase family PUBMED:15168611, PUBMED:16645307. Members of this subgroup include:

    \ \

    Energy-converting [NiFe] hydrogenases are membrane-bound enzymes with a six-subunit core: the large and small hydrogenase subunits, plus two hydrophilic proteins and two integral membrane proteins. Their large and small subunits show little sequence similarity to other [NiFe] hydrogenases, except for key conserved residues coordinating the active site and [FeS] cluster. However, they show considerable sequence similarity to the six-subunit, energy-conserving NADH:quinone oxidoreductases (complex I), which are present in cytoplasmic membranes of many bacteria and in inner mitochondrial membranes. However, the reactions they catalyse differ significantly from complex I. Energy-converting [NiFe] hydrogenases function as ion pumps.

    \ \

    Eha and Ehb hydrogenases contain extra subunits in addition to those shared by other energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type). Eha contains a 6[4Fe-4S] polyferredoxin, a 10[4F-4S] polyferredoxin, ten other predicted integral membrane proteins (EhaA , EhaB , EhaC , EhaD , EhaE , EhaF , EhaG , EhaI , EhaK , EhaL ) and four hydrophobic subunits (EhaM, EhaR , EhS, EhT) PUBMED:10491142. The ten predicted integral membrane proteins are absent from Ech, Coo, Hyc and Hyf complexes, which may have simpler membrane components than Eha. Eha and Ehb catalyse the reduction of low-potential redox carriers (e.g. ferredoxins or polyferredoxins), which then might function as electron donors to oxidoreductases.

    \

    [NiFe] hydrogenases function in H2 metabolism in a variety of microorganisms, enabling them to use H2 as a source of reducing equivalent under aerobic and anaerobic conditions [NiFe] hydrogenases consist of two subunits, hydrogenase large and hydrogenase small. The large subunit contains the binuclear [NiFe] active site, while the small subunit binds at least one [4Fe-4S] cluster PUBMED:15119826.

    \

    This entry represents proteins that are predicted to be the hydrophilic EhaM subunits of Eha-type energy-converting [NiFe] hydrogenase complexes.

    \ ' '9052' 'IPR015302' '\

    Members of this entry include the major coat protein of the Saccharomyces cerevisiae virus L-A (ScV-L-A) PUBMED:12244300. The major coat protein is a large polypeptide without apparent domain division.

    \ ' '9053' 'IPR009086' '\

    Bacteriocin AS-48 is a cyclic peptide antibiotic produced by the eubacteria Enterococcus faecalis (Streptococcus faecalis) that shows a broad antimicrobial spectrum against both Gram-positive and Gram-negative bacteria. Bacteriocin AS-48 is encoded by the pheromone-responsive plasmid pMB2, and acts on the plasma membrane in which it opens pores leading to ion leakage and cell death PUBMED:11005847. The globular structure of bacteriocin AS-48 is comprised of five alpha helices enclosing a hydrophobic core. The mammalian NK-lysin effector protein of T and natural killer cells has a similar structure, though it lacks sequence homology with bacteriocins AS-48.

    \ ' '9054' 'IPR015303' '\

    This entry represents the carbohydrate-specific lectin domain found in bacterial fimbrial adhesins. It adopts a compact, elongated structure consisting of a beta-sandwich with two major sheets: one consisting of five long strands in mixed orientations, and a front sheet with four antiparallel strands, forming an immunoglobin-like fold PUBMED:12864853.

    \ ' '9055' 'IPR015304' '\

    Members of this family of prokaryotic domains have been identified as part of the response of bacteria to a challenge with the toxic heavy metal cadmium. They are able to bind to cadmium, and ensure its subsequent elimination PUBMED:12909634.

    \ ' '9056' 'IPR015305' '\

    Members of this family are found in a set of hypothetical bacterial proteins. Their exact function has not, as yet, been determined.

    \ ' '9057' 'IPR015306' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the type II restriction endonuclease PvuII, which recognise the double-stranded DNA sequence 5\'-CAGCTG-3\' and cleave after G-3 PUBMED:9878366.

    \ ' '9058' 'IPR015307' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the prokaryotic restriction endonuclease HincII, which recognises the double-stranded sequence 5\'-GTYRAC-3\' and cleave after Y-3 PUBMED:15476804.

    \ ' '9059' 'IPR015308' '\

    Members of this family of fungal proteins are functionally uncharacterised PUBMED:14747700.

    \ ' '9060' 'IPR015309' '\

    Members of this family of transcriptional repressors adopt a T-shaped structure, with a core composed of two antiparallel alpha-helices. These proteins can be divided into two parts, a globular head and an elongated tail, and they negatively regulate conjugation and the expression of tra genes by antagonising traR/AAI-dependent activation PUBMED:15044488.

    \ ' '9061' 'IPR015310' '\

    This domain is predominantly found in the protein \'Activator of Hsp90 ATPase\', it adopts a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity PUBMED:15039704.

    \ ' '9062' 'IPR015311' '\

    Apoptosis, or programmed cell death (PCD), is a common and evolutionarily conserved property of all metazoans PUBMED:11341280. In many biological processes, apoptosis is required to eliminate supernumerary or dangerous (such as pre-cancerous) cells and to promote normal development. Dysregulation of apoptosis can, therefore, contribute to the development of many major diseases including cancer, autoimmunity and neurodegenerative disorders. In most cases, proteins of the caspase family execute the genetic programme that leads to cell death.

    \

    DNA fragmentation factor (DFF) is a complex of the DNase DFF40 (CAD) and its chaperone/inhibitor DFF45 (ICAD-L). In its inactive form, DFF is a heterodimer composed of a 45-kDa chaperone inhibitor subunit (DFF45 or ICAD), and a 40-kDa latent endonuclease subunit (DFF40 or CAD). Upon caspase-3 cleavage of DFF45, DFF40 forms active endonuclease homo-oligomers. It is activated during apoptosis to induce DNA fragmentation. DNA binding by DFF is mediated by the nuclease subunit, which can also form stable DNA complexes after release from DFF PUBMED:17626049, PUBMED:15572351. The nuclease subunit is inhibited in DNA cleavage but not in DNA binding PUBMED:15572351. DFF45 can also be cleaved and inactivated by caspase-7 but not by caspase-6 and caspase-8. The cleaved DFF45 fragments dissociate from DFF40, allowing DFF40 to oligomerise, forming a large complex that cleaves DNA by introducing double strand breaks. Histone H1 confers DNA binding ability to DFF and stimulates the nuclease activity of DFF40 PUBMED:10318789.

    \ ' '9063' 'IPR015312' '\

    Members of this family are core structural proteins found in the double-stranded RNA virus Phytoreovirus. They are large proteins without apparent domain division, with a number of all-alpha regions and one all beta domain near the C-terminal end PUBMED:14527391.

    \ ' '9064' 'IPR015313' '\

    Her-1 adopts an all-helical structure with two subdomains: residues 19-80 comprise a left-handed three-helix bundle with an overhand connection between the second and third helices, whilst residues 81-164 comprise a left-handed anti-parallel four-helix bundle in which the first helix consists of four consecutive turns of 3-10-helix. Fourteen Cys are conserved in all known HER-1 sequences and form seven disulphide bonds. The protein dictates male development in Caenorhabditis elegans, probably by playing a direct role in cell signalling during C. elegans sex determination. It also inhibits the function of tra-2a PUBMED:15289613.

    \ ' '9065' 'IPR015314' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents EcoRV prokaryotic restriction endonucleases, which recognise the double-stranded sequence 5\'-GATATC-3\' and cleave after T-3 PUBMED:15170321.

    \ ' '9066' 'IPR015315' '\

    This entry is a set of hypothetical bacterial proteins. Their function has not, as yet, been described.

    \ ' '9067' 'IPR015316' '\

    The fungal Ste50p SAM domain consists of five helices, which form a compact, globular fold. It is required for mediation of homodimerisation and heterodimerisation (and in some cases oligomerisation) of the protein PUBMED:14573615.

    \ ' '9068' 'IPR015317' '\

    Alpha-haemoglobin stabilising protein (AHSP) acts a molecular chaperone for free alpha-haemoglobin, preventing the harmful aggregation of alpha-haemoglobin during normal erythroid cell development: it specifically protects free alpha-haemoglobin from precipitation. AHSP adopts a helical secondary structure consisting of an elongated antiparallel three alpha-helix bundle PUBMED:15178680.

    \ ' '9069' 'IPR015318' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    Members of this entry bind to a 5\'-GAGAG-3\' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence PUBMED:9033593.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '9070' 'IPR015319' '\

    Members of this family are related in overall topology to fibronectin type III modules and fold into a sandwich comprising seven antiparallel beta sheets arranged in a three-strand and a four-strand beta-pleated sheet. They are required for binding of interleukin-4 to the receptor alpha chain, which is a crucial event for the generation of a Th2-dominated early immune response PUBMED:10219247.

    \ ' '9071' 'IPR015320' '\

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis PUBMED:12042765, PUBMED:11395412. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227.

    \

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils PUBMED:7980433.

    \

    This entry represents subunit B of topoisomerase VI, an ATP-dependent type IIB enzyme. Members of this family adopt a structure consisting of a four-stranded beta-sheet backed by three alpha-helices, the last of which is over 50 amino acids long and extends from the body of the protein by several turns. This domain has been proposed to mediate intersubunit communication by structurally transducing signals from the ATP binding and hydrolysis domains to the DNA binding and cleavage domains of the gyrase holoenzyme PUBMED:12505993.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '9072' 'IPR015321' '\

    Members of this family adopt a structure consisting of an immunoglobulin-like beta-sandwich, with seven strands in two beta-sheets, in a Greek-key topology. They are required for binding to the cytokine Interleukin-6 PUBMED:12461182.

    \ ' '9073' 'IPR015322' '\

    Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles PUBMED:12910258, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

    \

    Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi\'s sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus PUBMED:11056549.

    \ \

    Members of this family of viral cyclins adopt a helical structure consisting of five alpha-helices, with one helix surrounded by the others. They specifically activate CDK6 of host cells to a very high degree PUBMED:10368294.

    \ ' '9074' 'IPR015323' '\

    This entry represents the flavin-binding domain of flavocytochrome c sulphide dehydrogenase (FCSD), enzymes found in sulphur-oxidising bacteria such as the purple phototrophic bacteria Chromatium vinosum PUBMED:7939681, PUBMED:15544340. These enzymes are complexes of flavoprotein and a dihaem cytochrome that carry out hydrogen sulphide-dependent cytochrome C reduction. The dihaem cytochrome folds into two domains, each of which resembles mitochondrial cytochrome c, with the two haem groups bound to the interior of the subunit. The flavoprotein subunit has a glutathione reductase-like fold consisting of a beta(3,4)-alpha(3) core, and an alpha+beta sandwich. The active site of the flavoprotein subunit contains a catalytically important disulphide bridge located above the pyrimidine portion of the flavin ring PUBMED:7939681. Electrons are transferred from the flavin to one of the haem groups in the cytochrome. This entry represents a flavoprotein domain required for binding to flavin, and subsequent electron transfer.

    \ ' '9075' 'IPR015324' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Rsm22 has been identified as a mitochondrial small ribosomal subunit PUBMED:16835444 and is a methyltransferase. In Schizosaccharomyces pombe (Fission yeast), Rsm22 is tandemly fused to Cox11 (a factor required for copper insertion into cytochrome oxidase) and the two proteins are proteolytically cleaved after import into the mitochondria PUBMED:16835444. This entry consists of mitochondrial Rsm22 and homologous sequences from bacteria.

    \ ' '9076' 'IPR015325' '\

    This domain is C-terminal to the catalytic sucrose phosphorylase beta/alpha barrel domain. It adopts a beta-sandwich fold, with Greek-key topology and is functionally uncharacterised PUBMED:14756551.

    \ ' '9077' 'IPR015326' '\

    Mycoplasma arthritidis-derived mitogen (MA-Mit) adopts a completely alpha-helical structure consisting of ten alpha helices arranged in two orthogongal bundles. MA-Mit is a superantigen that can activate large fractions of T cells bearing particular TCR V-beta elements. Two MA-Mit molecules form an asymmetric dimer and cross-link two MHC antigens to form a dimerised MA-Mit-MHC complex PUBMED:14962388.

    \ ' '9078' 'IPR015327' '\

    The PHAT (pseudo-HEAT analogous topology) domain assumes a structure consisting of a layer of three parallel helices packed against a layer of two antiparallel helices, into a cylindrical shaped five-helix bundle. It is found in the RNA-binding protein Smaug, where it is essential for high-affinity RNA binding PUBMED:12820967.

    \ ' '9079' 'IPR009067' '\

    In eukaryotes, the general transcription factor TFIID helps to regulate transcription by RNA polymerase II from class II promoters. TFIID consists of TATA-box-binding proteins (TBP) and TBP-associated factors (TAFIIs), which together mediate both activation and inhibition of transcription. In Drosophila, the N-terminal region of TAFII-230 (the TFIID 230kDa subunit) binds directly to TBP, thereby inhibiting the binding of TBP to the TATA box. The structure of TAFII-230 is comprised of three short helices in an irregular array, which forms the core that occupies the DNA-binding surface of TBP PUBMED:9741622. Note, the Gene3D model in this entry is hitting fewer proteins than it should and is under revision.

    \ ' '9080' 'IPR015328' '\

    Members of this family of fungal domains adopt a structure that consists of an alpha/beta motif. Their exact function has not, as yet, been determined PUBMED:14690425.

    \ ' '9081' 'IPR015329' '\

    This domain adopts a structure consisting of a five helical bundle core. It is predominantly found in Archaeal tRNA nucleotidyltransferases, following the catalytic nucleotidyltransferase domain PUBMED:14636575.

    \ ' '9082' 'IPR015330' '\

    Members of this family adopt a structure consisting of a core of antiparallel beta sheets. They are found in various bacterial hypothetical proteins, and have been shown to harbour both primase and polymerase activities PUBMED:14730355.

    \ ' '9083' 'IPR015331' '\

    Members of this entry contain a domain that adopts a structure consisting of a single-stranded right-handed beta-helix, which in turn is made of parallel beta-strands and short turns. They are required for recognition of the 0-antigenic repeating units of the cell surface, and for subsequent infection of the bacterial cell by the phage PUBMED:8855221.

    \ ' '9084' 'IPR015332' '\

    Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans. The nomenclature system uses the first three letters of the genus, followed by the first letter of the species name, followed by a number (additional letters can be added to the name as required to discriminate between similar designations).

    \

    Fel d 1 is allergen 1 from Felis silvestris catus (Cat), which is an important agent in human allergic reactions PUBMED:17543334. The protein is expressed in saliva and sebaceous glands. The complete primary structure of Fel d 1 has been determined PUBMED:12851385. The allergen is tetrameric glycoprotein consisting of two disulphide-linked heterodimers of chains 1 and 2, which have been shown to be encoded by different genes. Fel d 1 chains 1 and 2 share structural similarity with uteroglobin, a secretoglobin superfamily member; chain 2 is a glycoprotein with N-linked oligosaccharides.

    \

    This entry represents Fel d 1 chain 2.

    \ ' '9085' 'IPR015333' '\

    This entry represents pollen allergens, such as ole-e-6, a small acidic protein from Olea europaea (Common olive) which mediates olive allergy. Members of this family have an alpha-helical hairpin structure cross-linked by three disulphides, followed by a long, unstructured C-terminal tail PUBMED:15247256.

    \ ' '9086' 'IPR015334' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the C-terminal domain of FokI restriction endonucleases, which adopts a structure consisting of an alpha/beta/alpha core containing a five-stranded beta-sheet. FokI recognises the double-stranded DNA sequence 5\'-GGATG-3\' and cleave DNA phosphodiester groups 9 base pairs away on this strand and 13 base pairs away on the complementary strand PUBMED:9724743, PUBMED:12093751.

    \ ' '9087' 'IPR015335' '\

    Members of this family represent the F1 capsule antigen Caf1 synthesised by Yersinia bacteria. They adopt a structure consisting of a seven strands arranged in two beta-sheets, in a Greek-key topology, and mediate targeting of the bacterium to sites of infection PUBMED:12787500.

    \ ' '9088' 'IPR015336' '\

    Cytokines can be grouped into a family on the basis of sequence, functional and structural similarities PUBMED:8095800, PUBMED:1377364, PUBMED:. Tumor necrosis factor (TNF) (also known as TNF-alpha or cachectin) is a monocyte-derived cytotoxin that has been implicated in tumour regression, septic shock and cachexia PUBMED:2989794, PUBMED:3349526. The protein is synthesised as a prohormone with an unusually long and atypical signal sequence, which is absent from the mature secreted cytokine PUBMED:2268312. A short hydrophobic stretch of amino acids serves to anchor the prohormone in lipid bilayers PUBMED:2777790. Both the mature protein and a partially-processed form of the hormone are secreted after cleavage of the propeptide PUBMED:2777790.

    There are a number of different families of TNF, but all these cytokines seem to form homotrimeric (or heterotrimeric in the case of LT-alpha/beta) complexes that are recognised by their specific receptors.

    \

    Members of this entry, which are predominantly found in the tumour necrosis factor receptor superfamily member 13c, BAFF-R, are required for binding to tumour necrosis factor ligand TALL-1 PUBMED:12721620.

    \ ' '9089' 'IPR015337' '\

    Cytokines can be grouped into a family on the basis of sequence, functional and structural similarities PUBMED:8095800, PUBMED:1377364, PUBMED:. Tumor necrosis factor (TNF) (also known as TNF-alpha or cachectin) is a monocyte-derived cytotoxin that has been implicated in tumour regression, septic shock and cachexia PUBMED:2989794, PUBMED:3349526. The protein is synthesised as a prohormone with an unusually long and atypical signal sequence, which is absent from the mature secreted cytokine PUBMED:2268312. A short hydrophobic stretch of amino acids serves to anchor the prohormone in lipid bilayers PUBMED:2777790. Both the mature protein and a partially-processed form of the hormone are secreted after cleavage of the propeptide PUBMED:2777790.

    There are a number of different families of TNF, but all these cytokines seem to form homotrimeric (or heterotrimeric in the case of LT-alpha/beta) complexes that are recognised by their specific receptors.

    \

    Members of this entry, which are predominantly found in the tumour necrosis factor receptor superfamily member 17, BCMA, are required for binding to tumour necrosis factor ligand TALL-1 PUBMED:12721620.

    \ ' '9090' 'IPR015338' '\

    Members of this entry catalyse the transfer reaction of N-acetylglucosamine and N-acetylgalactosamine from the respective UDP-sugars to the non-reducing end of [glucuronic acid]beta 1-3[galactose]beta 1-O-naphthalenemethanol, an acceptor substrate analogue of the natural common linker of various glycosylaminoglycans. They are also required for the biosynthesis of heparan-sulphate PUBMED:12562774.

    \ ' '9091' 'IPR015339' '\

    This entry represents the FIP-Fve (Fungal Immunomodulatory Protein Fve) is a major fruiting body protein from Flammulina velutipes, a mushroom possessing immunomodulatory activity. It stimulates lymphocyte mitogenesis, suppresses systemic anaphylaxis reactions and oedema, enhances transcription of IL-2, IFN-gamma and TNF-alpha, and haemagglutinates red blood cells. It appears to be a lectin with specificity for complex cell-surface carbohydrates. Fve adopts a tertiary structure consisting of an immunoglobulin-like beta-sandwich, with seven strands arranged in two beta sheets, in a Greek-key topology. It forms a non-covalently linked homodimer containing no Cys, His or Met residues; dimerisation occurs by 3-D domain swapping of the N-terminal helices and is stabilised predominantly by hydrophobic interactions PUBMED:12948495.

    \ ' '9092' 'IPR015340' '\

    Alpha-amylase is classified as family 13 of the glycosyl hydrolases and is present in archaea, bacteria, plants and animals. Alpha-amylase is an essential enzyme in alpha-glucan metabolism, acting to catalyse the hydrolysis of alpha-1,4-glucosidic bonds of glycogen, starch and related polysaccharides. Although all alpha-amylases possess the same catalytic function, they can vary with respect to sequence. In general, they are composed of three domains: a TIM barrel containing the active site residues and chloride ion-binding site (domain A), a long loop region inserted between the third beta strand and the alpha-helix of domain A that contains calcium-binding site(s) (domain B), and a C-terminal beta-sheet domain that appears to show some variability in sequence and length between amylases (domain C) PUBMED:11141191. Amylases have at least one conserved calcium-binding site, as calcium is essential for the stability of the enzyme. The chloride-binding functions to activate the enzyme, which acts by a two-step mechanism involving a catalytic nucleophile base (usually an Asp) and a catalytic proton donor (usually a Glu) that are responsible for the formation of the beta-linked glycosyl-enzyme intermediate.

    \

    This domain is found in various fungal alpha-amylase proteins. Its exact function has not, as yet, been defined PUBMED:9283074.

    \ ' '9093' 'IPR015341' '\

    Members of this entry belong to the glycosyl hydrolase family 38, This domain, which is found in the central region adopts a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. The domain is predominantly found in the enzyme alpha-mannosidase PUBMED:12634058.

    \ ' '9094' 'IPR015342' '\

    This domain adopts a double psi beta-barrel fold, similar in structure to the Cdc48 N-terminal domain. It has been suggested that this domain may be involved in interactions with ubiquitin, ubiquitin-like protein modifiers, or ubiquitin-like domains, such as Ubx. Furthermore, the domain may possess a putative adaptor or substrate binding site, allowing for peroxisomal biogenesis, membrane fusion and protein translocation PUBMED:15328346.

    \ ' '9095' 'IPR015343' '\

    This domain adopts a Cdc48 domain 2-like fold, with a beta-alpha-beta(3) arrangement. It has been suggested that this domain may be involved in interactions with ubiquitin, ubiquitin-like protein modifiers, or ubiquitin-like domains, such as Ubx. Furthermore, the domain may possess a putative adaptor or substrate binding site, allowing for peroxisomal biogenesis, membrane fusion and protein translocation PUBMED:15328346.

    \ ' '9096' 'IPR015344' '\

    This domain is predominantly found in Vibrio cholerae sialidase, and adopt a beta sandwich structure consisting of 12-14 strands arranged in two beta-sheets. It binds to lectins with high affinity helping to target the protein to sialic acid-rich environments, thereby enhancing the catalytic efficiency of the enzyme PUBMED:15226294.

    \ ' '9097' 'IPR015345' '\

    This domain adopts an alpha+beta sandwich structure with an antiparallel beta-sheet, in a ferredoxin-like fold. It is predominantly found in plant cytokinin dehydrogenase 1, where it is capable of binding both FAD and cytokinin substrates. The substrate displays a \'plug-into-socket\' binding mode that seals the catalytic site and precisely positions the carbon atom undergoing oxidation in close contact with the reactive locus of the flavin PUBMED:15321719.

    \ ' '9098' 'IPR015346' '\

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis PUBMED:12042765, PUBMED:11395412. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA PUBMED:12596227.

    \

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    \

    This entry represents a domain foundpredominantly found in viral DNA topoisomerase I, a type IB enzyme. This domain assumes a beta(2)-alpha-beta-alpha-beta(2) fold, with a left-handed crossover between strands beta2 and beta3 PUBMED:7994576.

    \

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase PUBMED:.

    \ ' '9099' 'IPR015347' '\

    The STAT protein (Signal Transducers and Activators of Transcription) family contains transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors, hence they act as signal transducers in the cytoplasm and transcription activators in the nucleus PUBMED:12039028. Binding of these factors to cell-surface receptors leads to receptor autophosphorylation at a tyrosine, the phosphotyrosine being recognised by the STAT SH2 domain, which mediates the recruitment of STAT proteins from the cytosol and their association with the activated receptor. The STAT proteins are then activated by phosphorylation via members of the JAK family of protein kinases, causing them to dimerise and translocated to the nucleus, where they bind to specific promoter sequences in target genes. In mammals, STATs comprise a family of seven structurally and functionally related proteins: Stat1, Stat2, Stat3, Stat4, Stat5a and Stat5b, Stat6. STAT proteins play a critical role in regulating innate and acquired host immune responses. Dysregulation of at least two STAT signalling cascades (i.e. Stat3 and Stat5) is associated with cellular transformation.

    \ \

    Signalling through the JAK/STAT pathway is initiated when a cytokine binds to its corresponding receptor. This leads to conformational changes in the cytoplasmic portion of the receptor, initiating activation of receptor associated members of the JAK family of kinases. The JAKs, in turn, mediate phosphorylation at the specific receptor tyrosine residues, which then serve as docking sites for STATs and other signalling molecules. Once recruited to the receptor, STATs also become phosphorylated by JAKs, on a single tyrosine residue. Activated STATs dissociate from the receptor, dimerise, translocate to the nucleus and bind to members of the GAS (gamma activated site) family of enhancers.

    \ \

    The seven STAT proteins identified in mammals range in size from 750 and 850 amino acids. The chromosomal distribution of these STATs, as well as the identification of STATs in more primitive eukaryotes, suggest that this family arose from a single primordial gene. STATs share structurally and functionally conserved domains including: an N-terminal domain that strengthens interactions between STAT dimers on adjacent DNA-binding sites; a coiled-coil STAT domain that is implicated in protein-protein interactions; a DNA-binding domain with an immunoglobulin-like fold similar to p53 tumour suppressor protein; an EF-hand-like linker domain connecting the DNA-binding and SH2 domains; an SH2 domain () that acts as a phosphorylation-dependent switch to control receptor recognition and DNA-binding; and a C-terminal transactivation domain PUBMED:9630226. The crystal structure of the N-terminus of Stat4 reveals a dimer. The interface of this dimer is formed by a ring-shaped element consisting of five short helices. Several studies suggest that this N-terminal dimerisation promotes cooperativity of binding to tandem GAS elements and with the transcriptional coactivator CBP/p300.

    \ \

    This entry represents a domain found in Dictyostelium STAT proteins. This domain adopts a structure consisting of four long alpha-helices, folded into a coiled coil. It is responsible for nuclear export of the protein PUBMED:15053873.

    \ ' '9100' 'IPR015348' '\

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport PUBMED:15261670. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors PUBMED:17449236, PUBMED:11598180.

    \

    Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion PUBMED:15752139, PUBMED:16806884. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase PUBMED:16734666. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process PUBMED:15261670. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins PUBMED:16699812.

    \

    This entry represents the core motif for the alpha-helical zigzag linker region connecting the conserved N-terminal beta-propeller region to the C-terminal alpha-alpha-superhelical region in clathrin heavy chains PUBMED:9827808.

    \

    More information about these proteins can be found at Protein of the Month: Clathrin PUBMED:.

    \ ' '9101' 'IPR015349' '\

    The Obg family comprises a group of ancient P-loop small G proteins (GTPases) belonging to the TRAFAC (for translation factors) class and can be subdivided into several distinct protein subfamilies PUBMED:17430889. OBG GTPases have been found in both prokaryotes and eukaryotes PUBMED:15827604. The structure of the OBG GTPase from Thermus thermophilus has been determined PUBMED:15019792.

    \

    This entry represents a C-terminal domain found in certain OBG GTPases. This domain contains a four-stranded beta sheet and three alpha helices flanked by an additional beta strand. It is predominantly found in the bacterial GTP-binding protein Obg, and is functionally uncharacterised.

    \ ' '9102' 'IPR015350' '\

    This DNA-binding domain adopt a beta-trefoil fold, that is, a capped beta-barrel with internal pseudo threefold symmetry. In the DNA-binding protein LAG-1, it also is the site of mutually exclusive interactions with NotchIC (and the viral protein EBNA2) and corepressors (SMRT/N-Cor and CIR) PUBMED:15297877.

    \ ' '9103' 'IPR015351' '\

    This domain is found in various eukaryotic hypothetical proteins and in the DNA-binding protein LAG-1. It adopts a beta sandwich structure, with nine strands in two beta-sheets, in a Greek-key topology, and allow for DNA binding PUBMED:15297877.

    \ ' '9104' 'IPR015352' '\

    This entry represents the extracellular domain of the serine protease hepsin. The domain is formed primarily by three elements of regular secondary structure: a 12-residue alpha helix, a twisted five-stranded antiparallel beta sheet, and a second, two-stranded, antiparallel sheet. The two beta-sheets lie at roughly right angles to each other, with the helix nestled between the two, adopting an SRCR fold. The exact function of this domain has not been identified, though it probably may serve to orient the protease domain or place it in the vicinity of its substrate PUBMED:12962630.

    \ ' '9105' 'IPR015353' '\

    This domain adopts a multihelical structure, with an irregular array of long and short alpha-helices. It allows binding of the protein to substrate, such as the N-terminal tails of histones H3 and H4 and the large subunit of the Rubisco holoenzyme complex PUBMED:12819771.

    \ ' '9106' 'IPR015354' '\

    This domain is found in a family of plasmid partition proteins; it adopts a ribbon-helix-helix fold, with a core of four alpha-helices. The proteins are an essential component of the DNA partition complex of the multidrug resistance plasmid TP228 PUBMED:14622405.

    \ ' '9107' 'IPR015355' '\

    Members of this family of Bordetella pertussis toxins adopt a structure consisting of an OB fold, with a closed or partly opened beta-barrel in a Greek-key topology PUBMED:8075982.

    \ ' '9108' 'IPR015356' '\

    Members of this family of Bordetella pertussis toxins adopt a structure consisting of an OB fold, with a closed or partly opened beta-barrel in a Greek-key topology PUBMED:8075982.

    \ ' '9109' 'IPR015357' '\

    Docking domains are found in prokaryotic erythronolide synthase. They adopt a structure consisting of a bundle of four alpha-helices, and mediate homodimerisation of the protein, stabilising the resulting complex PUBMED:12954331.

    \ ' '9110' 'IPR015358' '\

    This entry represents a family of DNA-binding domains that are predominantly found in the prokaryotic transcriptional regulator MerR. They adopt a structure consisting of a core of three alpha helices, with an architecture that is similar to that of the \'winged helix\' fold PUBMED:12958362.

    \ ' '9111' 'IPR015359' '\

    This domain is predominantly found in the enzyme phosphoinositol-specific phospholipase C. It adopts a structure consisting of a core of four alpha helices, in an EF like fold, and is required for functioning of the enzyme PUBMED:8784353.

    \ ' '9112' 'IPR015360' '\

    Members of this entry adopt a structure consisting of four alpha helices, arranged in an array. They bind specifically and directly to the xeroderma pigmentosum group C protein (XPC) to initiate nucleotide excision repair PUBMED:15885096.

    \ ' '9113' 'IPR015361' '\

    This domain is found in prokaryotic Taq DNA polymerase (thermostable), where it assumes a ribonuclease H-like motif. The domain confers 5\'-3\' exonuclease activity to the polymerase PUBMED:10449720.

    \ ' '9114' 'IPR015362' '\

    Members of this family adopt a structure consisting of a small globular all-beta-domain, with a three-stranded beta-sheet and a contiguous beta-hairpin. They bind to Mago alpha-helices via extensive electrostatic interactions and at a beta2-beta3 loop via hydrophobic interactions PUBMED:14968132.

    \ ' '9116' 'IPR015364' '\

    This domain is found in prokaryotic enzyme rhamnogalacturonase B, it adopts a structure consisting of a beta supersandwich, with eighteen strands in two beta-sheets. The exact function of the domain is unknown, but a putative role includes carbohydrate-binding PUBMED:15135077.

    \ ' '9117' 'IPR015365' '\

    These nucleic acid binding domains are predominantly found in elongation factor P, where they adopt an OB-fold, with five beta-strands forming a beta-barrel in a Greek-key topology PUBMED:15210970.

    \ ' '9118' 'IPR015366' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This domain is found at the N-terminus of peptidases belonging to MEROPS peptidase family S53 (sedolisin, clan SB). The domain adopts a ferredoxin-like fold, with an alpha+beta sandwich. Cleavage of the domain results in activation of the peptidase PUBMED:15242607.

    \ ' '9119' 'IPR015367' '\

    This family of DNA-binding domains is found in the Caenorhabditis elegans transcription factor CEP-1, which is related to human p53. It adopts a beta sandwich structure, with nine strands in two beta-sheets, in a Greek-key topology PUBMED:15242600.

    \ ' '9120' 'IPR015368' '\

    This C-terminal domain is found in ubiquitin binding proteins, it adopts a structure consisting of a three alpha-helix bundle. This domain is predominantly found in fungi PUBMED:15328341.

    \ ' '9121' 'IPR015369' '\

    Proteins containing this domain are predominantly found in osteonectin and follistatin. They adopt an EGF-like structure PUBMED:12867435, PUBMED:9501084. Follistatin is involved in diverse activities from embryonic development to cell secretion.

    \ ' '9122' 'IPR015426' '\

    This C-terminal domain is found in prokaryotic acetaldehyde dehydrogenases, it adopts a structure consisting of an alpha-beta-alpha-beta(3) core, which mediates dimerisation of the protein PUBMED:12764229.

    \ \

    The acetaldehyde dehydrogenase family of bacterial enzymes catalyses the formation of acetyl-CoA from acetaldehyde in the 3-hydroxyphenylpropinoate degradation pathway. It occurs as a late step in the meta-cleavage pathways of a variety of compounds, including catechol, biphenyl, toluene, salicylate PUBMED:1732207.

    \ ' '9123' 'IPR015370' '\

    This domain is found in mammalian T-cell antigen receptors, it adopts an immunoglobulin-like beta-sandwich fold, with seven strands in two beta-sheets in a Greek-key topology. Their exact function of members of this entry have not, as yet, been determined.

    \ ' '9124' 'IPR015371' '\

    This domain is predominantly found in Endonuclease VIII-like 1 proteins and adopts a glucocorticoid receptor-like fold. Structural analysis reveals a zincless finger motif that is required for glycosylase activity PUBMED:15232006.

    \ ' '9125' 'IPR015372' '\

    This domain is found in T4 RNaseH ribonuclease, and adopts a SAM domain-like fold, consisting of a bundle of four/five helices. These residues may have a role in providing a docking site for other proteins or enzymes in the replication fork PUBMED:8674116.

    \ ' '9126' 'IPR015373' '\

    Members of this family adopt a secondary structure consisting of seven beta-strands arranged in an immunoglobulin-like beta-sandwich, in a Greek-key topology. They are required for binding to interferon-alpha PUBMED:12842042.

    \ ' '9127' 'IPR015374' '\

    ChAPs (Chs5p-Arf1p-binding proteins) are required for the export of specialised cargo from the Golgi. They physically interact with Chs3, Chs5 and the small GTPase Arf1, and they form also interactions with each other PUBMED:16498409.

    \ ' '9128' 'IPR015375' '\

    This entry represents the N-terminal domain found in NADH pyrophosphatase. Nitrate reductase inactivator (NRI) protein shares 51.1-68.3% of its amino acid sequence with three types of the nucleotide pyrophosphatase-like protein from Arabidopsis thaliana.

    \ ' '9129' 'IPR015376' '\

    This domain has a zinc ribbon structure and is often found between two NUDIX domains.

    \ ' '9130' 'IPR015377' '\

    Fumarylacetoacetase (; also known as fumarylacetoacetate hydrolase or FAH) catalyses the hydrolytic cleavage of a carbon-carbon bond in fumarylacetoacetate to yield fumarate and acetoacetate as the final step in phenylalanine and tyrosine degradation PUBMED:11154690. This is an essential metabolic function in humans, the lack of FAH causing type I tyrosinaemia, which is associated with liver and kidney abnormalities and neurological disorders PUBMED:16602095, PUBMED:9101289. The enzyme mechanism involves a catalytic metal ion, a Glu/His catalytic dyad, and a charged oxyanion hole PUBMED:10508789. FAH folds into two domains: an N-terminal domain SH3-like beta-barrel, and a C-terminal with an unusual fold consisting of three layers of beta-sheet structures PUBMED:10508789.

    \ \

    This entry represents the N-terminal domain of fumarylacetoacetase.

    \ ' '9131' 'IPR015378' '\

    This domain is found in various prokaryotic integrases and transposases. It adopts a beta-barrel structure with Greek-key topology PUBMED:7628012.

    \ ' '9132' 'IPR015379' '\

    Members of this family form the minor capsid protein of various Tectiviridae PUBMED:15525981.

    \ ' '9133' 'IPR015380' '\

    Members of this family consist of various uncharacterised viral hypothetical proteins.

    \ ' '9134' 'IPR015381' '\

    XLF (also called Cernunnos) interacts with the XRCC4-DNA ligase IV complex to promote DNA non-homologous end-joining. It directly interacts with the XRCC4-Ligase IV complex and siRNA-mediated downregulation of XLF in human cell lines leads to radio-sensitivity and impaired DNA non-homologous end-joining PUBMED:16439205. XLF is homologous to the yeast non-homologous end-joining factor Nej1 PUBMED:16571728.

    \ ' '9135' 'IPR015382' '\

    This domain is found in the cytoplasmic N-terminus of KCNMB2, the beta-2 subunit of large conductance calcium and voltage-activated potassium channels. It is responsible for the fast inactivation of these channels PUBMED:11517232.

    \ ' '9136' 'IPR015383' '\

    This domain is predominantly found in the actin-bundling protein cortexillin I from Dictyostelium discoideum (Slime mold). The domain has a structure consisting of an 18-heptad-repeat alpha-helical coiled-coil, and is a prerequisite for the assembly of Cortexillin I PUBMED:10745004.

    \ ' '9137' 'IPR015384' '\

    Cytokines can be grouped into a family on the basis of sequence, functional and structural similarities PUBMED:8095800, PUBMED:1377364, PUBMED:. Tumor necrosis factor (TNF) (also known as TNF-alpha or cachectin) is a monocyte-derived cytotoxin that has been implicated in tumour regression, septic shock and cachexia PUBMED:2989794, PUBMED:3349526. The protein is synthesised as a prohormone with an unusually long and atypical signal sequence, which is absent from the mature secreted cytokine PUBMED:2268312. A short hydrophobic stretch of amino acids serves to anchor the prohormone in lipid bilayers PUBMED:2777790. Both the mature protein and a partially-processed form of the hormone are secreted after cleavage of the propeptide PUBMED:2777790.

    There are a number of different families of TNF, but all these cytokines seem to form homotrimeric (or heterotrimeric in the case of LT-alpha/beta) complexes that are recognised by their specific receptors.

    \

    This entry represents a cysteine-rich domain found in the TACI family of proteins. Members of this family are predominantly found in tumour necrosis factor receptor superfamily, member 13b (TACI), and are required for binding to the ligands APRIL and BAFF PUBMED:15542592.

    \ ' '9138' 'IPR015385' '\

    Members of this family of scaffolding proteins are produced by various bacteriophages PUBMED:10764583.

    \ ' '9139' 'IPR015386' '\

    This domain is found in MHC class II-associated invariant chain (Ii), and in class II invariant chain-associated peptide (CLIP), and is required for association with class II major histocompatibility complex (MHC II) in the MHC II processing pathway PUBMED:12589760. Ii plays a critical role in the assembly of the MHC, as well as in MHC II antigen processing by stabilising peptide-free class II alpha/beta heterodimers in a complex soon after their synthesis and directing transport of the complex from the endoplasmic reticulum to compartments where peptide loading of class II takes place PUBMED:16337363. In antigen-presenting cells (APCs), loading of MHC II molecules with peptides is regulated by Ii, which blocks MHC II antigen-binding sites in pre-endosomal compartments PUBMED:16181341. Several factors modulate the surface expression of MHC II molecules via post-Golgi mechanisms, including CLIP.

    \

    The Invariant chain contains a single transmembrane domain. Ii first assembles into a trimer and then associates with three class II alpha/beta MHC heterodimers. Although the membrane-proximal region of the Ii luminal domain is structurally disordered, the C-terminal segment of the luminal domain is largely alpha-helical and contains a major interaction site for the Ii trimer PUBMED:9843486.

    \

    More information about these proteins can be found at Protein of the Month: MHC PUBMED:.

    \ ' '9140' 'IPR015387' '\

    This entry represents the periplasmic sensor domain of the prokaryotic protein LuxQ, that assumes a structure consisting of two tandem Per/ARNT/Simple-minded (PAS) folds PUBMED:15916958.

    \ ' '9141' 'IPR015388' '\

    The C-terminal domain of FCP-1 is required for interaction with the carboxy terminal domain of RAP74. Interaction relies extensively on van der Waals contacts between hydrophobic residues situated within alpha-helices in both domains PUBMED:12732728.

    \ ' '9142' 'IPR015389' '\

    Members of this family are transcriptional co-activators that specifically associate with either OCT1 or OCT2, through recognition of their POU domains. They are essential for the response of B-cells to antigens and required for the formation of germinal centres PUBMED:10541551.

    \ ' '9143' 'IPR015390' '\

    This domain is predominantly found in Rabaptin and allows for binding to the GTPase Rab5. This interaction is necessary and sufficient for Rab5-dependent recruitment of Rabaptin5 to early endosomal membranes PUBMED:15378032.

    \ ' '9144' 'IPR015391' '\

    This domain is found at the N-terminus of the chaperone SurA. It is a helical domain of unknown function. The C-terminus of the SurA protein folds back and forms part of this domain also but is not included in the current alignment.

    \ ' '9145' 'IPR015392' '\

    This uncharacterised domain is predominantly found in bacterial Tellurite resistance proteins.

    \ ' '9146' 'IPR015393' '\

    This domain is functionally uncharacterised and found in bacterial glycosyltransferases and rhamnosyltransferases.

    \ ' '9147' 'IPR015394' '\

    These functionally uncharacterised domains are found in various eukaryotic calcium-dependent chloride channels.

    \ ' '9148' 'IPR015395' '\

    This entry represents the C-terminal domain of the proto-oncogene c-myb and the viral transforming protein myb. Truncation of the domain results in \'activation\' of c-myb and subsequent tumourigenesis PUBMED:2670562.

    \ ' '9149' 'IPR015396' '\

    This C-terminal domain is functionally uncharacterised and is predominantly found in various prokaryotic acyl-coenzyme a dehydrogenases.

    \ ' '9150' 'IPR015397' '\

    This domain is functionally uncharacterised and found predominantly in the N-terminal region of various prokaryotic alpha-glucosyltransferases.

    \ ' '9151' 'IPR015398' '\

    Members of this family are found in a set of hypothetical Mycoplasmal proteins. Their exact function has not, as yet, been defined.

    \ ' '9152' 'IPR015399' '\

    This C-terminal domain is functionally uncharacterised and predominantly found in Dnaj-like proteins.

    \ ' '9153' 'IPR015400' '\

    This domain is found in various hypothetical proteins produced by the bacterium Chlamydia pneumoniae. Their exact function has not, as yet, been identified. This entry includes the IncA proteins

    \ ' '9154' 'IPR015401' '\

    This N-terminal domain is functionally uncharacterised and found in various Oryza sativa (Rice) mutator-like transposases.

    \ ' '9155' 'IPR015402' '\

    Members of this family are found in a set of prokaryotic hypothetical proteins. Their exact function has not, as yet, been defined.

    \ ' '9156' 'IPR015403' '\

    This domain is functionally uncharacterised and found in various plant and yeast protein transport proteins. It is noramlly associated with and C-termianl to the SEC7 domain. The SEC7 domain was named after the first protein found to contain such a region PUBMED:3042778. It has been shown to be linked with guanine nucleotide exchange function PUBMED:9072969, PUBMED:9442017.

    \ ' '9157' 'IPR015404' '\

    Vps5 is a sorting nexin that functions in membrane trafficking. This is the C-terminal dimerisation domain PUBMED:12181349.

    \ ' '9158' 'IPR015405' '\

    This C-terminal domain is functionally uncharacterised and is found in various prokaryotic NADH dehydrogenases including NADH-quinone oxidoreductase, chain G.

    \ ' '9159' 'IPR015406' '\

    This domain, which is functionally uncharacterised is found in various bacteriophage host specificity proteins.

    \ ' '9160' 'IPR015407' '\

    This entry represents the C-terminal region of plant phytochelatin synthases (also known as glutathione gamma-glutamylcysteinyltransferase; ), which is involved in the synthesis of phytochelatins (PC) and homophytochelatins (hPC), the heavy-metal-binding peptides of plants. This enzyme is required for detoxification of heavy metals such as cadmium and arsenate. The N-terminal region of phytochelatin synthase contains the active site, as well as four highly conserved cysteine residues that appear to play an important role in heavy-metal-induced phytochelatin catalysis. The C-terminal region is rich in cysteines, and may act as a metal sensor, whereby the Cys residues bind cadmium ions to bring them into closer proximity and transferring them to the activation site in the N-terminal catalytic domain PUBMED:18270423. The C-terminal region displays homology to the functional domains of metallothionein and metallochaperone.

    \ ' '9161' 'IPR015408' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This zinc finger domain is found in Mcm10 proteins and DnaG-type primases PUBMED:16704411.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '9162' 'IPR015409' '\

    Members of this entry are predominantly found in prokaryotic D-lactate dehydrogenase, forming the cap-membrane-binding domain, which consists of a large seven-stranded antiparallel beta-sheet flanked on both sides by alpha-helices. They allow for membrane association PUBMED:10944213.

    \ ' '9163' 'IPR015410' '\

    This domain is functionally uncharacterised; it is found in a set of Arabidopsis thaliana (Mouse-ear cress) hypothetical proteins.

    \ ' '9164' 'IPR015411' '\

    Mcm10 is a eukaryotic DNA replication factor that regulates the stability and chromatin association of DNA polymerase alpha PUBMED:15494305.

    \ ' '9165' 'IPR015412' '\

    Eukaryotes have developed an evolutionarily conserved process, termed autophagy, to survive starvation conditions. The vacuole or lysosome mediates the turnover and recycling of non-essential intracellular material for re-use in critical biosynthetic reactions. ATG2 (also known as Apg2) is required for the formation and/or completion of cytosolic sequestering vesicles that are needed for vacuolar import through both the Cvt pathway and autophagy, as well as for the specific degradation of peroxisomes. Apg2 is a peripheral membrane protein that localises to the previously identified perivacuolar compartment that contains Apg9 PUBMED:11382760. This entry represents the C-terminal of Apg2.

    \ ' '9166' 'IPR015413' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This domain is found in methionyl and leucyl tRNA synthetases.

    \ ' '9167' 'IPR015414' '\

    This is a entry contains SNARE associated Golgi proteins. The yeast member of this family () localises with the t-SNARE Tlg2 PUBMED:16107716.

    \ ' '9168' 'IPR015415' '\

    This domain is found at the C-terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerisation PUBMED:16704411.

    \ ' '9169' 'IPR015416' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents an H2C2-type zinc finger that binds to histone upstream activating sequence (UAS) elements found in histone gene promoters PUBMED:16415340.

    \

    More information about these proteins can be found at Protein of the Month: Zinc Fingers PUBMED:.

    \ ' '9170' 'IPR015417' '\

    This is a family of glycine reductase, sarcosine reductase and betaine reductases. These enzymes catalyse the following reactions:\

    \ ' '9171' 'IPR005471' '\

    The many bacterial transcription regulation proteins which bind DNA through a\ \'helix-turn-helix\' motif can be classified into subfamilies on the basis of\ sequence similarities. One of these subfamilies, called \'iclR\', groups several proteins including:\ \

    \

    \ \

    These proteins have\ a Helix-Turn-Helix motif at the N-terminus that is similar to that of other DNA-binding proteins PUBMED:1840643.

    \ ' '9172' 'IPR015418' '\

    The NuA4 histone acetyltransferase (HAT) multisubunit complex is responsible for acetylation of histone H4 and H2A N-terminal tails in yeast PUBMED:10487762. NuA4 complexes are highly conserved in eukaryotes and play primary roles in transcription, cellular response to DNA damage, and cell cycle control PUBMED:14966270.

    \ ' '9173' 'IPR015419' '\

    Pcc1 is a transcription factor that functions in regulating genes involved in cell cycle progression and polarised growth PUBMED:16874308.

    \ ' '9174' 'IPR015420' '\

    This domain is found in serine proteases and is predicted to contain disulphide bonds (see ).

    \ ' '9175' 'IPR011740' '\

    This model represents a family of conserved hypothetical proteins. It is usually (but not always) found in apparent phage-derived regions of bacterial chromosomes.

    \ ' '9176' 'IPR010148' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a family of Cas proteins, which includes CT1975 of Chlorobium tepidum.

    \ ' '9177' 'IPR018530' '\

    This family of proteins are functionally uncharacterised.

    \ ' '9178' 'IPR018958' '\

    Yeast members of this family are involved in the regulation of cell wall assembly. Saccharomyces cerevisiae (Baker\'s yeast) protein KNR4 (SMI1) has a regulatory role in chitin deposition and in cell wall assembly PUBMED:10206705. It was originally identified as a regulator of chitin synthase expression (acting as a repressor) PUBMED:10206705 and of 1,3-beta-glucan synthase levels PUBMED:8289782. It was shown to localise in patches at presumptive bud sites in unbudded cells and at the incipient bud site during bud emergence PUBMED:10206705.

    \ \

    KNR4 is believed to connect the PKC1-SLT2 MAPK pathway with cell proliferation. It has been shown to interact with BCK2, a gene involved in cell cycle progression in S. cerevisiae (forming a complex) to allow PKC1 to coordinate the cell cycle (cell proliferation) with cell wall integrity PUBMED:12185498, PUBMED:12823808. PKC1 plays an essential role in cell wall integrity and cell proliferation through a bifurcated PKC1/mitogen-activated protein (MAP) kinase pathway. KNR4 also interacts with the tyrosine-tRNA synthetase protein encoded by TYS1 and is involved in sporulation process PUBMED:11410349.

    \

    Note: previously reported evidence that KNR4 may interact with nuclear matrix-association region PUBMED:8516310 may be due to an artefact PUBMED:10206705.

    \

    Proteins in this family are involved in the regulation of 1,3-beta-glucan synthase activity and cell-wall formation PUBMED:7937796, PUBMED:8289782.

    \ ' '9179' 'IPR018959' '\

    This entry represents proteins that are functionally uncharacterised.

    \ ' '9180' 'IPR018960' '\

    This entry represents proteins that are functionally uncharacterised.

    \ ' '9181' 'IPR018020' '\

    The proteins in this entry are OHCU decarboxylase, an enzyme of the purine catabolism that catalyses the conversion of OHCU into S(+)-allantoin PUBMED:16462750; it is the third step of the conversion of uric acid (a purine derivative) to allantoin. Step one is catalysed by urate oxidase () and step two is catalysed by hydroxyisourate hydrolase ().

    \ ' '9182' 'IPR018961' '\

    This entry represents a family of proteins that may have a role in protein folding or as a chaperone. DnaJ is a member of the J-protein family, which are defined by the presence of a J domain that can regulate the activity of 70-kDa heat-shock proteins PUBMED:15170475. Some of the proteins in this entry contain a J domain.

    \ ' '9183' 'IPR018531' '\

    This family of proteins are functionally uncharacterised.

    \ ' '9184' 'IPR018532' '\

    This family of proteins are functionally uncharacterised.

    \ ' '9185' 'IPR018962' '\

    This family of proteins are functionally uncharacterised.

    \ ' '9186' 'IPR018533' '\

    This presumed domain is found in the C-terminal region of Hepatocyte Nuclear Factor 3 alpha and beta chains. Its specific function is uncertain. The N-terminal region of this presumed domain contains an EH1 (engrailed homology 1) motif, that is characterised by the FxIxxIL sequence PUBMED:16309560.

    \ ' '9187' 'IPR018963' '\

    This family of proteins are functionally uncharacterised. They are found in a variety of bacteriophage.

    \ ' '9188' 'IPR018964' '\

    This entry describes the C-terminal region of a family of proteins found almost exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus (Rhodopseudomonas capsulata) gene transfer agent, which packages DNA. An apparent exception is Wolbachia pipientis wMel, a bacterial endosymbiont of the fruit fly, which has several candidate phage-related genes physically separate from obvious prophage regions.

    \ ' '9189' 'IPR018534' '\

    Human colonic Bacteroides species harbour a family of large conjugative transposons, called tetracycline resistance (Tcr) elements. Activities of these elements are enhanced by pregrowth of bacteria in medium containing tetracycline, indicating that at least some Tcr element genes are regulated by tetracycline. An insertional disruption in the rteC gene abolished self-transfer of the Tcr element to Bacteroides recipients, indicating that the gene was essential for self-transfer PUBMED:8407786.

    \ ' '9190' 'IPR018965' '\

    This presumed domain found at the C terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterised.

    \ ' '9191' 'IPR018966' '\

    This presumed domain is found in the yeast vacuolar transport chaperone proteins VTC2, VTC3 and VTC4. This domain is also found in a variety of bacterial proteins.

    \ ' '9192' 'IPR018967' '\

    This entry represents iron-sulphur domain containing proteins that have a CDGSH sequence motif (although the Ser residue can also be an Ala or Thr), and is found in proteins from a wide range of organisms with the exception of fungi. The CDGSH-type domain binds a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family PUBMED:17376863.

    \

    CDGSH-type domains are found in mitoNEET, an iron-containing integral protein of the outer mitochondrian membrane (OMM). MitoNEET forms a dimeric structure with a NEET fold, and contains two domains: a beta-cap region and a cluster-binding domain that coordinated two acid-labile 2Fe-2S clusters (one bound to each protomer) PUBMED:17766440. The CDGSH iron-sulphur domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, () PUBMED:17584744, PUBMED:17766440. The whole protein regulates oxidative capacity and may function in electron transfer, for instance in redox reactions with metabolic intermediates, cofactors and/or proteins localized at the OMM.

    \ \ ' '9193' 'IPR018968' '\

    This entry describes a group of phasins that associate with polyhydroxyalkanoate (PHA) inclusions, the most common of which consist of polyhydroxybutyrate (PHB).

    \

    Phasins (or granule-associate proteins) are surface proteins found covering Polyhydroxyalkanoate (PHA) storage granules in bacteria. Polyhydroxyalkanoates are linear polyesters produced by bacterial fermentation of sugar or lipids for the purpose of storing carbon and energy, and are accumulated as intracellular granules by many bacteria under unfavorable conditions, enhancing their fitness and stress resistance PUBMED:17965215. The layer of phasins stabilises the granules and prevents coalescence of separated granules in the cytoplasm and nonspecific binding of other proteins to the hydrophobic surfaces of the granules. For example, in Ralstonia eutropha (strain ATCC 17699/H16/DSM 428/Stanier 337) (Cupriavidus necator (strain ATCC 17699 / H16 / DSM 428 / Stanier 337)), the major surface protein of polyhydroxybutyrate (PHB) granules is phasin PhaP1(Reu), which occurs along with three homologues (PhaP2, PhaP3, and PhaP4) that have the capacity to bind to PHB granules but are present at minor levels PUBMED:18223073, PUBMED:15256572. These four phasins lack a highly conserved domain but share homologous hydrophobic regions.

    \ ' '9194' 'IPR018535' '\

    This family of proteins are functionally uncharacterised.

    \ ' '9195' 'IPR018969' '\

    Phosphoketolases (PK) are key enzymes of the pentose phosphate pathway of heterofermentative and facultative homofermentative lactic acid bacteria and of the D-fructose 6-phosphate shunt of bifidobacteria. PK activity has been sporadically reported in other microorganisms including eukaryotic yeasts. Xylulose-5-phosphate/fructose-6-phosphate phosphoketolase is a thiamine diphosphate (ThdP)-dependent enzyme found in bacteria such as Bifidobacterium sp PUBMED:11292814, PUBMED:15899413. This enzyme has dual-specificity with the following catalytic activities:

    \

    \

    This family is distantly related to transketolases, e.g. .

    \ \ ' '9196' 'IPR018970' '\

    Phosphoketolases (PK) are key enzymes of the pentose phosphate pathway of heterofermentative and facultative homofermentative lactic acid bacteria and of the D-fructose 6-phosphate shunt of bifidobacteria. PK activity has been sporadically reported in other microorganisms including eukaryotic yeasts. Xylulose-5-phosphate/fructose-6-phosphate phosphoketolase is a thiamine diphosphate (ThdP)-dependent enzyme found in bacteria such as Bifidobacterium sp PUBMED:11292814, PUBMED:15899413. This enzyme has dual-specificity with the following catalytic activities:

    \

    \

    This family is distantly related to transketolases, e.g. .

    \ \ ' '9197' 'IPR012808' '\

    Members of this family are widely (though sparsely) distributed bacterial proteins, about 230 residues in length and in fungal proteins, which are around 400 residues in length. All members have a motif RxxRDxRFxxx[DN]KxxY. The function of this protein family is unknown.

    \ ' '9198' 'IPR018971' '\

    This family of proteins are functionally uncharacterised.

    \ ' '9199' 'IPR018536' '\

    This family, that includes CpeS proteins, is functionally uncharacterised.

    \ ' '9200' 'IPR018972' '\

    This entry represents the C-terminal domain of the Something about silencing protein 10 (Sas10), which is essential for gene silencing and has a role in the structure of silenced chromatin PUBMED:9611201, PUBMED:12756328. Sas10 plays a role in the developing brain, and may bind RNA. SAS10 from Saccharomyces cerevisiae (Baker\'s yeast) is primarily required at the G2/M phase and is essential for viability, being involved in nucleolar processing of pre-18S ribosomal RNA as part of the ribosomal small subunit (SSU) processome PUBMED:17330950.

    \ ' '9201' 'IPR018973' '\

    This entry represents proteins that are functionally uncharacterised. They are mainly found in helicase proteins so could be RNA binding and include a probable zinc binding motif at its C terminus.

    \ ' '9202' 'IPR009215' '\ Members of this family are predicted to have a TIM barrel fold, based on PSI-BLAST analysis (iteration 4) and on SCOP prediction (using SMART). Interestingly, this novel domain also exists as an N-terminal domain of sigma54-dependent transcriptional activators (enhancer-binding proteins). Because sigma54 dependent activators typically have a three-domain structure: the variable N-terminal regulatory (activation) domain involved in signal recognition/receiving, the central AAA-type ATPase domain, and the DNA-binding domain (see , , , , for details), the proteins of the current entry may be predicted to play a role in signal recognition/receiving and signal transduction.\ ' '9203' 'IPR018974' '\

    This presumed domain is found at the N terminus of . This protein defines a novel family of prokaryotic transcriptional accessory factors PUBMED:8755871.

    \ ' '9204' 'IPR018272' '\

    This presumed domain is found at the C-terminus of a variety of Pox virus proteins. The PRANC (Pox proteins Repeats of ANkyrin, C-terminal) domain is also found on its own in some proteins PUBMED:16025237. The function of this domain is unknown, but it appears to be related to the F-box domain and may play a similar role.

    \ ' '9205' 'IPR018975' '\

    Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum) is a methanogenic Gram-positive microorganism with a cell wall consisting of pseudomurein. This repeat specifically binds to pseudomurein. This repeat is found at the N terminus of PeiW and PeiP which are pseudomurein binding phage proteins.

    \ ' '9206' 'IPR018537' '\

    This family contains a potential peptidoglycan binding domain.

    \ ' '9207' 'IPR018976' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The imelysin peptidase was first identified in Pseudomonas aeruginosa. The active site residues have not been identified. However, His201 and Glu204 are completely conserved in the family and occur in an HXXE motif that is also found in family M14.

    \ ' '9208' 'IPR018977' '\

    This family includes NurA a nuclease exhibiting both single-stranded endonuclease activity and 5\'-3\' exonuclease activity on single-stranded and double-stranded DNA from the hyperthermophilic archaeon Sulfolobus acidocaldarius PUBMED:12052775.

    \ ' '9209' 'IPR018978' '\

    This entry represents the C-terminal domain of proteins that are highly conserved in species ranging from archaea to vertebrates and plants PUBMED:12496757. The family contains several Shwachman-Bodian-Diamond syndrome (SBDS, OMIM 260400) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. It is characterised by bone marrow failure and leukemia predisposition. Members of this family play a role in RNA metabolism PUBMED:15701631, PUBMED:15701634. In yeast Sdo1 is involved in the biogenesis of the 60S ribosomal subunit and translational activation of ribosomes. Together with the EF-2-like GTPase RIA1 (EfI1), it triggers the GTP-dependent release of TIF6 from 60S pre-ribosomes in the cytoplasm, thereby activating ribosomes for translation competence by allowing 80S ribosome assembly and facilitating TIF6 recycling to the nucleus, where it is required for 60S rRNA processing and nuclear export. This data links defective late 60S subunit maturation to an inherited bone marrow failure syndrome associated with leukemia predisposition PUBMED:17353896.

    \ \ \

    A number of uncharacterised hydrophilic proteins of about 30 kDa share regions of similarity. These include,

    \ \ \ ' '9210' 'IPR018538' '\

    The HAS barrel is named after HerA-ATP Synthase. In ATP synthases, this domain is implicated in the assembly of the catalytic toroid and docking of accessory subunits, such as the subunit of the ATP synthase complex. Similar roles in docking of the functional partner, the NurA nuclease, and assembly of the HerA toroid complex appear likely for the HAS-barrel of the HerA family PUBMED:15466593.

    \ ' '9211' 'IPR018979' '\

    The FERM domain (F for 4.1 protein, E for ezrin, R for radixin and M for moesin) is a widespread protein module involved in localising proteins to the plasma membrane PUBMED:9757824. FERM domains are found in a number of cytoskeletal-associated proteins that associate with various proteins at the interface between the plasma membrane and the cytoskeleton. The FERM domain is located at the N-terminus of the majority of FERM-containing proteins PUBMED:9757824, PUBMED:10847681, which includes:

    \

    \

    Ezrin, moesin, and radixin are highly related proteins (ERM protein family), but the other proteins in which the FERM domain is found do not share any region of similarity outside of this domain. ERM proteins are made of three domains, the FERM domain, a central helical domain and a C-terminal tail domain, which binds F-actin. The amino-acid sequence of the FERM domain is highly conserved among ERM proteins and is responsible for membrane association by direct binding to the cytoplasmic domain or tail of integral membrane proteins. ERM proteins are regulated by an intramolecular association of the FERM and C-terminal tail domains that masks their binding sites for other molecules. For cytoskeleton-membrane cross-linking, the dormant molecules becomes activated and the FERM domain attaches to the membrane by binding specific membrane proteins, while the last 34 residues of the tail bind actin filaments. Aside from binding to membranes, the activated FERM domain of ERM proteins can also bind the guanine nucleotide dissociation inhibitor of Rho GTPase (RhoDGI), which suggests that in addition to functioning as a cross-linker, ERM proteins may influence Rho signalling pathways. The crystal structure of the FERM domain reveals that it is composed of three structural modules (F1, F2, and F3) that together form a compact clover-shaped structure PUBMED:10970839.

    \

    The FERM domain has also been called the amino-terminal domain, the 30kDa domain, 4.1N30, the membrane-cytoskeletal-linking domain, the ERM-like domain, the ezrin-like domain of the band 4.1 superfamily, the conserved N-terminal region, and the membrane attachment domain PUBMED:9757824.

    \

    This domain is the N-terminal ubiquitin-like structural domain of the FERM domain.

    \ ' '9212' 'IPR018980' '\

    The FERM domain (F for 4.1 protein, E for ezrin, R for radixin and M for moesin) is a widespread protein module involved in localising proteins to the plasma membrane PUBMED:9757824. FERM domains are found in a number of cytoskeletal-associated proteins that associate with various proteins at the interface between the plasma membrane and the cytoskeleton. The FERM domain is located at the N-terminus of the majority of FERM-containing proteins PUBMED:9757824, PUBMED:10847681, which includes:

    \

    \

    Ezrin, moesin, and radixin are highly related proteins (ERM protein family), but the other proteins in which the FERM domain is found do not share any region of similarity outside of this domain. ERM proteins are made of three domains, the FERM domain, a central helical domain and a C-terminal tail domain, which binds F-actin. The amino-acid sequence of the FERM domain is highly conserved among ERM proteins and is responsible for membrane association by direct binding to the cytoplasmic domain or tail of integral membrane proteins. ERM proteins are regulated by an intramolecular association of the FERM and C-terminal tail domains that masks their binding sites for other molecules. For cytoskeleton-membrane cross-linking, the dormant molecules becomes activated and the FERM domain attaches to the membrane by binding specific membrane proteins, while the last 34 residues of the tail bind actin filaments. Aside from binding to membranes, the activated FERM domain of ERM proteins can also bind the guanine nucleotide dissociation inhibitor of Rho GTPase (RhoDGI), which suggests that in addition to functioning as a cross-linker, ERM proteins may influence Rho signalling pathways. The crystal structure of the FERM domain reveals that it is composed of three structural modules (F1, F2, and F3) that together form a compact clover-shaped structure PUBMED:10970839.

    \

    The FERM domain has also been called the amino-terminal domain, the 30kDa domain, 4.1N30, the membrane-cytoskeletal-linking domain, the ERM-like domain, the ezrin-like domain of the band 4.1 superfamily, the conserved N-terminal region, and the membrane attachment domain PUBMED:9757824.

    \

    This entry, however, represents the PH-like domain found at the C terminus of the eukaryote proteins moesin, ezrin and radixin.

    \ ' '9213' 'IPR018981' '\

    Porins are channel proteins in the outer membrane of Gram-negative bacteria which mediate the uptake of molecules required for growth and survival. Escherichia coli OmpG forms a 14 stranded beta-barrel and in contrast to most porins, appears to function as a monomer PUBMED:16797588. The central pore of OmpG is wider than other E. coli porins and it is speculated that it may form a non-specific channel for the transport of larger oligosaccharides PUBMED:16797588.

    \ ' '9214' 'IPR018982' '\

    This entry represents the RQC domain, which is a DNA-binding domain found only in RecQ family enzymes. RecQ family helicases can unwind G4 DNA, and play important roles at G-rich domains of the genome, including the telomeres, rDNA, and immunoglobulin switch regions. This domain has a helix-turn-helix structure and acts as a high affinity G4 DNA binding domain PUBMED:16530788. Binding of RecQ to Holliday junctions involves both the RQC and the HRDC domains.

    \ ' '9215' 'IPR018449' '\

    This domain is found at the C terminus of ABC transporter proteins involved in D-methionine transport as well as a number of ferredoxin-like proteins. This domain is likely to act as a substrate binding domain. The domain has been named after a conserved sequence in some members of the family.

    \ ' '9216' 'IPR018983' '\

    This entry represents the C-terminal domain of the U3 small nucleolar RNA-associated protein 15 (UTP15). This protein is involved in nucleolar processing of pre-18S ribosomal RNA, and is required for optimal pre-ribosomal RNA transcription by RNA polymerase I together with a subset of U3 proteins required for transcription (t-UTPs). UTP15 is a component of the ribosomal small subunit (SSU) processome, which is a large ribonucleoprotein (RNP) required for processing of precursors to the small subunit RNA, the 18S, of the ribosome PUBMED:15590835, PUBMED:15489292. This domain is found C-terminal to the WD40 repeat (). UTP15 associates with U3 snoRNA, which is ubiquitous in eukaryotes and is required for nucleolar processing of pre-18S ribosomal RNA PUBMED:12068309.

    \ \ ' '9217' 'IPR018984' '\

    This domain is found at the N-terminal of sensor histidine kinase proteins.

    \ ' '9218' 'IPR018985' '\

    ParD is a plasmid anti-toxin than forms a ribbon-helix-helix DNA binding structure PUBMED:11743881. It stabilises plasmids by inhibiting ParE toxicity in cells that express ParD and ParE. ParD forms a dimer and also regulates its own promoter (parDE).

    \ ' '9219' 'IPR018539' '\

    MRP1 and MRP2 are mitochondrial RNA binding proteins that form a heteromeric complex. The MRP1/MRP2 heterotetrameric complex binds to guide RNAs and stabilises them in an unfolded conformation suitable for RNA-RNA hybridisation. Each MRP subunit adopts a \'whirly\' transcription factor fold PUBMED:16923390.

    \ ' '9220' 'IPR018540' '\

    Spore formation is an extreme response to starvation and can also be a component of disease transmission. Sporulation is controlled by an expanded two-component system where starvation signals result in sensor kinase activation and phosphorylation of the master sporulation response regulator Spo0A. Phosphatases such as Spo0E dephosphorylate Spo0A thereby inhibiting sporulation. This is a family of Spo0E-like phosphatases. The structure of a Bacillus anthracis member of this family has revealed an anti-parallel alpha-helical structure PUBMED:17001075.

    \ ' '9222' 'IPR018987' '\

    This family contains a putative Fe-S binding reductase () whose structure adopts an alpha and beta fold.

    \ ' '9223' 'IPR018988' '\

    This is a family of proteins of unknown function. The structure of one of the proteins in this family has been shown to adopt an alpha beta fold.

    \ ' '9224' 'IPR011841' '\

    Type III secretion systems translocate proteins, usually virulence factors, out across both inner and outer membranes of certain Gram-negative bacteria and further across the plasma membrane and into the cytoplasm of the host cell. This protein, termed YscF in Yersinia, and EscF, PscF, EprI, etc. in other systems, forms the needle of the injection apparatus PUBMED:14580388.

    \ ' '9225' 'IPR018989' '\

    This entry includes protein XkdM () from the Phage-like element PBSX in Bacillus subtilis. The structure of XkdM adopts a beta barrel flanked with alpha helical regions. Its function is unknown. PBSX, a defective prophage of B. subtilis, is a chromosomally based element which encodes a non-infectious phage-like particle with bactericidal activity. PBSX is induced by agents which elicit the SOS response PUBMED:125016.

    \ \ \ ' '9226' 'IPR018990' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    Chagasin reversible inhibitor of papain-like cysteine proteases PUBMED:11719560. Chagasin has a beta-barrel structure, which is a unique variant of the immunoglobulin fold with homology to human CD8alpha PUBMED:17011790, PUBMED:17502099.

    \ \ \ ' '9228' 'IPR018992' '\

    Thrombin is a Na+-activated, allosteric serine protease that functions in blood homeostasis, inflammation and wound healing PUBMED:18329094. Thrombin (or coagulation factor II) is an enzyme that cleaves bonds after Arg and Lys, converts fibrinogen to fibrin and activates factors V, VII, VIII, and (in complex with thrombomodulin) protein C PUBMED:18282807. Sodium binding is the major driving force behind the procoagulant, prothrombotic and signaling functions of the enzyme, but is dispensable for cleavage of the anticoagulant protein C PUBMED:15656349. Prothrombin is activated on the surface of a phospholipid membrane where factor Xa removes the activation peptide and cleaves the remaining part into light and heavy chains. This domain corresponds to the light chain of thrombin.

    \ ' '9229' 'IPR018541' '\

    This domain directs oriented DNA translocation and forms a winged helix structure PUBMED:17057717. Mutated proteins with substitutions in the FtsK gamma DNA-recognition helix are impaired in DNA binding PUBMED:17057717.

    \ ' '9230' 'IPR018993' '\

    Fibroblast growth factor receptor 1 (FGFR1) oncogene partner (FOP) is a centrosomal protein that is involved in anchoring microtubules to centrosomes. This domain includes a Lis-homology motif. It forms an alpha-helical bundle and is involved in dimerisation PUBMED:16690081.

    \ ' '9231' 'IPR018542' '\

    This is a family of proteins found in SARS coronavirus (SARS-CoV) (Severe acute respiratory syndrome coronavirus). The protein has a novel fold which forms a dimeric tent-like beta structure with an amphipathic surface, and a central hydrophobic cavity that binds lipid molecules PUBMED:16843897. This cavity is likely to be involved in membrane attachment PUBMED:16843897.

    \ ' '9232' 'IPR018994' '\

    This entry represents a group of putative cytoplasmic proteins. The structure of these proteins form an antiparallel beta sheet and contain some alpha helical regions.

    \ ' '9233' 'IPR018995' '\

    Non-structural protein 10 (NSP10) is involved in RNA synthesis. It is synthesised as part of a replicase polyprotein, whose cleavage generates many non-structural proteins PUBMED:17634238. NSP10 contains two zinc binding motifs and forms two anti-parallel helices which are stacked against an irregular beta sheet PUBMED:16873246. A cluster of basic residues on the protein surface suggests a nucleic acid-binding function.

    \ \

    This entry contains cysteine peptidase belonging to MEROPS peptidase families, C30 (porcine transmissible gastroenteritis virus-type main peptidase, clan PA(C)) and C16B (murine hepatitis coronavirus papain-like peptidase 1, clan CA).

    \ ' '9234' 'IPR018996' '\

    This entry represents the Inner nuclear membrane proteins MAN1 (also known as LEM domain-containing protein 3) and LEM domain-containing protein 2 (or LEM protein 2). Emerin and MAN1 are LEM domain-containing integral membrane proteins of the vertebrate nuclear envelope PUBMED:12684533. MAN1 is an integral protein of the inner nuclear membrane which binds to chromatin associated proteins and plays a role in nuclear organisation. The C-terminal nulceoplasmic region forms a DNA binding winged helix and binds to Smad PUBMED:16648637. LEM protein 2 is an essential protein involved in chromosome segregation and cell division, probably via its interaction with lmn-1, the main component of nuclear lamina. Has some overlapping function with emr-1.

    \ ' '9235' 'IPR018543' '\

    FadA (Fusobacterium adhesin A) is an adhesin which forms two alpha helices.

    \ ' '9236' 'IPR018544' '\

    This is a family of proteins of unknown function which adopt an alpha helical and beta sheet structure.

    \ ' '9237' 'IPR018545' '\

    This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5\' nucleotide PUBMED:16923391, PUBMED:14973490.

    \ ' '9238' 'IPR018546' '\

    This is a family of proteins with unknown function. The structure of one of the proteins in this family has revealed a novel alpha-beta fold PUBMED:16374782.

    \ ' '9239' 'IPR018547' '\

    This is a family of proteins with unknown function.

    \ ' '9240' 'IPR018548' '\

    Spike is an envelope glycoprotein which aids viral entry into the host cell. This domain corresponds is the immunogenic receptor binding domain of the protein which binds to angiotensin-converting enzyme 2 (ACE2) PUBMED:16597622.

    \ ' '9241' 'IPR018997' '\

    The PUB (also known as PUG) domain is found in peptide N-glycanase where it functions as a AAA ATPase binding domain PUBMED:16807242. This domain is also found on other proteins linked to the ubiquitin-proteasome system.

    \ ' '9243' 'IPR018550' '\

    PagL is an outer membrane protein with lipid A 3-O-deacylase activity. It forms an 8 stranded beta barrel structure PUBMED:16632613.

    \ ' '9244' 'IPR018998' '\

    This is a entry represents endoribonucleases involved in RNA biosynthesis which has been named XendoU in Xenopus laevis (African clawed frog). XendoU is a U-specific metal dependent enzyme that produces products with a 2\'-3\' cyclic phosphate termini.

    \ ' '9245' 'IPR018551' '\

    This is a family of proteins with unknown function.

    \ ' '9246' 'IPR012647' '\

    Members of this family ligate (seal breaks in) RNA. Members so far include phage proteins that can counteract a host defence of cleavage of specific tRNA molecules, trypanosome ligases involved in RNA editing, but no prokaryotic host proteins.

    \ ' '9247' 'IPR018552' '\

    This is a family of eukaryotic proteins whose function has not been characterised.

    \ ' '9248' 'IPR018999' '\

    UPF1 (or regulator of nonsense transcripts 1 homologue) is an essential RNA helicase that detects mRNAs containing premature stop codons and triggers their degradation. This domain contains 3 zinc binding motifs and forms interactions with another protein (UPF2) that is also involved nonsense-mediated mRNA decay (NMD) PUBMED:16931876.

    \ ' '9250' 'IPR018553' '\

    This is a eukaryotic family of proteins with unknown function.

    \ ' '9251' 'IPR019001' '\

    This is a family of proteins which show sequence similarity to the HAD superfamily of hydrolases.

    \ ' '9252' 'IPR019002' '\

    Nop16 is a protein involved in the biogenesis of the 60S ribosomal subunit.

    \ ' '9253' 'IPR018554' '\

    The frequency clock protein, is the central component of the frq-based circadian negative feedback loop, regulates various aspects of the circadian clock in Neurospora crassa PUBMED:11226160. This protein has been shown to interact with itself via a coiled-coil PUBMED:11226160.

    \ ' '9254' 'IPR019003' '\

    WTX is an X chromosome gene; Wilms\' tumor gene on the X chromosome (WTX) PUBMED:18720004. WTX protein is a protein encoded by a gene mutated in Wilms tumors and it forms a complex with beta-catenin, AXIN1 and beta-TrCP2 (beta-transducin repeat-containing protein 2) PUBMED:17510365. The WTX protein is found to be inactivated in one third of Wilms\' tumours PUBMED:17204608.

    \ ' '9255' 'IPR018946' '\

    This entry contains a number of putative proteins as well as Alkaline phosphatase D which catalyses the reaction:\

    \ ' '9256' 'IPR019004' '\

    Putative protein of unknown function; the authentic protein is detected in highly purified mitochondria in high-throughput studies; YOR215C is not an essential gene.

    \ ' '9257' 'IPR018467' '\

    The short CCT (CO, COL, TOC1) motif is found in a number of plant proteins, including Constans (CO), Constans-like (COL) and TOC1. The CCT motif is about 45 amino acids long and contains a putative nuclear localisation signal within the second half of the CCT motif PUBMED:10926537. The CCT motif is found in the Arabidopsis circadian rhythm protein TOC1, an autoregulatory response regulator homologue the controls the photoperiodic flowering through its clock function PUBMED:10926537.

    \ ' '9258' 'IPR019005' '\

    This entry represents the N-terminal domain of vacuolar R-SNARE Nyv1, which adopts a longin fold PUBMED:16855025. Vacuolar v-SNARE is required for docking and is only involved in homotypic vacuole fusion. Nyv1 is required for Ca(2+) efflux from the vacuolar lumen, a required signal for subsequent membrane fusion events, by inhibiting vacuolar Ca(2+)-ATPase PMC1 and promoting Ca(2+) release when forming trans-SNARE assemblies during the docking step. In yeast, the N-terminal domain of Nyv1 is sufficient to direct the transport of Nyv1 to limiting membrane of the vacuole PUBMED:16855025.

    \ ' '9259' 'IPR019006' '\

    This domain is found at the C-terminal of a family of ER membrane bound transcription factors called sterol regulatory element binding proteins (SREBP).

    \ ' '9260' 'IPR018555' '\

    This is a family of fungal proteins whose function is unknown.

    \ ' '9261' 'IPR019007' '\

    Synonym(s): Rsp5 or WWP domain

    \

    The WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded beta-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins PUBMED:7846762, PUBMED:7802651, PUBMED:7828727,\ PUBMED:7641887. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs PUBMED:7644498, PUBMED:11911877. It is frequently associated with other domains typical for proteins in signal transduction processes.

    \ \

    A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker\'s yeast) RSP5, similar to NEDD-4 in its molecular organization; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein, amongst others.

    \

    This entry represents WW domain-binding protein 11, which may play a role in the regulation of pre-mRNA processing.

    \ ' '9262' 'IPR019008' '\

    This is a eukaryotic family of uncharacterised proteins.

    \ ' '9263' 'IPR018556' '\

    This region is found at the C-terminal of a group of cytoskeletal proteins.

    \ ' '9264' 'IPR018557' '\

    The THO complex plays a role in coupling transcription elongation to mRNA export. It is composed of subunits THP2, HPR1, THO2 and MFT1 PUBMED:11060033.

    \ ' '9265' 'IPR018558' '\

    The THO complex plays a role in coupling transcription elongation to mRNA export. It is composed of subunits THP2, HPR1, THO2 and MFT1 PUBMED:11060033.

    \ ' '9267' 'IPR018559' '\

    This entry represents uncharacterised proteins found in fungi.

    \ ' '9268' 'IPR018560' '\

    This entry represents the N-terminal of proteins that contain a ubiquitin domain.

    \ ' '9269' 'IPR018291' '\

    This entry represents a group of proteins containing five transmembrane regions. These proteins are found exclusively in Schizosaccharomyces pombe (Fission yeast).

    \ ' '9270' 'IPR018561' '\

    This domain is found on proteins including ubiquitin, cysteine synthases and JAB peptidases.

    \ ' '9271' 'IPR019009' '\

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes PUBMED:17622352, PUBMED:16469117. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor PUBMED:17507650. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5\' and 3\' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    \

    The beta subunit of the signal recognition particle receptor (SRP) is a transmembrane GTPase, which anchors the alpha subunit to the endoplasmic reticulum membrane PUBMED:7844142.

    \

    The SR receptor is a monomer consisting of the loosely membrane-associated SR-alpha homologue FtsY, while the eukaryotic SR receptor is a heterodimer of SR-alpha (70 kDa) and SR-beta (25 kDa), both of which contain a GTP-binding domain PUBMED:12654246. SR-alpha regulates the targeting of SRP-ribosome-nascent polypeptide complexes to the translocon PUBMED:10859309. SR-alpha binds to the SRP54 subunit of the SRP complex. The SR-beta subunit is a transmembrane GTPase that anchors the SR-alpha subunit (a peripheral membrane GTPase) to the ER membrane PUBMED:7844142. SR-beta interacts with the N-terminal SRX-domain of SR-alpha, which is not present in the bacterial FtsY homologue. SR-beta also functions in recruiting the SRP-nascent polypeptide to the protein-conducting channel.

    \ ' '9272' 'IPR019010' '\

    This entry represents the N-terminal domain of subunit 6 (or e) (eIF3e) of the translation initiation factor eIF3. EIF3 is required in protein synthesis in mammalian cells and, together with other initiation factors, stimulates binding of initiator methionyl-tRNAi and mRNA to the 40S ribosomal subunit to form the 48 S initiation complex PUBMED:17322308. The eIF3 complex also prevents premature association of the 40 and 60 S ribosomal subunits and interacts with other initiation factors involved in start codon selection. EIF3 has at least 13 protein components (eIF3a-m or 1-13), where subunits h, i, k, and m are likely to be on the periphery of the complex PUBMED:17322308. Subunit 6 is produced by the int6 gene, one of the frequent integration sites for mouse mammary tumor viruses PUBMED:18502752.

    \ ' '9273' 'IPR018562' '\

    This DNA-binding protein binds to the autonomously replicating sequence (ARS) binding element. It may play a role in regulating the cell cycle response to stress signals PUBMED:9488484.

    \ ' '9274' 'IPR018563' '\

    Acid-adaptive protein possibly of physiological significance when Helicobacter pylori (Campylobacter pylori) colonises the human stomach, which adopts a unique four alpha-helical triangular conformations. The biologically active form is thought to be a tetramer. The protein is expressed along with six other proteins, some of which are related to iron storage and haem biosynthesis PUBMED:16395670.

    \ ' '9275' 'IPR019011' '\

    This entry represents the CFC domain found in the membrane protein Cripto (or teratocarcinoma-derived growth factor), a protein over expressed in many tumours PUBMED:12919325, PUBMED:17125258 and structurally similar to the C-terminal extracellular portions of Jagged 1 and Jagged 2 PUBMED:12919325. CFC is approx 40-residues long, compacted by three internal disulphide bridges, and binds Alk4 via a hydrophobic patch. CFC is structurally homologous to the VWFC-like domain PUBMED:12919325.

    \ \

    The protein Cripto is the founding member of the extra-cellular EGF-CFC growth factors, which are composed of two adjacent cysteine-rich domains: the EGF-like () and the CFC domains. Members of the EGF-CFC family play key roles in embryonic development and are also implicated in tumourigenesis PUBMED:19035567. The Cripto protein could play a role in the determination of the epiblastic cells that subsequently give rise to the mesoderm. Although both the EGF and CFC domains are involved in the tumourigenic activity of Crispto proteins, the CFC domain appears to play a crucial role, as it is through the CFC domain that Crispto interferes with the onco-suppressive activity of Activins, either by blocking the Activin receptor ALK4 or by antagonising proteins of the TGF-beta family PUBMED:19035567.

    \ \

    The Cryptic protein is involved in the correct establishment of the left-right axis. May play a role in mesoderm and/or neural patterning during gastrulation.

    \ \ ' '9276' 'IPR018564' '\

    This putative domain is found to be the most conserved region in mediator of replication checkpoint protein 1.

    \ ' '9277' 'IPR019012' '\

    RNA cap guanine-N2 methyltransferases such as Schizosaccharomyces pombe (Fission yeast) trimethylguanosine synthase (Tgs1) and Giardia lamblia (Giardia intestinalis) Tgs2, catalyse the methylation step(s) for the conversion of the 7-monomethylguanosine (m(7)G) caps of snRNAs and snoRNAs to a 2,2,7-trimethylguanosine (m(2,2,7)G) cap structure PUBMED:17284461, PUBMED:18840651, PUBMED:15590684. Trimethylguanosine synthase is specific for guanine, and N7 methylation must precede N2 methylation. This enzyme is required for pre-mRNA splicing, pre-rRNA processing and small ribosomal subunit synthesis. As such, this enzyme plays a role in transcriptional regulation.

    \ ' '9278' 'IPR019013' '\

    The vacuolar ATPase assembly integral membrane protein VMA21 is required for the assembly of the integral membrane sector (V0 component) of the vacuolar ATPase (V-ATPase) in the endoplasmic reticulum PUBMED:15356264. This entry represents a putative short domain found in VMA21-like proteins, and which appears to contain two potential transmembrane helices.

    \ ' '9279' 'IPR018565' '\

    This entry includes the Cnl2 kinetochore protein PUBMED:17035632.

    \ ' '9280' 'IPR018566' '\

    MmlI is a short, approx 115 residue, protein of two alpha helices and four beta strands. It is involved in the catabolism of methyl-substituted aromatics via a modified oxo-adipate pathway in bacteria. The enzyme appears to be monomeric in some species PUBMED:2241929 and tetrameric in others PUBMED:2818569. The known structure shows two copies of the protein form a dimeric alpha beta barrel.

    \ ' '9281' 'IPR018567' '\

    Protein of unknown function found in bacteria.

    \ ' '9282' 'IPR018568' '\

    Protein of unknown function found in bacteria.

    \ ' '9283' 'IPR018939' '\

    Autophagy is a degradative transport pathway that delivers cytosolic proteins to the lysosome (vacuole) PUBMED:11058089 and is induced by starvation PUBMED:9190802. Cytosolic proteins appear inside the vacuole enclosed in autophagic vesicles. Autophagy significantly differs from other transport pathways by using double membrane layered transport intermediates, called autophagosomes PUBMED:11675007, PUBMED:18472412. The breakdown of vesicular transport intermediates is a unique feature of autophagy PUBMED:11058089. Autophagy can also function in the elimination of invading bacteria and antigens PUBMED:18472412.

    \ \

    There are more than 25 AuTophaGy-related (ATG) genes that are essential for autophagy, although it is still not known how the autophagosome is made. Atg9 is a potential membrane carrier to deliver lipids that are used to form the vesicle. Atg27 is another transmembrane protein, and is a cycling protein PUBMED:17297289.

    \ \

    It acts as an effector of VPS34 phosphatidylinositol 3-phosphate kinase signalling and regulates the cytoplasm to vacuole transport (Cvt) vesicle formation. It is also required for autophagy-dependent cycling of ATG9.

    \ ' '9284' 'IPR019014' '\

    The endosomal sorting complex required for transport (ESCRT) complexes play a critical role in receptor down-regulation and retroviral budding. A new component of the ESCRT-I complex was identified PUBMED:17145965, multivesicular body sorting factor of 12 kDa (Mvb12), which binds to the coiled-coil domain of the ESCRT-I subunit vacuolar protein sorting 23 (Vps23) PUBMED:17145965.

    \ ' '9285' 'IPR019015' '\

    The HirA B (Histone regulatory homologue A binding) motif is the essential binding interface between and ASF1a, of approx. 40 residues. It forms an antiparallel beta-hairpin that binds perpendicular to the strands of the beta-sandwich of ASF1a N-terminal core domain, via beta-sheet, salt bridge and van der Waals interactions PUBMED:16980972. The two histone chaperone proteins, HIRA and ASF1a, form a heterodimer with histones H3 and H4. HIRA is the human orthologue of Hir proteins known to silence histone gene expression and create transcriptionally silent heterochromatin in yeast, flies, plants and humans.

    \ \

    The HIR complex is composed of HIR1, HIR2, HIR3 and HPC2, and interacts with ASF1. The HIR complex cooperates with ASF1 to promote replication-independent chromatin assembly. The HIR complex is also required for the periodic repression of three of the four histone gene loci during cell cycle as well as for autogenous regulation of the HTA1-HTB1 locus by H2A and H2B. DNA-binding by the HIR complex may repress transcription by inhibiting nucleosome remodeling by the SWI/SNF complex. The HIR complex may also be required for transcriptional silencing of centromeric, telomeric and mating-type loci in the absence of CAF-1.

    \ ' '9286' 'IPR017916' '\

    The Endosomal Sorting Complex Required for Transport (ESCRT) complexes form the machinery driving protein sorting from endosomes to lysosomes. ESCRT complexes are central to receptor down-regulation, lysosome biogenesis, and budding of HIV. Yeast ESCRT-I consists of three protein subunits, VPS23, VPS28, and VPS37. In humans, ESCRT-I comprises TSG101, VPS28, and one of four potential human VPS37 homologues. The main role of ESCRT-I is to recognise ubiquitinated cargo via the UEV domain of the VPS23/TSG101 subunit. The assembly of the ESCRT-I complex is directed by the C-terminal steadiness box (SB) of VPS23, the N-terminal half of VPS28, and the C-terminal half of VPS37. The structure is primarily composed of three long, parallel helical hairpins, each corresponding to a different subunit. The additional domains and motifs extending beyond the core serve as gripping tools for ESCRT-I critical functions PUBMED:16615893, PUBMED:16615894.

    \

    This entry represents the Steadiness box domain.

    \ ' '9287' 'IPR019016' '\

    Cas are a group of proteins associated with clustered regularly interspaced short palindromic repeats - CRISPS - of DNA found in nearly half of bacterial and archaeal genomes. The family describes Cas proteins of about 400 residues that include the motif [VIL]-D-x-[ST]-H-[GS]. The CRISPR and associated proteins are thought to be involved in the evolution of host resistance. The exact molecular function of this family is currently unknown.

    \ ' '9288' 'IPR019017' '\

    This domain is found in the C terminus of the signal transduction response regulator (phospho-relay) kinase RcsC, between the ATP-binding region () and the receiver region (). This domain forms a discrete alpha/beta/loop structure PUBMED:17005198. The Rcs signalling pathway controls a variety of physiological functions like capsule synthesis, cell division or motility in prokaryotes. The Rcs regulation cascade, involving a multi-step phosphorelay between the two membrane-bound hybrid sensor kinases RcsC and RcsD and the global regulator RcsB, is, up to now, one of the most complicated regulatory systems in bacteria PUBMED:17005198.

    \ ' '9289' 'IPR019018' '\

    The FIP domain is the Rab11-binding domain (RBD) at the C terminus of a family of Rab11-interacting proteins (FIPs). The Rab proteins constitute the largest family of small GTPases (>60 members in mammals). Among them Rab11 is a well characterised regulator of endocytic and recycling pathways. Rab11 associates with a broad range of post-Golgi organelles, including recycling endosomes PUBMED:17030804.

    \ \

    Rab11-interacting protein is an effector protein involved in protein trafficking from apical recycling endosomes to the apical plasma membrane. It is also involved in controlling membrane trafficking along the phagocytic pathway and phagocytosis PUBMED:19141279.

    \ \

    Rab6-interacting ERC1 is the regulatory subunit of the IKK complex and probably recruits IkappaBalpha/NFKBIA to the complex. It may be involved in the organisation of the cytomatrix at the nerve terminals active zone (CAZ) which regulates neurotransmitter release. It may also be involved in vesicle trafficking at the CAZ, as well as in Rab-6 regulated endosomes to Golgi transport PUBMED:19119858.

    \ \ \ ' '9290' 'IPR019019' '\

    The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates PUBMED:16704980. It is sometimes found in association with the C-terminal domain of coagulation factor F5/8 ().

    \ ' '9291' 'IPR019020' '\

    This entry represents a haem-binding domain found in cytochromes b558/566 (subunit A), c-551 and c-552, as well as in members of the type-II members of the microbial dimethyl sulphoxide (DMSO) reductase family.

    \ \

    The DMSO reductase family is a large and rapidly expanding group of enzymes found in bacteria and archaea that share a common form of molybdenum cofactor known as bis(molybdopterin guanine dinucleotide)Mo PUBMED:15311335. In addition to the molybdopterin subunit, these enzymes also contain an iron-sulphur subunit. These include two distinct but very closely related periplasmic proteins of anaerobic respiration: selenate reductase and chlorate reductase PUBMED:15866716. Other proteins containing this subunit include dimethyl sulphide dehydrogenase and ethylbenzene dehydrogenase PUBMED:11294876, PUBMED:12067345, PUBMED:16030201.

    \ \ \

    One member of the DMSO reductase family is eythylbenzene dehydrogenase, which is a heterotrimer of three subunits that catalyses the anaerobic degradation of hydrocarbons (alpha, beta and gamma subunits). This entry matches the gamma subunit, whose structure is known PUBMED:16962969. The alpha subunit contains the catalytic centre as a Molybdenum cofactor-complex. This removes an electron-pair from the hydrocarbon and passes it along an electron transport system involving iron-sulphur complexes held in the beta subunit and a Haem b molecule contained in the gamma subunit. The electron-pair is then subsequently passed to an as yet unknown receiver. The enzyme is found in a variety of different bacteria.

    \ ' '9292' 'IPR018569' '\

    This domain consists of the adjacent Saf-Nte and Saf-pilin chains of the pilus-forming complex. Pilus assembly in Gram-negative bacteria involves a Donor-strand exchange mechanism between the C- and the N-termini of this domain. The C-terminal subunit forms an incomplete Ig-fold which is then complemented by the 10-18 residue N terminus of another, incoming, pilus subunit which is not involved in the Ig-fold. The N terminus sequences contain a motif of alternating hydrophobic residues that occupy the P2 to P5 binding pockets in the groove of the first pilus subunit PUBMED:16793551.

    \ ' '9293' 'IPR018570' '\

    PcF is a 52 residue protein factor of two alpha helices, containing a 4-hydroxyproline and three cysteine bridges. The presence of the hydroxyproline is unique in relation to other fungal phytotoxic proteins. The protein has a high content of acidic side-chains implying a lack of binding with lipid-rich components of membranes and appears to be an extracellular phytotoxin that causes leaf necrosis in strawberries.

    \ ' '9294' 'IPR019021' '\

    This entry represents a conserved region found in the Mus7 protein PUBMED:17307401 and in methyl methanesulphonate-sensitivity protein 22 (MMS22). Mus7 is involved in the repair of replication-associated DNA damage in Schizosaccharomyces pombe (Fission yeast). Mus7 functions in the same pathway as Mus81, a subunit of the Mus81-Eme1 structure-specific endonuclease, which has been implicated in the repair of the replication-associated DNA damage PUBMED:17307401.

    \ \ \ \

    MMS22, along with MMS1, is involved in protection against replication-dependent DNA damage. MMS22 may act by restoring active replication forks, repairing unusual DNA structures, and/or preventing aberrant DNA rearrangement at arrested replication forks, including the repair of double-stranded DNA breaks created by the cleavage reaction of topoisomerase II PUBMED:15718301.

    \ ' '9295' 'IPR018571' '\

    Opy2p acts as a membrane anchor in the HOG signalling pathway PUBMED:16543225.

    \ ' '9297' 'IPR019023' '\

    The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope PUBMED:17251381.

    \ ' '9298' 'IPR018474' '\

    The hypothetical protein YqaI is expressed in bacteria, particularly Bacillus subtilis. It forms a homo-dimer, with each monomer containing an alpha helix and four beta strands.

    \ ' '9299' 'IPR018572' '\

    This hypothetical protein is expressed in bacteria, particularly Bacillus subtilis. It forms homo-dimers, with each monomer consisting of one alpha helix and three beta strands.

    \ ' '9300' 'IPR019024' '\

    This entry represents the non-catalytic subunit B of RNase H2, an endonuclease that specifically degrades RNA when annealed to a complementary DNA, and which is present in all living organisms. RNase H2 participates in DNA replication, possibly by mediating the removal of lagging-strand Okazaki fragment RNA primers during DNA replication. It mediates the excision of single ribonucleotides from DNA:RNA duplexes. In Saccharomyces cerevisiae (Baker\'s yeast), RNase H2 is a heterotrimer composed of the catalytic subunit RNH201 and of the non-catalytic subunits RNH202 and RNH203 (or Rnh2Ap, Ydr279p and Ylr154p), this family represents the homologues of RNH202 (or Ydr279p) PUBMED:14734815. It is not known whether non yeast proteins in this family fulfil the same function.

    \ ' '9301' 'IPR019025' '\

    The Cordon-bleu protein domain is highly conserved among vertebrates. The sequence contains three repeated lysine, arginine, and proline-rich regions, the KKRAP motif. The exact function of the protein is unknown but it is thought to be involved in mid-brain neural tube closure. It is expressed specifically in the node PUBMED:14512015.

    \ ' '9302' 'IPR015667' '\

    Telethonin is found at the Z-disc of sarcomeres. It is the phosphorylation target of the kinase domain of titin, and is thought to play a role in muscle development PUBMED:9804419, PUBMED:10481174. Deletion of the C-terminus of titin, including the kinase domain, has been found to impair myofibrillogenesis PUBMED:14600266. Mutations of telethonin cause limb-girdle muscular dystrophy type 2G PUBMED:12379311.

    \ \ \

    Telethonin is a 167-residue protein which complexes with the large muscle protein, titin. The very N-terminus of titin, composed of two immunoglobulin-like (Ig) domains, referred to as Z1 and Z2, interacts with the N-terminal region (residues 1-53) of telethonin, mediating the antiparallel assembly of two Z1Z2 domains. The C terminus of telethonin appears to induce dimerisation of this 2:1 titin/telethonin structure which thus forms a complex necessary for myofibril assembly and maintenance of the intact Z-disk of skeletal and cardiac muscles PUBMED:16713295.

    \ ' '9303' 'IPR019026' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This is a family of highly selective metallo-endopeptidases belonging to the MEROPS peptidase family M64 (IgA peptidase, clan MA). The primary structure of the Clostridium ramosum IgA peptidase shows no significant overall similarity to any other known metallo-endopeptidase PUBMED:11815614.

    \ ' '9304' 'IPR013347' '\

    Many archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This domain is mostly found in MtrF, where it covers the entire length of the protein. This polypeptide is one of eight subunits of the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase complex found in methanogenic archaea. This is a membrane-associated enzyme complex that uses methyl-transfer reactions to drive a sodium-ion pump PUBMED:7737157. MtrF itself is involved in the transfer of the methyl group from N5-methyltetrahydromethanopterin to coenzyme M. Subsequently, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase.

    \

    In some organisms this domain is found at the C-terminal region of what appears to be a fusion of the MtrA and MtrF proteins PUBMED:15466049, PUBMED:15353801. The function of these proteins is unknown, though it is likely that they are involved in C1 metabolism.

    \ ' '9306' 'IPR012672' '\

    Members of this family are encoded within bacterial type III secretion gene clusters. Among all species with type III secretion, those with this protein are found among those that target animal rather than plant cells. The member of this family in Yersinia was shown by mutation to be required for type III secretion of Yops effector proteins and therefore is believed to be part of the secretion machinery PUBMED:9882687.

    \ ' '9307' 'IPR013365' '\

    Proteins in this entry are the IcmQ component of Dot/Icm secretion systems, as found in the obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favor calling this the Dot/Icm system. This protein was shown to be essential for translocation (PUBMED:15661013).

    \ ' '9308' 'IPR019027' '\

    Proteins in this entry consist of a pilus biogenesis protein, CpaD, from Caulobacter, and homologues in other bacteria, including three in the root nodule bacterium Bradyrhizobium japonicum. The molecular function of the homologues is not known.

    \ ' '9309' 'IPR013348' '\

    YscG is a molecular chaperone for YscE, where both are part of the type III secretion system that in Yersinia is designated Ysc (Yersinia secretion). The secretion system delivers effector proteins, designated Yops (Yersinia outer proteins), in Yersinia. This entry consists of YscG from Yersinia, and functionally equivalent type III secretion proteins in other species: e.g. AscG in Aeromonas and LscG in Photorhabdus luminescens.

    \ ' '9310' 'IPR019028' '\

    Carbohydrate-binding modules (CBMs) of microbial glycoside hydrolases play a central role in the recycling of photosynthetically fixed carbon through their binding to specific plant structural polysaccharides PUBMED:11598143. Carbohydrate-binding modules (CBMs) can recogise both crystalline and amorphous cellulose forms PUBMED:15136030. CBMs are the most common non-catalytic modules associated with enzymes active in plant cell-wall hydrolysis. Many putative CBMs have been identified by amino acid sequence alignments but only a few representatives have been show experimentally to have a carbohydrate-binding function PUBMED:15210353.

    \ \

    binds both beta-1,4-glucan and beta-1,3-1,4-mixed linked glucans. binds to xylan and xylooligosaccharides. CBM25 has a starch-binding function. binds to amorphous cellulose and soluble beta-1,4-glucans, with a minimal binding requirement of cellotriose and optimal affinity for cellohexaose. Family 17 CBMs appear to have a very shallow binding cleft that may be more accessible to cellulose chains in non-crystalline cellulose than the deeper binding clefts of family 4 CBMs PUBMED:11733998. CBM28 does not compete with CBM17 modules when binding to non-crystalline cellulose but does have a "beta-jelly roll" topology, which is similar in structure to the CBM17 domains. Sequence and structural conservation in families 17 and 28 suggests that they have evolved through gene duplication and subsequent divergence PUBMED:15136030.

    \ \

    This domain is found at the C-terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose PUBMED:17322304.

    \ ' '9311' 'IPR013378' '\

    This model describes a conserved core region of about 43 residues, which occurs in at least two families of tandem repeats. These include 78-residue repeats which occur from 2 to 15 times in some proteins of Tannerella forsythensis ATCC 43037, and 70-residue repeats found in families of internalins of Listeria species. Single copies are found in proteins of Fibrobacter succinogenes, Geobacter sulfurreducens, and a few other bacteria.

    \ ' '9312' 'IPR019029' '\

    In Salmonella, the gene encoding this protein is part of a four-gene operon PrgHIJK, while in other organisms it is found in type III secretion operons. PrgH has been shown to be required for type III secretion and is a structural component of the needle complex, which is the core component of type III secretion systems.

    \ ' '9313' 'IPR013381' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents the Cse1 family of Cas proteins, which includes CT1972 from Chlorobium tepidum PUBMED:16292354. These proteins are found in the CRISPR/Cas subtype Escherichia coli regions of many bacteria (most of which are mesophiles), and not in Archaea.

    \ ' '9314' 'IPR013388' '\

    This protein is encoded by genes which are found in type III secretion operons, and has been shown to be essential for the invasion phenotype in Salmonella and a component of the secretion apparatus PUBMED:10816487. The protein is known as OrgA in Salmonella due to its oxygen-dependent expression pattern in which low-oxygen levels up-regulate the gene PUBMED:8063389. In Shigella the gene is called MxiK and has been shown to be essential for the proper assembly of the needle complex, which is the core component of type III secretion systems PUBMED:12864857.

    \ ' '9315' 'IPR013390' '\

    This entry represents proteins encoded by genes which are always found in type III secretion operons, although their function in the processes of secretion and virulence is unclear PUBMED:12730176. Hpa stands for Hrp-associated gene, where Hrp stands for hypersensitivity response and virulence.

    \ ' '9316' 'IPR013389' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a minor class of Cas proteins found in at least five prokaryotic genomes: Methanosarcina mazei, Sulfurihydrogenibium azorense, Thermotoga maritima, Carboxydothermus hydrogenoformans, and Dictyoglomus thermophilum, the first of which is archaeal while the rest are bacterial PUBMED:16292354.

    \ ' '9317' 'IPR013382' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents the Cse2 family of Cas proteins, which includes CT1973 from Chlorobium tepidum. These proteins are found in the CRISPR/Cas subtype Ecoli regions of many bacteria (most of which are mesophiles), and not in Archaea PUBMED:16292354.

    \ ' '9318' 'IPR013392' '\

    This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow range of species including Xanthomonas, Burkholderia and Ralstonia.

    \ ' '9319' 'IPR013391' '\

    This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow group of species including Xanthomonas, Burkholderia and Ralstonia.

    \ ' '9320' 'IPR012812' '\

    This family consists of examples of mannosyl-3-phosphoglycerate synthase (MPGS), which together with mannosyl-3-phosphoglycerate phosphatase (MPGP), comprises a two-step pathway for mannosylglycerate biosynthesis. Mannosylglycerate is a compatible solute that tends to be restricted to extreme thermophiles of archaea and bacteria. Note that in Rhodothermus marinus (Rhodothermus obamensis), this pathway is one of two; the other is condensation of GDP-mannose with D-glycerate by mannosylglycerate synthase.

    \ ' '9321' 'IPR012667' '\

    This entry represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems PUBMED:12869542. Evidence for this assignment includes 1) prediction of a single transmembrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a protein (CbtA) predicted to have five additional transmembrane segments.

    \ ' '9322' 'IPR012666' '\

    This entry represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems PUBMED:12869542. Evidence for this assignment includes 1) prediction of five transmembrane segments, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a small protein (CbtB) having a single additional transmembrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site.

    \ ' '9323' 'IPR018573' '\

    This entry includes AlwI (recognises GGATC), Bsp6I (recognises GC^NGC), BstNBI (recognises GASTC), PleI(recognises GAGTC) and MlyI (recognises GAGTC) restriction endonucleases.

    \ ' '9324' 'IPR012669' '\

    Members of this family are isozymes of pectate lyase (), also called polygalacturonic transeliminase and alpha-1,4-D-endopolygalacturonic acid lyase.

    \ ' '9325' 'IPR012663' '\

    Members of this family are small hypothetical proteins of 60 to 100 residues from Cyanobacteria and some Proteobacteria. Prochlorococcus marinus strains have two members, other species one only. Interestingly, of the eight most conserved residues, four are aromatic and three are invariant tryptophans. It appears all species that encode this protein can synthesize tryptophan de novo.

    \ ' '9326' 'IPR018574' '\

    The Slx4 protein is a heteromeric structure-specific endonuclease found in fungi. Slx4 with Slx1 acts as a nuclease on branched DNA substrates, particularly simple-Y, 5\'-flap, or replication fork structures by cleaving the strand bearing the 5\' non-homologous arm at the branch junction and thus generating ligatable nicked products from 5\'-flap or replication fork substrates PUBMED:12832395.

    \ ' '9327' 'IPR019034' '\

    This protein is highly conserved, but its function is unknown. It can be isolated from HeLa cell nucleoli and is found to be homologous with Leydig cell tumour protein whose function is unknown [1, supplementary Table I].

    \ ' '9328' 'IPR018464' '\

    The CENP-O class proteins form a stable complex and are required for proper kinetochore function. They are involved in the prevention of premature sister chromatid separation during recovery from spindle damage PUBMED:18007590. CENP-O mediates the attachment of the centromere to the mitotic spindle by forming essential interactions between the microtubule-associated outer kinetochore proteins and the centromere-associated inner kinetochore proteins. It has been shown to be involved in chromosome segregation via regulation of the spindle in both yeast PUBMED:16079914 and human PUBMED:16622419.

    \

    Chromosome segregation in eukaryotes requires the kinetochore, a multi-protein structure that assembles on centromeric DNA, and which acts to link chromosomes to spindle microtubules. Kinetochore structure and composition is highly conserved among vertebrates. The inner kinetochore is essential for kinetochore assembly, and is involved in chromosome segregation via regulation of the spindle. Inner kinetochore components include the multi-subunit CENP-H/I complex, which may function, in part, in directing centromere protein A (CENP-A) deposition to centromeres, where CENP-A is a centromere-specific histone H3 variant required for the organisation of centromeric chromatin during interphase. The CENP-H/I complex contains three functional classes of proteins PUBMED:16622420, PUBMED:18094054:

    \

    \ \ ' '9329' 'IPR019035' '\

    Med12 is a component of the evolutionarily conserved Mediator complex PUBMED:17088561. The Med12 subunit may specifically regulate transcription of targets of the Wnt signaling pathway and SHH signaling pathway. Med12 is a negative regulator of the Gli3-dependent sonic hedgehog signaling pathway via its interaction with Gli3 within the Mediator. A complex is formed between Med12, Med13, CDK8 and CycC which is responsible for suppression of transcription PUBMED:18394596.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '9330' 'IPR012661' '\

    This family consists of small hypothetical proteins, about 100 amino acids in length. The family includes five members (three in tandem) in Pseudomonas aeruginosa PAO1, and also in Pseudomonas putida (strain KT2440), four in Pseudomonas syringae pv. tomato str. DC3000, and single members in several other Proteobacteria. The function is unknown.

    \ ' '9331' 'IPR019036' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes R.ApaLI and R.XbaI restriction endonucleases. ApaLI recognises and cleaves the sequence GTGCAC.

    \ ' '9332' 'IPR012660' '\

    This entry consists of a broadly distributed uncharacterised domain found often as a standalone protein. The member from Shewanella oneidensis is described from crystallography work as a putative thioesterase. About half of the members of this family are fused to an N-terminal acetyltransferase domain (). The function of these proteins are unknown.

    \ ' '9333' 'IPR012655' '\

    Members of this family are very small proteins, about 47 residues each, in the genus Bacillus. Single members are found in Bacillus subtilis and Bacillus halodurans, while arrays of six members in tandem are found in Bacillus cereus and Bacillus anthracis. An EIxxE motif present in most members of this family resembles cleavage sites by the germination protease GPR in a number of small acid-soluble spore proteins (SASP). A role in sporulation is possible.

    \ ' '9334' 'IPR013393' '\

    This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow range of species including Xanthomonas, Burkholderia and Ralstonia.

    \ ' '9336' 'IPR019037' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the restriction endonuclease Bsp6I, which recognises and cleaves the double-stranded sequence GC^NGC.

    \ ' '9337' 'IPR012653' '\

    This family consists of dimethylamine methyltransferases from the genus Methanosarcina. It is found in three nearly identical copies in each of Methanosarcina acetivorans, Methanosarcina barkeri, and Methanosarcina mazei. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with trimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates dimethylamine, leaving monomethylamine, and methylates the prosthetic group of the small corrinoid protein MtbC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence.

    \ ' '9338' 'IPR012765' '\

    Proteins in this family are glucosylglycerol-phosphate phosphatases, with the gene symbol stpA (Salt Tolerance Protein A). A motif characteristic of acid phosphatases is found, but otherwise this family shows little sequence similarity to other phosphatases. This enzyme acts on the glucosylglycerol phosphate, product of glucosylglycerol phosphate synthase and immediate precursor of the osmoprotectant glucosylglycerol.

    \ ' '9339' 'IPR012711' '\

    The gene which codes for this protein in gut-bacteria is located in a novel putative operon for galactose metabolism. The protein appears to be a carbohydrate-processing phosphorolytic enzyme (), unlike either glycoside hydrolases or glycoside lyase. Intestinal colonisation by Bifidobacteria is important for human health, especially in paediatrics, because colonisation seems to prevent infection by some pathogenic bacteria that cause diarrhoea or other illnesses. The operon seems to be involved in intestinal colonisation by Bifidobacteria mediated by metabolism of mucin sugars. In addition, it may also resolve the question of the nature of the bifidus factor in human milk as the lacto-N-biose structure found in milk oligosaccharides.

    \ ' '9340' 'IPR019038' '\

    This protein forms the C subunit of DNA polymerase delta. It carries the essential residues for binding to the Pol1 subunit of polymerase alpha, from residues 293-332, which are characterised by the motif D--G--VT, referred to as the DPIM motif. The first 160 residues of the protein form the minimal domain for binding to the B subunit, Cdc1, of polymerase delta, the final 10 C-terminal residues, 362-372, being the DNA sliding clamp, PCNA, binding motif.

    \ ' '9341' 'IPR012654' '\

    This entry consists of a relatively rare prokaryotic protein family (about 8 occurrences per 200 genomes). Genes for members of this family appear to be associated variously with phage and plasmid regions, restriction system loci, transposons, and housekeeping genes. Their function is unknown.

    \ ' '9342' 'IPR018304' '\

    Rtt102p (Regulator of Ty1 Transposition Protein 102) is a transcription regulator protein found in fungi that appears to be integrally associated with both the Swi-Snf and the RSC chromatin remodelling complexes, PUBMED:14660704. RSC is involved in transcription regulation and nucleosome positioning, and is responsible for the transfer of a histone octamer from a nucleosome core particle to naked DNA. The reaction requires ATP and involves an activated RSC-nucleosome intermediate. Remodelling reaction also involves DNA translocation, DNA twist and conformational change. As a reconfigurer of centromeric and flanking nucleosomes, RSC complex is required both for proper kinetochore function in chromosome segregation and, via a PKC1-dependent signalling pathway, for organisation of the cellular cytoskeleton. It is a probable component of the SWI/SNF complex, an ATP-dependent chromatin-remodelling complex, is required for the positive and negative regulation of gene expression of a large number of genes. It changes chromatin structure by altering DNA-histone contacts within a nucleosome, leading eventually to a change in nucleosome position, thus facilitating or repressing binding of gene-specific transcription factors.

    \ ' '9343' 'IPR019039' '\

    Members of this family are phage proteins with ATP-dependent RNA ligase activity. Host defence to phage may include cleavage and inactivation of specific tRNA molecules; members of this family act to reverse this RNA damage. The enzyme is adenylated, transiently, on a Lys residue in a motif KXDGSL. This entry includes putative bifunctional polynucleotide kinase/RNA ligases.

    \ ' '9344' 'IPR012652' '\

    Levels of thiamine pyrophosphate (TPP) or thiamine regulate transcription or translation of a number of thiamine biosynthesis, salvage, or transport genes in a wide range of prokaryotes. The mechanism involves direct binding, with no protein involved, to a structural element called THI found in the untranslated upstream region of thiamine metabolism gene operons. This element is called a riboswitch and is seen also for other metabolites such as FMN and glycine. This protein family consists of proteins identified in operons controlled by the THI riboswitch and designated ThiW. The hydrophobic nature of this protein and reconstructed metabolic background suggests that this protein acts in transport of a thiazole precursor of thiamine.

    \ ' '9346' 'IPR019041' '\

    Protein SSX1 can repress transcription, and this has been attributed to a putative Kruppel associated box (KRAB) repression domain at the N terminus. However, from the analysis of these deletion constructs further repression activity was found at the C terminus of SSX1. Which has been called the SSXRD (SSX Repression Domain). The potent repression exerted by full-length SSX1 appears to localise to this region PUBMED:9788446.

    \ ' '9347' 'IPR012651' '\

    Members of this protein family have been assigned as thiamine transporters by a phylogenomic analysis of families of genes regulated by the THI element, a broadly conserved RNA secondary structure element through which thiamine pyrophosphate (TPP) levels can regulate transcription of many genes related to thiamine transport, salvage, and de novo biosynthesis. Species with this protein always lack the ThiBPQ ABC transporter. In some species (e.g. Streptococcus mutans and Streptococcus pyogenes), yuaJ is the only THI-regulated gene. Evidence from Bacillus cereus indicates thiamine uptake is coupled to proton translocation.

    \ ' '9348' 'IPR019042' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the CfrBI restriction endonuclease which recognises and cleaves C^CWWGG.

    \ ' '9349' 'IPR018575' '\

    This entry the restriction endonuclease Eco29kI which recognises and cleaves CCGC^GG.

    \ ' '9350' 'IPR019043' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the HindIII restriction endonuclease which recognises and cleaves A^AGCTT.

    \ ' '9351' 'IPR019044' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the HindVP (recognises GRCGYC bu the cleavage site is unknown) restriction endonucleases.

    \ ' '9352' 'IPR019045' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the restriction endonuclease MjaII, which recognises the double-stranded sequence GGNCC, but the cleavage site is unknown.

    \ ' '9353' 'IPR019046' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the restriction endonuclease NgoPII, which recognises and cleaves the double-stranded sequence GG^CC.

    \ ' '9354' 'IPR018576' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes R.Pab1 from Pyrococcus abyssi. R.Pab1 is homodimeric and has a curved anti-parallel beta-sheet, which forms a three dimensional shape resembling a \'half pipe\'. DNA-binding analyses identifies it as the double-strand DNA-binding site. It is able to catalyse the cleavage of 5\'-GTAC generating a TA3\' overhang in the absence of Mg2+ PUBMED:17332011.

    \ \ ' '9355' 'IPR012659' '\

    Members of this family are bacterial hypothetical proteins, about 160 amino acids in length, found in various proteobacteria, including members of the genera Pseudomonas and Vibrio. The C-terminal region is poorly conserved and is not included in the model.

    \ ' '9356' 'IPR011741' '\

    This entry represents the conserved C-terminal domain of a family of proteins found exclusively in bacteriophage and in bacterial prophage regions. The functions of this domain and the proteins containing it are unknown.

    \ ' '9358' 'IPR012658' '\

    Members of this family are small proteins, about 70 residues in length, with a basic triplet near the N-terminus and a probable metal-binding motif CPXCX(18)CXXC. Members are found in various proteobacteria.

    \ ' '9359' 'IPR011744' '\

    This model represents a protein found encoded in F1F0-ATPase operons in several genomes, including Methanosarcina barkeri (archaeal) and Chlorobium tepidum (bacterial). It is a small protein (about 100 amino acids) with long hydrophic stretches and is presumed to be a subunit of the enzyme PUBMED:9425287.

    \ ' '9360' 'IPR019048' '\

    This entry represents a family of immunodominant surface proteins that contain an 80 amino acid (240 nucleotide) tandem repeat (Ehrlichia repeat), found in a variable number of copies in an immunodominant outer membrane protein of Ehrlichia sp. such as Ehrlichia chaffeensis, a tick-borne obligate intracellular pathogen.

    \ ' '9361' 'IPR011737' '\

    This entry represents a family of hydrophobic proteins with seven predicted transmembrane alpha helices. Members are found in Bacillus subtilis (ywaF), TP0381 from Treponema pallidum (TP0381), Streptococcus pyogenes, Rhodococcus erythropolis, etc.

    \ ' '9362' 'IPR011742' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a family of Cas proteins, including TM1812 from Thermotoga maritima. Family members are also found in Vibrio vulnificus (strain YJ016), Nitrosomonas europaea (strain ATCC 19718), a large plasmid of Synechocystis sp.(strain PCC 6803), and Fibrobacter succinogenes subsp. succinogenes S85.

    \ ' '9363' 'IPR019049' '\

    Ndc1 is a nucleoporin protein that is a component of the Nuclear Pore Complex, and, in fungi, also of the Spindle Pole Body. It consists of six transmembrane segments, three luminal loops, both concentrated at the N terminus and cytoplasmic domains largely at the C terminus, all of which are well conserved.

    \ ' '9364' 'IPR019050' '\

    This motif is found in the C-terminal region of Sm-like proteins PUBMED:15225602. Sm and Sm-like proteins of the Lsm (like Sm) domain family are generally involved in essential RNA-processing tasks.

    \ ' '9365' 'IPR011755' '\

    This family consists of at least 9 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. One appears truncated toward the N-terminus; the others are predicted lipoproteins. The function is unknown.

    \ ' '9366' 'IPR019051' '\

    Members of this family are predicted transmembrane proteins with four membrane-spanning helices. Members are found in the Actinobacteria (Mycobacterium, Corynebacterium, Streptomyces), always associated with genes for tryptophan biosynthesis.

    \ ' '9367' 'IPR011750' '\

    This entry consists of at least 10 paralogous proteins from Myxococcus xanthus that lack detectable sequence similarity to any other protein family. An imperfectly conserved CXXCG motif, a probable binding site, appears twice in the multiple sequence alignment.

    \ ' '9368' 'IPR011751' '\

    This family consists of a set of at least 17 paralogous proteins in Myxococcus xanthus (strain DK 1622). Members are about 200 amino acids in length. No other homologs are known; the function is unknown.

    \ ' '9369' 'IPR019052' '\

    Members of this protein family are found mostly in the Proteobacteria, although one member is found in the marine planctomycete Rhodopirellula baltica. The function is unknown.

    \ ' '9370' 'IPR012644' '\

    Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.

    \ ' '9371' 'IPR012645' '\

    Members of this uncharacterised protein family are found in a number of alphaproteobacteria, including root nodule bacteria, Brucella suis, Caulobacter crescentus (Caulobacter vibrioides), and Rhodopseudomonas palustris. Conserved residues include two well-separated cysteines, suggesting a disulphide bond. The function is unknown.

    \ ' '9374' 'IPR019053' '\

    This pair of motifs are found in the C-terminal region of Sm-like proteins PUBMED:15225602. Sm and Sm-like proteins of the Lsm (like Sm) domain family are generally involved in essential RNA-processing tasks.

    \ ' '9375' 'IPR011753' '\

    This family consists of at least 7 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. The function is unknown.

    \ ' '9376' 'IPR011754' '\

    This family consists of at least 8 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. The function is unknown.

    \ ' '9377' 'IPR019054' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the restriction endonuclease AccI, which recognises and cleaves the double-stranded sequence GT^MKAC.

    \ ' '9378' 'IPR014194' '\

    This entry represents the stage III sporulation protein AE, which is encoded in a spore formation operon spoIIIAABCDEFGH under the control of sigma G PUBMED:12662922. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species.

    \ ' '9379' 'IPR014201' '\

    This entry is designated stage IV sporulation protein A. It acts in the mother cell compartment and plays a role in spore coat morphogenesis PUBMED:12662922. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species.

    \ ' '9380' 'IPR014198' '\

    This entry represents the stage III sporulation protein AB, which is encoded in a spore formation operon: spoIIIAABCDEFGH that is under sigma G regulation PUBMED:12662922. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species.

    \ ' '9381' 'IPR018577' '\

    This entry includes Bpu10I which recognises and cleaves CCTNAGC (-5/-2)restriction endonucleases.

    \ ' '9382' 'IPR019056' '\

    This entry describes a family of proteins found exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus (Rhodopseudomonas capsulata) gene transfer agent, which packages DNA.

    \ ' '9383' 'IPR014202' '\

    This entry is designated stage II sporulation protein R. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species. SpoIIR is a signalling protein that links the activation of sigma E to the transcriptional activity of sigma F during sporulation PUBMED:7892217, PUBMED:12662922.

    \ ' '9384' 'IPR018578' '\

    This entry represents the restriction endonuclease BstXI which recognises and cleaves at CCANNNNN^NTGG.

    \ ' '9385' 'IPR019057' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes Eco47II which recognises GGNCC but the cleavage site is not known.

    \ ' '9386' 'IPR019058' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This family includes Type-2 restriction enzyme NgoBI, and HaeII. They recognise the double-stranded sequence RGCGCY and cleaves after C-5.

    \ ' '9388' 'IPR019059' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the restriction endonuclease HaeIII, which recognises and cleaves the double-stranded sequence GG^CC.

    \ ' '9389' 'IPR019060' '\

    This entry describes an uncharacterised domain, sometimes found in association with a PRC-barrel domain (), which is also found in rRNA processing protein RimM and in a photosynthetic reaction centre complex protein). This domain is found in proteins from Bacillus subtilis, Deinococcus radiodurans, Anabaena sp. PCC 7120, Myxococcus xanthus, and several other species. The function is not known.

    \ ' '9390' 'IPR014271' '\

    Two members of this family are found in Colwellia psychrerythraea (strain 34H / ATCC BAA-681) and one each in various other species of Colwellia and Shewanella. One member from C. psychrerythraea is of special interest because it is preceded by the same cis-regulatory site as a number of genes that have the PEP-CTERM domain described by .

    \ ' '9391' 'IPR014174' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    Members of this entry resemble the Cas6 proteins described by in having a C-terminal motif GXGXXXXXGXG, where the single X of each GXG is hydrophobic and the spacer XXXXX has at least one Lys or Arg. Examples are found in cas gene operons of CRISPR regions in Anabaena variabilis (strain ATCC 29413/PCC 7937), Leptospira interrogans, Gemmata obscuriglobus UQM 2246, and twice in Myxococcus xanthus (strain DK 1622). Oddly, an orphan member is found in Thiobacillus denitrificans (strain ATCC 25259), whose genome does not seem to contain other evidence of CRISPR repeats or cas genes.

    \ ' '9392' 'IPR019061' '\

    This entry represents the C-terminal region of the sporulation protein YunB. In Bacillus subtilis, the expression of YunB is controlled by sigmaE. The gene YunB seems to code for a protein involved, at least indirectly, in the pathway leading to the activation of sigmaK. Inactivation of YunB delays sigmaK activation and results in reduced sporulation efficiency PUBMED:12662922.

    \ ' '9393' 'IPR019062' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This family includes HpaII, which recognises the double-stranded sequence CCGG and cleaves after C-1.

    \ ' '9394' 'IPR019063' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the restriction endonuclease LlaMI, which recognises and cleaves the double-stranded sequence CC^NGG.

    \ ' '9395' 'IPR018579' '\

    This entry includes LlaJI which recognises GACGC restriction endonucleases.

    \ ' '9396' 'IPR019064' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the restriction endonuclease NgoBV, which recognises the sequence GGNNCC, but whose cleavage site is unknown.

    \ ' '9397' 'IPR019065' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the NgoFVII restriction endonuclease, which recognises GCSGC but cleavage site is unknown.

    \ ' '9398' 'IPR019066' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This family includes the SacI restriction endonuclease, which recognises and cleaves GAGCT^C.

    \ ' '9399' 'IPR019067' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the MamI restriction endonuclease which recognises and cleaves GATNN^NNATC.

    \ ' '9400' 'IPR019068' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the MjaI (recognises CTAG but cleavage site unknown) restriction endonuclease.

    \ ' '9401' 'IPR019069' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the restriction endonuclease ScaI, which recognises and cleaves the double-stranded sequence AGT^ACT.

    \ ' '9402' 'IPR019070' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the restriction endonuclease SinI, which recognises and cleaves the double-stranded sequence G^GWCC.

    \ ' '9403' 'IPR019071' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the restriction endonuclease XcyI, which recognises and cleaves the double-stranded sequence C^CCGGG.

    \ ' '9404' 'IPR019072' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry includes the XamI restriction endonuclease which recognises GTCGAC but cleavage site unknown.

    \ ' '9405' 'IPR019073' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents the restriction endonuclease TaqI, which recognises and cleaves the double-stranded sequence T^CGA.

    \ ' '9406' 'IPR014175' '\

    This very small protein (about 46 amino acids) consists largely of a single predicted membrane-spanning region. It is found in Photobacterium profundum SS9 and in three species of Vibrio, always near periplasmic nitrate reductase genes, but far from the periplasmic nitrate reductase genes in Aeromonas hydrophila ATCC 7966.

    \ ' '9407' 'IPR014220' '\

    This entry represents a group of small acid-soluble proteins (SASP) from Bacillus species, which are present in spores but not in growing cells. The sspJ gene is transcribed in the forespore compartment by RNA polymerase with the forespore-specific sigmaG. Loss of SspJ causes a slight decrease in the rate of spore outgrowth in an otherwise wild-type background PUBMED:9852018.

    \ ' '9408' 'IPR014368' '\

    Cytochrome c oxidase () is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen PUBMED:6307356. In fungi (as in other eukaryotes) this enzyme complex is located in the mitochondrial inner membrane.

    \

    In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. This family is composed of the heart and liver isoforms of cytochrome c oxidase subunit VIIa.

    \

    This entry represents fungal-type cytochrome-c oxidase, subunit VIIa.

    \ ' '9409' 'IPR014231' '\

    Proteins in thie entry, typified by YpjB, are restricted to a subset of the endospore-forming bacteria which includes Bacillus species, but not species. In Bacillus subtilis, ypjB was found to be part of the sigma-E regulon PUBMED:12662922. Sigma-E is a sporulation sigma factor that regulates expression in the mother cell compartment. Null mutants of ypjB show a sporulation defect, but this gene is not, however, a part of the endospore formation minimal gene set.

    \ ' '9410' 'IPR019074' '\

    This protein is predicted to span the membrane several times. It is only found in genomes of species that perform sporulation, such as Bacillus subtilis, Clostridium tetani, and other members of the Firmicutes (low-GC Gram-positive bacteria). Mutation of this sigmaE-dependent gene blocks development of the spore cortex. The length of the C-terminal region, which includes some hydrophobic regions, is variable.

    \ ' '9411' 'IPR014229' '\

    Proteins in this entry, exemplified by YtfJ of Bacillus subtilis, are encoded by bacterial genomes if, and only if, the species is capable of endospore formation. YtfJ was confirmed in spores of B. subtilis; it appears to be expressed in the forespore under control of SigF PUBMED:12480901.

    \ ' '9412' 'IPR019076' '\

    This entry contains YhcN and YlaJ, which are predicted lipoproteins that have been detected as spore proteins but not vegetative proteins in Bacillus subtilis. Both appear to be expressed under control of the RNA polymerase sigma-G factor. The YlaJ-like members of this family have a low-complexity, strongly acidic, 40-residue C-terminal domain.

    \ ' '9413' 'IPR014245' '\

    This family represents the stage III sporulation protein AF (SpoIIIAF) of the bacterial endospore formation program, which exists in some but not all members of the Firmicutes (formerly called low-GC Gram-positives). The C-terminal region of these proteins is poorly conserved.

    \ ' '9414' 'IPR014287' '\

    Proteins in this entry include Anf1 from Rhodobacter capsulatus (Rhodopseudomonas capsulata) and AnfO from Azotobacter vinelandii. They are found exclusively in species which contain the iron-only nitrogenase, and are encoded immediately downstream of the structural genes for the nitrogenase enzyme in these species.

    \ ' '9415' 'IPR014318' '\

    This protein previously was designated yjbO in Escherichia coli. It is found only in genomes that have the phage shock operon (psp), but it is only rarely encoded near other psp genes. The psp regulon is upregulated in response to a number of stress conditions, including ethanol, expression of the filamentous phage secretin protein IV and other secretins and heat shock.

    \ ' '9416' 'IPR014321' '\

    Members of this entry are phage shock protein PspD, they are found in a minority of bacteria that carry the defining genes of the phage shock regulon (pspA, pspB, pspC, and pspF). It is found in Escherichia coli, Yersinia pestis, and closely related species, where it is part of the phage shock operon. It is known to be expressed but its function is unknown.

    \ ' '9417' 'IPR011719' '\

    This family consists of few members, broadly distributed. It occurs so far in several Firmicutes (twice in Oceanobacillus), one Cyanobacterium, one alpha Proteobacterium, and (with a long prefix) in plants. The function is unknown. The alignment includes a perfectly conserved motif GxGxDxHG near the N-terminus.

    \ ' '9418' 'IPR018580' '\

    This protein is a conserved membrane protein PUBMED:16306698. The yfhO gene is transcribed in Difco sporulation medium and the transcription is affected by the YvrGHb two-component system. Some members of this family have been annotated as glycosyl transferases of the PMT family.

    \ ' '9419' 'IPR019079' '\

    This protein is a putative poly-gamma-glutamate capsule biosynthesis protein found in bacteria. Poly-gamma-glutamate is a natural polymer that may be involved in virulence and may help bacteria survive in high salt concentrations. It is a surface-associated protein PUBMED:16689787.

    \ ' '9420' 'IPR019080' '\

    This protein is found in many different bacterial species but is of viral origin. The protein forms an oligomer and functions as a processive alkaline exonuclease that digests linear double-stranded DNA in a Mg(2+)-dependent reaction, It has a preference for 5\'-phosphorylated DNA ends. It thus forms part of the two-component SynExo viral recombinase functional unit PUBMED:12670970.

    \ ' '9421' 'IPR018581' '\

    HrpA is an essential component of the type III secretion system (TTSS) which pathogens use to inject virulence factors directly into their host cells, and to cause disease. The TTSS has an Hrp pilus appendage for channelling effector proteins through the plant cell wall and this pilus elongates by the addition of HrpA pilin subunits at the distal end PUBMED:11953310.

    \ ' '9422' 'IPR018582' '\

    This protein is found in feline immunodeficiency retrovirus. From residues 610-780 it is envelope glycoprotein gp36 with homology to HIV gp41 PUBMED:15799719. The process of lentiviral env glycoprotein-mediated fusion of membranes is essential for viral entry and syncytia formation PUBMED:15051387.

    \ ' '9423' 'IPR019081' '\

    This protein is found in eukaryotic, parasitic microsporidia. Its function is unknown.

    \ ' '9424' 'IPR006484' '\

    The sequences in this group represent a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism.

    \ ' '9425' 'IPR018583' '\

    Cotton leaf-curl disease - CLCuD - is of major economic importance in cotton-growing areas of the far-east. The infectious agent appears to be a single-stranded DNA molecule of approx 1350 nucleotides in length, which, when inoculated with the Begomovirus into cotton, induces symptoms typical of CLCuD. This molecule requires the Begomovirus for replication and encapsidation PUBMED:11437658. DNA beta encodes a single protein, betaC1. The intracellular distribution of betaC1 is consistent with the hypothesis that it has a role in transporting the DNA A of Begomovirus from the nuclear site of replication to the plasmodesmatal exit sites of the infected cell. The DNA beta-encoded protein, betaC1, is the determinant of both pathogenicity and suppression of gene silencing PUBMED:17872543.

    \ ' '9426' 'IPR018584' '\

    This is a putative transmembrane protein from bacteria. It is likely to be conserved between Mycobacterium species PUBMED:11234002.

    \ ' '9427' 'IPR018585' '\

    This is a viral attachment glycoprotein from region G of metaviruses. It is high in serine and threonine suggesting it is highly glycosylated PUBMED:16555281.

    \ ' '9428' 'IPR019082' '\

    This entry represents the N-terminal domain found in a family of neurogenic mastermind-like proteins (MAMLs), which act as critical transcriptional co-activators for Notch signaling PUBMED:18758483, PUBMED:18758478. Notch receptors are cleaved upon ligand engagement and the intracellular domain of Notch shuttles to the nucleus. MAMLs form a functional DNA-binding complex with the cleaved Notch receptor and the transcription factor CSL, thereby regulating transcriptional events that are specific to the Notch pathway. MAML proteins may also play roles as key transcriptional co-activators in other signal transduction pathways as well, including: muscle differentiation and myopathies (MEF2C) PUBMED:16510869, tumour suppressor pathway (p53) PUBMED:17317671 and colon carcinoma survival (beta-catenin) PUBMED:17875709. MAML proteins could mediate cross-talk among the various signaling pathways and the diverse activities of the MAML proteins converge to impact normal biological processes and human diseases, including cancers.

    \ \

    The N-terminal domain of MAML proteins adopt an elongated kinked helix that wraps around ANK and CSL forming one of the complexes in the build-up of the Notch transcriptional complex for recruiting general transcription factors. This N-terminal domain is responsible for its interaction with the ankyrin repeat region of the Notch proteins NOTCH1 PUBMED:16880534, NOTCH2 PUBMED:17699740, NOTCH3 PUBMED:19150886 and NOTCH4. It forms a DNA-binding complex with Notch proteins and RBPSUH/RBP-J kappa/CBF1, and also binds CREBBP/CBP PUBMED:15961999 and CDK8 PUBMED:15546612. The C-terminal region is required for transcriptional activation.

    \ \ ' '9429' 'IPR019083' '\

    This entry is found in fungal and plant proteins and contains a conserved IGR motif. Its function is unknown.

    \ ' '9430' 'IPR019084' '\

    This entry is found at the N-terminal region of the Stm1 protein. Stm1 is a G4 quadraplex and purine motif triplex nucleic acid-binding protein. It has been implicated in many biological processes including apoptosis and telomere biosynthesis. Stm1 is known to interact with CDC13 PUBMED:12207228, and is known to associate with ribosomes and nuclear telomere cap complexes PUBMED:15044472.

    \ ' '9431' 'IPR005427' '\

    Some Gram-negative animal enteropathogens express a specialised secretion \ system to directly "inject" exotoxins into the cytoplasm of host cells. The system is composed of structural proteins and \ exotoxin effectors; these are often encoded on large virulence plasmids\ or on the bacterial chromosome itself PUBMED:11018143. Members may be referred to as invasins, pathogenicity island effectors, and cell invasion proteins PUBMED:7608068.

    \

    The Shigella flexneri invasion plasmid antigen (ipa) genes are found on such \ a plasmid, and are highly regulated PUBMED:3057506. Homologues of the ipa genes (SipC/SspC) have been found in Salmonella typhimurium PUBMED:7608068, and both pathogens utilise their type III system to translocate their invasive effectors. These are responsible for interacting directly with the host eukaryotic cell. IpaC can activate cellular kinase activity once in the \ host cytoplasm, and thus promotes cellular uptake of S. flexneri PUBMED:9784535. In addition, it has been found that SipC interacts with another Salmonella protein, SipA, to enhance its reorganisation of the host cell actin \ cytoskeleton, thereby facilitating cellular uptake of the Salmonella \ bacterium by the eukaryotic cell PUBMED:9784535.\

    \ ' '9432' 'IPR011846' '\

    This entry describes a small protein of unknown function, about 100 amino acids in length, essentially always found in an operon with CydAB, subunits of the cytochrome d terminal oxidase. It appears to be an integral membrane protein. It is found so far only in the Proteobacteria PUBMED:9068659.

    \ ' '9433' 'IPR011727' '\

    This conserved hypothetical protein of unknown function is predominantly found in proteobacteria. Its function is unknown and its genome context is not well-conserved. It is found amid urease genes in at least one species.

    \ ' '9434' 'IPR011728' '\

    This entry describes a protein found in polyhydroxyalkanoic acid (PHA) gene regions and incorporated into PHA inclusions in Bacillus cereus and Bacillus megaterium. The role of the protein may include amino acid storage PUBMED:9882674.

    \ ' '9435' 'IPR011871' '\

    This domain of about 175 to 200 amino acids is found, in from one to five copies, in over 50 proteins in Fibrobacter succinogenes subsp. succinogenes S85, an obligate anaerobe of the rumen. Many members of this family have an apparent lipoprotein signal sequence. Conserved cysteine residues, suggestive of disulphide bond formation, are also consistent with an extracytoplasmic location for this domain. This domain can also be found in small numbers of proteins in Chlorobium tepidum and Bacteroides thetaiotaomicron.

    \ ' '9436' 'IPR011726' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    P-ATPases (sometime known as E1-E2 ATPases) () are found in bacteria and in a number of eukaryotic plasma membranes and organelles PUBMED:9419228. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.

    \

    This entry represents the F subunit (KdpF) of a P-type K+-translocating ATPase (Kdp). KdpF is a very small integral membrane peptide. The kdpABC operon of Escherichia coli codes for the high affinity K+-translocating Kdp complex PUBMED:10608856. KdpF is found upstream of the KdpA subunit (). Because of its very small size and highly hydrophobic character, it is sometimes missed in genome annotation.

    \ \

    More information about this protein can be found at Protein of the Month: ATP Synthases PUBMED:.

    \ ' '9437' 'IPR011733' '\

    This family consists of strongly hydrophobic proteins about 190 amino acids in length with a strongly basic motif near the C-terminus. If is found in rather few species, but in paralogous families of 12 members in the oral pathogenic spirochaete Treponema denticola and 2 in Streptococcus pneumoniae (strain ATCC BAA-255 / R6).

    \ ' '9438' 'IPR019087' '\

    The proteins in this entry represent subunit Med15 of the Mediator complex. They contain a single copy of the approximately 70 residue ARC105 domain. The ARC105 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, ARC105 is a critical transducer of gene activation signals that control early metazoan development PUBMED:16799563.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '9439' 'IPR018586' '\

    This DNA-binding domain is the first approx. 100 residues of the N-terminal end of Brinker. The structure of this domain in complex with DNA consists of four alpha-helices that contain a helix-turn-helix DNA recognition motif specific for GC-rich DNA. The Brinker nuclear repressor is a major element of the Drosophila Decapentaplegic morphogen signalling pathway PUBMED:16876822.

    \ ' '9440' 'IPR019088' '\

    This entry consists of predicted transmembrane proteins of about 270 amino acids. They are found predominantly, though not exclusively, in alphaproteobacteria, generally only once in each genome.

    \ ' '9441' 'IPR019089' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a rare CRISPR-associated protein. So far, members are only found in Geobacter sulfurreducens, Gemmata obscuriglobus and Actinomyces naeslundii. CRISPR-associated proteins typically are found near CRISPR repeats and other CRISPR-associated proteins, have low levels of sequence identify, have sequence relationships that suggest lateral transfer, and show some sequence similarity to DNA-active proteins such as helicases and repair proteins.

    \ ' '9442' 'IPR011732' '\

    This entry represents the N-terminal region of a family of large, virulence-associated proteins in Mycoplasma arthritidis and smaller proteins in Mycoplasma capricolum. It includes a probable signal sequence or signal anchor, which, in most instances, has four consecutive Lys residues before the hydrophobic stretch.

    \ ' '9443' 'IPR013397' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry, typified by YPO2465 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy1, for CRISPR/Cas Subtype Ypest protein 1.

    \ ' '9444' 'IPR011735' '\

    The protein from this rare, uncharacterised protein family is designated HtrL or YibB in Escherichia coli, where its gene is found in a region of LPS core biosynthesis genes PUBMED:8157607. Homologs are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein.

    \ ' '9445' 'IPR013394' '\

    This family of proteins is encoded by genes found within type III secretion operons in a limited range of species including Xanthomonas, Ralstonia and Burkholderia.

    \ ' '9446' 'IPR013398' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry, typified by YPO2464 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy2, for CRISPR/Cas Subtype Ypest protein 2.

    \ ' '9447' 'IPR013399' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry, typified by YPO2463 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy3, for CRISPR/Cas Subtype Ypest protein 3.

    \ ' '9449' 'IPR013403' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry is found in CRISPR-associated (cas) proteins in the genomes of Geobacter sulfurreducens PCA and Desulfotalea psychrophila LSv54 (both Desulfobacterales from the Deltaproteobacteria), Gemmata obscuriglobus (a Planctomycete), and Actinomyces naeslundii MG1 (Actinobacteria).

    \ ' '9450' 'IPR013396' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This protein family, typified by YPO2462 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy4, for CRISPR/Cas Subtype Ypest protein 4.

    \ ' '9451' 'IPR013400' '\

    This entry is encoded within type III secretion operons. The protein has been characterised PUBMED:15292137 as a chaperone for the outer membrane pore component YscC (). YscW is a lipoprotein which is itself localized to the outer membrane and, it is believed, facilitates the oligomerisation and localization of YscC.

    \ ' '9452' 'IPR013409' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents tha Csx3 family of Cas proteins, which is encoded in CRISPR-associated gene cluster near CRISPR repeats in the genomes of several different thermophiles: Archaeoglobus fulgidus (archaeal), Aquifex aeolicus (Aquificae), Dictyoglomus thermophilum (Dictyoglomi), and a thermophilic Synechococcus (Cyanobacteria). It is not yet assigned to a specific CRISPR/cas subtype (hence the x designation csx3).

    \ ' '9453' 'IPR013405' '\

    This family of proteins are encoded within type III secretion operons and have been characterised in Yersinia as a regulator of the Low-Calcium Response (LCR) PUBMED:1695896.

    \ ' '9454' 'IPR013416' '\

    This entry is found in Anabaena sp. (strain PCC 7120), Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus in a conserved two-gene neighbourhood. Proteins containing this entry appear to span the membrane seven times.

    \ ' '9455' 'IPR019092' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a Cas protein family found in both bacteria and arachaea. The function of these proteins is unknown.

    \ ' '9456' 'IPR013417' '\

    The function of this protein is unknown. It is always found as part of a two-gene operon with , a protein that appears to span the membrane seven times. It has so far been found in the bacteria Anabaena sp. (strain PCC 7120), Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus.

    \ ' '9457' 'IPR018587' '\

    VP9 is a protein containing a ferredoxin fold. Two dimers come together to form one asymmetric unit which possesses a DNA recognition fold and specific metal binding sites possibly for zinc. It is postulated that being a non-structural protein VP9 is involved in the transcriptional regulation of the White spot syndrome virus (WSSV), from which it comes. WSSV is the major viral pathogen in shrimp aquaculture PUBMED:16956937. VP9 is found N-terminal to the .

    \ ' '9458' 'IPR018588' '\

    Dihaem cytochrome c (DHC) is a soluble c-type cytochrome that folds into two distinct domains, each binding a single haem group and connected by a small linker region. Despite little sequence similarity, the N-terminal domain (residues 12-75) is a class I type cytochrome c, that binds one of the haems, but the domain surrounding the other haem is structurally unique. DHC binds electrostatically to an oxygen-binding protein, sphaeroides haem protein (SHP), as a component of a conserved electron transfer pathway. DHC acts as the physiological electron donor for SHP during phototrophic growth PUBMED:16700547. In certain species DHC is found upstream of .

    \ ' '9459' 'IPR018589' '\

    This hypothetical protein of 125 residues is expressed in bacteria but is thought to be plasmid in origin. It forms a six beta-strand barrel with three accompanying alpha helices and is probably a homo-dimer in the cell. It may be involved in pheromone-inducible conjugation PUBMED:17302827.

    \ ' '9460' 'IPR018590' '\

    Yvfg is a hypothetical protein of 71 residues expressed in some bacteria. The monomer consists of two parallel alpha helices, and the protein crystallises as a homo-dimer.

    \ ' '9461' 'IPR018591' '\

    YorP is a 71 residue protein found in bacteria. As it is also found in a bacteriophage it might be of viral origin. The structure is of an alpha helix between two of five beta strands. The function is unknown.

    \ ' '9462' 'IPR018592' '\

    This protein of 86 residues is expressed in bacteria. It consists of four alpha helices and two beta strands. Its function is unknown. One UniProt entry gives the gene name as Traf5.

    \ ' '9463' 'IPR018593' '\

    The Sen15 subunit of the tRNA intron-splicing endonuclease is one of the two structural subunits of this heterotetrameric enzyme. Residues 36-157 of this subunit possess a novel homodimeric fold. Each monomer consists of three alpha-helices and a mixed antiparallel/parallel beta-sheet. Two monomers of Sen15 fold with two monomers of Sen34, one of the two catalytic subunits, to form an alpha2-beta2 tetramer as part of the functional endonuclease assembly.

    \ ' '9464' 'IPR019093' '\

    The Rac1-binding domain is the C-terminal portion of YpkA from Yersinia. It is an all-helical molecule consisting of two distinct subdomains connected by a linker. The N-terminal end of this domain (residues 434-615) consists of six helices organised into two three-helix bundles packed against each other. This region is involved with binding to GTPases. The C-terminal end (residues 705-732) is a novel and elongated fold consisting of four helices clustered into two pairs, and this fold carries the helix implicated in actin activation. The Rac1-binding domain mimics host guanidine nucleotide dissociation inhibitors (GDIs) of the Rho GTPases, thereby inhibiting nucleotide exchange in Rac1 and causing cytoskeletal disruption in the host PUBMED:16959567. It is usually found downstream of .

    \ ' '9465' 'IPR018594' '\

    This protein of approx.120 residues consists of three beta strands and five alpha helices, thought to fold into a homo-dimer.

    \ ' '9466' 'IPR018595' '\

    This protein is produced from gene PA1123 in Pseudomonas. It contains three alpha helices and six beta strands and is thought to be monomeric. It appears to be present in the biofilm layer and may be a lipoprotein.

    \ ' '9467' 'IPR018285' '\

    This entry represents the N-terminal domain of methionyl-tRNA synthetase (MetRS). This N-terminal appended domain mediates non-catalytic complex formation through its interaction with a domain in the tRNA aminoacylation cofactor Arc1p. The interacting domains of MetRS, GluRS (glutamyl-tRNA synthetase) and Arc1p form a ternary complex resembling a classical GST homo-dimer PUBMED:16914447. Domain-swapping between symmetrically related MetRS-N and Arc1p-N domains generates a 2:2 tetramer held together by van der Waals forces. This domain is necessary for formation of the aminoacyl-tRNA synthetase complex necessary for tRNA nuclear export and shuttling as part of the translational apparatus.

    \ ' '9468' 'IPR019094' '\

    This entry includes protein XkdW () from the Phage-like element PBSX in Bacillus subtilis. XkdW is approximately 100 residues long and contains two alpha helices and two beta strands, and is probably monomeric. XkdW is expressed in bacteria but is probably viral in origin. Its function is unknown. PBSX, a defective prophage of B. subtilis, is a chromosomally based element which encodes a non-infectious phage-like particle with bactericidal activity. PBSX is induced by agents which elicit the SOS response PUBMED:125016.

    \ ' '9469' 'IPR019095' '\

    Med18 is one subunit of the Mediator complex and a component of the head module that is involved in stimulating basal RNA polymerase II (PolII) transcription. Med18 consists of an eight-stranded beta-barrel with a central pore and three flanking helices. It complexes with Med8 and Med20 proteins by forming a heterodimer of two-fold symmetry with Med20 and binding the C-terminal alpha-helix region of Med8 across the top of its barrel. This complex creates a multipartite TBP-binding site that can be modulated by transcriptional activators PUBMED:16964259.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '9470' 'IPR018596' '\

    This is a hypothetical protein from Pyroccous horikoshii of unknown function. It contains six alpha helices and eight beta strands and is thought to be monomeric.

    \ ' '9471' 'IPR018597' '\

    YjcQ is a protein of approx. 100 residues containing four alpha helices and three beta strands. It is expressed in bacteria and also in viruses. It appears to be under the regulation of SigD RNA polymerase which is responsible for the expression of many genes encoding cell-surface proteins related to flagellar assembly, motility, chemotaxis and autolysis in the late exponential growth phase. The exact function of YjcQ is unknown PUBMED:15033535. However, it is thought to be a prophage head protein in viruses PUBMED:15016546.

    \ ' '9472' 'IPR018598' '\

    This protein domain is of unknown function. though putatively involved in DNA mismatch repair. It is associated with .

    \ ' '9473' 'IPR018599' '\

    This protein of approx. 100 residues is found in bacteria. It contains up to five alpha helices and up to seven beta strands and is probably monomeric. Its function is unknown. It is cited as a major prophage head protein PUBMED:15016546, so might generally be of viral origin.

    \ ' '9474' 'IPR018600' '\

    YonK protein is expressed by the bacterial prophage SPbetaC PUBMED:10376821. It is a 63 residue protein that associates into a homo-octamer in the form of a beta-stranded barrel with four outer helical features at points of the compass. Its function is unknown.

    \ ' '9475' 'IPR019096' '\

    YopX is a protein of plasmid origin found in bacteria. It is of approx. 135 residues and is largely helical, with three identical chains probably complexing into a twelve-chain structure. Yop proteins are a subset of pathogenicity factors known as (Yersinia) outer proteins - Yops - which act as chaperones for other proteins such as X. They are exported by the type III secretion system (TTSS) upon bacterial infection of host cells. The TTSS is encoded on a virulence plasmid and is necessary for the survival and replication of the bacterium within host lymphoid tissues PUBMED:15847602.

    \ ' '9476' 'IPR019097' '\

    This protein of 129 residues is expressed in bacteria. It consists of three identical chains of five alpha helices. Two copies of each chain associate into a complex of six units of possible biological significance but of unknown function.

    \ ' '9477' 'IPR018601' '\

    F-112 protein is of 70-110 residues and is found in viruses. Its winged-helix structure suggests a DNA-binding function.

    \ ' '9478' 'IPR018602' '\

    This protein of 154 residues consists of a unit of helices and beta sheets that crystallises into a beautiful asymmetrical dodecameric barrel-structure, of two six-membered rings one on top of the other. It is expressed in bacteria but is of viral origin as it is found in Burkholderia phage BcepMu and is probably a pathogenesis factor PUBMED:15184022.

    \ ' '9480' 'IPR018604' '\

    This domain is exclusively found in YycI proteins in the low GC content Gram-positive species. The domains share the same structural fold with domains two and three of YycH () PUBMED:17307848. Both, YycH and YycI are always found as a pair on the chromosome, downstream of the essential histidine kinase YycG. Additionally, both proteins share a function in regulating the YycG kinase with which they appear to form a ternary complex. Lastly, the two proteins always contain an N-terminal transmembrane helix and are localized to the periplasmic space as shown by PhoA fusion studies.

    \ ' '9481' 'IPR019098' '\

    This domain is highly conserved from yeasts to humans and is part of the chaperone protein HIRIP3 in vertebrates which interacts with the H3.3 chaperone HIRA, implicated in histone replacement during transcription. N- and C- termini of Chz family members are relatively divergent but do contain similar acidic stretches rich in Glu/Asp residues, characteristic of all histone chaperones PUBMED:17289584.

    \ ' '9482' 'IPR013433' '\

    Proteins in this entry are encoded by genes involved in either polyhydroxyalkanoic acid (PHA) biosynthesis or utilisation, including proteins at found at the surface of PHA granules. These proteins have so far been found in the Pseudomonadales, Xanthomonadales, and Vibrionales, all of which belong to the Gammaproteobacteria.

    \ ' '9483' 'IPR013442' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a conserved region of about 150 amino acids found in a family of Cas proteins in at least five archaeal and three bacterial species. In six of eight species, the protein is encoded the vicinity of a CRISPR/Cas locus.

    \ ' '9484' 'IPR013443' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a conserved region of about 95 amino acids found exclusively in species with CRISPR repeats. In all bacterial species that contain this entry, the genes encoding the proteins are in the midst of a cluster of cas genes.

    \ ' '9486' 'IPR013472' '\

    These conserved hypothetical proteins have so far been found only in the Cyanobacteria. They are about 170 amino acids long and contain a CxxCx(14)CxxH motif near the N-terminus.

    \ ' '9487' 'IPR013481' '\

    Proteins in this entry are found in the Cyanobacteria, and are mostly encoded near nitrate reductase and molybdopterin biosynthesis genes. Molybdopterin guanine dinucleotide is a cofactor for nitrate reductase. These proteins are sometimes annotated as nitrate reductase-associated proteins, though their function is unknown.

    \ ' '9488' 'IPR019099' '\

    This entry is found in putative proteins of about 150 amino acids in length that contain three predicted transmembrane helices and an unusual motif with consensus sequence PGPGW.

    \ ' '9489' 'IPR013487' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents the Cxs8 family of Cas proteins, whose funciton is unknown. These proteins are encoded in the midst of a cas gene operon PUBMED:16292354.

    \ ' '9490' 'IPR013488' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents the Cxs9 family of Cas proteins found in archaea. These proteins are encoded in the midst of a cas gene operon PUBMED:16292354.

    \ ' '9491' 'IPR013489' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents the Csm6 family of Cas proteins PUBMED:16292354.

    \ ' '9492' 'IPR013493' '\

    Proteins in this entry are encoded within a conserved gene four-gene neighbourhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. (strain EbN1) (Aromatoleum aromaticum (strain EbN1)) and Ralstonia solanacearum (Betaproteobacteria).

    \ ' '9493' 'IPR013494' '\

    Proteins in this entry are encoded within a conserved gene four-gene neighbourhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. (strain EbN1) (Aromatoleum aromaticum (strain EbN1)) and Ralstonia solanacearum (Betaproteobacteria).

    \ ' '9494' 'IPR014097' '\

    Members of this protein family are the gamma subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. The gamma subunit has no known homologues.

    \ ' '9495' 'IPR014086' '\

    Members of this family are ring-opening amidohydrolases, including cyanuric acid amidohydrolase () (AtzD and TrzD) and barbiturase. Note that barbiturase does not act as defined for (barbiturate + water = malonate + urea) but rather catalyses the ring opening of barbiturase acid to ureidomalonic acid PUBMED:11485332.

    \ ' '9496' 'IPR013495' '\

    Proteins in this entry are encoded within a conserved gene four-gene neighbourhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. (strain EbN1) (Aromatoleum aromaticum (strain EbN1)) and Ralstonia solanacearum (Betaproteobacteria).

    \ ' '9497' 'IPR014328' '\

    Type II restriction endonucleases () are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin PUBMED:15770420. However, there is still considerable diversity amongst restriction endonucleases PUBMED:14576294, PUBMED:11827971. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone PUBMED:11557805.

    \

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5\'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements PUBMED:15121719, PUBMED:12665693, as summarised below:

    \

    \ \

    This entry represents type II restriction endonucleases of the Alw26I/Eco31I/Esp3I family PUBMED:, whose recognition sequences are 5\'-GTCTC-3\' (Alw26I), 5\'-GGTCTC-3\' (Eco31I) and 5\'-CGTCTC-3\' (Esp3I).

    \ ' '9498' 'IPR018605' '\

    Sororin is an essential, cell cycle-dependent mediator of sister chromatid cohesion PUBMED:15837422. The protein is nuclear in interphase cells, dispersed from the chromatin in mitosis, and interacts with the cohesin complex PUBMED:15837422.

    \ ' '9499' 'IPR019102' '\

    This entry represents a conserved region found the in HMG box transcription factor BBX. This protein is necessary for cell cycle progression from the G1 to the S phase PUBMED:11680820.

    \ ' '9500' 'IPR019103' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised PUBMED:1455179. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin PUBMED:10625704 and archaean preflagellin have been described PUBMED:16983194, PUBMED:14622420.

    \ \

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases.\ All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    \ \

    This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover PUBMED:17010377.

    \ ' '9501' 'IPR019104' '\

    This entry represents a family of proteins which are encoded in temperate phages and bacterial prophage regions. They include the product of the late operon rha gene from lambdoid phage phi-80. The presence of this gene interferes with the infection of bacterial strains lacking the integration host factor that regulates the rha gene. Rha is thought to function as a phage regulatory protein.

    \ ' '9502' 'IPR014082' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a family of Cas proteins encoded exclusively in the vicinity of CRISPR repeats and other Cas proteins in Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum), Thermus thermophilus (Deinococcus-Thermus), Chloroflexus aurantiacus (Chloroflexi), and Thermomicrobium roseum (Thermomicrobia).

    \ ' '9503' 'IPR014099' '\

    Members of this protein family are the spore coat protein GerQ of endospore-forming Firmicutes (low GC Gram-positive bacteria). This protein is cross-linked by a spore coat-associated transglutaminase.

    \ ' '9505' 'IPR019106' '\

    This entry represents TrbC, a protein that is an essential component of the F-type conjugative pilus assembly system (aka type 4 secretion system) for the transfer of plasmid DNA PUBMED:205063, PUBMED:16138100. The N-terminal portion of these proteins is heterogeneous.

    \ ' '9506' 'IPR014127' '\

    Members of this uncharacterised protein family are found sporadically, so far only among spirochetes, epsilon and delta proteobacteria, and Bacteroides. The function is unknown and its gene neighbourhoods show little conservation.

    \ ' '9507' 'IPR014131' '\

    Members of this entry are encoded by genes in chlamydiaphage such as Chp2. These viruses have around eight genes and infect obligately intracellular bacterial pathogens of the genus Chlamydia. This protein is annotated as VP3 or structural protein (as if a protein of mature viral particles), however, it is displaced from procapsids as DNA is packaged, and therefore is more correctly described as a scaffolding protein.

    \ ' '9508' 'IPR014118' '\

    This entry represents TraV, a component of a conjugative type IV secretion system. TraV is an outer membrane lipoprotein that is believed to interact with the secretin TraK PUBMED:11722740, PUBMED:12855161, PUBMED:16138100. This protein contains three conserved cysteines in the N-terminal half.

    \ ' '9509' 'IPR014115' '\

    This entry represents TrbI, an essential component of the F-type conjugative transfer system for plasmid DNA transfer that has been shown to be localized to the periplasm PUBMED:1355084, PUBMED:16138100.

    \ ' '9510' 'IPR019108' '\

    This entry represents the CtaG protein required for the assembly of active caa3-type cytochrome c oxidase in Bacillus subtilis, and related proteins.

    \ ' '9511' 'IPR014112' '\

    This entry represents TraQ, a protein that makes a specific interaction with pilin (TraA) to aid its transfer through the inner membrane during the process of F-type conjugative pilus assembly PUBMED:10564517, PUBMED:16138100.

    \ ' '9512' 'IPR010070' '\

    This entry represents a family of hypothetical proteins, half of which are 40 residues or less in length. Members are found only in spore-forming species. A Gly-rich variable region is followed by a strongly conserved, highly hydrophobic region, predicted to form a transmembrane helix, ending with an invariant Gly. The consensus for this stretch is FALLVVFILLIIV.

    \ ' '9513' 'IPR010056' '\

    This entry represents the N-terminal domain of a small family of phage proteins. The protein contains a region of low-complexity sequence that reflects DNA direct repeats able to function as an origin of phage replication. The region is N-terminal to the low-complexity region.

    \ ' '9514' 'IPR010026' '\

    This entry identifies a family of putative phage holin from a number of phage and prophage regions of Gram-positive bacteria. Like other holins, it is small (about 100 amino acids) with stretches of hydrophobic sequence and is encoded adjacent to lytic enzymes.

    \ ' '9515' 'IPR006540' '\

    These sequences represent bacteriocins related to lactococcin 972 PUBMED:10589723. Members tend to be found in association with a seven transmembrane putative immunity protein.

    \ ' '9516' 'IPR006521' '\

    These sequences represent the family of phage P2 protein I and related tail proteins from a number of temperate phage of Gram-negative bacteria.

    \ ' '9517' 'IPR019109' '\

    Chloroplast function requires the import of nuclear encoded proteins from the cytoplasm across the chloroplast double membrane. This is accomplished by two protein complexes, the Toc complex located at the outer membrane and the Tic complex located at the inner membrane. The Toc complex recognises specific proteins by a cleavable N-terminal sequence and is primarily responsible for translocation through the outer membrane, while the Tic complex translocates the protein through the inner membrane. This entry represents Tic20, a core member of the Tic complex, and related proteins. Tic20 is deeply embedded in the inner envelope membrane and is thought to function as a protein conducting component of the Tic complex.

    \ ' '9518' 'IPR019110' '\

    This entry identifies a family of proteins, around 100 amino acids in length, that include a predicted signal sequence and a perfectly conserved motif, RAQPRD, towards the C terminus. They are found in the Pseudomonas putida TOL plasmid pWW0 and in cryptic plasmid regions of Salmonella enterica subsp. enterica serovar Typhi and Pseudomonas syringae pv. tomato str. DC3000. The function of these proteins is unknown.

    \ ' '9519' 'IPR019111' '\

    These sequences contain a conserved sequence region of about 60 amino acids found in over 40 predicted proteins of Plasmodium falciparum. It is not currently found elsewhere, even in closely related Plasmodium species. No member of this family has been functionally characterised.

    \ ' '9520' 'IPR006496' '\

    This entry represents a set of protein sequences found in Plasmodium species. An interesting feature is five perfectly conserved Trp residues.

    \ ' '9521' 'IPR006489' '\

    This repeat is found in the products of only 2 genes in Plasmodium yoelii, in each of these proteins it is repeated 9 times. It is found in no other organism.

    \ ' '9522' 'IPR006488' '\

    This entry represents the N-terminal domain of a paralogous family of Plasmodium yoelii proteins that are preferentially encoded in the subtelomeric regions of the chromosomes. There are no obvious homologues to these proteins in other organisms. The C-terminal portions of the proteins are divergent and some contain other Plasmodium-specific paralogous domains such as PYST-C2 ().

    \ ' '9523' 'IPR019114' '\

    This family comprises lipoproteins from gamma proteobacterial species: pullulanase secretion protein PulS protein of Klebsiella pneumoniae (P20440), the lipoprotein OutS protein of Erwinia chrysanthemi (Q01567) and the functionally uncharacterised type II secretion protein EtpO (Q7BSV3) from Escherichia coli O157:H7. PulS and OutS have been shown to interact with and facilitate insertion of secretins into the outer membrane, suggesting a chaperone-like, or piloting function for members of this family.

    \ ' '9524' 'IPR018606' '\

    Arb1 is required for histone H3 Lys9 (H3-K9) methylation, heterochromatin, assembly and siRNA generation in fission yeast PUBMED:17310250.

    \ ' '9525' 'IPR010022' '\

    This entry identifies a family of small (about 50 amino acid) phage proteins, found in at least 12 different phage and prophage regions of Gram-positive bacteria. In a number of these phage, the gene for this protein is found near the holin and endolysin genes.

    \ ' '9526' 'IPR010239' '\

    This entry represents a conserved hypothetical protein about 240 residues in length found so far in Proteobacteria including Shewanella oneidensis and Ralstonia solanacearum, usually as part of a paralogous family. The function is unknown.

    \ ' '9527' 'IPR006513' '\

    These are sequences from gammaproteobacteria that are related to the Escherichia coli protein, YtfJ.

    \ ' '9528' 'IPR018607' '\

    Ctf8 (chromosome transmissions fidelity 8) is a component of the Ctf18 RFC-like complex which is a DNA clamp loader involved in sister chromatid cohesion.

    \ ' '9529' 'IPR019115' '\

    This family of proteins of unknown function is found in Porphyromonas gingivalis (Bacteroides gingivalis).

    \ ' '9530' 'IPR010176' '\

    This motif occurs from three to eight times in eight different proteins of Geobacter sulfurreducens. The final CXXCH motif matches the cytochrome c family haem-binding site signature, suggesting that the sequence may be involved in haem-binding.

    \ ' '9531' 'IPR010177' '\

    This entry represents a domain of about 41 amino acids that contains, among other motifs, two copies of the CXXCH motif associated with haem binding. Most proteins in this entry have at least three copies of this domain (i.e. at least six copies of CXXCH) and ares predicted to be high molecular weight c-type cytochromes. These proteins are found mostly in species of Shewanella, Geobacter, and Vibrio.

    \ ' '9532' 'IPR019117' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a highly divergent family of Cas proteins, found in at least ten different archaeal and bacterial species, including TM1793 from Thermotoga maritima.

    \ ' '9533' 'IPR010160' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a family of Cas proteins as represented by TM1791.1 from Thermotoga maritima. This family of Cas proteins are found in both archaeal and bacterial species.

    \ ' '9534' 'IPR010157' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a minor family of Cas protein found in various species of Sulfolobus and Pyrococcus (all archaeal). It is found with two different CRISPR loci in Sulfolobus solfataricus.

    \ ' '9535' 'IPR010184' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a family of Cas proteins that tends to be found near CRISPR repeats. The species range for famliy members, so far, is exclusively archaeal. It is found so far in only four different species, and includes two tandem genes in Pyrococcus furiosus DSM 3638.

    \ ' '9536' 'IPR010155' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a small Cas family represented by CT1134 from Chlorobium tepidum. This family belongs to a set of several Cas protein families, one each for a number of different CRISPR/Cas subtypes, that share a region of N-terminal sequence similarity. This family represents the Dvulg subtype of CRISPR/Cas locus.

    \ ' '9538' 'IPR019121' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a conserved region of about 65 amino acids found in otherwise highly divergent proteins encoded in CRISPR-associated regions. This region features two CXXC motifs.

    \ ' '9539' 'IPR010152' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents a minor branch of the Cas2 family of CRISPR-associated protein which are found in . Cas2 is one of four protein families (Cas1 to Cas4) that are associated with CRISPR elements and always occur near a repeat cluster, usually in the order cas3-cas4-cas1-cas2. The function of Cas2 (and Cas1) is unknown. Cas3 proteins appear to be helicases while Cas4 proteins resemble RecB-type exonucleases, suggesting that these genes are involved in DNA metabolism or gene expression PUBMED:11952905.

    \ ' '9541' 'IPR010144' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents the Csd1 (CRISPR/Cas Subtype DVULG protein 1) family of Cas proteins, which tend to be found near CRISPR repeats of the DVULG subtype of CRISPR/Cas locus. The species range for this subtype, so far, is exclusively bacterial and mesophilic, although CRISPR loci in general are particularly common among archaea and thermophilic bacteria.

    \ ' '9542' 'IPR019122' '\

    This entry represents a family of six predicted lipoproteins from a region of about 20 tandemly arranged genes in the Treponema denticola genome. Two other neighbouring genes share the lipoprotein signal peptide region but do not show more extensive homology. The function of this locus is unknown.

    \ ' '9543' 'IPR010146' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    This entry represents the Csn2 family of Cas proteins, which are found only in CRISPR-containing species, near other CRISPR-associated proteins (cas), as part of the NMENI subtype of CRISPR/Cas loci. The species range so far for this subtype is animal pathogens and commensals only. This protein is present in some but not all NMENI CRISPR/Cas loci.

    \ ' '9544' 'IPR010123' '\

    Poly(R)-hydroxyalkanoic acids (PHAs) function as carbon and energy storage polymers in many bacteria. This entry represents the PhaE subunit of the heterodimeric class (class III) of PHA synthases. The most common PHA is polyhydroxybutyrate but about 150 different constituent hydroxyalkanoic acids (HAs) have been identified in various species.

    \ ' '9545' 'IPR006476' '\

    This plant-specific family of proteins are defined by an uncharacterised region 57 residues in length. It is found toward the N terminus of most proteins that contain it. Examples include at least several proteins from Arabidopsis thaliana (Mouse-ear cress) and Oryza sativa (Rice). The function of the proteins are unknown.

    \ ' '9547' 'IPR006410' '\

    These sequences represent an uncharacterised family consisting of a small number of hypothetical proteins of the malaria parasite Plasmodium falciparum (isolate 3D7).

    \ ' '9548' 'IPR006389' '\

    These sequences represent a family of proteins from the malaria parasite Plasmodium falciparum, several of which have been shown to be expressed specifically in the ring stage as well as the rodent parasite Plasmodium yoelii PUBMED:11163452. A homologue from Plasmodium chabaudi was localized to the parasitophorous vacuole membrane PUBMED:8139619. Members have an initial hydrophobic, Phe/Tyr-rich stretch long enough to span the membrane, a highly charged region rich in Lys, a second putative transmembrane region, and a second highly charged, low complexity sequence region. Some members have up to 100 residues of additional C-terminal sequence. These genes have been shown to be found in the sub-telomeric regions of both Plasmodium falciparum and P. yoelii chromosomes.

    \ ' '9549' 'IPR006387' '\

    This entry represents a domain of about 61 residues in length with six well-conserved cysteine residues and six well-conserved aromatic sites. The domain can be found in tandem repeats. It is named for motifs of CPxxW and (less well conserved) WPC. Its function is unknown.

    \ ' '9550' 'IPR019125' '\

    This entry represents a relatively well-conserved region near the C terminus of the tape measure protein from bacteriophage lambda and related phages. The protein is typically about 1000 residues in length, often containing both low-complexity sequences and insertions/deletions. Mutational studies suggest a ruler or template role in the determination of phage tail length. Similar behaviour is attributed to proteins from distantly related or unrelated families in other phages.

    \ ' '9551' 'IPR010181' '\

    This entry represents a putative redox-active protein of about 140 residues, with four perfectly conserved Cys residues. It includes a CGAXXG motif. Most members are found within one or two loci of transporter or oxidoreductase genes. A member from Geobacter sulfurreducens, located in a molybdenum transporter operon, has a TAT (twin-arginine translocation) signal sequence for Sec-independent transport across the plasma membrane, a hallmark of bound prosthetic groups such as FeS clusters.

    \ ' '9552' 'IPR013406' '\

    This entry defines several short bacterial proteins, typically about 75 amino acids long, which are always found as part of a pair (at least) of small genes. The other protein in the pair always belongs to a family of plasmid stabilisation proteins (). It is likely that this protein and its partner comprise some form of addiction module - a pair of genes consisting of a stable toxin and an unstable antitoxin which mediate programmed cell death PUBMED:10547685 - although these gene pairs are usually found on the bacterial main chromosome.

    \ ' '9553' 'IPR019127' '\

    Proteins in this entry are mostly designated exosortase, analogous to the sortase in cell wall sorting mediated by LPXTG domains in Gram-positive bacteria. Members of this entry are integral membrane proteins with eight predicted transmembrane helices in common. Some proteins in this entry have long trailing sequences past the region described by this model. The best characterised protein in this entry is EpsH of Methylobacillus sp. 12S, where it is part of a locus associated with biosynthesis of the exopolysaccharide methanol-an.

    \ ' '9554' 'IPR011979' '\

    Proteins in this family are found almost exclusively in the Proteobacteria, but also in Gloeobacter violaceus PCC 7421, a cyanobacterium. The function is unknown.

    \ ' '9555' 'IPR013429' '\

    This entry represents a region of about 41 amino acids found in a number of small proteins in a wide range of bacteria. The region usually begins with the initiator Met and contains two CxxC motifs separated by 17 amino acids. One protein in this entry has been noted as a putative regulatory protein, designated FmdB PUBMED:8841393. Most proteins in this entry have a C-terminal region containing highly degenerate sequence.

    \ ' '9556' 'IPR019128' '\

    Sister chromatid cohesion protein DCC1 is a component of the RFC-like complex CTF18-RFC. This complex is required for the efficient establishment of chromosome cohesion during S-phase and may load or unload POL30/PCNA. During a clamp loading circle, the RFC:clamp complex binds to DNA and the recognition of the double-stranded/single-stranded junction stimulates ATP hydrolysis by RFC. The complex presumably provides bipartite ATP sites in which one subunit supplies a catalytic site for hydrolysis of ATP bound to the neighbouring subunit. Dissociation of RFC from the clamp leaves the clamp encircling DNA PUBMED:11389843, PUBMED:15964801.

    \ ' '9557' 'IPR019129' '\

    This entry represents the full-length proteins in which, in higher eukaryotes, the nested domain EDSLL lies. Fra10Ac1 is a highly conserved nuclear protein of unknown function that is highly expressed in brain tissue PUBMED:15203205.

    \ ' '9558' 'IPR019130' '\

    This entry represents the multi-pass transmembrane protein Macoilin, which is highly conserved in eukaryotes.

    \ ' '9559' 'IPR019131' '\

    This entry represents the first approximately 600 residues of cortactin-binding protein 2. In addition to being a positional candidate for autism, this protein is expressed at highest levels in the brain in humans. Towards the C-terminal end of this entry are a series of proline-rich regions which are likely to be the points of interaction with the SH3 domain of cortactin. The human protein has six associated ankyrin repeat domains () towards the C terminus of the protein which act as protein-protein interaction domains PUBMED:11707066.

    \ ' '9560' 'IPR019132' '\

    Taxilin contains an extraordinarily long coiled-coil domain in its C-terminal half and is ubiquitously expressed. It is a novel binding partner of several syntaxin family members and is possibly involved in Ca(2+)-dependent exocytosis in neuroendocrine cells PUBMED:12558796. Gamma-taxilin, described as leucine zipper protein Factor Inhibiting ATF4-mediated Transcription (FIAT), localises to the nucleus in osteoblasts and dimerises with ATF4 to form inactive dimers, thus inhibiting ATF4-mediated transcription PUBMED:16831913.

    \ ' '9561' 'IPR018608' '\

    In Schizosaccharomyces pombe (Fission yeast) the gti1 protein promotes the onset of gluconate uptake upon glucose starvation PUBMED:9372449. In S. pombe the Pac2 protein controls the onset of sexual development, by inhibiting the expression of ste11, in a pathway that is independent of the cAMP cascade PUBMED:8536311.

    \ ' '9562' 'IPR018477' '\

    BicD proteins consist of three coiled-coiled domains and are involved in dynein-mediated minus end-directed transport from the Golgi apparatus to the endoplasmic reticulum (ER) PUBMED:19018277. Glycogen synthase kinase-3beta (GSK-3beta) is required for the binding of BICD to dynein but not to dynactin, acting to maintain the anchoring of microtubules to the centromere PUBMED:17139249. It appears that amino-acid residues 437-617 of BicD and the kinase activity of GSK-3 are necessary for the formation of a complex between BicD and GSK-3beta in intact cells PUBMED:17139249.

    \ ' '9563' 'IPR019133' '\

    Mitofilin controls mitochondrial cristae morphology. Mitofilin is enriched in the narrow space between the inner boundary and the outer membranes, where it forms a homotypic interaction and assembles into a large multimeric protein complex PUBMED:8886976. The first 78 amino acids contain a typical amino-terminal-cleavable mitochondrial presequence (residues 1-43) rich in positive-charged and hydroxylated residues and a membrane anchor domain (residues 47-66). In addition, it has three centrally located coiled coil domains (residues 200-240,280-310 and 400-420) PUBMED:15647377.

    \ ' '9564' 'IPR019134' '\

    This entry represents the C-terminal 200 residues of the cactin protein which is necessary for the association of cactin with IkappaB-cactus, as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development and in the establishment of dorsal-ventral polarity in the early embryo PUBMED:10842059. Most members of the family also have the conserved mid region of cactin () further upstream.

    \ ' '9565' 'IPR019135' '\

    The VEFS-Box is found in the the C-terminal region of the VRN2, EMF2, FIS2, and Su(z)12 polycomb proteins. This region is characterised by an acidic cluster and a tryptophan/methionine-rich sequence, the acidic-W/M domain PUBMED:11701882. In some proteins the VEFS-Box is associated with a zinc-finger domain located roughly 100 residues towards the N terminus. These proteins are part of the polycomb cluster of proteins which control HOX gene transcription as it functions in heterochromatin-mediated repression PUBMED:11546753.

    \ ' '9566' 'IPR019136' '\

    Transcription factor IIIC (TFIIIC) is a multisubunit DNA binding factor that serves as a dynamic platform for assembly of pre-initiation complexes on class III genes. This entry represents subunit 5 (also known as the tau 95 subunit) which holds a key position in TFIIIC, exerting both upstream and downstream influence on the TFIIIC-DNA complex by rendering the complex more stable PUBMED:12533520. Once bound to tDNA-intragenic promoter elements, TFIIIC directs the assembly of TFIIIB on the DNA, which in turn recruits the RNA polymerase III (pol III) and activates multiple rounds of transcription.

    \ ' '9567' 'IPR019137' '\

    Expression of this protein was found to be markedly reduced in patients with Alzheimer\'s disease PUBMED:10673335. Nck-associated protein 1 is part of lamellipodial complex that controls Rac-dependent actin remodeling.\ \ It associates preferentially with the first SH3 domain of Nck and is a component of the WAVE2 complex composed of ABI1, CYFIP1/SRA1, NCKAP1/NAP1 and WASF2/WAVE2. It is also a component of the WAVE1 complex composed of ABI2, CYFIP2, C3orf10/HSPC300, NCKAP1 and WASF1/WAVE1. CYFIP2 binds to activated RAC1 which causes the complex to dissociate, releasing activated WASF1. The complex can also be activated by NCK1

    \ ' '9568' 'IPR018609' '\

    This entry is characterised by proteins with alternating conserved and low-complexity regions. It is suggested that the proteins may function in pre-mRNA splicing.

    \ ' '9569' 'IPR019138' '\

    This is the C-terminal conserved 400 residues of Det1 proteins of approximately 550 amino acids PUBMED:12589545. Det1 (de-etiolated-1) is an essential negative regulator of plant light responses, and it is a component of the Arabidopsis CDD complex containing DDB1 and COP10 ubiquitin E2 variant. Mammalian Det1 forms stable DDD-E2 complexes, consisting of DDB1, DDA1 (DET1, DDB1 Associated 1), is a member of the UBE2E group of canonical ubiquitin conjugating enzymes and modulates Cul4A function PUBMED:17452440.

    \ ' '9570' 'IPR019139' '\

    This entry represents transcriptional repressors which preferentially bind to the GC-rich consensus sequence (5\'-AGCCCCCGGCG-3\') and may regulate expression of TNF, EGFR and PDGFA. They may control smooth muscle cell proliferation following artery injury through PDGFA repression and may also bind double-stranded RNA. They interact with the leucine-rich repeat domain of human flightless-I (FliI) protein.

    \ ' '9571' 'IPR019140' '\

    This entry is of proteins of approximately 600 residues in length containing alternating regions of conservation and low complexity. The function is unknown.

    \ ' '9572' 'IPR018610' '\

    This is a 100 residue conserved region of a family of proteins found from fungi to humans. This region contains three conserved Cysteines and a motif of {CP}{y/l}{HG}.

    \ ' '9573' 'IPR019141' '\

    This entry is the conserved 250 residues of proteins of approximately 450 amino acids. It contains several highly conserved motifs including a CVxLxxxD motif. The function is unknown.

    \ ' '9574' 'IPR019142' '\

    Dymeclin (Dyggve-Melchior-Clausen syndrome protein) contains a large number of leucine and isoleucine residues and a total of 17 repeated dileucine motifs. It is characteristically about 700 residues long and present in plants and animals. In humans, mutations in the gene coding for this protein give rise to a disorder called Dyggve-Melchior-Clausen syndrome (DMC, MIM 223800), which is an autosomal-recessive disorder characterised by the association of spondylo-epi-metaphyseal dysplasia and mental retardation PUBMED:12554689.

    \ ' '9575' 'IPR018611' '\

    This entry is the conserved N-terminal 300 residues of a group of proteins found from protozoa to Humans. The function is unknown.

    \ ' '9576' 'IPR019143' '\

    This entry represents the N-terminal 200 residues of a set of proteins conserved from yeasts to humans. Most of the proteins in this entry have a RhoGEF domain () at their C-terminal end.

    \ ' '9577' 'IPR018612' '\

    This entry is a conserved domain of approximately 130 residues of proteins conserved from fungi to humans. The proteins do contain a coiled-coil domain, but the function is unknown.

    \ ' '9578' 'IPR019144' '\

    Membralin is evolutionarily highly conserved, though it appears to be a unique protein family. It contains several predicted transmembrane regions, and in humans it is expressed in certain cancers, particularly ovarian cancers PUBMED:16084606.

    \ ' '9579' 'IPR018613' '\

    This entry is of sequences of two conserved domains separated by a region of low complexity, spanning some 200 residues. The function is unknown.

    \ ' '9580' 'IPR019145' '\

    Med10 is one of the protein subunits of the Mediator complex, tethered to Med14 (Rgr1) protein. Med10 specifically mediates basal-level HIS4 transcription via Gcn4. In addition, there is a putative requirement for Med10 in Bas2-mediated transcription PUBMED:9891034.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '9581' 'IPR019146' '\

    This entry is of proteins of approximately 300 residues conserved from plants to humans. It contains two conserved motifs, HxSL and FHVSL. The function is unknown.

    \ ' '9582' 'IPR019147' '\

    This entry represents the conserved N-terminal region of SWAP (suppressor-of-white-apricot protein) splice factor proteins. This region contains two highly conserved motifs, viz: DRY and EERY, which appear to be the sites for alternative splicing of exons 2 and 3 of the SWAP mRNA PUBMED:8206918. These proteins are thus thought to be involved in auto-regulation of pre-mRNA splicing. Most family members are associated with two Surp domains () and an arginine/serine-rich binding region towards the C terminus.

    \ ' '9583' 'IPR019148' '\

    This entry represents a family of proteins of approximately 500 residues with alternating regions of low complexity and conservation where the domain similarities are strong. Apart from a predicted coiled-coil domain, no other known functional domains have been characterised. May be involved in pre-mRNA splicing and has been associated with the spliceosome C complex. The protein appears to be expressed in the nucleus, particularly in the pons sub-region of the brain. It is clearly necessary for normal development of the nervous system PUBMED:9499415.

    \ ' '9584' 'IPR019149' '\

    This family of proteins has no known function.

    \ ' '9585' 'IPR019150' '\

    This entry represents a family of proteins, approximately 300 residues in length, involved in vesicle transport. They have a single C-terminal transmembrane domain and a SNARE [soluble NSF (N-ethylmaleimide-sensitive fusion protein) attachment protein receptor] domain of approximately 60 residues. The SNARE domains are essential for membrane fusion and are conserved from yeasts to humans. Use1 is one of the three protein subunits that make up the SNARE complex and it is specifically required for Golgi-endoplasmic reticulum retrograde transport PUBMED:12853481.

    \ ' '9586' 'IPR019151' '\

    This family of proteins is conserved from plants to humans. It is putatively determined to be of 264 amino acids in length. It is a chaperon protein which promotes assembly of the 20S proteasome as part of a heterodimer with psmg1. It is degraded by the proteasome upon completion of 20S proteasome maturation. One of the members of the entry, HCCA3 is over expressed in hepatocellular carcinoma and is associated with the invasion of tumour capsules and the adjacent small satellite nodule lesions PUBMED:11854909.

    \ ' '9587' 'IPR019152' '\

    This is the conserved N-terminal 350 residues of a family of proteins of unknown function possibly containing a coiled-coil domain.

    \ ' '9588' 'IPR019153' '\

    This is a family of proteins of approximately 300 residues. They contain a highly conserved DDRGK motif. The function is unknown.

    \ ' '9589' 'IPR019154' '\

    The fission yeast Argonaute siRNA chaperone (ARC) complex contains the Argonaute protein Ago1 and two previously uncharacterised proteins, Arb1 and Arb2, both of which are required for histone H3 Lys9 (H3-K9) methylation, heterochromatin assembly and siRNA generation PUBMED:17310250. This entry represents a region found in both Arb2 and the Hda1 protein.

    \ ' '9590' 'IPR019155' '\

    This entry represents an N-terminal region of approximately 150 residues found in a family of proteins of unknown function. It contains a highly conserved FPL motif.

    \ ' '9591' 'IPR019156' '\

    This is the conserved C-terminal 100 residues of Ataxin-10. Ataxin-10 belongs to the family of armadillo repeat proteins and in solution it tends to form homotrimeric complexes, which associate via a tip-to-tip association in a horseshoe-shaped contact with the concave sides of the molecules facing each other. This domain may represent the homo-association site since that is located near the C terminus of Ataxin-10. The protein does not contain a signal sequence for secretion or any subcellular compartment confirming its cytoplasmic localisation, specifically to the olivocerebellar region PUBMED:15201271.

    \ ' '9594' 'IPR019159' '\

    This entry represents the conserved N-terminal 200 residues of a family of coiled-coil-containing proteins conserved from plants to vertebrates. In Drosophila it comes from the Fidipidine gene, and is of unknown function.

    \ ' '9595' 'IPR019160' '\

    The exocyst complex is composed of 8 subunits: Exoc1, Exoc2, Exoc3, Exoc4, Exoc5, Exoc6, Exoc7 and Exoc8. This entry represents the conserved middle and C terminus of the subunit Exoc1 (Sec3).

    \ \

    Sec3 binds to the C-terminal cytoplasmic domain of GLYT1 (glycine transporter protein 1). Sec3 is the exocyst component that is closest to the plasma membrane docking site and it serves as a spatial landmark in the plasma membrane for incoming secretory vesicles. Sec3 is recruited to the sites of polarised membrane growth through its interaction with Rho1p, a small GTP-binding protein.

    \ ' '9596' 'IPR019161' '\

    This protein has a region of approximately 200 residues carrying several distinctive motifs including a WDYHV motif and one of three cysteines. The function of the protein is unknown.

    \ ' '9597' 'IPR019162' '\

    This entry represents a region of approximately 100 residues containing three WD repeats and six cysteine residues- possibly as three cysteine-bridges associated with FancL. FancL is the ubiquitin ligase protein that mediates ubiquitination of FancD2, a key step in the DNA damage pathway PUBMED:17352736, PUBMED:16860002. FancL belongs to the multisubunit Fanconi anemia (FA) complex, which is composed of subunits: FancA, FancB, FancC, FancE, FancF, FancG, FancL/PHF9 and FancM. The WD repeats are required for interaction of FancL with other subunits of the FA complex PUBMED:16474167.

    \ \

    In humans defects in FancL are a cause of Fanconi anemia (FA) [MIM:227650], and the FA complex is not found in FA patients. FA is a genetically heterogeneous, autosomal recessive disorder characterised by progressive pancytopenia, a diverse assortment of congenital malformations, and a predisposition to the development of malignancies. At the cellular level it is associated with hypersensitivity to DNA-damaging agents, chromosomal instability (increased chromosome breakage), and defective DNA repair.

    \ \ ' '9598' 'IPR019163' '\

    This entry represents Thoc5 which is one of the subunits of the THO complex, which additionally contains: HPR1, Thoc2, Thoc6 and Thoc7. The evolutionarily conserved multisubunit THO complex, which is recruited to actively transcribed genes is required for the efficient expression of genes that have internal tandem repeats. It is suggested that the THO complex functions to rectify aberrant structures that arise during transcription PUBMED:16983072, PUBMED:15998806 and is required for cell proliferation and for proper export of heat-shock mRNAs under heat stress PUBMED:15133499.

    \ \

    This entry also identifies the crucial 144 N-terminal residues of the FmiP protein, which is essential for the binding of the protein to the cytoplasmic domain of activated Fms-molecules in M-CSF induced haematopoietic differentiation of macrophages. The C terminus contains a putative nuclear localisation sequence and a leucine zipper which suggest further, as yet unknown, nuclear functions. The level of FMIP expression might form a threshold that determines whether cells differentiate into macrophages or into granulocytes PUBMED:10597251.

    \ ' '9599' 'IPR019164' '\

    This entry is of the conserved N-terminal 150 residues of proteins conserved from plants to humans. The function is unknown although some annotation suggests that it is a transmembrane protein.

    \ ' '9600' 'IPR019165' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Mitochondrial inner membrane protease ATP23 has two roles in the assembly of mitochondrial ATPase. Firstly, it acts as a protease that removes the N-terminal 10 residues of mitochondrial ATPase CF(0) subunit 6 (ATP6) at the intermembrane space side. Secondly, it is involved in the correct assembly of the membrane-embedded ATPase CF(0) particle, probably mediating association of ATP6 with the subunit 9 ring PUBMED:17135290, PUBMED:17135288.

    \ ' '9601' 'IPR019166' '\

    Apolipoproteins are proteins that binds to lipids. Members of this family promote cholesterol efflux from macrophage cells. They are present in various lipoprotein complexes, including HDL, LDL and VLDL. Apolipoprotein O is a 198 amino acids protein that contains a 23 amino acids long signal peptide. The apoprotein is secreted by a microsomal triglyceride transfer protein (MTTP)-dependent mechanism, probably as a VLDL-associated protein that is subsequently transferred to HDL. Apolipoprotein O is the first chondroitine sulphate chain containing apolipoprotein PUBMED:16956892.

    \ ' '9602' 'IPR019167' '\

    Proteins in this entry are necessary for accurate chromosome transmission during cell division PUBMED:8972867.

    \ ' '9603' 'IPR019168' '\

    The function of this family of transmembrane proteins has not, as yet, been determined.

    \ ' '9604' 'IPR019169' '\

    The function of this family of transmembrane proteins has not, as yet, been determined.

    \ ' '9605' 'IPR019170' '\

    Meckelin is a 995-amino acid seven-transmembrane receptor protein of unknown function PUBMED:16415887. Members of this family are thought to be related to the ciliary basal body. Defects result in Meckel syndrome type 3, [MIM:607361], an autosomal recessive disorder characterised by a combination of renal cysts and variably associated features including developmental anomalies of the central nervous system (typically encephalocele), hepatic ductal dysplasia and cysts, and polydactyly. Joubert syndrome type 6 [MIM:610688] is also a manifestation of certain mutations; it is an autosomal recessive congenital malformation of the cerebellar vermis and brainstem with abnormalities of axonal decussation (crossing in the brain) affecting the corticospinal tract and superior cerebellar peduncles. Individuals with Joubert syndrome have motor and behavioural abnormalities, including an inability to walk due to severe clumsiness and \'mirror\' movements, and cognitive and behavioural disturbances PUBMED:16415887, PUBMED:17160906.

    \ ' '9606' 'IPR019171' '\

    This entry represents proteins that mediate the disruption of the DNA replication checkpoint (S-M checkpoint) mechanism caused by caffeine.

    \ ' '9607' 'IPR018614' '\

    Members of this family comprise various keratinocyte-associated proteins. Their exact function has not, as yet, been determined.

    \ ' '9608' 'IPR018615' '\

    Members of this family are involved in mitochondrial biogenesis and G2/M phase cell cycle progression. They form a component of the mitochondrial ribosome large subunit (39S) which comprises a 16S rRNA and about 50 distinct proteins.

    \ ' '9609' 'IPR019172' '\

    Osteopetrosis-associated transmembrane protein 1 (OSTM1) is required for osteoclast and melanocyte maturation and function. Mutations in OSTM1 give rise to autosomal recessive osteopetrosis, also called autosomal recessive Albers-Schonberg disease PUBMED:12627228, PUBMED:16813530.

    \ ' '9610' 'IPR018616' '\

    Members of this family of proteins catalyse the conversion of guanosine triphosphate (GTP) to 3\',5\'-cyclic guanosine monophosphate (cGMP) and pyrophosphate.

    \ ' '9611' 'IPR018617' '\

    Members of this family of uncharacterised novel proteins have no known function.

    \ ' '9613' 'IPR019173' '\

    Members of this family mediate the transfer of electrons from NADH to the respiratory chain. The immediate electron acceptor for the enzyme is believed to be ubiquinone, the reaction that occurs being: NADH + ubiquinone = NAD(+) + ubiquinol PUBMED:16828987, PUBMED:9425316.

    \ ' '9614' 'IPR019174' '\

    The NADH dehydrogenase [ubiquinone] complex performs the first stage of electron transfer from NADH to the respiratory chain. This entry represents an accessory subunit that is not thought to be involved in catalysis PUBMED:1518044.

    \ ' '9615' 'IPR018618' '\

    Members of this family are involved in the negative regulation of gluconeogenesis. They are required for both proteosome-dependent and vacuolar catabolite degradation of fructose-1,6-bisphosphatase (FBPase), where they probably regulate FBPase targeting from the FBPase-containing vesicles to the vacuole PUBMED:12686616, PUBMED:9508768.

    \ ' '9616' 'IPR016340' '\

    This group represents a predicted mitochondrial ribosomal protein L31, fungal type PUBMED:2666132, PUBMED:1764528.

    \ ' '9617' 'IPR019175' '\

    This is the C-terminal domain of the pre-mRNA processing factor Prp31. Prp31 is required for U4/U6*U5 tri-snRNP formation PUBMED:11867543. In humans this protein has been linked to autosomal dominant retinitis pigmentosa PUBMED:11867543, PUBMED:12444105.

    \ ' '9618' 'IPR019176' '\

    Members of this family are found at the N-terminal region of cytochrome B561, as well as in various other putative, uncharacterised proteins.

    \ ' '9619' 'IPR019177' '\

    This entry represents a family of proteins involved in maintaining Golgi structure. They stimulate the formation of Golgi stacks and ribbons, and are involved in intra-Golgi retrograde transport. Two main interactions have been characterised: one with RAB1A that has been activated by GTP-binding and another with isoform CASP of CUTL1 PUBMED:12656988.

    \ ' '9620' 'IPR019178' '\

    Members of this family catalyse the hydrolysis of the 4-position phosphate of phosphatidylinositol 4,5-bisphosphate, in the reaction: \

    \ ' '9621' 'IPR019179' '\

    Members of this family have been annotated as being coiled-coil domain-containing protein 149, however they currently have no known function.

    \ ' '9622' 'IPR018619' '\

    Members of this family of proteins may have a role in the beta-catenin-Tcf/Lef signaling pathway, as well as in the process of myelination of the central and peripheral nervous system. Defects in Hyccin are the cause of hypomyelination with congenital cataracts [MIM:610532]. This disorder is characterised by congenital cataracts, progressive neurologic impairment, and diffuse myelin deficiency. Affected individuals experience progressive pyramidal and cerebellar dysfunction, muscle weakness and wasting prevailing in the lower limbs PUBMED:16951682, PUBMED:10910037.

    \ ' '9623' 'IPR019180' '\

    This entry represents the N-terminal region of various oxidoreductase-like proteins whose exact function is, as yet, unknown.

    \ ' '9624' 'IPR018620' '\

    This family is of proteins conserved in yeasts. It binds to Uba3 and is involved in the NEDD8 signalling pathway PUBMED:14623327.

    \ ' '9625' 'IPR019181' '\

    Sm and Sm-like proteins of the Lsm (like Sm) domain family are generally involved in essential RNA-processing tasks PUBMED:10801455. All the LSM proteins are evolutionarily conserved in eukaryotes with an N-terminal Lsm domain to bind nucleic acids, followed by an as yet uncharacterised C-terminal region, some of which have a C-terminal methyltransferase domain.

    \ \ \

    This entry represents the central region of approximately 100 residues, which is conserved from plants to humans and is frequently found in association with Lsm domain-containing proteins.

    \ \ ' '9626' 'IPR018307' '\

    This entry represents the late secretory protein Avl9, which is required for the generation of secretory vesicles as well as for actin polarization and polarized growth. Avl9 is involved in exocytic transport from the Golgi. It has been speculated that Avl9 could play a role in deforming membranes for vesicle fission and/or in recruiting cargo PUBMED:17229886.

    \ ' '9627' 'IPR018621' '\

    Autophagy is an intracellular degradation system that responds to nutrient starvation. Cis1/Atg31 has been shown to be required for autophagosome formation in Saccharomyces cerevisiae (Baker\'s yeast) PUBMED:17362880. It interacts with Atg17 PUBMED:17362880.

    \ ' '9628' 'IPR019182' '\

    This entry represents subunit 10 of the cytochrome b-c1 complex (also known as the ubiquinol-cytochrome c reductase complex or complex III). This complex is located on the inner mitochondrial membrane and it couples electron transfer from ubiquinol to cytochrome. Subunit 10 is required for stable association of the iron-sulphur protein with the complex PUBMED:8175712.

    \ ' '9629' 'IPR019183' '\

    This is the non-catalytic subunit of the N-terminal acetyltransferase B complex (NatB). The NatB complex catalyses the acetylation of the amino-terminal methionine residue of all proteins beginning with Met-Asp or Met-Glu and of some proteins beginning with Met-Asn or Met-Met. In Saccharomyces cerevisiae (Baker\'s yeast) this subunit is called MDM20 and in Schizosaccharomyces pombe (Fission yeast) it is called Arm1. NatB acetylates the Tpm1 protein and regulates and tropomyocin-actin interactions. This subunit is required by the NatB complex for the N-terminal acetylation of Tpm1 PUBMED:12808144.

    \ ' '9630' 'IPR018622' '\

    This is a family of proteins which regulate checkpoint kinases. In Schizosaccharomyces pombe (Fission yeast) this protein is called Rad26 and in Saccharomyces cerevisiae (Baker\'s yeast) it is called LCD1 PUBMED:11060031.

    \ ' '9631' 'IPR019184' '\

    This entry represents a 100 amino acid region from a family of proteins that is predicted to be a transmembrane region but its function is not known.

    \ ' '9633' 'IPR019185' '\

    Members of this family are integral membrane proteins involved in protein trafficking between the late Golgi and endosome. They may also serve as a receptor for ADP-ribosylation factor-related protein 1 (ARFRP1) PUBMED:15203023. Sys1p is a small integral membrane protein with four predicted transmembrane domains that localises to the Trans Golgi network TGN in yeast and human cells PUBMED:16926193.

    \ ' '9634' 'IPR018624' '\

    Members of this family of proteins are a component of the heterotetrameric Sec62/63 complex composed of SEC62, SEC63, SEC66 and SEC72. The Sec62/63 complex associates with the Sec61 complex to form the Sec complex. Sec 66 is involved in SRP-independent post-translational translocation across the endoplasmic reticulum and functions together with the Sec61 complex and KAR2 in a channel-forming translocon complex. Furthermore, Sec66 is also required for growth at elevated temperatures PUBMED:8257795, PUBMED:8257794, PUBMED:2000150, PUBMED:7758110.

    \ ' '9635' 'IPR018625' '\

    Members of this family of proteins have no known function.

    \ ' '9636' 'IPR018626' '\

    Members of this family of hypothetical proteins have no known function.

    \ ' '9637' 'IPR019186' '\

    Nop12 is a novel nucleolar protein required for pre-large subunit rRNA processing and in yeast normal rates of cell growth at low temperatures PUBMED:11452019.

    \ ' '9638' 'IPR019187' '\

    Members of this family of proteins are cell-growth suppressors, associating with and influencing the biological activities of important cell cycle regulators in the S phase including monomeric non-phosphorylated cyclin-dependent kinase 2 (CDK2) and DNA polymerase alpha/primase. An association between mutations in the gene coding for this protein and oral cancer has been described.

    \ ' '9639' 'IPR018627' '\

    Members of this family of putative uncharacterised proteins have no known function.

    \ ' '9640' 'IPR019188' '\

    Members of this family are part of the SNAPc complex required for the transcription of both RNA polymerase II and III small-nuclear RNA genes. They bind to the proximal sequence element (PSE), a non-TATA-box basal promoter element common to these 2 types of genes. Furthermore, they also recruit TBP and BRF2 to the U6 snRNA TATA box. SNAPc consists of at least four stably associated subunits, SNAP43, SNAP45, SNAP50, and SNAP190. None of the three small subunits can bind to the PSE on their own PUBMED:9418884.

    \ ' '9641' 'IPR019189' '\

    Proteins in this entry are components of the mitochondrial ribosome large subunit. They are also involved in apoptosis and cell cycle regulation.

    \ ' '9642' 'IPR019190' '\

    Members of this family of proteins are thought to be involved in cellular morphology, though little else is known about them. Mutation of the Saccharomyces cerevisiae (Baker\'s yeast) gene results in a number of features that include aberrant mitochondria and fragmentation of the nucleus PUBMED:10628851.

    \ ' '9643' 'IPR019191' '\

    This entry represents proteins found in the N-terminal region of the essential protein Yae1. The exact function has not been determined.

    \ ' '9644' 'IPR019192' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Members of this family are components of the mitochondrial large ribosomal subunit. Mature mitochondrial ribosomes consist of a small (37S) and a large (54S) subunit. The 37S subunit contains at least 33 different proteins and 1 molecule of RNA (15S). The 54S subunit contains at least 45 different proteins and 1 molecule of RNA (21S) PUBMED:12125055, PUBMED:2060626.

    \ ' '9645' 'IPR018628' '\

    Members of this family of proteins have no known function.

    \ ' '9646' 'IPR019193' '\

    This entry consists of E3 ubiquitin-protein ligases which accept ubiquitin from specific E2 ubiquitin-conjugating enzymes, and transfer it to substrates, generally promoting their degradation by the proteasome PUBMED:15749827.

    \ ' '9647' 'IPR018629' '\

    Members of this family comprise various XK-related proteins, that are involved in sodium-dependent transport of neutral amino acids or oligopeptides. These proteins are responsible for the Kx blood group system - defects results in McLeod syndrome [MIM:314850], an X-linked multi-system disorder characterised by late onset abnormalities in the neuromuscular and hematopoietic systems PUBMED:8004674, PUBMED:7737196.

    \ ' '9648' 'IPR019194' '\

    ELL is an RNA Polymerase II (Pol II) transcriptional elongation factor that interacts with ELL-Associated Factor 1 (EAF1) protein. ELL and EAF1 are components of Cajal bodies, which have a role in leukemogenesis PUBMED:12686606. EAF1 also has the capacity to interact with ELL1 and ELL2. EAF1 has a region of high serine, aspartic acid, and glutamic acid residues PUBMED:11418481.

    \ \ ' '9649' 'IPR018630' '\

    Members of this family of uncharacterised proteins have no known function.

    \ ' '9650' 'IPR019195' '\

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain PUBMED:9873074.

    \ The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site PUBMED:11421269, PUBMED:1282354, PUBMED:9640644.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis PUBMED:11988180, PUBMED:11470432, PUBMED:11402022, PUBMED:9872322, PUBMED:11080142, PUBMED:11532960.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions PUBMED:9873074. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette PUBMED:9873074, PUBMED:11421270. More than 50 subfamilies have been described based on a phylogenetic and functional classification PUBMED:9873074, PUBMED:11421269, PUBMED:11421270; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    \

    This entry consists of various predicted ABC transporter class ATPases.

    \ ' '9651' 'IPR017195' '\

    This group represents a predicted ABC-type thiamin-related transport system, permease component 1. It is probably part of the ABC transporter complex ykoCDEF that could transport hydroxymethylpyrimidine (HMP) and/or thiamine. It could also transport other HMP-containing products. The complex is composed of two ATP-binding proteins (ykoD), two transmembrane proteins (ykoC and ykoE) and a solute-binding protein (ykoF).

    \ ' '9652' 'IPR018631' '\

    This entry contains many hypothetical bacterial proteins. This family was previously the N-terminal part of () before it was split into two. This region is predicted to be an AAA-ATPase domain PUBMED:17584917.

    \ ' '9653' 'IPR018632' '\

    Members of this family are found in various prokaryotic ABC transporters, predominantly involved in nitrate, sulphonate and bicarbonate translocation.

    \ ' '9654' 'IPR019196' '\

    This domain is found in various eukaryotic and prokaryotic intra-flagellar transport proteins involved in gliding motility, as well as in several hypothetical proteins.

    \ ' '9655' 'IPR018633' '\

    This entry was previously the N-terminal portion of DUF524 () before it was split into two. This domain has no known function. It is predicted to adopt an all beta secondary structure pattern followed by mainly alpha-helical structures PUBMED:17584917.

    \ ' '9656' 'IPR014517' '\

    Members of this family of archaeal proteins are conserved transcriptional regulators belonging to the ArsR family.

    \ ' '9657' 'IPR019197' '\

    The function of this structural domain is unknown. It is found to the N terminus of the biotin protein ligase catalytic domain PUBMED:18809372. Biotin protein ligase carries out the post-translational modification of specific proteins by the attachment of biotin. It acts on various carboxylases such as acetyl-CoA-carboxylase, pyruvate carboxylase, propionyl CoA carboxylase, and 3-methylcrotonyl CoA carboxylase.

    \ ' '9658' 'IPR019198' '\

    This entry consists of predicted secreted proteins containing a C-terminal beta-propeller domain distantly related to WD-40 repeats.

    \ ' '9659' 'IPR019199' '\

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes PUBMED:17442114. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements PUBMED:17379808, PUBMED:16545108. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity.

    \

    In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci PUBMED:16292354. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.

    \ \

    Members of this family of bacterial proteins comprise various hypothetical proteins, as well as CRISPR (clustered regularly interspaced short palindromic repeats) associated proteins, conferring resistance to infection by certain bacteriophages.

    \ ' '9660' 'IPR018634' '\

    Members of this family of bacterial proteins, are involved in the reduction of chromate accumulation and are essential for chromate resistance.

    \ ' '9661' 'IPR018635' '\

    This domain, found in various prokaryotic proteins, has no known function.

    \ ' '9662' 'IPR019200' '\

    Diadenosine 5\',5\'\'\'-P-1,P-4-tetraphosphate (Ap4A) and related diadenosine oligoposphates such as Ap3A are important intracellular and extracellular signalling molecules in prokaryotes and eukaryotes PUBMED:9607303. They are implicated in the regulation of many vital celluar functions including stress response, cell division and apoptosis. Synthesis primarily occurs via aminoacyl-tRNA synthetases adding the AMP moiety of an aminoacyl-AMP to an acceptor nucleotide, and is an inevitable byproduct of protein synthesis. The concentration of these compounds must thus be controlled both to ensure the proper regulation of various celluar processes, but also to prevent their buildup to potentially toxic levels.

    \ \

    This entry represents a group of ATP adenylyltransferases found in bacteria and lower eukaryotes which catalyse the interconversion of Ap4A to ATP and ADP PUBMED:2556364, PUBMED:7948033, PUBMED:9003364. While these enzymes are thought to act primarily to break down Ap4A, there is evidence to suggest that in some circumstances they may also act in a biosynthetic role. Some variability in substrate range is apparent eg the cyanobacterial enzyme can also utilise Ap3A as a substrate, while the Saccharomyces enzymes apparently cannot.

    \ ' '9663' 'IPR018636' '\

    This domain, found in various prokaryotic proteins, has no known function.

    \ ' '9664' 'IPR018637' '\

    This domain, found in various prokaryotic proteins, has no known function.

    \ ' '9666' 'IPR018638' '\

    This domain, found in various prokaryotic proteins, has no known function.

    \ ' '9667' 'IPR018639' '\

    This domain, found in various prokaryotic proteins, has no known function.

    \ ' '9668' 'IPR018640' '\

    This domain, found in various prokaryotic proteins, has no known function.

    \ ' '9669' 'IPR018641' '\

    This domain, found in various prokaryotic proteins, has no known function.

    \ ' '9670' 'IPR019201' '\

    This entry represents a protein found in various prokaryotic proteins, and has no known function.

    \ ' '9671' 'IPR018642' '\

    This domain, found in various prokaryotic proteins, has no known function.

    \ ' '9672' 'IPR019202' '\

    This family of archaeal proteins, have no known function.

    \ ' '9674' 'IPR018643' '\

    This domain, found in various prokaryotes, has no known function.

    \ ' '9675' 'IPR019204' '\

    This domain of unknown function is found in various bacterial and archael hypothetical proteins, as well as in prokaryotic polyketide synthase.

    \ ' '9676' 'IPR018644' '\

    This conserved protein (similar to YgjF), found in various prokaryotes, has no known function.

    \ ' '9677' 'IPR018645' '\

    This archaeal protein has no known function.

    \ ' '9678' 'IPR012017' '\

    There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9679' 'IPR018646' '\

    This domain, found in various archaeal hypothetical proteins, has no known function.

    \ ' '9680' 'IPR018647' '\

    This domain, found in various prokaryotic proteins (including putative ATP/GTP binding proteins), has no known function.

    \ ' '9681' 'IPR018648' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function. The domain, however, is found in various periplasmic ligand-binding sensor proteins.

    \ ' '9682' 'IPR017732' '\

    At least two distinct groups of proteins, often encoded by adjacent genes, show sequence similarity due to homology between type IV secretion systems and type VI secretion systems. One is the IcmF family (). The other group is the DotU family, defined by this N-terminal domain, which includes DotU from the Legionella pneumophila type IV secretion system. Many of the proteins in this entry from type VI secretion systems have an additional C-terminal domain with OmpA/MotB homology .

    \ ' '9683' 'IPR018649' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9684' 'IPR018650' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9685' 'IPR019205' '\

    This entry, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9686' 'IPR018651' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9687' 'IPR018652' '\

    This domain, found in various hypothetical prokaryotic proteins, as well as some Zn-ribbon nucleic-acid-binding proteins has no known function.

    \ ' '9688' 'IPR018653' '\

    This domain is found in various prokaryotic transcriptional regulatory proteins belonging to the XRE family. Its exact function is, as yet, unknown.

    \ ' '9689' 'IPR018654' '\

    This domain, found in various hypothetical prokaryotic proteins, as well as proteins belonging to the UPF0386 family, has no known function.

    \ ' '9690' 'IPR019206' '\

    This entry, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9691' 'IPR018655' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9692' 'IPR018656' '\

    This domain, found in various hypothetical prokaryotic proteins and transcriptional activators, has no known function.

    \ ' '9693' 'IPR018657' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9694' 'IPR018658' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9695' 'IPR018659' '\

    This domain, found in various prokaryotic carbohydrate kinases, has no known function.

    \ ' '9696' 'IPR018660' '\

    This domain, found in the C-terminal end of various hypothetical prokaryotic proteins, has no known function.

    \ ' '9697' 'IPR019207' '\

    This entry represents various hypothetical prokaryotic proteins, has no known function.

    \ ' '9698' 'IPR018661' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9699' 'IPR017748' '\

    These proteins are found exclusively, although not universally, in bacterial species that possess a type VI secretion system and are encoded in type VI secretion-associated gene clusters. Their specific function is not yet known. Occasionally a domain is found C-terminal to this entry, but it shows little if any conservation between sequences.

    \ ' '9700' 'IPR018662' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9701' 'IPR017098' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9702' 'IPR019208' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9703' 'IPR019209' '\

    This family of proteins have no known function.

    \ ' '9704' 'IPR009181' '\

    The exact function of this protein is unknown, but likely is linked to methanogenesis or a process closely connected to it.

    \ ' '9705' 'IPR019210' '\

    This entry represents various hypothetical archaeal proteins, and has no known function.

    \ ' '9706' 'IPR018663' '\

    This domain, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9707' 'IPR012025' '\

    The exact functionof this protein unknown, but likely is linked to methanogenesis or a process closely connected to it.

    \ ' '9708' 'IPR018664' '\

    This domain, found in various putative metal binding prokaryotic proteins, has no known function.

    \ ' '9709' 'IPR019211' '\

    This entry is found in various hypothetical archaeal proteins, has no known function.

    \ ' '9710' 'IPR019212' '\

    This entry represents a protein found in various hypothetical archaeal proteins, has no known function.

    \ ' '9711' 'IPR011313' '\

    Energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type) form a distinct group within the [NiFe] hydrogenase family PUBMED:15168611, PUBMED:16645307. Members of this subgroup include:

    \ \

    Energy-converting [NiFe] hydrogenases are membrane-bound enzymes with a six-subunit core: the large and small hydrogenase subunits, plus two hydrophilic proteins and two integral membrane proteins. Their large and small subunits show little sequence similarity to other [NiFe] hydrogenases, except for key conserved residues coordinating the active site and [FeS] cluster. However, they show considerable sequence similarity to the six-subunit, energy-conserving NADH:quinone oxidoreductases (complex I), which are present in cytoplasmic membranes of many bacteria and in inner mitochondrial membranes. However, the reactions they catalyse differ significantly from complex I. Energy-converting [NiFe] hydrogenases function as ion pumps.

    \ \

    Eha and Ehb hydrogenases contain extra subunits in addition to those shared by other energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type). Eha contains a 6[4Fe-4S] polyferredoxin, a 10[4F-4S] polyferredoxin, ten other predicted integral membrane proteins (EhaA , EhaB , EhaC , EhaD , EhaE , EhaF , EhaG , EhaI , EhaK , EhaL ) and four hydrophobic subunits (EhaM, EhaR , EhS, EhT) PUBMED:10491142. The ten predicted integral membrane proteins are absent from Ech, Coo, Hyc and Hyf complexes, which may have simpler membrane components than Eha. Eha and Ehb catalyse the reduction of low-potential redox carriers (e.g. ferredoxins or polyferredoxins), which then might function as electron donors to oxidoreductases.

    \

    [NiFe] hydrogenases function in H2 metabolism in a variety of microorganisms, enabling them to use H2 as a source of reducing equivalent under aerobic and anaerobic conditions [NiFe] hydrogenases consist of two subunits, hydrogenase large and hydrogenase small. The large subunit contains the binuclear [NiFe] active site, while the small subunit binds at least one [4Fe-4S] cluster PUBMED:15119826.

    \

    Based on sequence similarity and genome context analysis, other organisms such as Methanopyrus kandleri, Methanocaldococcus jannaschii, and Methanothermobacter marburgensis also encode Eha-like [NiFe]-hydrogenase-3-type complexes and have very similar eha operon structure.

    \

    This entry represents small membrane proteins that are predicted to be the EhaF transmembrane subunits of multi-subunit membrane-bound [NiFe]-hydrogenase Eha complexes.

    \ ' '9712' 'IPR011317' '\

    Energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type) form a distinct group within the [NiFe] hydrogenase family PUBMED:15168611, PUBMED:16645307. Members of this subgroup include:

    \ \

    Energy-converting [NiFe] hydrogenases are membrane-bound enzymes with a six-subunit core: the large and small hydrogenase subunits, plus two hydrophilic proteins and two integral membrane proteins. Their large and small subunits show little sequence similarity to other [NiFe] hydrogenases, except for key conserved residues coordinating the active site and [FeS] cluster. However, they show considerable sequence similarity to the six-subunit, energy-conserving NADH:quinone oxidoreductases (complex I), which are present in cytoplasmic membranes of many bacteria and in inner mitochondrial membranes. However, the reactions they catalyse differ significantly from complex I. Energy-converting [NiFe] hydrogenases function as ion pumps.

    \ \

    Eha and Ehb hydrogenases contain extra subunits in addition to those shared by other energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type). Eha contains a 6[4Fe-4S] polyferredoxin, a 10[4F-4S] polyferredoxin, ten other predicted integral membrane proteins (EhaA , EhaB , EhaC , EhaD , EhaE , EhaF , EhaG , EhaI , EhaK , EhaL ) and four hydrophobic subunits (EhaM, EhaR , EhS, EhT) PUBMED:10491142. The ten predicted integral membrane proteins are absent from Ech, Coo, Hyc and Hyf complexes, which may have simpler membrane components than Eha. Eha and Ehb catalyse the reduction of low-potential redox carriers (e.g. ferredoxins or polyferredoxins), which then might function as electron donors to oxidoreductases.

    \

    [NiFe] hydrogenases function in H2 metabolism in a variety of microorganisms, enabling them to use H2 as a source of reducing equivalent under aerobic and anaerobic conditions [NiFe] hydrogenases consist of two subunits, hydrogenase large and hydrogenase small. The large subunit contains the binuclear [NiFe] active site, while the small subunit binds at least one [4Fe-4S] cluster PUBMED:15119826.

    \

    Based on sequence similarity and genome context analysis, other organisms such as Methanopyrus kandleri, Methanocaldococcus jannaschii, and Methanothermobacter marburgensis also encode Eha-like [NiFe]-hydrogenase-3-type complexes and have very similar eha operon structure.

    \

    This entry represents small membrane proteins that are predicted to be the EhaE transmembrane subunits of multi-subunit membrane-bound [NiFe]-hydrogenase Eha complexes.

    \ ' '9713' 'IPR019213' '\

    This entry is found in various hypothetical archaeal proteins, has no known function.

    \ ' '9714' 'IPR019214' '\

    This entry is found in various hypothetical archaeal proteins and has no known function.

    \ ' '9715' 'IPR016757' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9716' 'IPR012029' '\

    There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. However, members of possess a domain homologous to these proteins fused within a signal transduction sensor protein containing PAS/PAC and GAF domains. Therefore, it is possible that members of this family are involved in signal transduction (possibly as a sensor).

    \ ' '9717' 'IPR012356' '\

    The exact function of this protein is unknown, but likely is linked to methanogenesis or a process closely connected to it.

    \ ' '9718' 'IPR016762' '\

    There is currently no experimental data for members of this group or their homologues. Based on distant sequence similarity, they may be tentatively predicted to be nucleic acid-binding proteins, they are also likely to be linked to methanogenesis or a process closely connected to it.

    \ ' '9719' 'IPR008303' '\

    There are currently no experimental data for members of this group or their homologues. The exact function of this protein is unknown, but likely is linked to methanogenesis or a process closely connected to it.

    \ ' '9720' 'IPR019215' '\

    This entry represents various hypothetical archaeal proteins, has no known function.

    \ ' '9721' 'IPR019216' '\

    This entry contains various hypothetical archaeal proteins whose functions are unknown. However, they contain a conserved zinc ribbon motif in the N-terminal part and a predicted transmembrane segment in the C-terminal part.

    \ ' '9722' 'IPR012032' '\

    There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9723' 'IPR019217' '\

    This protein is found in various hypothetical archaeal proteins, but has no known function.

    \ ' '9724' 'IPR019218' '\

    This protein is found in various hypothetical archaeal proteins, and has no known function.

    \ ' '9725' 'IPR014515' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9726' 'IPR016754' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. They do show distant similarity to NTPases and to nucleic acid binding enzymes.

    \ ' '9727' 'IPR018665' '\

    This domain, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9729' 'IPR009183' '\ There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.\ ' '9730' 'IPR018666' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9731' 'IPR018667' '\

    Members of this family of bacterial domains are predominantly found in transglutaminase and transglutaminase-like proteins. Their exact function is, as yet, unknown.

    \ ' '9732' 'IPR014591' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. However, they are predicted to be integral membrane proteins (with several transmembrane segments).

    \ ' '9733' 'IPR017154' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9734' 'IPR016979' '\

    This is a group of uncharacterised conserved proteins.

    \ ' '9735' 'IPR019219' '\

    This entry, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9736' 'IPR017162' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9737' 'IPR018668' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9738' 'IPR019220' '\

    This entry, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9739' 'IPR018669' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9741' 'IPR018671' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9742' 'IPR016675' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9743' 'IPR018672' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9744' 'IPR018673' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9745' 'IPR018674' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9748' 'IPR014547' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9749' 'IPR019223' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9750' 'IPR019224' '\

    This entry, found in various hypothetical bacterial and archaeal proteins containing a ferredoxin domain, has no known function.

    \ ' '9751' 'IPR018676' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9752' 'IPR014518' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9753' 'IPR014450' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9754' 'IPR016975' '\

    The LiaRS two-component system is part of the regulatory network orchestrating the cell-envelope stress response in Bacillus subtilis. It responds to perturbations with the cell envelope, especially antibiotics that interfere with the lipid II and undecaprenol cycle, such as bacitracin or vancomycin. LiaRS-dependent regulation is strictly repressed by the membrane protein LiaF and it integrates both positive and negative feedback loops to transduce cell envelope stress signals PUBMED:17660417, PUBMED:16816187. This group represents a group of cell wall-active antibiotics response proteins including LiaF and YvqF types.

    \ ' '9755' 'IPR019225' '\

    This entry contains various hypothetical prokaryotic proteins that have no known function.

    \ ' '9756' 'IPR016732' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9757' 'IPR018677' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9758' 'IPR019226' '\

    This entry represents a family of predominantly prokaryotic proteins with no known function.

    \ ' '9759' 'IPR019227' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9760' 'IPR018678' '\

    The members of this family of hypothetical prokaryotic proteins have no known function. It is thought that they are transmembrane proteins, but their function has not been inferred yet.

    \ ' '9761' 'IPR018679' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9762' 'IPR017199' '\

    This group represents a predicted membrane transporter, MTH672 type.

    \ ' '9763' 'IPR019228' '\

    This entry describes the N-terminal region of a family of proteins found almost exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus (Rhodopseudomonas capsulata) gene transfer agent, which packages DNA. An apparent exception is Wolbachia pipientis wMel, a bacterial endosymbiont of the fruit fly, which has several candidate phage-related genes physically separate from obvious prophage regions.

    \ ' '9764' 'IPR018680' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9765' 'IPR018681' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9767' 'IPR018682' '\

    This domain, found in various hypothetical membrane-anchored prokaryotic proteins, has no known function.

    \ ' '9768' 'IPR019230' '\

    This entry represents various hypothetical prokaryotic proteins, has no known function. It is also found in a few prokaryotic tRNA (guanine-N(1)-)-methyltransferases.

    \ ' '9769' 'IPR018683' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9770' 'IPR019231' '\

    This entry represents various hypothetical prokaryotic proteins, with no known function.

    \ ' '9771' 'IPR018684' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9772' 'IPR019232' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function. An aminopeptidase domain is conserved within the family, but its relevance has not been established yet.

    \ ' '9773' 'IPR018685' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9774' 'IPR019233' '\

    This entry represents various hypothetical bacterial proteins that have no known function.

    \ ' '9775' 'IPR018686' '\

    This domain, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9777' 'IPR018687' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9778' 'IPR019235' '\

    This entry, found in various hypothetical bacterial and archaeal proteins, has no known function, but contains several predicted transmembrane helices.

    \ ' '9779' 'IPR017211' '\

    This group represents a predicted zinc finger protein, AF1427 type.

    \ ' '9780' 'IPR018688' '\

    This domain, found in various hypothetical bacterial membrane proteins having predicted metal-binding properties, has no known function.

    \ ' '9781' 'IPR019236' '\

    This entry, found in various bacterial and fungal proteins, has no known function.

    \ ' '9782' 'IPR020049' '\

    This group represents an uncharacterised conserved protein.

    \ ' '9783' 'IPR018689' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9784' 'IPR019238' '\

    This entry, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9785' 'IPR018690' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9786' 'IPR018691' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9787' 'IPR018692' '\

    Members of this family are found in various hypothetical prokaryotic proteins, as well as putative cytochrome c oxidases. Their exact function has not, as yet, been established.

    \ ' '9788' 'IPR011231' '\

    This is a family of uncharacterised conserved proteins from the Gammaproteobacteria.

    \ ' '9789' 'IPR019239' '\

    This entry, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9790' 'IPR018693' '\

    This domain, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9791' 'IPR018694' '\

    This domain, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9792' 'IPR018695' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9793' 'IPR018696' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9794' 'IPR019240' '\

    This protein is found in various hypothetical bacterial proteins, has no known function.

    \ ' '9795' 'IPR019241' '\

    This entry represents proteins found in various hypothetical bacterial proteins, has no known function.

    \ ' '9796' 'IPR019242' '\

    This entry represents a protein found in various hypothetical archaeal proteins, and has no known function.

    \ ' '9797' 'IPR018697' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9798' 'IPR014580' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9799' 'IPR018698' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9800' 'IPR019243' '\

    This domain, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9801' 'IPR018699' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9802' 'IPR018700' '\

    This domain, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9803' 'IPR018701' '\

    This domain, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9804' 'IPR018702' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9805' 'IPR009198' '\ There are currently no experimental data for members of this group or their homologues. However, these proteins are predicted to contain three or more transmembrane segments.\ ' '9806' 'IPR014514' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9808' 'IPR018704' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9809' 'IPR018705' '\

    This domain, found in various hypothetical prokaryotic proteins, has no known function.

    \ ' '9811' 'IPR019246' '\

    This entry is found in a family consisting mostly of bacterial and phage proteins. The exact function of these proteins has not, as yet, been determined.

    \ ' '9812' 'IPR018706' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9813' 'IPR014543' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9814' 'IPR018707' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9815' 'IPR014544' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9816' 'IPR019247' '\

    This entry is found at the N-terminus of various BarA-like signal transduction histidine kinases. These proteins are involved in the regulation of carbon metabolism via the csrA/csrB regulatory system. The role of this domain has not, as yet, been established.

    \ ' '9817' 'IPR019248' '\

    This entry represents various prokaryotic membrane-anchored proteins predicted to be involved in the regulation of amylopullulanase.

    \ ' '9818' 'IPR018708' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9819' 'IPR019249' '\

    This entry, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9820' 'IPR019250' '\

    This entry represents hypothetical bacterial proteins that possess metal binding properties; however, their exact function has not yet been determined.

    \ ' '9821' 'IPR018709' '\

    Members of this family include various bacterial hypothetical proteins, as well as CoA enzyme activases. The exact function of this domain has not, as yet, been defined.

    \ ' '9822' 'IPR019251' '\

    This entry, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9823' 'IPR018710' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9824' 'IPR018711' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9826' 'IPR018712' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9827' 'IPR018713' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9828' 'IPR018714' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9829' 'IPR014509' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. However, they are predicted to be integral membrane proteins, with several transmembrane segments.

    \ ' '9830' 'IPR018715' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9831' 'IPR018716' '\

    This domain, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9832' 'IPR018717' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9833' 'IPR018718' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9834' 'IPR018719' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9835' 'IPR019253' '\

    This entry consists of various bacterial putative membrane proteins with no known function.

    \ ' '9836' 'IPR016630' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9837' 'IPR011201' '\

    This is a family of uncharacterised bacterial proteins.

    \ ' '9838' 'IPR018720' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9839' 'IPR019254' '\

    Members of this family of hypothetical archaeal proteins have no known function.

    \ ' '9840' 'IPR014449' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9841' 'IPR018721' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9843' 'IPR018723' '\

    Members of this family of bacterial proteins comprises various hypothetical and putative membrane proteins. Their exact function, has not, as yet, been defined.

    \ ' '9844' 'IPR016888' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9845' 'IPR017136' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9846' 'IPR018724' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9847' 'IPR017140' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9848' 'IPR018725' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9849' 'IPR019257' '\

    This entry, found in various hypothetical proteins, has no known function.

    \ ' '9850' 'IPR019258' '\

    Members of this family represent the Med4 subunit of the Mediator (Med) complex PUBMED:10235266, PUBMED:10235267.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '9852' 'IPR019260' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9853' 'IPR019261' '\

    This entry, found in various hypothetical bacterial and eukaryotic proteins, has no known function.

    \ ' '9854' 'IPR016624' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9855' 'IPR014553' '\

    This group represents a predicted aminopeptidase.

    \ ' '9857' 'IPR018727' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9858' 'IPR018728' '\

    This domain, found in various hypothetical bacterial proteins, as well as predicted zinc dependent proteases, has no known function.

    \ ' '9859' 'IPR018729' '\

    Members of this family of bacterial hypothetical integral membrane proteins have no known function.

    \ ' '9860' 'IPR014470' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9861' 'IPR014469' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9862' 'IPR019262' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9863' 'IPR018730' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9864' 'IPR019263' '\

    This entry represents proteins involved in inorganic phosphate transport, as well as telomere length regulation and maintenance PUBMED:8709965, PUBMED:14576278, PUBMED:16823961, PUBMED:16552446.

    \ ' '9865' 'IPR018731' '\

    Members of this family of phosphoproteins are involved in cytoplasm to vacuole transport (Cvt), and more specifically in Cvt vesicle formation. They are probably involved in the switching machinery regulating the conversion between the Cvt pathway and autophagy. Finally, ATG13 is also required for glycogen storage PUBMED:9224892, PUBMED:8224160, PUBMED:10837477.

    \ ' '9866' 'IPR018732' '\

    This entry represents the Dpy-19 protein from Caenorhabditis elegans and its homologues in other Metazoa, including mammals. In C. elegans, Dpy-19 is required to orient neuroblasts QL and QR correctly on the anterior/posterior (A/P) axis. These neuroblasts are born in the same A/P position, but polarise and migrate left/right asymmetrically, where QL migrates toward the posterior and QR migrates toward the anterior. After their migrations, QL (but not QR) switches on the Hox gene mab-5. Dpy-19 is required along with Unc-40 to express Mab-5 correctly in the Q cell descendants PUBMED:11023868.

    \

    A mammalian dpy-19 homologue was found to be expressed in GABAergic neurons PUBMED:15944814. The mammalian homologue of Mab-5 is the Gsh2 homeobox transcription factor, which plays a crucial role in the development of GABAergic neurons.

    \ ' '9867' 'IPR019264' '\

    This entry, found mostly in hypothetical bacterial proteins, has no known function.

    \ ' '9868' 'IPR019265' '\

    This family of proteins conserved from nematodes to humans is of approximately 250 amino acids. It is purported to be carnitine deficiency-associated protein but this could not be confirmed. It carries a characteristic RLL sequence-motif. The function is unknown.

    \ ' '9869' 'IPR019266' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This entry represents a family of small ribosomal proteins possessing one of three conserved sequence blocks found in proteins that stimulate the dissociation of guanine nucleotides from G-proteins. This leaves open the possibility that they may be functional partners of GTP-binding ribosomal proteins PUBMED:11344316.

    \ ' '9870' 'IPR018733' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9871' 'IPR018734' '\

    This domain, found in various hypothetical bacterial proteins and in the RNA polymerase sigma factor, has no known function.

    \ ' '9872' 'IPR019267' '\

    This entry, found in various hypothetical proteins, has no known function.

    \ ' '9873' 'IPR018735' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9874' 'IPR019268' '\

    This entry consists of hypothetical proteins with no known function.

    \ ' '9875' 'IPR018736' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9876' 'IPR018737' '\

    Rtp is a family of proteins of approximately 112 amino acids in length which is conserved from nematodes to humans. The proposed tertiary structure is of almost entirely alpha helix interrupted only by loops located at proline residues. Three sites in the protein sequence reveal two types of possible post-translation modification. A serine residue, at position 41, is a candidate for protein kinase C phosphorylation. Glycine residues at position 69 and 91 are probable sites for acetylation by covalent amide linkage of myristate via N-myristoyl transferase. Rtp is differentially expressed in the trout retina between parr and smolt developmental stages (smoltification). It is likely to be a house-keeping protein PUBMED:14662307.

    \ ' '9877' 'IPR018738' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9878' 'IPR019269' '\

    This entry represents a family of proteins that play a role in cellular proliferation, as well as in the biogenesis of specialised organelles of the endosomal-lysosomal system PUBMED:15102850.

    \ ' '9879' 'IPR018739' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9880' 'IPR018740' '\

    Members of this family of hypothetical bacterial proteins and putative signal peptide proteins have no known function.

    \ ' '9881' 'IPR019270' '\

    Members of this family of hypothetical proteins have no known function.

    \ ' '9882' 'IPR019271' '\

    This entry represents a family of predicted metal-binding bacterial and archaeal proteins with no known function.

    \ ' '9883' 'IPR017006' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9884' 'IPR018741' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9885' 'IPR018742' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9886' 'IPR014582' '\

    There is currently no experimental data for members of this group of predicted periplasmic lipoproteins or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9887' 'IPR018743' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9888' 'IPR018744' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9889' 'IPR018745' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9890' 'IPR019273' '\

    This domain, found in the eukaryotic lunapark proteins, has no known function PUBMED:12732147.

    \ ' '9892' 'IPR018746' '\

    Members of this highly hydrophobic probable integral membrane family belong to two classes. In one, a single copy of the region modelled by the signatures in this entry represents essentially the full length of a strongly hydrophobic protein of about 700 to 900 residues (variable because of long inserts in some). The domain architecture of the other class consists of an additional N-terminal region, two copies of the region represented by this model, and three to four repeats of TPR, or tetratricopeptide repeat. The unusual species range includes several Archaea, several Chloroflexi, and Clostridium phytofermentans. An unusual motif YYYxG is present. The function is unknown.

    \ \ ' '9893' 'IPR018747' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9894' 'IPR018748' '\

    This domain, found in various bacterial hypothetical and putative signal peptide proteins, has no known function.

    \ ' '9895' 'IPR019275' '\

    This entry, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9897' 'IPR019276' '\

    This entry represents hypothetical bacterial proteins that have no known function.

    \ ' '9898' 'IPR019277' '\

    This entry represents hypothetical archaeal and bacterial proteins that have no known function.

    \ ' '9899' 'IPR018750' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9901' 'IPR019278' '\

    This entry, found in various cyanobacterial sensor proteins that catalyse the reaction [ATP + protein L-histidine = ADP + protein N- phospho-L-histidine], has no known function.

    \ ' '9902' 'IPR018752' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9903' 'IPR016908' '\

    This group represents uncharacterised conserved proteins.

    \ ' '9905' 'IPR018753' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9906' 'IPR018754' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9907' 'IPR019280' '\

    The photomorphogenic 9 (COP9) signalosome or CSN complex is composed of eight subunits: Cops1/GPS1, Cops2, Cops3, Cops4, Cops5, Cop6, Cops7 (Cops7A or Cops7B) and Cops8. In the complex, Cops8, which is the smallest subunit, probably interacts directly with Cops3, Cops4 and Cops7 (Cops7A or Cops7B). This signalosome is homologous to the lid subcomplex of the 26S proteasome and regulates the ubiquitin-proteasome pathway. It functions as a structural scaffold for subunit-subunit interactions within the complex and is a key regulator of photomorphogenic development PUBMED:14636993.

    \ ' '9908' 'IPR018755' '\

    Members of this family of proteins comprise various hypothetical and putative bacteriophage tail proteins, including gp48 from Bacteriophage Mu and other Mu-like prophages such as FluMu.

    \ ' '9909' 'IPR018756' '\

    This domain is found in various bacterial hypothetical proteins, as well as putative ankyrin repeat proteins. The exact function of the domains comprising this family has not, as yet, been determined.

    \ ' '9910' 'IPR018757' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9911' 'IPR011199' '\

    This is a family of uncharacterised bacterial proteins.

    \ ' '9912' 'IPR018758' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9913' 'IPR012037' '\

    There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9914' 'IPR018759' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9915' 'IPR016891' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9916' 'IPR016755' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9918' 'IPR011397' '\

    There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. However, they are predicted to be integral membrane proteins (with several transmembrane segments).

    \ ' '9919' 'IPR016772' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9920' 'IPR018760' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9922' 'IPR018762' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9923' 'IPR019282' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9924' 'IPR019283' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9925' 'IPR016633' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9926' 'IPR011200' '\

    This is a family of uncharacterised bacterial proteins.

    \ ' '9927' 'IPR016936' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9928' 'IPR018763' '\

    This domain, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9929' 'IPR019284' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9930' 'IPR019285' '\

    Members of this family of hypothetical bacterial proteins have no known function.

    \ ' '9931' 'IPR018764' '\

    RskA (regulator of sigma K) represses the extra-cytoplasmic function (ECF) sigma factor K (sigK) by binding to it and inhibiting its activity PUBMED:18203833. This leads to a decreased expression of SigK-regulated genes, such as mpt70 and mpt83. RskA is found in various Mycobacterium, such as Mycobacterium tuberculosis. However, in Mycobacterium bovis it is probably dysfunctional, due to at least one of the two natural occurring polymorphisms in its encoding gene, when compared to M. tuberculosis PUBMED:17064366. This leads to an increased expression of SigK-regulated genes.

    \ \ \ ' '9932' 'IPR016935' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9933' 'IPR019286' '\

    This entry, found in various hypothetical bacterial proteins, has no known function.

    \ ' '9934' 'IPR018765' '\

    Members of this family are found in various bacterial proteins, including MotA/TolQ/ExbB proton channels and other transport proteins. The exact function of this set of domains has not, as yet, been determined.

    \ ' '9935' 'IPR018766' '\

    This entry represents a family of proteins whose function is currently unknown. These proteins are predominantly found in the actinobacteria (high GC Gram-positive bacteria), though some occur in other bacterial species and archaea.

    \ ' '9936' 'IPR018767' '\

    This entry represents the highly conserved C-terminal region of Brr6-like proteins, including Brl1, which are found in fungi. Brr6 from Saccharomyces cerevisiae (Baker\'s yeast) is an essential nuclear envelope integral membrane protein that is required for mRNA nuclear export PUBMED:11483521. Brr6 is involved in the nuclear pore complex (NPC) distribution and nuclear envelope morphology. Brr6 interacts with Brl1, which is also involved in mRNA and protein export from the nucleus PUBMED:15882446.

    \

    The conserved C-terminal region carries four highly conserved cysteine residues. It is suggested that members of the family interact with each other via di-sulphide bridges to form a complex that is involved in nucleocytoplasmic transport.

    \ \ \ ' '9937' 'IPR018768' '\

    This domain, found in various hypothetical bacterial proteins and Radical Sam domain proteins, has no known function.

    \ ' '9938' 'IPR018769' '\

    Members of this family are found in various bacterial hypothetical proteins, as well as Rhs element Vgr proteins.

    \ ' '9939' 'IPR019287' '\

    This domain is found in various predicted bacterial endonucleases which are distantly related to archaeal Holliday junction resolvases.

    \ ' '9940' 'IPR019288' '\

    This entry represents various prokaryotic 3\'-5\' exonucleases and hypothetical proteins.

    \ ' '9941' 'IPR019289' '\

    Members of this family of prokaryotic proteins include various gp41 proteins and related sequences PUBMED:9714755.

    \ ' '9942' 'IPR018476' '\

    Members of this family comprise the membrane domain of the prokaryotic enzyme glycerophosphoryl diester phosphodiesterase PUBMED:1851953.

    \ ' '9943' 'IPR019290' '\

    This entry is found in a set of prokaryotic proteins including putative glucosyltransferases, which are involved in bacterial capsule biosynthesis PUBMED:9515923, PUBMED:11953367.

    \ ' '9944' 'IPR018770' '\

    This entry consists of prokaryotic proteins that mediate the hydrolysis of 5-bromo-4-chloroindolyl phosphate bonds.

    \ ' '9945' 'IPR016760' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. Note: members of this group are not fibrillarin homologues, and should not be annotated as such.

    \ ' '9946' 'IPR018771' '\

    This entry is thought to act as a sensory domain in histidine kinases catalysing the reaction: ATP + protein L-histidine = ADP + protein N- phospho-L-histidine.

    \ ' '9947' 'IPR018772' '\

    This entry, found in various hypothetical prokaryotic proteins, has no known function. One of the proteins in this entry corresponds to the transcriptional activator HlyU, indicating a possible similar role in other members.

    \ ' '9948' 'IPR019291' '\

    Members of this family of bacterial proteins are required for the attachment of the bacterium to host cells PUBMED:10786639, PUBMED:12024217.

    \ ' '9949' 'IPR019292' '\

    Members of this family of prokaryotic proteins modify the specificity of mcrB restriction by expanding the range of modified sequences that are restricted PUBMED:2203735, PUBMED:2050643. It does not bind DNA.

    \ ' '9950' 'IPR016516' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function

    \ ' '9951' 'IPR018773' '\

    This entry represents a sequence region found in various prokaryotic methyltransferases that regulates the activity of the methyltransferase domain.

    \ ' '9952' 'IPR019293' '\

    This entry is found in a predominantly archaeal family of proteins that possess phosphomethylpyrimidine kinase activity and are involved in the thiamin biosynthetic process.

    \ ' '9954' 'IPR019294' '\

    Members of this entry belong to the Com family of proteins that act as translational regulators of mom PUBMED:6345072, PUBMED:11922669.

    \ ' '9955' 'IPR019295' '\

    Members of this family of proteins comprise various viral Mu-like prophage I proteins.

    \ ' '9956' 'IPR018774' '\

    This entry consists of various caudoviral prophage proteins, including the Mu-like prophage major head subunit gpT.

    \ ' '9957' 'IPR017059' '\

    This group contains membrane proteins that are predicted to be transmembrane subunits (EhaH) of multisubunit membrane-bound [NiFe]-hydrogenase Eha complexes.

    The energy-converting hydrogenase A (eha) operon encodes a putative multisubunit membrane-bound [NiFe]-hydrogenase Eha in Methanobacterium thermoautotrophicum (strain Marburg / DSM 2133). Sequence analysis of the eha operon indicates that it encodes at least 20 proteins, including the [NiFe]-hydrogenase large subunit (), the [NiFe]-hydrogenase small subunit (), and two broadly conserved integral membrane proteins (this entry and ). These four proteins show high sequence similarity to subunits of the Ech hydrogenase from Methanosarcina barkeri, Escherichia coli hydrogenases 3 and 4 (Hyc and Hyf), and CO-induced hydrogenase from Rhodospirillum rubrum (Coo), all of which form a distinct group of multisubunit membrane-bound [NiFe]-hydrogenases (together called hydrogenase-3-type hydrogenases). In addition to these four subunits, the eha operon encodes a 6[4Fe-4S] polyferredoxin, a 10[4Fe-4S] polyferredoxin, ten other predicted integral membrane proteins (, , , , , , , , , ), and four hydrophilic subunits (, , , ) (the latter two hydrophilic subunits are members of well-characterised enzyme families but lack the essential amino acids assumed to form the active site PUBMED:10491142). All of these proteins are expressed and therefore thought to be functional subunits of the Eha hydrogenase complex PUBMED:10491142. Note, however, that the ten additional predicted integral membrane proteins are absent from Ech, Coo, Hyc, and Hyf complexes (and therefore from corresponding organisms), indicating that those complexes have a simpler membrane component than Eha PUBMED:10491142.

    Members of this group are homologous to the N-terminal domain of members (e.g., EhbF, HyfF of E. coli hydrogenase 4, amongst others). Therefore, this type of membrane subunit of Eha complex is conserved across the various hydrogenase-3-type hydrogenases (that is, they are not limited to the Eha subgroup). A protein with sequence similarity to the C-terminal part of EhbF () is not present in the Eha complex (not encoded by the eha operon).

    Based on sequence similarity and genome context analysis, other organisms such as Methanopyrus kandleri,Methanocaldococcus jannaschii, and M. thermoautotrophicum also encode Eha-like [NiFe]-hydrogenase-3-type complexes and have very similar eha operon structure.

    \ ' '9958' 'IPR019296' '\

    This domain, found in various hypothetical archaeal proteins, has no known function.

    \ ' '9959' 'IPR018775' '\

    Proteins in this entry are predicted to catalyse the transfer of nucleotide residues from nucleoside diphosphates or triphosphates into dimer or polymer forms.

    \ ' '9960' 'IPR019297' '\

    This entry is found in various prokaryotic OpcA and glucose-6-phosphate dehydrogenase proteins. Its exact function is, as yet, unknown.

    \ ' '9961' 'IPR014550' '\

    There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.

    \ ' '9962' 'IPR019298' '\

    This entry represents a set of bacterial and archaeal domains that are predicted to be RNases (from similarities to 5\'-exonucleases).

    \ ' '9963' 'IPR018776' '\

    This entry is found in various bacterial and archaeal hypothetical membrane proteins, as well as in tetratricopeptide TPR_2 repeat protein. Its function has not yet been established, though it shows similarity to 6-pyruvoyl-tetrahydropterin synthase.

    \ ' '9965' 'IPR019300' '\

    This entry represents a family of bacterial proteins predicted to have RNA-binding properties, though their exact function has not yet been defined.

    \ ' '9966' 'IPR018777' '\

    Members of this family of bacterial proteins are single-stranded DNA binding proteins that are involved in DNA replication, repair and recombination.

    \ ' '9967' 'IPR019301' '\

    The N-terminal domain of the FlgJ protein is directly involved in flagellar rod assembly, while the adjacent C-terminal domain is a flagellum-specific muramidase (peptidoglycan hydrolase) required for formation of the outer membrane L ring PUBMED:11554792.

    \ ' '9968' 'IPR011385' '\

    This group represents a site-specific recombinase Gcr. Please see the following relevant reference: PUBMED:9079926.

    \ ' '9969' 'IPR019302' '\

    This entry represents a TIR-like domain found in a family of prokaryotic predicted nucleotide-binding proteins. Their exact function has not, as yet, been defined.

    \ ' '9970' 'IPR019303' '\

    This entry is found in a family of proteins that confer resistance to the metalloid element tellurium and its salts.

    \ ' '9971' 'IPR017030' '\

    This group represents a virulence effector protein, SrfC type.

    \ ' '9972' 'IPR018778' '\

    This entry consists of prokaryotic proteins including the virulence factor essB, which is required for the synthesis and secretion of EsxA and EsxB, both ESAT-6 like proteins.

    \ ' '9973' 'IPR018779' '\

    This entry represents a domain found at the C terminus of a set of single-stranded DNA-specific exonucleases, including RecJ. Its function has not, as yet, been determined.

    \ ' '9974' 'IPR009199' '\ Proteins in this entry are believed to play a role in virulence/pathogenicity in Salmonella. Salmonella typhi PqaA has been shown to be activated by PhoP/Q two-component regulatory system, which regulates many virulence genes PUBMED:9075219. It has been also shown to confer resistance to antimicrobial peptides (melittin) PUBMED:9075219. Members of this family are predicted to belong to the alpha/beta hydrolase domain superfamily.\ ' '9975' 'IPR019304' '\

    This entry is found in various bacterial and plant 2,3-bisphosphoglycerate-independent phosphoglycerate mutase enzymes, which catalyse the interconversion of 2-phosphoglycerate and 3-phosphoglycerate in the reaction: [2-phospho-D-glycerate + 2,3-diphosphoglycerate = 3-phospho-D-glycerate + 2,3-diphosphoglycerate].

    \ ' '9976' 'IPR019305' '\

    This entry represents a group of bacterial proteins that are membrane proteins that effect the expression of haemolysin under anaerobic conditions PUBMED:10699510.

    \ ' '9977' 'IPR010090' '\

    This entry represents a reasonably well conserved core region of a family of phage tail proteins. The member from phage TP901-1 was characterised as a tail length tape measure protein in that a shortened form of the protein leads to phage with proportionately shorter tails.

    \ ' '9978' 'IPR018482' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a family of proteins which appears to have a highly conserved zinc finger domain at the C-terminal end, described as -C-X2-CH-X3-H-X5-C-X2-C-. The structure is predicted to contain a coiled coil. Members of this family are annotated as being tumour-associated antigen HCA127 in humans, but this could not be confirmed.

    \ ' '9979' 'IPR018472' '\

    Members of this family of proteins act as negative regulators of G1 to S cell cycle phase progression by inhibiting cyclin-dependent kinases. Inhibitory effects are additive with GADD45 proteins but occur also in the absence of GADD45 proteins. Furthermore, they act as a repressor of the orphan nuclear receptor NR4A1 by inhibiting AB domain-mediated transcriptional activity PUBMED:12716909. They may be involved in the hormone-mediated regulation of NR4A1 transcriptional activity.

    \ ' '9980' 'IPR015649' '\

    SCHIP-1 is a coiled-coil protein that specifically associates with schwannomin in vitro and in vivo. The product of the neurofibromatosis type 2 (NF2) tumour suppressor gene, known as schwannomin or merlin, is involved in NF2-associated and sporadic schwannomas and meningiomas. It is closely related to the ezrin-radixin-moesin family members, which link membrane proteins to the cytoskeleton. Association with SCHIP-1 can be observed only with some naturally occurring mutants of schwannomin, or a schwannomin spliced isoform lacking exons 2 and 3, but not with the schwannomin isoform exhibiting growth-suppressive activity PUBMED:10669747.

    This entry consists of mammalian SCHIP-1 proteins from Mus musculus (Mouse) and Homo sapiens (Human).

    \ ' '9981' 'IPR019306' '\

    Members of this family are involved in the biosynthesis of 6-sulpho sialyl Lewis X molecules, by catalysing the transfer of sulphate from 3\'-phosphoadenosine 5\'-phosphosulphate to position 6 of a non-reducing N-acetylglucosamine (GlcNAc) residue PUBMED:9722682.

    \ ' '9982' 'IPR019307' '\

    Ribonuclease E and Ribonuclease G are related enzymes that cleave a wide variety of RNAs PUBMED:16237448. RNA-binding protein AU-1 binds to RNA loop regions that are with AU-rich sequences PUBMED:12614195.

    \ ' '9983' 'IPR019308' '\

    This is a 450 amino acid region of a family of proteins conserved from insects to humans. Mouse transmembrane protein 214, (), is annotated as being a putative Vitamin K-dependent carboxylation gamma-carboxyglutamic (GLA) domain containing protein, but this could not be confirmed. The function is not known.

    \ ' '9984' 'IPR019309' '\

    This entry represents a 140 amino acid region. Proteins in this entry include those with a coiled-coil domain as well as Daf-16-dependent longevity protein 1 PUBMED:16380712. Their function is unknown.

    \ ' '9985' 'IPR019310' '\

    This is a region of 120 amino acids that is conserved in a family of proteins found from plants to fungi. The function is not known.

    \ ' '9986' 'IPR019311' '\

    This is a family of proteins conserved from nematodes to humans. The function is not known.

    \ ' '9987' 'IPR019312' '\

    This entry represents a region of 120 amino acids in proteins conserved from plants to humans. Their function is not known.

    \ ' '9988' 'IPR019313' '\

    This entry represents subunit Med17 of the Mediator complex.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '9989' 'IPR019314' '\

    This entry contains highly conserved proteins from nematodes to humans. They have no known function.

    \ ' '9990' 'IPR018780' '\

    This entry represents a region of 130 amino acids that is the most conserved part of some hypothetical proteins involved in loss of heterozygosity, and thus, tumour suppression PUBMED:11896457. The exact function of these proteins is not known.

    \ ' '9991' 'IPR019315' '\

    This entry represents a glycine-rich domain that is the most highly conserved region of a family of proteins that, in vertebrates, are associated with tumours in multiple myelomas. The region may contain phosphorylation sites for several protein kinases, as well as N-myristoylation sites and nuclear localisation signals, so it might act as a signal molecule in the nucleus PUBMED:12545221.

    \ ' '9992' 'IPR018781' '\

    This entry represents 280 amino acid region found in a group of proteins conserved from plants to humans. These are predicted to be membrane proteins, but apart from that their function is unknown.

    \ ' '9993' 'IPR018782' '\

    This entry represents a family of small conserved proteins found from nematodes to humans. The C-terminal region is rich in asparagine. These proteins have been putatively designated as mitochondrial precursor proteins but this has not been confirmed.

    \ ' '9994' 'IPR019316' '\

    This entry represents a domain found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix PUBMED:16632497.

    \ ' '9995' 'IPR018783' '\

    Enhancer of yellow 2 (EnY2) is a small transcription factor which is combined in a complex with the TAFII40 protein PUBMED:11438676. This protein is conserved from protozoa to humans.

    \ ' '9996' 'IPR019317' '\

    This is a highly conserved set of proteins which contains three pairs of cysteine residues within a length of 42 amino acids and is rich in proline residues towards the N terminus. It includes a membrane protein that has been found to be highly expressed in the mouse brain and consequently, several members have been putatively assigned as brain protein i3 (Bri3); although this has not be verified. Their function is unknown.

    \ ' '9997' 'IPR019318' '\

    Ric8 is involved in the EGL-30 neurotransmitter signalling pathway PUBMED:10985349. It is a guanine nucleotide exchange factor PUBMED:12971991 that regulates neurotransmitter secretion.

    \ ' '9998' 'IPR019319' '\

    This family is conserved from nematodes to humans. The function is not known.

    \ ' '9999' 'IPR019320' '\

    This is the N-terminal 80 residues of a set of proteins conserved from plants to humans. It contains a characteristic NEP sequence motif. Their function is not known.

    \ ' '10000' 'IPR019321' '\

    Nup88 can be divided into two structural domains; the N-terminal two-thirds of the protein have no obvious structural motifs. It is, however, where it binds to Nup98; one of the components of the nuclear pore. The C-terminal end is a predicted coiled-coil domain PUBMED:9049309. Nup88 is over expressed in tumour cells PUBMED:12589057.

    \ ' '10001' 'IPR018784' '\

    This is a family of 121-amino acid secretory proteins consisting of learning associated protein 18 (LAPS18) and related sequences. LAPS18 functions in the regulation of neuronal cell adhesion and/or movement and synapse attachment PUBMED:11168596. It has been shown to bind to the ApC/EBP (Aplysia CCAAT/enhancer binding protein) promoter and activate the transcription of ApC/EBP mRNA PUBMED:16504946.

    \ ' '10002' 'IPR018785' '\

    This entry represents the N-terminal approximately 100 amino acids of a family of proteins found from nematodes to humans. It contains between six and eight highly conserved cysteine residues and a characteristic DPF sequence motif. One member is putatively named as receptor for egg jelly protein but this could not confirmed.

    \ ' '10003' 'IPR019322' '\

    This is a set of proteins conserved from nematodes to humans. The function is not known.

    \ ' '10004' 'IPR018276' '\

    This entry represents DDA1 (DET1- and DDB1-associated protein 1) ubiquitin ligase, which binds strongly with Det1 (De-etiolated 1) and DDB1 (Damaged DNA binding protein 1 associated 1). Together DDA1, DDB1 and Det1 form the DDD core complex, which recruits a specific UBE2E enzyme to form specific DDD-E2 complexes PUBMED:17452440. Component of the DDD-E2 complexes which may provide a platform for interaction with cul4a and WD repeat proteins. These proteins may be involved in ubiquitination and subsequent proteasomal degradation of target proteins.

    \ ' '10005' 'IPR018786' '\

    This entry represents a family of proteins conserved from plants to humans. Their function is not known.

    \ ' '10006' 'IPR019323' '\

    This entry is found in a family of proteins that form part of the CAZ (cytomatrix at the active zone) complex which is involved in determining the site of synaptic vesicle fusion PUBMED:14723704. Located at the C terminus is a PDZ-binding motif that binds directly to RIM (a small G protein Rab-3A effector). These proteins also contain four coiled-coil domains PUBMED:12391317.

    \ ' '10007' 'IPR019324' '\

    This group of sequences describe M-phase phosphoprotein 6 (MPP6), which is necessary for generation of the 3\' end of the 5.8S rRNA precursor. MPP6 appeared to display RNA-binding activity in vitro with a preference for pyrimidine-rich sequences, and to bind to the ITS2 element of pre-rRNAs PUBMED:16396833.

    \ ' '10008' 'IPR019325' '\

    Proteins in this entry are conserved from fungi to humans. The human member is annotated as a Golgi-associated protein-Nedd4 WW domain-binding protein but this could not be confirmed.

    \ ' '10009' 'IPR018787' '\

    This entry represents a family of proteins conserved from nematodes to humans. Their function is not known.

    \ ' '10010' 'IPR018788' '\

    Proteasome assembly chaperone 3 (PSMG3) promotes assembly of the 20S proteasome PUBMED:17189198. It may cooperate with PSMG1-PSMG2 heterodimers to orchestrate the correct assembly of proteasomes.

    \ ' '10011' 'IPR019326' '\

    This is a proline-rich region of a group of proteins found from plants to fungi. The function is largely unknown, although the entry contains Fibronectin type-III domain-containing protein C4orf31, which promotes matrix assembly and cell adhesiveness.

    \ ' '10012' 'IPR019327' '\

    This is a conserved region of a family of proteins found from fungi to humans. The function is not known.

    \ ' '10013' 'IPR019328' '\

    PIG-H is a family of conserved proteins that complexes with three other proteins to form the GPI-GnT (glycosylphosphatidylinositol anchor biosynthesis transferase) complex. It appears to be a peripheral membrane protein that faces the cytoplasm and which is involved in the first step in GPI anchor formation.

    \ ' '10014' 'IPR018789' '\

    This presumed domain is found at the N terminus of the Saccharomyces cerevisiae (Baker\'s yeast) Flo11 protein.

    \ ' '10015' 'IPR019329' '\

    This entry represents the ESSS subunit from mitochondrial NADH:ubiquinone oxidoreductase (complex I). It carries mitochondrial import sequences PUBMED:12381726.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '10016' 'IPR018790' '\

    This entry represents a family of conserved proteins found from plants to humans. Their function is unknown.

    \ ' '10017' 'IPR019330' '\

    Mesoderm development candidate 2 represents a set of highly conserved proteins found from nematodes to humans. The final C-terminal residues, KEDL, are the endoplasmic reticulum retention sequence as it is an ER protein specifically required for the intracellular trafficking of members of the low-density lipoprotein family of receptors (LDLRs) PUBMED:12581524. The N- and C-terminal sequences are predicted to adopt a random coil conformation, with the exception of an isolated predicted helix within the N-terminal region, The central folded domain flanked by natively unstructured regions is the necessary structure for facilitating maturation of LRP6 (Low-Density Lipoprotein Receptor-Related Protein 6 Maturation) PUBMED:17488095.

    \ ' '10018' 'IPR018791' '\

    This entry is a predicted coiled-coil-containing region of approximately 200 residues. It is found in a family of proteins of unknown function that is conserved from nematodes to humans.

    \ ' '10019' 'IPR019331' '\

    This is a the N-terminal 100 amino acids of a family of proteins conserved from plants to humans. The full-length protein has putatively been called NEFA-interacting nuclear protein NIP30, however no reference could be found to confirm this.

    \ ' '10020' 'IPR019332' '\

    Organic solute carrier protein 1, or Oscp1, is a family of proteins conserved from plants to humans. It is called organic solute transport protein or oxido-red-nitro domain-containing protein 1, however no reference could be find to confirm the function of the protein.

    \ ' '10021' 'IPR019333' '\

    The Integrator complex is involved in small nuclear RNA (snRNA) U1 and U2 transcription, and in their 3\'-box-dependent processing. This complex associates with the C-terminal domain of RNA polymerase II largest subunit and is recruited to the U1 and U2 snRNAs genes PUBMED:16239144.

    \ \

    This entry represents a conserved region found in subunit 3 of this complex. The function of this subunit is unknown.

    \ ' '10022' 'IPR019334' '\

    This entry represents a group of putative transmembrane proteins conserved from nematodes to humans. The protein is only approximately 130 amino acids in length. The function is unknown.

    \ ' '10023' 'IPR019335' '\

    The conserved oligomeric Golgi (COG) complex is an eight-subunit (Cog1-8) peripheral Golgi protein involved in membrane trafficking and glycoconjugate synthesis PUBMED:16051600. COG-7 is required for normal Golgi morphology and trafficking. Mutation in COG7 causes a congenital disorder of glycosylation PUBMED:15107842.

    \ ' '10024' 'IPR019336' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    This region of 270 amino acids is the seven transmembrane alpha-helical domains included within five GPCRRHODOPSN4 motifs of a G-protein-coupled-receptor (GPCR) protein, conserved from nematodes to humans PUBMED:12538434.

    \ ' '10025' 'IPR019337' '\

    This entry represents the central conserved 110 amino acid region of a group of proteins called telomere-length regulation, or clock abnormal protein-2, which are conserved from plants to humans. The full-length protein regulates telomere length and contributes to silencing of sub-telomeric regions PUBMED:11641227. In vitro the protein binds to telomeric DNA repeats.

    \ ' '10027' 'IPR018792' '\

    P8 is a short 80-82 amino acid protein that is conserved from nematodes to humans. It carries at least one protein kinase C domain suggesting a possible role in signal transduction and it is thought to be a phosphoprotein, but the sites of phosphorylation and the kinases involved remain to be determined PUBMED:10092851.

    \ ' '10029' 'IPR019339' '\

    This entry represents a 45 residue conserved region at the N-terminal end of a family of proteins referred to as CIRs (CBF1-interacting co-repressors). CBF1 (centromere-binding factor 1) acts as a transcription factor that causes repression by binding specifically to GTGGGAA motifs in responsive promoters, and it requires CIR as a co-repressor. CIR binds to histone deacetylase and to SAP30 and serves as a linker between CBF1 and the histone deacetylase complex PUBMED:9874765. It may also modulate splice site selection during alternative splicing of pre-mRNAs.

    \ ' '10030' 'IPR019340' '\

    This entry is found in Ada3 and homologous proteins which function as part of histone acetyltransferase complexes PUBMED:8413201. Ada3 is an essential component of the Ada transcriptional coactivator (alteration/deficiency in activation) complex. It plays a key role in linking histone acetyltransferase-containing complexes to p53 (tumour suppressor protein) thereby regulating p53 acetylation, stability and transcriptional activation following DNA damage PUBMED:17272277.

    \ ' '10031' 'IPR019341' '\

    p34 is a protein involved in membrane trafficking. It is known to interact with both alpha and gamma adaptin PUBMED:10477754. It has been speculated that p34 may play a chaperone role such as preventing the soluble adaptors from co-assembling with soluble clathrin, or helping to remove the adaptors from the coated vesicle. It may also aid in the recruitment of soluble adaptors onto the membrane PUBMED:10477754.

    \ ' '10032' 'IPR019342' '\

    Proteins in this entry form part of the NADH:ubiquinone oxidoreductase complex I. Complex I is the first multisubunit inner membrane protein complex of the mitochondrial electron transport chain and it transfers two electrons from NADH to ubiquinone. These proteins carry four highly conserved cysteine residues, but these do not appear to be in a configuration which would favour metal binding, so the exact function of the protein is uncertain PUBMED:10070614.

    \ ' '10035' 'IPR018793' '\

    This entry represents the conserved N-terminal region of a family of conserved proteins found from nematodes to humans. It carries six highly conserved cysteine residues. Pet191 is required for the assembly of active cytochrome c oxidase but does not form part of the final assembled complex PUBMED:8381337.

    \ ' '10036' 'IPR018469' '\

    DuoxA (Dual oxidase maturation factor) is the essential protein necessary for the final release of DUOX2 (an NADPH:O2 oxidoreductase flavoprotein) from the endoplasmic reticulum. Dual oxidases (DUOX1 and DUOX2) constitute the catalytic core of the hydrogen peroxide generator, which generates H2O2 at the apical membrane of thyroid follicular cells, essential for iodination of thyroglobulin by thyroid peroxidases. DuoxA carries five membrane-integral regions including a reverse signal-anchor with external N terminus (type III) and two N-glycosylation sites PUBMED:16651268. It is conserved from nematodes to humans.

    \ ' '10037' 'IPR019343' '\

    This entry represents the N-terminal 100 residue region, which contains a conserved KLRAQ motif. This region is found in a family of coiled-coil domain-containing proteins that are conserved from nematodes to humans. These proteins also contain a C-terminal TTKRSYEDQ motif region (). The function of these proteins is not known.

    \ ' '10038' 'IPR019344' '\

    This entry represents small proteins of approximately 110 amino acids, which are highly conserved from nematodes to humans. Some have been annotated in Swiss-Prot as being the f subunit of mitochondrial F1F0-ATP synthase but this could not be confirmed. The sequence has a well-conserved WRW motif. The exact function of the protein is not known.

    \ ' '10040' 'IPR019345' '\

    This entry represents Armet proteins (aka mesencephalic astrocyte-derived neurotrophic factor or arginine-rich protein). Armet is a small protein of approximately 170 residues which contains four di-sulphide bridges that are highly conserved from nematodes to humans. Armet is a soluble protein resident in the endoplasmic reticulum and induced by ER stress. It appears to be involved with dealing with mis-folded proteins in the ER, thus in quality control of ER stress PUBMED:17507765. Armet from Rattus norvegicus (Rat) selectively promotes the survival of dopaminergic neurons of the ventral mid-brain. It modulates GABAergic transmission to the dopaminergic neurons of the substantia nigra, and enhances spontaneous, as well as evoked, GABAergic inhibitory postsynaptic currents in dopaminergic neurons PUBMED:16462600.

    \ ' '10041' 'IPR018794' '\

    This entry consists of small proteins of approximately 150 amino acids whose function is unknown.

    \ ' '10042' 'IPR019346' '\

    This entry represents a family of short proteins; each approximately 100 amino acid residues in length. They are identified as the mitochondrial 28S ribosomal proteins S32.

    \ ' '10043' 'IPR019347' '\

    Axonemal dynein light chain proteins play a dynamic role in flagellar and cilial motility. Eukaryotic cilia and flagella are complex organelles consisting of a core structure, the axoneme, which is composed of nine microtubule doublets forming a cylinder that surrounds a pair of central singlet microtubules. This ultra-structural arrangement seems to be one of the most stable micro-tubular assemblies known and is responsible for the flagellar and ciliary movement of a large number of organisms ranging from protozoan to mammals. This light chain interacts directly with the N-terminal half of the heavy chains PUBMED:11606062.

    \ ' '10044' 'IPR019348' '\

    This entry represents the C-terminal 500 residue region, which contains a conserved TTKRSYEDQ motif. This region is found in a family of coiled-coil domain-containing proteins that are conserved from nematodes to humans. These proteins also contain an N-terminal KLRAQ motif region (). The function of these proteins is not known.

    \ ' '10045' 'IPR019349' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This entry represents a conserved region of approx. 125 residues of one of the proteins that makes up the small subunit of the mitochondrial ribosome. In Saccharomyces cerevisiae (Baker\'s yeast) it is mitochondrial ribosomal protein S24 whereas in humans it is S35.

    \ ' '10046' 'IPR019350' '\

    RNA polymerase I-specific transcription-initiation factor Rrn6 and Rrn7 represent components of a multisubunit transcription factor essential for the initiation of rDNA transcription by Pol I PUBMED:7958901. These proteins are found in fungi.

    \ ' '10047' 'IPR018943' '\

    Ost4 is a very short, approximately 30 residues, enzyme found from fungi to vertebrates. It is a member of the ER oligosaccaryltansferase complex, , that catalyses the asparagine-linked glycosylation of proteins. It appears to be an integral membrane protein that mediates the en bloc transfer of a pre-assembled high-mannose oligosaccharide onto asparagine residues of nascent polypeptides as they enter the lumen of the rough endoplasmic reticulum.

    \ ' '10048' 'IPR010220' '\

    This small family of proteins includes paralogs ChpX and ChpY in Synechococcus sp. (strain PCC 7942) (Anacystis nidulans R2) and other cyanobacteria, associated with distinct NAD(P)H dehydrogenase complexes. These proteins collectively enable light-dependent CO2 hydration and CO2 uptake; loss of both blocks growth at low CO2 concentrations.

    \ ' '10049' 'IPR019351' '\

    This entry is a region of approximately 100 residues containing three pairs of cysteine residues. The region is conserved from plants to humans but its function is unknown.

    \ ' '10050' 'IPR019352' '\

    This region is found in a set of sequences conserved from nematodes to plants; it is of approximately 200 residues in length and is functionally uncharacterised. It contains 14 conserved cysteines, three of which are CC-dimers.

    \ ' '10052' 'IPR019354' '\

    This is a family of proteins conserved from plants to humans. In Dictyostelium, it is annotated as being similar to Mss11p, but this is likely as a result of the tracts of asparagine and glutamine which both sequences share.

    \ ' '10053' 'IPR019355' '\

    This is a set of proteins conserved from worms to humans. The function is unknown.

    \ ' '10054' 'IPR018795' '\

    Proteins in this entry are conserved from worms to humans. Their function is unknown.

    \ ' '10055' 'IPR019356' '\

    This is region of approximately 250 residues with no known function.

    \ ' '10056' 'IPR019357' '\

    This entry represents a highly conserved 100 residue region which is likely to have a coiled-coil structure. The exact function is unknown.

    \ ' '10057' 'IPR019358' '\

    This entry represents the central 200 residues of a family of proteins that have no known function.

    \ ' '10058' 'IPR019359' '\

    This is the conserved N-terminal half of a proteins which are found in Metazoa. Some annotation suggests it might be PKR, the Hepatitis delta antigen-interacting protein A, but this could not be confirmed. It also contains a coiled-coil domain.

    \ ' '10060' 'IPR019361' '\

    This is a family of conserved proteins of approximately 700 residues found in eukaryotes.

    \ ' '10061' 'IPR019362' '\

    This is a family of proteins conserved in the metazoa but absent from fungi. They are all approximately 300 residues in length and have no known function.

    \ ' '10062' 'IPR019363' '\

    This family of proteins is conserved from plants to humans. The function is unknown.

    \ ' '10063' 'IPR018796' '\

    This entry consists of small conserved proteins found from worms to humans. Their function is not known.

    \ ' '10064' 'IPR019364' '\

    Arc32, or Med8, is one of the subunits of the Mediator complex of RNA polymerase II. The region conserved contains two alpha helices putatively necessary for binding to other subunits within the core of the Mediator complex.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '10065' 'IPR019365' '\

    This is a family of small, less than 200 residue long proteins, named CG6151-P proteins. The function is unknown.

    \ ' '10066' 'IPR019366' '\

    This protein of 413 amino acids contains a central coiled-coil domain, possibly the region that binds to clusterin. Cluap1 expression is highest in the nucleus and gradually increases during late S to G2/M phases of the cell cycle and returns to the basal level in the G0/G1 phases. In addition, it is upregulated in colon cancer tissues compared to corresponding non-cancerous mucosa. It thus plays a crucial role in the life of the cell PUBMED:15480429.

    \ ' '10067' 'IPR019367' '\

    The CRIPT protein is a cytoskeletal protein involved in microtubule production. This C-terminal domain is essential for binding to the PDZ3 domain of the SAP90 protein, one of a super-family of PDZ-containing proteins that play an important role in coupling the membrane ion channels with their signalling partners. SAP90 is concentrated in the post synaptic density of glutamatergic neurons PUBMED:16796391.

    \ ' '10068' 'IPR019368' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This entry represents a family of conserved proteins which were originally described as death-associated-protein-3 (DAP-3). The proteins carry a P-loop DNA-binding motif, and induce apoptosis PUBMED:9889192. DAP3 has been shown to be a pro-apoptotic factor in the mitochondrial matrix PUBMED:11017876 and to be crucial for mitochondrial biogenesis and so has also been designated as MRP-S29 (mitochondrial ribosomal protein subunit 29).

    \ ' '10069' 'IPR019369' '\

    This set of proteins, which are of approximately 200 residues in length, contain a highly conserved Glu-Phe-Trp (QFW) motif close to the N terminus and an Asp/Asn-Pro-Pro-Tyr/Phe motif in the centre. This latter motif is characteristic of N-6 adenine-specific DNA methylases and could be involved in substrate binding or in the catalytic activity.

    \ ' '10070' 'IPR019370' '\

    This entry represents the conserved, C-terminal portion of an E2F binding protein. E2F transcription factors play an essential role in cell proliferation and apoptosis and their activity is frequently deregulated in human cancers. E2F activity is regulated by a variety of mechanisms, frequently mediated by proteins binding to individual members or a subgroup of the family. E2F-associated phosphoprotein (EAPP)interacts with a subset of E2F factors and influences E2F-dependent promoter activity. EAPP is present throughout the cell cycle but disappears during mitosis PUBMED:15716352.

    \ ' '10071' 'IPR018797' '\

    FAM98A, B and C are glycine-rich proteins found from worms to humans whose function is unknown.

    \ ' '10072' 'IPR018798' '\

    FAM125A (also known as CIN85/CD2AP family-binding protein) interacts with CD2AP and CIN85/SH3KBP1, and is thought to be involved in the ligand-mediated internalization and down-regulation of EGF receptor PUBMED:16895919.

    \ ' '10073' 'IPR019371' '\

    This entry represents a conserved region of 80 residues which defines a family of short proteins. There is a characteristic KxDL motif towards the C terminus. The function is unknown.

    \ ' '10074' 'IPR019372' '\

    This is a group of proteins expressed from a series of genes referred to as Lipoma HGMIC fusion partner-like. The proteins carry four highly conserved transmembrane domains. In certain instances, as in LHFPL5, mutations cause deafness in humans PUBMED:16752389 or hypospadias PUBMED:16395596. LHFPL1 is transcribed in six liver tumour cell lines PUBMED:15620218.

    \ ' '10075' 'IPR018799' '\

    This entry represents a protein which interacts with both microtubules and TRAF3 (tumour necrosis factor receptor-associated factor 3), and is conserved from worms to humans. The N-terminal region is the microtubule binding domain and is well-conserved; the C-terminal 100 residues, also well-conserved, constitute the coiled-coil region which binds to TRAF3. The central region of the protein is rich in lysine and glutamic acid and carries KKE motifs which may also be necessary for tubulin-binding, but this region is the least well-conserved PUBMED:10791955.

    \ ' '10076' 'IPR019373' '\

    MRP-L51 is a family of small proteins from the intact 55 S mitochondrial ribosome PUBMED:16451194. It has otherwise been referred to as bMRP-64 PUBMED:11402041. The exact function of this family is not known.

    \ ' '10077' 'IPR019374' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This is the conserved N terminus and central portion of the mitochondrial small subunit 28S ribosomal protein S22. Mammalian mitochondria carry out the synthesis of 13 polypeptides that are essential for oxidative phosphorylation and, hence, for the synthesis of the majority of the ATP used by eukaryotic organisms. The number of proteins produced by prokaryotes is smaller, reflected in the lower number of ribosomal proteins present in them PUBMED:10938081.

    \ ' '10078' 'IPR019375' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    This is a family of short mitochondrial ribosomal proteins, less than 200 amino acids long. The structure has previously been referred to as MRP-S18 but the current numbering fits the preferred nomenclature from these authors PUBMED:11279123.

    \ ' '10079' 'IPR018450' '\

    The majority of endogenous reactive oxygen species (ROS) in cells are produced by the mitochondrial respiratory chain. An increase or imbalance in ROS alters the intracellular redox homeostasis, triggers DNA damage, and may contribute to cancer development and progression.

    \ \

    This entry contains the mitochondrial protein, reactive oxygen species modulator 1 (Romo1), that is responsible for increasing the level of ROS in cells. \ In various cancer cell lines with elevated levels of ROS there is also an increased abundance of Romo1 PUBMED:16842742. Increased Romo1 expression can have a number of other affects including: inducing premature senescence of cultured human fibroblasts PUBMED:18836179, PUBMED:18313394 and increased resistance to 5-fluorouracil PUBMED:17537404.

    \ ' '10080' 'IPR019376' '\

    This entry is the conserved central region of a group of proteins that are putative transcriptional repressors. The structure contains a putative 14-3-3 binding motif involved in the subcellular localisation of various regulatory molecules, and it may be that interaction with the transcription factor DREF could be regulated through this motif. DREF regulates proliferation-related genes in Drosophila PUBMED:11137299. Myelodysplasia-myeloid leukemia factor 1-interacting protein (Mlf1IP) is expressed in both the nuclei and the cytoplasm and thus may have multi-functions PUBMED:17595757.

    \ ' '10081' 'IPR019377' '\

    NADH-ubiquinone oxidoreductase subunit 10 of (NDUFB10) is a member of a family of conserved proteins of up to 180 residues. It is one of the 41 protein subunits within the hydrophobic fraction of the NADH:ubiquinone oxidoreductase (complex I), a multiprotein complex located in the inner mitochondrial membrane whose main function is the transport of electrons from NADH to ubiquinone, which is accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. NDUFB10 is encoded in the nucleus.

    \ ' '10082' 'IPR019378' '\

    This is a family of conserved proteins representing the enzyme responsible for adding O-fucose to EGF (epidermal growth factor-like) repeats. Six highly conserved cysteines are present as well as a DXD-like motif (ERD), conserved in mammals, Drosophila, and Caenorhabditis elegans. Both features are characteristic of several glycosyltransferase families. The enzyme is a membrane-bound protein released by proteolysis and, as for most glycosyltransferases, is strongly activated by manganese PUBMED:11524432.

    \ ' '10083' 'IPR019379' '\

    This entry is a short, 101 peptide protein, which is the smallest subunit of the gamma-secretase aspartyl protease complex. It catalyses the intra-membrane cleavage of a subset of type I transmembrane proteins. The other active constituents of the complex are presenilin (PS) nicastrin and anterior pharynx defective-1 (APH-1) protein. Presenilin enhancer-2 (PEN-2) adopts a hairpin orientation in the membrane with its N- and C-terminal domains facing the luminal/extracellular space. The C-terminal domain maintains PS stability within the complex PUBMED:15953349.

    \ ' '10084' 'IPR019380' '\

    This domain is a region of 70 residues conserved in proteins from plants to humans and contains a serine/arginine rich motif. In rats the full protein is a casein kinase substrate, and this region contains phosphorylation sites for both cAMP-dependent protein kinase and casein kinase II PUBMED:8615683.

    \ ' '10085' 'IPR018800' '\

    This is the highly conserved C-terminal domain of the renal papillary carcinoma protein PRCC. The function of this domain is not known.

    \ ' '10086' 'IPR019381' '\

    PACS-1 is a cytosolic sorting protein that directs the localisation of membrane proteins in the trans-Golgi network (TGN)/endosomal system. PACS-1 connects the clathrin adaptor AP-1 to acidic cluster sorting motifs contained in the cytoplasmic domain of cargo proteins such as furin, the cation-independent mannose-6-phosphate receptor and in viral proteins such as human immunodeficiency virus type 1 Nef PUBMED:9695949.

    \ ' '10087' 'IPR019382' '\

    RNA polymerase I is a multi-subunit enzyme and its transcription competence is dependent on the presence of PAF67 PUBMED:11592397.

    \ ' '10088' 'IPR019383' '\

    Proteins in this entry include Golgin subfamily A member 7 and the Ras modification protein ERF4.

    \ ' '10089' 'IPR019384' '\

    This entry represents a conserved sequence region found in a family of proteins described as retinoic acid-induced protein 16-like proteins. These proteins are conserved from worms to humans, but their function is not known.

    \ ' '10090' 'IPR019385' '\

    The phosphorylated adaptor for RNA export (PHAX) protein transports U3 snoRNA from the nucleus after transcription PUBMED:11333016. This entry represents the highly conserved U3 snoRNA-binding domain of PHAX, which is characterised by having two pairs of adjacent glycines with the sequence motif GGx12GG.

    \ ' '10091' 'IPR019386' '\

    This is a family of conserved proteins which, it has been suggested, contain leucine-zipper domains. A leucine zipper domain is a region of 30 amino acids with leucines repeating every seven or eight residues; a pattern which these proteins match. The protein in Drosophila comes from the gene ROGDI.

    \ ' '10092' 'IPR019387' '\

    This domain of approximately 75 residues contains a highly conserved SATSv/iFN motif. The function is unknown but the domain is conserved from plants to humans.

    \ ' '10093' 'IPR019388' '\

    Members of this entry represent the fat-inducing transcript (FIT) -family of genes, which play an important role in lipid droplet accumulation. They are endoplasmic reticulum resident membrane proteins that induce lipid droplet accumulation in cell culture and when expressed in mouse liver PUBMED:18160536.

    \ ' '10094' 'IPR019389' '\

    This entry is an approximately 100 residue region of selenoprotein T, conserved from plants to humans. The protein binds to UDP-glucose:glycoprotein glucosyltransferase (UGTR), the endoplasmic reticulum (ER)-resident protein, which is known to be involved in the quality control of protein folding PUBMED:11278576. Selenium (Se) plays an essential role in cell survival and most of the effects of Se are probably mediated by selenoproteins, including selenoprotein T. However, despite its binding to UGTR and that its mRNA is up-regulated in extended asphyxia, the function of the protein and hence of this region of it is unknown PUBMED:17034973.

    \ ' '10095' 'IPR019390' '\

    This family represents approximately 160 residues of a group of proteins, some of which have been annotated as SprT-like metallo-proteases. However, this could not be confirmed. The function is not known.

    \ ' '10096' 'IPR019391' '\

    In humans the Storkhead-box protein controls polyploidization of extravillus trophoblast and is implicated in pre-eclampsia PUBMED:15806103. This entry represents the conserved N-terminal winged-helix domain, which is likely to bind DNA.

    \ ' '10097' 'IPR019392' '\

    This is a family of conserved proteins varying in length from 500-600 residues. Its function is not known.

    \ ' '10098' 'IPR019393' '\

    Strumpellin contains one known domain called a spectrin repeat that consists of three alpha-helices of a characteristic length wrapped in a left-handed coiled coil. The spectrin proteins have multiple copies of this repeat, which can then form multimers in the cell. Spectrin associates with the cell membrane via spectrin repeats in the ankyrin protein. The spectrin repeat is a structural platform for cytoskeletal protein assemblies. Two closely situated point mutations in human strumpellin lead to the condition of hereditary spastic paraplegia.

    \ ' '10099' 'IPR019394' '\

    This family of transmembrane coiled-coil containing proteins is conserved from worms to humans. Its function is unknown.

    \ ' '10100' 'IPR019395' '\

    This entry represents a family of conserved eukaryotic proteins. Members are putative transmembrane proteins but otherwise the function is not known.

    \ ' '10101' 'IPR019396' '\

    This entry represents a family of conserved transmembrane proteins that, in humans, are expressed from a region upstream of the FragileXF site and appear to be intimately linked with Fragile-X syndrome. The absence of the human TMEM185A protein does not necessarily lead to developmental delay, but might, in combination with other, currently unknown, factors. Alternatively, the TMEM185A protein is either redundant, or its function can be complemented by the highly similar chromosome 2 retro-pseudogene product, TMEM185B PUBMED:12404111.

    \ ' '10102' 'IPR018937' '\

    This entry represents a novel family of membrane magnesium transporters (MMgT) PUBMED:18057121. The proteins, MMgT1 and MMgT2, are localised to the Golgi complex and post-Golgi vesicles, including the early endosomes, suggesting that they may provide regulated pathways for Mg2+ transport in the Golgi and post-Golgi organelles of epithelium-derived cells PUBMED:18057121.

    \ ' '10103' 'IPR019397' '\

    This is a family of putative, eukaryote, transmembrane proteins but the function is unknown.

    \ ' '10104' 'IPR018801' '\

    This entry consists of proteins conserved from worms to humans. They are purported to be transmembrane protein-precursors but their function is unknown.

    \ ' '10105' 'IPR019398' '\

    The pre-rRNA-processing protein TSR2 is required for 20S pre-rRNA processing PUBMED:12837249. This entry represents a conserved region whose function is not known, though it contains a distinctive WGG motif.

    \ ' '10106' 'IPR019399' '\

    This family of proteins is transcribed anti-sense along the DNA to the Parkin gene product and the two appear to be transcribed under the same promoter. The protein has predicted alpha-helical and beta-sheet domains which suggest its function is in the ubiquitin/proteasome system PUBMED:12547187. Mutations in parkin are the genetic cause of early-onset and autosomal recessive juvenile parkinsonism.

    \ ' '10107' 'IPR019400' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This family of proteins is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryotes, as it is a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumour domain) in which there is an active cysteine protease triad (ii) a nuclear localisation signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif.

    \ ' '10108' 'IPR019401' '\

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a short conserved zinc-finger domain. It contains the sequence motif Cx8Hx14Cx2C.

    \ ' '10109' 'IPR019402' '\

    This entry includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, CILW (x2), CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function PUBMED:10585768. DRAM is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an endoplasmic reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of DRAM is stress-induced PUBMED:16839881. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton PUBMED:12837385, PUBMED:12015967.

    \ ' '10110' 'IPR019403' '\

    Med19 represents a family of conserved proteins which are members of the multi-protein co-activator Mediator complex PUBMED:12584197.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '10111' 'IPR018802' '\

    This entry represents the precursor proteins for a number of short antimicrobial peptides called Latarcins. Latarcins were discovered in the venom of the spider Lachesana tarabaevi PUBMED:16735513. Latarcins are likely to adopt amphipathic alpha-helical structure in the plasma membrane.

    \ ' '10112' 'IPR019404' '\

    This entry represents subunit Med11 of the Mediator complex PUBMED:12584197.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '10113' 'IPR018803' '\

    This group of proteins, found primarily in fungi, consists of putative stress-responsive nuclear envelope protein Ish1 and homologues PUBMED:11859360.

    \ ' '10114' 'IPR019405' '\

    This entry represents a conserved domain found in both in bacteria and fungi. Though often found 6-phosphogluconolactonase enzymes, the function of this domain is not known.

    \ ' '10115' 'IPR019406' '\

    C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf\'s can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 PUBMED:11361095. C2H2 Znf\'s are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes PUBMED:10664601. Transcription factors usually contain several Znf\'s (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA PUBMED:10940247. C2H2 Znf\'s can also bind to RNA and protein targets PUBMED:18253864.

    \

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a C2H2-type Znf motif that in humans is part of the APLF (aprataxin- and PNK-like) forkead association domain-containing protein PUBMED:17353262. The Znf is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues, and is configured CX5CX6HX5H. Many of the proteins containing this Znf are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism.

    \ ' '10116' 'IPR018475' '\

    This domain is found associated with the the catalytic domain of dinoflagellate luciferase. Luciferase is involved in catalysing the light emitting reaction in bioluminescence. This domain has a three helix bundle structure that holds four important histidines that are thought to play a role in the pH regulation of the enzyme PUBMED:15665092.

    \ ' '10117' 'IPR018804' '\

    This entry represents the catalytic domain of dinoflagellate luciferase. Luciferase is involved in catalysing the light emitting reaction in bioluminescence. The structure of this domain has been solved PUBMED:15665092. The core part of the domain is a 10 stranded beta barrel that is structurally similar to lipocalins and FABP PUBMED:15665092.

    \ ' '10119' 'IPR018805' '\

    This entry represents a family of proteins conserved primarily in fungi. One member is annotated putatively as OPEL, a house-keeping protein, but this could not be confirmed. It contains 5 highly conserved cysteines two of which form a characteristic CGC sequence motif.

    \ ' '10120' 'IPR019407' '\

    Cytoplasmic thiouridylase is a highly conserved complex responsible for the 2-thiolation of cytosolic tRNAs PUBMED:18391219. Inactivation of this complex leads to a loss of thiolation on tRNAs, decreased viability and aberrant cell development. This entry represents the second subunit of this complex.

    \ ' '10122' 'IPR018807' '\

    This entry is found in the N-terminal region of proteins that contain . The function of this glycine-rich region is unknown.

    \ ' '10123' 'IPR018808' '\

    The SAFF domain is conserved in proteins from fungi through to humans.

    \ ' '10124' 'IPR019408' '\

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    Srab is part of the Sra superfamily of chemoreceptors. The expression pattern of the srab genes is biologically intriguing. Of the six promoters successfully expressed in transgenic organisms, one was exclusively expressed in the tail phasmid neurons, two were exclusively expressed in a head amphid neuron, and two were expressed both in the head and tail neurons as well as a limited number of other cells PUBMED:15618405.

    \ ' '10125' 'IPR019409' '\

    This entry represents a conserved region found within FMP27.

    \

    The function of the FMP27 protein is not known, but it is encoded by a gene that is directly involved in the activation of RNA polymerase II through gene looping PUBMED:15314641.

    \ ' '10126' 'IPR019410' '\

    There are a number of unidentified genes that have a high probability of coding for methyltransferases. They make up approximately 0.6-1.6% of the genes in the yeast, human, mouse, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Escherichia coli genomes PUBMED:12872006. This entry represents putative methyltransferases.

    \ ' '10127' 'IPR018809' '\

    This entry represents a family of small proteins conserved in fungi. The function is not known.

    \ ' '10128' 'IPR019411' '\

    This is entry represents a sequence region of unknown function that is conserved from plants to humans.

    \ ' '10129' 'IPR018287' '\

    This entry represents an essential domain of the transcription activator Hap4 that allows it to associate with Hap2, Hap3 and Hap5 to form the Hap complex PUBMED:16522629, PUBMED:16278450. In Saccharomyces cerevisiae (Baker\'s yeast), the haem-activated protein complex Hap2/3/4/5 plays a major role in the transcription of genes involved in respiration PUBMED:2676721.

    \ ' '10130' 'IPR018478' '\

    This entry represents the N-terminal domain of sporulation factor WhiA PUBMED:10986251. This domain is related to the LAGLIDADG homing endonuclease domain while the C-terminal domain of WhiA is predicted to be a DNA binding helix-turn-helix domain PUBMED:17603302.

    \ ' '10132' 'IPR019412' '\

    This entry represents a family of proteins conserved from fungi to humans. In humans this family is expressed in primary breast carcinomas but not in normal breast tissue. There appears to be a putative eukaryotic RNP-1 motif and a candidate anchoring transmembrane domain. The human protein is coordinately regulated with oestrogen receptor, but is not necessarily oestradiol-responsive PUBMED:9461476. These proteins also carry a tetratricopeptide repeat () at their C terminus.

    \ ' '10134' 'IPR019413' '\

    This entry represents a family of proteins of unknown function found in fungi. They contain a characteristic GFDRL sequence motif.

    \ ' '10135' 'IPR018810' '\

    This entry represents a family of proteins conserved in fungi whose function is unknown.

    \ ' '10136' 'IPR019414' '\

    This entry represents a 38 residue domain of unknown function that is found at the extreme C-terminal end of some HEAT repeats.

    \ ' '10137' 'IPR019415' '\

    This entry represents a conserved region within FMP27 that contains characteristic SW and GKG sequence motifs.

    \

    The function of the FMP27 protein is not known, but it is encoded by a gene that is directly involved in the activation of RNA polymerase II through gene looping PUBMED:15314641.

    \ ' '10138' 'IPR018811' '\

    This entry represents a family of conserved proteins found in fungi. They contain a characteristic FL(I)LHE(L)TA sequence motif, where the bracketed residues are I, L or V. Their function is not known.

    \ ' '10139' 'IPR018812' '\

    This entry represents a family of proteins conserved in fungi whose function is not known. There are two characteristic sequence motifs, GGWW and TGR.

    \ ' '10141' 'IPR019416' '\

    This entry is found in a family of proteins of unknown function that are conserved from fungi to mammals. Note - one mouse protein is referred to as ELG, but this is not a homologue of the human ELG protein.

    \ ' '10142' 'IPR018814' '\

    This entry represents a family of proteins conserved in fungi. Their function is not known.

    \ ' '10143' 'IPR018815' '\

    This is a family of proteins of approximately 200 residues that are conserved in fungi. Ilm1 is part of the peroxisome, a complex that is the sole site of beta-oxidation in Saccharomyces cerevisiae (Baker\'s yeast) and known to be required for optimal growth in the presence of fatty acid. Ilm1 may participate in the control of the C16/C18 ratio since it interacts strongly with Mga2p, a transcription factor that controls expression of Ole1, the sole fatty acyl desaturase in S. cerevisiae responsible for conversion of the saturated fatty acids stearate (C18) and palmitate (C16) to oleate and palmitoleate, respectively PUBMED:17151231.

    \ ' '10144' 'IPR018816' '\

    This entry represents the conserved central region of a family of proteins referred to as cactins. This region contains two of three predicted coiled-coil domains. Most proteins containing this region also have at the C-terminal end. Upstream of this region in Drosophila proteins are a serine-rich region, some non-typical RD motifs and three predicted bipartite nuclear localisation signals, none of which are well-conserved. Cactin associates with IkappaB-cactus as one of the intracellular members of the Rel (NF-kappaB) pathway which is conserved in invertebrates and vertebrates. In mammals, this pathway controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo PUBMED:10842059.

    \ ' '10145' 'IPR019417' '\

    This entry represents a short (30 residues) domain of unknown function found in a family of fungal proteins. It contains a characteristic DLL sequence motif.

    \ ' '10147' 'IPR019419' '\

    This entry represents conserved proteins with unknown function and is restricted to fungi. Some of the proteins are annotated to the Loss of Respiratory Capacity protein 2 (LRC2).

    \ ' '10148' 'IPR019420' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class b (Srb) from the Sra superfamily PUBMED:15618405. Srb receptors contain 6-8 hydrophobic, putative transmembrane, regions and can be distinguished from other 7TM GPCR receptors by their own characteristic TM signatures.

    \

    Srbc is a solo family amongst the superfamilies of chemoreceptors.

    \ ' '10149' 'IPR019421' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents the chemoreceptor Srd PUBMED:18050473.

    \ ' '10150' 'IPR019422' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    Srh is part of the Str superfamily of chemoreceptors PUBMED:10673277.

    \ ' '10151' 'IPR019423' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class j (Srj) from the Str superfamily PUBMED:18050473, PUBMED:9582190. The Srj family is designated as the out-group based on its location in preliminary phylogenetic analyses of the entire superfamily PUBMED:11238245.

    \ ' '10152' 'IPR019424' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class sx (Srsx), which is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473.

    \ ' '10153' 'IPR019425' '\

    Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type PUBMED:7585938. Srt is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473.

    \ ' '10154' 'IPR003839' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class u (Sru) from the Srg superfamily PUBMED:7585938.

    \ ' '10155' 'IPR019426' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class v (Srv) from the Srg superfamily PUBMED:18050473, PUBMED:9582190. Srg receptors contain seven hydrophobic, putative transmembrane, regions and can be distinguished from other 7TM GPCR receptors by their own characteristic TM signatures.

    \ ' '10156' 'IPR019427' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class w (Srw), which is a solo family amongst the superfamilies of chemoreceptors. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz PUBMED:15761060.

    \ ' '10157' 'IPR018817' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class z (Srz), a solo family amongst the superfamilies of chemoreceptors PUBMED:18050473, PUBMED:9582190. The genes encoding Srz appear to be under strong adaptive evolutionary pressure PUBMED:15761060.

    \ ' '10158' 'IPR019428' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class r (Str) from the Str superfamily PUBMED:18050473, PUBMED:9582190. Almost a quarter (22.5%) of str and srj family genes and pseudogenes in C. elegans appear to have been newly formed by gene duplications since the species split PUBMED:11238245.

    \ ' '10159' 'IPR019429' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents Sri, which is part of the Str superfamily of chemoreceptors.

    \ ' '10160' 'IPR019430' '\

    G-protein-coupled receptors, GPCRs, constitute a vast protein family that encompasses a wide range of functions (including various autocrine, paracrine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence PUBMED:8170923. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family. There is a specialised database for GPCRs (http://www.gpcr.org/7tm/).

    \

    The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli PUBMED:10580986. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise \'blind\' and \'deaf\' PUBMED:18050473. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr PUBMED:7585938, PUBMED:18050473, PUBMED:15618405. Many of these proteins have homologues in Caenorhabditis briggsae.

    \

    This entry represents serpentine receptor class x (Srx) from the Srg superfamily PUBMED:18050473, PUBMED:9582190. Srg receptors contain seven hydrophobic, putative transmembrane, regions and can be distinguished from other 7TM GPCR receptors by their own characteristic TM signatures.

    \ \ ' '10161' 'IPR019431' '\

    This entry represents a conserved region of unknown function found in a family of fungal proteins. In some cases these proteins also contain an alpha/beta hydrolase fold ().

    \ ' '10162' 'IPR018818' '\

    This entry represents the conserved N-terminal end of a group of proteins conserved in fungi. They are likely to be Sin3 binding proteins. Sin3p does not bind DNA directly even though the yeast SIN3 gene functions as a transcriptional repressor. Sin3p is part of a large multiprotein complex PUBMED:9393435. Stb3 appears to bind directly to ribosomal RNA Processing Elements (RRPE) although there are no obvious domains which would accord with this, implying that Stb3 may be a novel RNA-binding protein PUBMED:17616518.

    \ ' '10163' 'IPR019432' '\

    This entry represents a conserved 45 residue region found in one of the proteins from a siderophore biosynthesis complex PUBMED:9266668. The siderophore produced by this complex varies with species eg alcaligin is produced in Bordetella, aerobactin in Escherichia coli and mycobactin in mycobacteria. The protein appears to catalyse N-acylation of the hydroxylamine group in N-hydroxyputrescine with succinyl CoA - an activated mono-thioester derivative of succinic acid that is an intermediate in the Krebs cycle PUBMED:15719346.

    \ ' '10164' 'IPR018819' '\

    This entry represents the conserved 100 residue central region from a family of proteins found in fungi. It carries a characteristic EYD sequence motif. The function is not known.

    \ ' '10165' 'IPR019433' '\

    Pga1 is found only in yeasts and not in mammals. It localises in the ER as a glycosylated integral membrane protein. It binds to the GPI-mannosyltransferase II subunit of the GPI and it is responsible for the second mannose addition to GPI precursors. The GPI-anchoring complex is a glycolipid that functions as a membrane anchor for many cell-surface proteins PUBMED:17615295.

    \ ' '10166' 'IPR018820' '\

    This entry represents a family of proteins conserved in fungi. Their function is not known.

    \ ' '10167' 'IPR018821' '\

    This entry is found associated with presumed nucleotidyltransferase domains and seems to be distantly related to other helical substrate binding domains.

    \ ' '10168' 'IPR018822' '\

    This entry represents a family of proteins conserved in fungi. Their function is not known.

    \ ' '10169' 'IPR018823' '\

    This entry is found in a family of proteins conserved in fungi. Their function is not known. This entry represents the C-terminal half of some member proteins which contain at their N terminus.

    \ ' '10170' 'IPR019434' '\

    This entry is found in a family of conserved proteins whose function is not known.

    \ ' '10171' 'IPR019435' '\

    This entry represents putative velum formation proteins found in fungi. They are of unknown function but are highly induced in zinc-depleted conditions and have increased expression in NAP1 deletion mutants PUBMED:12788058.

    \ ' '10172' 'IPR019436' '\

    This is a family of proteins conserved in yeasts. The function is not known.

    \ ' '10173' 'IPR019437' '\

    EST3 is a component of the telomerase holoenzyme, involved in telomere replication. It has been demonstrated that Est3 dimerises and binds to DNA and RNA. Furthermore, Est3 stimulates the dissociation of RNA/DNA hetero-duplexes PUBMED:16418502, PUBMED:16884717.

    \ ' '10174' 'IPR018466' '\

    This family of proteins appear to be involved in both fruiting body formation and in host attack as one member is named Hesp-379 (haustorially expressed secreted protein), the haustorium being the small sucker that penetrates host tissue PUBMED:17383119.

    \ ' '10175' 'IPR019438' '\

    This is a family of conserved proteins that have no known function.

    \ ' '10176' 'IPR019439' '\

    This entry represents a conserved region in FMP27 that contains a characteristic FAQPTW sequence motif.

    \

    The function of the FMP27 protein is not known, but it is encoded by a gene that is directly involved in the activation of RNA polymerase II through gene looping PUBMED:15314641.

    \ ' '10177' 'IPR019440' '\

    Cohesin loading factor is a conserved protein that has been characterised in fungi. It is associated with the cohesin complex and is required in G1 for cohesin binding to chromosomes, but is dispensable in G2 when cohesion has been established. It is often referred to as Ssl3 in Schizosaccharomyces pombe (Fission yeast), and Scc4 in Saccharomyces cerevisiae (Baker\'s yeast). It complexes with Mis4 PUBMED:16682348.

    \ ' '10178' 'IPR018824' '\

    This entry represents a conserved region found in fungal conidiation-specific protein 6 PUBMED:8224542. This protein is expressed approximately 6 hours after the induction of development and is induced just prior to major constriction-chain growth PUBMED:9560395.

    \ ' '10179' 'IPR019441' '\

    This entry represents a conserved sequence region in FMP27 that contains a characteristic GFWDK sequence motif.

    \

    The function of the FMP27 protein is not known, but it is encoded by a gene that is directly involved in the activation of RNA polymerase II through gene looping PUBMED:15314641.

    \ ' '10180' 'IPR018825' '\

    This entry represents the N-terminal region of a family of proteins conserved in fungi. Several of these proteins are annotated as being Ftp1 but this could not be confirmed. Their function is not known.

    \ ' '10181' 'IPR018826' '\

    This entry represents a sequence domain found in WW domain-binding protein that is characterised by several short PY and PT-like motifs of the PPPPY form. These appear to bind directly to the WW domains of WWP1 and WWP2 and other such diverse proteins as dystrophin and YAP (Yes-associated protein). The presence of a phosphotyrosine residue in the pWBP-1 peptide abolishes WW domain binding which suggests a potential regulatory role for tyrosine phosphorylation in modulating WW domain-ligand interactions. Given the likelihood that WWP1 and WWP2 function as E3 ubiquitin-protein ligases, it is possible that initial substrate-specific recognition occurs via WW domain-substrate protein interaction followed by ubiquitin transfer and subsequent proteolysis PUBMED:9169421. This domain lies just downstream of in many sequences.

    \ ' '10182' 'IPR019442' '\

    This entry is found in a family of proteins of unknown function that are conserved from plants to humans. Several of these proteins have been annotated as being HEAT repeat-containing proteins while others are designated as death-receptor interacting proteins, but neither of these has yet been confirmed. Aberrations in the genes encoding these proteins have been observed in benign thyroid adenomas PUBMED:12955091.

    \ ' '10183' 'IPR019443' '\

    This entry represents the C terminus of a family of proteins conserved from plants to humans. These include FMP27 and plant proteins which localise to the Golgi proteins and appear to regulate membrane trafficking, as they are required for rapid vesicle accumulation at the tip of the pollen tube PUBMED:16299389. The C terminus probably contains the Golgi localisation signal and it is well-conserved.

    \

    The function of the FMP27 protein is not known, but it is encoded by a gene that is directly involved in the activation of RNA polymerase II through gene looping PUBMED:15314641.

    \ ' '10185' 'IPR019445' '\

    This is a family of short, 111 residue, proteins found in Schizosaccharomyces pombe (Fission yeast). Their function is not known.

    \ ' '10186' 'IPR019446' '\

    This entry represents the N-terminal domain of a family of proteins whose function is not known.

    \ ' '10187' 'IPR018827' '\

    This entry represents a conserved sequence region found a family of fungal proteins. It appears to contain regions similar to mitochondrial electron transport proteins. The C-terminal domain is hydrophobic and negatively charged. There are consensus sites for both N-linked glycosylation and cAMP-dependent protein kinase phosphorylation PUBMED:8635735.

    \ ' '10188' 'IPR018828' '\

    This protein is expressed in fungi but its function is unknown.

    \ ' '10189' 'IPR019447' '\

    This entry represents the conserved central 169 residue region of the Kin17 DNA/RNA-binding proteins. The N-terminal region of Kin17 contains a zinc-finger domain, while in the human and mouse proteins there is a RecA-like domain found in the C-terminal region. In humans, Kin17 protein forms intra-nuclear foci during cell proliferation and is re-distributed in the nucleoplasm during the cell cycle PUBMED:10964102.

    \ ' '10190' 'IPR019448' '\

    This entry represents the N-terminal 150 residues of a family of conserved proteins which are induced by oestrogen PUBMED:14605097. Proteins in this entry are usually annotated as Fam102A, Fam102B, or Eeig1 (early oestrogen-responsive gene product 1).

    \ ' '10191' 'IPR019449' '\

    This entry represents a conserved sequence region within FMP27 that contains characteristic HQR and WPPW sequence motifs.

    \

    The function of the FMP27 protein is not known, but it is encoded by a gene that is directly involved in the activation of RNA polymerase II through gene looping PUBMED:15314641.

    \ ' '10192' 'IPR018829' '\

    This entry represents a conserved 120 residue region from a family fungal proteins. Their function is not known.

    \ ' '10193' 'IPR018830' '\

    This entry represents a family of proteins conserved in fungi. Their function is not known.

    \ ' '10194' 'IPR019450' '\

    This entry represents the very short (20 residues) conserved C-terminal domain of a family of nicotinamide mononucleotide adenylyltransferase proteins. The function of this domain is unknown. In most proteins it is associated with which is found at the N-terminus.

    \ ' '10195' 'IPR019451' '\

    This is a conserved region of approximately 400 residues which is found only in vertebrates. It is associated with HEAT domains () in all members. The function is not known.

    \ ' '10196' 'IPR018831' '\

    Found predominantly in Vibrio and cyanobacterial species (also some fungi) this entry represents a conserved region of approximately 150 residues found in a family of proteins of unknown function. There is a characteristic NKWYS sequence motif.

    \ ' '10197' 'IPR018832' '\

    Gingipains R and K are endopeptidases with specificity for arginyl and lysyl bonds, respectively. Like other cysteine peptidases, they require reducing conditions for activity. They are maximally active at approximately neutral pH. Gingipains R and K are secreted by the bacterium Porphyromonas gingivalis (Bacteroides gingivalis). The bacterium is a major pathogen in periodontal disease, and the many ways in which the activities of the gingipains may contribute to the disease processes have been reviewed PUBMED:10064139. These enzymes are also involved in the hemagglutinating activity of the organisms.

    \ \ \

    This entry represents a central region found in gingipain K peptidases, active on lysyl bonds; they belong to the MEROPS peptidase family C25 (gingipain family, clan CD).

    \

    \ ' '10198' 'IPR019452' '\

    This entry represents a domain found in the vacuolar sorting protein Vps39 and transforming growth factor beta receptor-associated protein Trap1. Vps39, a component of the C-Vps complex, is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole PUBMED:9111041, PUBMED:1493335. In Saccharomyces cerevisiae (Baker\'s yeast), Vps39 has been shown to stimulate nucleotide exchange PUBMED:11062257. Trap1 plays a role in the TGF-beta/activin signaling pathway. It associates with inactive heteromeric TGF-beta and activin receptor complexes, mainly through the type II receptor, and is released upon activation of signaling PUBMED:9545258, PUBMED:11278302. The precise function of this domain has not been characterised.

    \ ' '10199' 'IPR019453' '\

    This entry represents a domain found in the vacuolar sorting protein Vps39 and transforming growth factor beta receptor-associated protein Trap1. Vps39, a component of the C-Vps complex, is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole PUBMED:9111041, PUBMED:1493335. In Saccharomyces cerevisiae (Baker\'s yeast), Vps39 has been shown to stimulate nucleotide exchange PUBMED:11062257. Trap1 plays a role in the TGF-beta/activin signaling pathway. It associates with inactive heteromeric TGF-beta and activin receptor complexes, mainly through the type II receptor, and is released upon activation of signaling PUBMED:9545258, PUBMED:11278302. The precise function of this domain has not been characterised In Vps39 this domain is involved in localisation and in mediating the interactions with Vps11 PUBMED:11062257.

    \ ' '10200' 'IPR019454' '\

    The YkyA family of proteins contain a lipoprotein signal and a hydrolase domain. They are similar to cell wall binding proteins and might also be recognisable by a host immune defence system. It is thus likely that they function in pathways important for pathogenicity PUBMED:16684363.

    \ ' '10201' 'IPR019455' '\

    This entry represents the C-terminal half of the small subunit of acetolactate synthase. Acetolactate synthase is a tetrameric enzyme, composed of two large and two small subunits, which catalyses the first step in branched-chain amino acid biosynthesis. This reaction is sensitive to certain herbicides PUBMED:9197540.

    \ ' '10202' 'IPR018833' '\

    This entry represents the N-terminal 50 amino acids of a group of bacterial proteins often annotated as fumarylacetoacetate hydrolase-containing enzymes. In most cases these proteins also contain , which is found towards the C terminus.

    \ ' '10203' 'IPR019456' '\

    EKR is a short, 33 residue, domain found in bacterial and some lower eukaryotic species which lies between a POR (pyruvate ferredoxin/flavodoxin oxidoreductase) domain () and the 4Fe-4S binding domain (). It contains a characteristic EKR sequence motif. The exact function of this domain is not known.

    \ ' '10204' 'IPR019457' '\

    This entry is found at the N terminus of a family of putative membrane-spanning bacterial proteins. These proteins often contain towards the C terminus.

    \ ' '10205' 'IPR018834' '\

    Est1 is a protein which recruits or activates telomerase at the site of polymerisation PUBMED:12169735, PUBMED:12454059. This is the DNA/RNA binding domain of EST1 PUBMED:12676088.

    \ ' '10206' 'IPR019458' '\

    Est1 is directly involved in telomere replication. It associates with telomerase and, during its interaction with CDC13, telomerase activity is promoted PUBMED:12169735, PUBMED:12454059.

    \ ' '10207' 'IPR019459' '\

    The GRIP-related Arf-binding (GRAB) domain is located towards the C terminus of Rud3 type proteins. It is related to the GRIP domain, but the conserved tyrosine residue found at position 4 in all GRIP domains is replaced by a leucine residue. The small GTPase Arf is localised to the cis-Golgi where it recruits proteins via their GRAB domain, as part of the transport of cargo from the endoplasmic reticulum to the plasma membrane PUBMED:16338975.

    \ ' '10208' 'IPR018468' '\

    Mei5 is one of a pair of meiosis-specific proteins which facilitate the loading of Dmc1 on to Rad51 on DNA at double-strand breaks during recombination. Recombination is carried out by a large protein complex based around the two RecA homologues, Rad51 and Dmc1 PUBMED:15620352. This complex may play both a catalytic and a structural role in the interaction between homologous chromosomes during meiosis. Mei5 is seen to contain a coiled-coli region.

    \ ' '10209' 'IPR019460' '\

    This entry is found in a family of proteins involved in telomere maintenance. In Schizosaccharomyces pombe (Fission yeast) this protein is called Taf1 (taz1 interacting factor) and is part of the telomere cap complex. In Saccharomyces cerevisiae (Baker\'s yeast) this protein is called ATG11 and is known to be involved in vacuolar targeting and peroxisome degradation PUBMED:11309418, PUBMED:15659643.

    \ ' '10210' 'IPR018835' '\

    This entry represents a putative RNA-binding domain found only in fungi. It occurs in proteins annotated as Nrd1, which are known to carry RNA recogntion motif (RRM) domains. It is not homologous with any of the other RRM domains, e.g. .

    \ ' '10211' 'IPR018836' '\

    This entry represents a family of virulence proteins that are found in pathogenic Streptomyces species.

    \ ' '10212' 'IPR018837' '\

    CRF1 is a transcription factor that co-represses ribosomal genes with FHL1 via the TOR signalling pathway and protein kinase A PUBMED:15620355.

    \ ' '10213' 'IPR019461' '\

    Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The small C-terminal domain is likely to be a distinct binding region for the stability of the autophagosome complex PUBMED:12672448. It carries a highly characteristic conserved FLKF sequence motif.

    \ ' '10214' 'IPR018838' '\

    Proteins in this family have been implicated in telomere maintenance in Saccharomyces cerevisiae (Baker\'s yeast) PUBMED:15161972 and in meiotic chromosome segregation in Schizosaccharomyces pombe (Fission yeast) PUBMED:16169489.

    \ ' '10215' 'IPR018839' '\

    Clr2 is a chromatin silencing protein, one of a quartet of proteins forming the core of SHREC, a multienzyme effector complex that mediates hetero-chromatic transcriptional gene silencing in fission yeast PUBMED:17289569. Clr2 does not have any obvious well-conserved domains but, along with the other core proteins, binds to the histone deacetylase Clr3, and on its own might also have a role in chromatin organisation at the cnt domain, the site of kinetochore assembly.

    \ ' '10216' 'IPR018465' '\

    The centromere protein Scm3 is a non-histone component of centromeric chromatin that binds to CenH3-H4 histones, which are required for kinetochore assembly. Scm3 is required for Cse4 localisation and is required for its centromeric association PUBMED:17574026, PUBMED:17569568. The histone H3 variant Cse4 replaces conventional histone H3 in centromeric chromatin and helps direct the assembly of the kinetochore. In addition, Scm3 has been shown in Saccharomyces cerevisiae (Baker\'s yeast) to be required for G2/M progression PUBMED:17548816. Scm3 is required to maintain kinetochore function throughout the cell cycle. Scm3 contains a nuclear export signal (NES). The N-terminal region of Scm3 is well conserved and functions as the CenH3-interacting domain, while the C-terminal region is variable in size and sometimes consists of DNA binding motifs PUBMED:17704645.

    \ ' '10217' 'IPR019462' '\

    DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric\ enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme PUBMED:3052291. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length PUBMED:10499798. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    \ \

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5\' to 3\'direction, is known as the primary transcript.\ \ Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:\ \

    \ \ Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses\ vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    \

    RNA polymerases catalyse the DNA-dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared with three in eukaryotes (not including mitochondrial or chloroplast polymerases). This entry represents a domain in prokaryotic polymerases that spans the gap between domains 4 and 5 of the protein. It is also known as the external 1 region of the polymerase and is bound in association with the external 2 region PUBMED:11313498.

    \ ' '10218' 'IPR018840' '\

    This is a family of highly conserved predicted proteins primarily found in Bacillus species. Structurally they form homo-dimers, but their function is unknown.

    \ ' '10219' 'IPR018841' '\

    Several proteins in this entry are annotated as being putative molybdopterin-guanine dinucleotide biosynthesis proteins, but this has not been confirmed. The function of these proteins is therefore not currently known.

    \ ' '10220' 'IPR018842' '\

    In most cases this highly conserved region of the YkuI protein lies immediately downstream of the EAL (diguanylate cyclase/phosphodiesterase) domain so that together they form a monomer which dimerises for its enzymatic action. This region contains three alpha helices and five beta strands and forms C-terminal half of the structure.

    \ ' '10221' 'IPR019463' '\

    CoatB is the major coat protein of the Ionoviruses (filamentous bacteriophage). It is an alpha helix of approximately 50 residues composed of three main sections; an acidic N-terminal region, a central hydrophobic region, and a basic C terminus. Despite differences in primary structure between different strains, all members assemble into a complex of 35 monomers in a Catherine-wheel like formation PUBMED:2078529.

    \ ' '10222' 'IPR019464' '\

    ELL is a family of RNA polymerase II elongation factors. It is bound stably to elongation-associated factors 1 and 2, EAFs, and together these act as a strong regulator of transcription activity. by direct interaction with Pol II. ELL binds to pol II on its own but the affinity is greatly increased by the cooperation of EAF PUBMED:17150956. Some members carry an occludin domain () just downstream. There is no Saccharomyces cerevisiae (Baker\'s yeast) member.

    \ ' '10223' 'IPR018944' '\

    DNA polymerases catalyse the addition of dNMPs onto the 3-prime ends of DNA chains. There is a general polymerase fold consisting of three subdomains that have been likened to the fingers, palm, and thumb of a right hand. This entry represents the central three-helical region of DNA polymerase lambda referred to as the F and G helices of the fingers domain. Contacts with DNA involve this conserved helix-hairpin-helix motif in the fingers region which interacts with the primer strand. This motif is common to several DNA binding proteins and confers a sequence-independent interaction with the DNA backbone PUBMED:14992725.

    \ ' '10224' 'IPR019465' '\

    The conserved oligomeric Golgi (COG) complex is a peripheral membrane complex involved in intra-Golgi protein trafficking. Subunit 5 is located in the smaller, B lobe, together with subunits 6-8, and has been shown to bind subunits 1 and 7 PUBMED:15932880.

    \ ' '10225' 'IPR019466' '\

    This entry represents a short domain found the matrilin (cartilage matrix) proteins. It forms a coiled coil structure and contains a single cysteine residue at its start which is likely to form a di-sulphide bridge with a corresponding cysteine in an upstream EGF domain (), thereby spanning the VWA domain of the protein ().This domain is likely to be responsible for protein trimerisation PUBMED:9287130.

    \ ' '10226' 'IPR019467' '\

    This entry represents the N-terminal half of the structure of histone acetyl transferase HAT1. It is often found in association with the C-terminal part of . It seems to be motifs C and D of the structure. Histone acetyltransferases (HATs) catalyse the transfer of an acetyl group from acetyl-CoA to the lysine E-amino groups on the N-terminal tails of histones. HATs are involved in transcription since histones tend to be hyper-acetylated in actively transcribed regions of chromatin, whereas in transcriptionally silent regions histones are hypo-acetylated PUBMED:9175471.

    \ ' '10227' 'IPR018843' '\

    Utp8 is an essential component of the nuclear tRNA export machinery in Saccharomyces cerevisiae (Baker\'s yeast). It is a tRNA binding protein that acts at a step between tRNA maturation /aminoacylation, and translocation of the tRNA across the nuclear pore complex PUBMED:17634288.

    \ ' '10228' 'IPR018948' '\

    This family represents the shorter, B, chain of the homo-dimeric structure which is a guanine nucleotide-binding protein that binds and hydrolyses GTP. TrmE is homologous to the tetrahydrofolate-binding domain of N,N-dimethylglycine oxidase and indeed binds formyl-tetrahydrofolate. TrmE actively participates in the formylation reaction of uridine and regulates the ensuing hydrogenation reaction of a Schiff\'s base intermediate. This B chain is the N-terminal portion of the protein consisting of five beta-strands and three alpha helices and is necessary for mediating dimer formation within the protein PUBMED:15616586.

    \ ' '10229' 'IPR019468' '\

    Adenylosuccinate lyase catalyses two steps in the synthesis of purine nucleotides: the conversion of succinylaminoimidazole-carboxamide ribotide into aminoimidazole-carboxamide ribotide (the fifth step of de novo IMP biosynthesis); the formation of adenosine monophosphate (AMP) from adenylosuccinate (the final step in the synthesis of AMP from IMP) PUBMED:17485188. This entry represents the C-terminal, seven alpha-helical, domain of adenylosuccinate lyase PUBMED:9274883.

    \ ' '10230' 'IPR019469' '\

    This entry represents a small group of highly conserved proteins from bacteria, in particular Helicobacter species. The structure is a bundle of alpha helices. The function is not known.

    \ ' '10231' 'IPR019470' '\

    This entry represents the TAT-signal region found in the iron-sulphur subunit of Ubiquinol-cytochrome C reductase (also known as the cytochrome bc1 complex). This enzymex is an oligomeric membrane protein complex that is a component of respiratory and photosynthetic electron transfer chains. It couples the transfer of electrons from ubiquinol to cytochrome c with the generation of a protein gradient across the membrane PUBMED:10873857. This entry is associated with , and .

    \ ' '10232' 'IPR018309' '\

    Phenolic acids, also called substituted hydroxycinnamic acids, are abundant in the plant kingdom because they are involved in the structure of plant cell walls and are present in some vacuoles. In plant-soil ecosystems they are released as free acids by hemicellulases produced by several fungi and bacteria. Of these weak acids, the most abundant are p-coumaric, ferulic, and caffeic acids, considered to be natural toxins that inhibit the growth of microorganisms, especially at low pHs. In spite of this chemical stress, some bacteria can use phenolic acids as a sole source of carbon. For other microorganisms, these compounds induce a specific response by which the organism adapts to its environment. The ubiquitous lactic acid bacterium Lactobacillus plantarum exhibits an inducible phenolic acid decarboxylase (PAD) activity which converts these substrates into less-toxic vinyl phenol derivatives. PadR acts as a repressor of padA gene expression in the phenolic acid stress response PUBMED:15066807. This entry represents the C-terminal domain.

    \ ' '10233' 'IPR019471' '\

    This is the interferon-regulatory factor 3 chain of the hetero-dimeric structure which also contains the shorter chain CREB-binding protein. These two subunits make up the DRAF1 (double-stranded RNA-activated factor 1). Viral dsRNA produced during viral transcription or replication leads to the activation of DRAF1. The DNA-binding specificity of DRAF1 correlates with transcriptional induction of ISG (interferon-alpha, beta-stimulated gene). IRF-3 pre-exists in the cytoplasm of uninfected cells and translocates to the nucleus following viral infection. Translocation of IRF-3 is accompanied by an increase in serine and threonine phosphorylation, and association with the CREB coactivator occurs only after infection.

    \ ' '10235' 'IPR018326' '\

    Mutations in the nucleotide excision repair (NER) pathway can cause the xeroderma pigmentosum skin cancer predisposition syndrome. NER lesions are limited to one DNA strand, but otherwise they are chemically and structurally diverse, being caused by a wide variety of genotoxic chemicals and ultraviolet radiation. The xeroderma pigmentosum C (XPC) protein has a central role in initiating global-genome NER by recognising the lesion and recruiting downstream factors.

    \ \

    In NER in eukaryotes, DNA is incised on both sides of the lesion, resulting in the removal of a fragment ~25-30 nucleotides long. This is followed by repair synthesis and ligation. This reaction, in yeast, requires the damage binding factors Rad14, RPA, and the Rad4-Rad23 complex, the transcription factor TFIIH which contains the two DNA helicases Rad3 and Rad25, essential for creating a bubble structure, and the two endonucleases, the Rad1-Rad10 complex and Rad2, which incise the damaged DNA strand on the 5\'- and 3\'-side of the lesion, respectively PUBMED:10915862.

    \ \

    The crystal structure of the yeast XPC orthologue Rad4 bound to DNA containing a cyclobutane pyrimidine dimer lesion has been determined. The structure shows that Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. The expelled nucleotides of the undamaged strand are recognised by Rad4, whereas the two cyclobutane pyrimidine dimer-linked nucleotides become disordered. This indicates that the lesions recognised by Rad4/XPC thermodynamically destabilise the double helix in a manner that facilitates the flipping-out of two base pairs PUBMED:17882165.

    \ \

    Homologues of all the above mentioned yeast genes, except for RAD7, RAD16, and MMS19, have been identified in humans, and mutations in these human genes affect NER in a similar fashion as they do in yeast, with the exception of XPC, the human counterpart of yeast RAD4. Deletion of RAD4 causes the same high level of UV sensitivity as do mutations in the other class 1 genes, and rad4 mutants are completely defective in incision. By contrast, XPC is required for the repair of nontranscribed regions of the genome but not for the repair of the transcribed DNA strand.

    \ \

    This entry represents the DNA-binding domain of Rad4, which has a beta-hairpin structure PUBMED:17882165. Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix.

    \ ' '10236' 'IPR018327' '\

    Mutations in the nucleotide excision repair (NER) pathway can cause the xeroderma pigmentosum skin cancer predisposition syndrome. NER lesions are limited to one DNA strand, but otherwise they are chemically and structurally diverse, being caused by a wide variety of genotoxic chemicals and ultraviolet radiation. The xeroderma pigmentosum C (XPC) protein has a central role in initiating global-genome NER by recognising the lesion and recruiting downstream factors.

    \ \

    In NER in eukaryotes, DNA is incised on both sides of the lesion, resulting in the removal of a fragment ~25-30 nucleotides long. This is followed by repair synthesis and ligation. This reaction, in yeast, requires the damage binding factors Rad14, RPA, and the Rad4-Rad23 complex, the transcription factor TFIIH which contains the two DNA helicases Rad3 and Rad25, essential for creating a bubble structure, and the two endonucleases, the Rad1-Rad10 complex and Rad2, which incise the damaged DNA strand on the 5\'- and 3\'-side of the lesion, respectively PUBMED:10915862.

    \ \

    The crystal structure of the yeast XPC orthologue Rad4 bound to DNA containing a cyclobutane pyrimidine dimer lesion has been determined. The structure shows that Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. The expelled nucleotides of the undamaged strand are recognised by Rad4, whereas the two cyclobutane pyrimidine dimer-linked nucleotides become disordered. This indicates that the lesions recognised by Rad4/XPC thermodynamically destabilise the double helix in a manner that facilitates the flipping-out of two base pairs PUBMED:17882165.

    \ \

    Homologues of all the above mentioned yeast genes, except for RAD7, RAD16, and MMS19, have been identified in humans, and mutations in these human genes affect NER in a similar fashion as they do in yeast, with the exception of XPC, the human counterpart of yeast RAD4. Deletion of RAD4 causes the same high level of UV sensitivity as do mutations in the other class 1 genes, and rad4 mutants are completely defective in incision. By contrast, XPC is required for the repair of nontranscribed regions of the genome but not for the repair of the transcribed DNA strand.

    \ \

    This entry represents the DNA-binding domain of Rad4, which has a beta-hairpin structure PUBMED:17882165. Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix.

    \ ' '10237' 'IPR018328' '\

    Mutations in the nucleotide excision repair (NER) pathway can cause the xeroderma pigmentosum skin cancer predisposition syndrome. NER lesions are limited to one DNA strand, but otherwise they are chemically and structurally diverse, being caused by a wide variety of genotoxic chemicals and ultraviolet radiation. The xeroderma pigmentosum C (XPC) protein has a central role in initiating global-genome NER by recognising the lesion and recruiting downstream factors.

    \ \

    In NER in eukaryotes, DNA is incised on both sides of the lesion, resulting in the removal of a fragment ~25-30 nucleotides long. This is followed by repair synthesis and ligation. This reaction, in yeast, requires the damage binding factors Rad14, RPA, and the Rad4-Rad23 complex, the transcription factor TFIIH which contains the two DNA helicases Rad3 and Rad25, essential for creating a bubble structure, and the two endonucleases, the Rad1-Rad10 complex and Rad2, which incise the damaged DNA strand on the 5\'- and 3\'-side of the lesion, respectively PUBMED:10915862.

    \ \

    The crystal structure of the yeast XPC orthologue Rad4 bound to DNA containing a cyclobutane pyrimidine dimer lesion has been determined. The structure shows that Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. The expelled nucleotides of the undamaged strand are recognised by Rad4, whereas the two cyclobutane pyrimidine dimer-linked nucleotides become disordered. This indicates that the lesions recognised by Rad4/XPC thermodynamically destabilise the double helix in a manner that facilitates the flipping-out of two base pairs PUBMED:17882165.

    \ \

    Homologues of all the above mentioned yeast genes, except for RAD7, RAD16, and MMS19, have been identified in humans, and mutations in these human genes affect NER in a similar fashion as they do in yeast, with the exception of XPC, the human counterpart of yeast RAD4. Deletion of RAD4 causes the same high level of UV sensitivity as do mutations in the other class 1 genes, and rad4 mutants are completely defective in incision. By contrast, XPC is required for the repair of nontranscribed regions of the genome but not for the repair of the transcribed DNA strand.

    \ \

    This entry represents the DNA-binding domain of Rad4, which has a beta-hairpin structure PUBMED:17882165. Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix.

    \ ' '10238' 'IPR019473' '\

    This entry represents the C-terminal region of subunit 8 (also known as TAF8) of the transcription factor TFIID PUBMED:14580349. The adjacent N-terminal region generally contains a histone fold domain (). This subunit is one of the key subunits of TFIID, being one of several general cofactors which are typically involved in gene activation to bring about the communication between gene-specific transcription factors and components of the general transcription machinery PUBMED:16858867.

    \ ' '10239' 'IPR018844' '\

    Cytokinesis in yeasts involves a family of proteins whose essential function is to bind Cdc14-family phosphatase and prevent this from being sequestered and inhibited in the nucleolus. This is the highly conserved N terminus of a family of proteins which act as cytokinesis checkpoint controls by allowing cells to cope with cytokinesis defects. These proteins are required for rDNA silencing and mini-chromosome maintenance PUBMED:17538026.

    \ ' '10240' 'IPR019474' '\

    This entry represents the most conserved part of the core region of ubiquitin conjugation factor E4 (or Ub elongating factor, or Ufd2P), running from helix alpha-11 to alpha-38. It consists of 31 helices of variable length connected by loops of variable size forming a compact unit; the helical packing pattern of the compact unit consists of five structural repeats that resemble tandem Armadillo (ARM) repeats. This domain is involved in ubiquitination as it binds Cdc48p and escorts ubiquitinated proteins from Cdc48p to the proteasome for degradation. The core is structurally similar to the nuclear transporter protein importin-alpha. The core is associated with the U-box at the C terminus, (), which has ligase activity.

    \

    Ubiquitin conjugation factor E4 is involved in N-terminal ubiquitin fusion degradation proteolytic pathway (UFD pathway). E4 binds to the ubiquitin moieties of preformed conjugates and catalyses ubiquitin chain assembly in conjunction with E1, E2, and E3. E4 appears to influence the formation and topology of the multi-Ub chain as it enhances ubiquitination at \'Lys-48\' but not at \'Lys-29\' of the N-terminal Ub moiety.

    \ ' '10241' 'IPR014020' '\

    Tensins constitute an eukaryotic family of lipid phosphatases that are defined by the presence of two adjacent domains: a lipid phosphatase domain and a C2-like domain. The tensin-type C2 domain has a structure similar to the classical C2 domain (see ) that mediates the Ca2+-dependent membrane recruitment of several signalling proteins. However the tensin-type C2 domain lacks two of the three conserved loops that bind Ca2+, and in this respect it is similar to the C2 domains of PKC-type PUBMED:11395408, PUBMED:11858936. The tensin-type C2 domain can bind phopholipid membranes in a Ca2+ independent manner PUBMED:10555148. In the tumour suppressor protein PTEN, the best characterised member of the family, the lipid phosphatase domain was shown to specifically dephosphorylate the D3 position of the inositol ring of the lipid second messenger, phosphatydilinositol-3-4-5-triphosphate (PIP3). The lipid phosphatase domain contains the signature motif HCXXGXXR present in the active sites of protein tyrosine phosphatases (PTPs) and dual specificity phosphatases (DSPs). Furthermore, two invariant lysines are found only in the tensin-type phosphatase motif (HCKXGKXR) and are suspected to interact with the phosphate group at position D1 and D5 of the inositol ring PUBMED:11395408, PUBMED:10555148.

    \ \

    The C2 domain is found at the C-terminus of the tumour suppressor protein PTEN (phosphatidyl-inositol triphosphate phosphatase). This domain may include a CBR3 loop, indicating a central role in membrane binding. This domain associates across an extensive interface with the N-terminal phosphatase domain DSPc suggesting that the C2 domain productively positions the catalytic part of the protein on the membrane. The crystal structure of the PTEN tumour suppressor has been solved PUBMED:10555148. The lipid phosphatase domain has a structure similar to the dual specificity phosphatase (see ). However, PTEN has a larger active site pocket that could be important to accommodate PI(3,4,5)P3.

    \ \

    Proteins known to contain a phosphatase and a C2 tensin-type domain are listed below:

    \ \ \ \ \ ' '10242' 'IPR019475' '\

    This entry represents the C-terminal region three-helical domain of DNA primase PUBMED:10741967. Primases synthesise short RNA strands on single-stranded DNA templates, thereby generating the hybrid duplexes required for the initiation of synthesis by DNA polymerases. Primases are recruited to single-stranded DNA by helicases - this domain binds DnaB-helicase PUBMED:10873470. It is associated with , which is the central catalytic core.

    \ ' '10243' 'IPR018950' '\

    This is the N-terminal domain of the disulphide bond isomerase DsbC. The whole molecule is V-shaped, where each arm is a DsbC monomer of two domains linked by a hinge; and the N-termini of each monomer join to form the dimer interface at the base of the V, so are vital for dimerisation PUBMED:10700276. DsbC is required for disulphide bond formation and functions as a disulphide bond isomerase during oxidative protein-folding in bacterial periplasm. It also has chaperone activity PUBMED:16087673.

    \ ' '10244' 'IPR019476' '\

    The plasmid conjugative coupling protein TraD (also known as TrwB) is a basic integral inner-membrane nucleoside-triphosphate-binding protein. It is the structural prototype for the type IV secretion system coupling proteins, a family of proteins essential for macromolecular transport between cells PUBMED:11748238. This protein forms hexamers from six structurally very similar protomers PUBMED:11214325. This hexamer contains a central channel running from the cytosolic pole (formed by the all-alpha domains) to the membrane pole ending at the transmembrane pore shaped by 12 transmembrane helices, rendering an overall mushroom-like structure. The TrwB all-alpha domain appears to be the DNA-binding domain of the structure.

    \ ' '10245' 'IPR019477' '\

    Rhodopsin is the archetypal G-protein-coupled receptor. Such receptors participate in virtually all physiological processes as signalling molecules. They utilise heterotrimeric guanosine triphosphate (GTP)-binding proteins to transduce extracellular signals to intracellular events. Rhodopsin is important because of the pivotal role it plays in visual signal transduction. It is a dimeric transmembrane protein whose intradiskal surface consists of an N-terminal domain and three loops connecting six of the seven transmembrane helices. The N-terminal domain is a compact alpha-helical region with breaks and bends at proline residues outside the membrane PUBMED:10888202. This entry represents the N-terminal domain, while the transmembrane region is represented by (). The N-terminal domain is extracellular is and is necessary for successful dimerisation and molecular stability PUBMED:16567090.

    \ ' '10246' 'IPR019478' '\

    Bacterial sulphur metabolism depends on the iron-containing porphinoid sirohaem. CysG is a multi-functional enzyme with S-adenosyl-L-methionine (SAM)-dependent bismethyltransferase, dehydrogenase and ferrochelatase activities. CysG synthesizes sirohaem from uroporphyrinogen III via reactions which encompass two branchpoint intermediates in tetrapyrrole biosynthesis, diverting flux first from protoporphyrin IX biosynthesis and then from cobalamin (vitamin B12) biosynthesis. CysG is a dimer. Its dimerisation region is 74 residues long, and acts to hold the two structurally similar protomers held together asymmetrically through a number of salt-bridges across complementary residues within the dimerisation region PUBMED:14595395. CysG dimerisation produces a series of active sites, accounting for CysG\'s multi-functionality, catalysing four diverse reactions:

    \ \

    \ \ \ ' '10247' 'IPR018951' '\

    Fumarase C catalyses the stereo-specific interconversion of fumarate to L-malate as part of the Krebs cycle. The full-length protein forms a tetramer with visible globular shape. FumaraseC_C is the C-terminal 65 residues referred to as domain 3. The core of the molecule consists of a bundle of 20 alpha-helices from the five-helix bundle of domain 2. The projections from the core of the tetramer are generated from domains 1 and 3 of each subunit PUBMED:8909293. This entry does not appear to be part of either the active site or the activation site but is helical in structure forming a little bundle.

    \ ' '10248' 'IPR018845' '\

    In Trichomonas vaginalis, thought to be the earliest extant eukaryote, the sole initiator element for control of the start of transcription is Inr, and this is recognised by the initiator binding protein IBP39. IBP39 contains an N-terminal Inr binding domain (IBD, represented by this entry) connected via a flexible, proteolytically sensitive, linker (residues 127-145) to a C-terminal domain. The IBD structure reveals a winged-helix-wing conformation with each element binding to DNA, the central helix-turn-helix contributing the majority of the specificity-determining contacts with the Inr core motif TCAPy(T/A). The binding of IBP39 to the Inr directly recruits RNA polymerase II and in this way initiates transcription PUBMED:14622596.

    \ ' '10249' 'IPR019479' '\

    This entry represents the C-terminal domain of 1-Cys peroxiredoxin, a member of the peroxiredoxin superfamily which protect cells against membrane oxidation through glutathione (GSH)-dependent reduction of phospholipid hydroperoxides to corresponding alcohols PUBMED:9587003. The C-terminal domain is crucial for providing the extra cysteine necessary for dimerisation of the whole molecule. Loss of the enzyme\'s peroxidase activity is associated with oxidation of the catalytic cysteine found upstream of this domain. Glutathionylation, presumably through its disruption of protein structure, facilitates access for GSH, resulting in spontaneous reduction of the mixed disulphide to the sulphydryl and consequent activation of the enzyme PUBMED:15004285. The domain is associated with , which carries the catalytic cysteine.

    \ ' '10250' 'IPR019480' '\

    Lactococcus lactis is one of the few organisms with two dihydroorotate dehydrogenases (DHODs) A and B PUBMED:11188687. The B enzyme is typical of DHODs in Gram-positive bacteria that use NAD+ as the second substrate. DHODB is a heterotetramer composed of a central homodimer of PyrDB subunits resembling the DHODA structure and two PyrK subunits along with three different cofactors: FMN, FAD, and a [2Fe-2S] cluster. The [2Fe-2S] iron-sulphur cluster binds to this C-terminal domain of the PyrK subunit, which is at the interface between the flavin and NAD binding domains and contains three beta-strands. The four cysteine residues at the N-terminal part of this domain are the ones that bind, in pairs, to the iron-sulphur cluster. The conformation of the whole molecule means that the iron-sulphur cluster is localized in a well-ordered part of this domain close to the FAD binding site PUBMED:11188687. The FAD and NAD binding domains are and respectively.

    \ ' '10251' 'IPR019481' '\

    This entry represents a conserved region found in a family of proteins that function as subunits of transcription factor IIIC (TFIIIC) PUBMED:17409385. TFIIIC in yeast and humans is required for transcription of tRNA and 5 S RNA genes by RNA polymerase III. The yeast proteins in this entry are fused to phosphoglycerate mutase domain.

    \ ' '10252' 'IPR019482' '\

    This entry represents the central domain (domain 2) of the beta subunit (also known as subunit p40) interleukin-12 (IL-12). This domain is largely beta-stranded with a fibronectin III-like structure. IL-12 is produced on stimulation by macrophage-engulfed micro-organisms and other stimuli, when it dimerises with IL-12 subunit alpha (subunit p35) to form a heterodimer which then binds to receptors on natural killer cells to activate them to destroy the micro-organisms PUBMED:10899108. This domain contains two disulphide bridges, one of which serves to bind the beta subunit to the alpha subunit, and the other to hold the beta strands within the domain together. The cupped shape of the alpha subunit binding interface matches the elbow-like bend between domains 2 and 3 in the beta subunit PUBMED:15661020. Domain 3 also has a fibronectin iii-like structure.

    \ ' '10253' 'IPR018952' '\

    This is the largely alpha-helical, C-terminal half of 2\'-5\'-oligoadenylate synthetase 1, being described as domain 2 of the enzyme and homologous to a tandem ubiquitin repeat. It carries the region of enzymic activity between residues 320 and 344 at the extreme C-terminal end PUBMED:14636576. Oligoadenylate synthetases are antiviral enzymes that counteract viral attack by degrading viral RNA. The enzyme uses ATP in 2\'-specific nucleotidyl transfer reactions to synthesise 2\'.5\'-oligoadenylates, which activate latent ribonuclease, resulting in degradation of viral RNA and inhibition of virus replication PUBMED:1651324. This domain is often associated with .

    \ ' '10254' 'IPR018479' '\

    Monopolin is a protein complex, originally identified in Saccharomyces cerevisiae (Baker\'s yeast), that is required for the segregation of homologous centromeres to opposite poles of a dividing cell during meiosis I PUBMED:11163190. The orthologous complex in Schizosaccharomyces pombe (Fission yeast) is not required for meiosis I chromosome segregation, but is proposed to play a similar physiological role in clamping microtubule binding sites PUBMED:17627824. In S. cerevisiae this subunit is called LRS4, and in S. pombe it is known as Mde4 PUBMED:12689592.

    \ ' '10255' 'IPR018953' '\

    This is the N-terminal domain of bacterial AMP nucleoside phosphorylase (AMNp). The N- and C-termini form distinct domains which intertwine with each other to form a stable monomer which associates with five other monomers to yield the active hexamer. The N terminus consists of a long helix and a four-stranded sheet with a novel topology. The C terminus binds the nucleoside whereas the N terminus acts as the enzymatic regulatory domain. AMNp () catalyses the hydrolysis of AMP to form adenine and ribose 5-phosphate. thereby regulating intracellular AMP levels PUBMED:15296732.

    \ ' '10256' 'IPR019483' '\

    This entry represents the C-terminal domain the E subunit (RFC-E) of the DNA polymerase III clamp-loader complex, one of the five RFC proteins of the clamp loader complex (replication factor-C, RFC) which binds to the DNA sliding clamp (proliferating cell nuclear antigen, PCNA). The five modules of RFC assemble into a right-handed spiral, which results in only three of the five RFC subunits (RFC-A, RFC-B and RFC-C) making contact with PCNA, leaving a wedge-shaped gap between RFC-E and the PCNA clamp-loader complex. The C-terminal is vital for the correct orientation of RFC-E with respect to RFC-A PUBMED:15201901.

    \ ' '10257' 'IPR011266' '\

    This entry represents the fibrinogen-binding domain from bacterial proteins such as fibrinogen-binding adhesion SdrG and clumping factor A. In both SdrG and clumping factor A, there are two fibrinogen-binding domains with similar core beta-sandwich topologies, but with different modulations in their structure. This entry represents the second domain, while represents the first domain.

    \ \

    Gram-positive pathogens, such as Staphylococci, Streptococci, and Enterococci, contain multiple cell wall-anchored proteins. Some of these proteins act as adhesins and mediate bacterial attachment to host tissues through lock-and-interactions with host ligands, such as fibrinogen, a glycoprotein found in blood plasma that plays a key role in haemostasis and coagulation. For pathogenic bacteria that do not invade host cells, extracellular matrix proteins are preferred targets for bacterial adhesion; adhesins mediating these interactions have been termed MSCRAMMs (microbial surface components recognizing adhesive matrix molecules). A common binding domain organization found within MSCRAMMs suggests a common ancestry. Both fibrinogen-binding adhesion SdrG and clumping factor A are MSCRAMMs.

    \

    Fibrinogen-binding adhesion SdrG is a cell wall-anchored adhesion found in the Gram-positive pathogen Mycobacterium farcinogenes that binds to the B-beta chain of human fibrinogen PUBMED:14567919. SdrG allows attachment of the bacterium to host tissues via specific binding to the beta-chain of human fibrinogen (Fg). SdrG binds to its ligand with a dynamic "dock lock, and latch" mechanism which represents a general mode of ligand-binding for structurally related cell wall-anchored proteins in most Gram-positive bacteria. The C-terminal part of SdrG(276-596) is integral to the folding of the immunoglobulin-like whole to create the docking grooves necessary for Fg binding PUBMED:14567919. Clumping factor A performs a similar function in Staphylococcus aureus by binding the gamma chain of fibrinogen PUBMED:12485987.

    \ ' '10258' 'IPR019485' '\

    C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf\'s can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 PUBMED:11361095. C2H2 Znf\'s are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes PUBMED:10664601. Transcription factors usually contain several Znf\'s (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA PUBMED:10940247. C2H2 Znf\'s can also bind to RNA and protein targets PUBMED:18253864.

    \

    Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates PUBMED:10529348, PUBMED:15963892, PUBMED:15718139, PUBMED:17210253, PUBMED:12665246. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few PUBMED:11179890. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    \ \

    This entry represents a C2H2-type zinc-finger domain found in the RAG1 protein. The structure contains the characteristic two-stranded beta-sheet and alpha-helix of a classical zinc-finger. The domain binds one zinc and, in complex with an adjacent RING-type zinc finger domain, helps to stabilise the whole of the dimerisation region of recombination activating protein 1 (RAG1) PUBMED:9228952. The function of the whole is to bind double-stranded DNA.

    \

    During lymphocyte development, the genes encoding immunoglobulins and T-cell receptors are assembled from variable (V), diversity (D), and joining (J) gene segments. This combinatorial process, known as V(D)J recombination, allows the generation of an enormous range of binding specificities from a limited amount of genetic information. The V(D)J recombination-activating proteins 1 and 2 (RAG1 and RAG2) form a complex that initiates this process by binding to the conserved recombination signal sequences (RSS) and introducing a double-strand break between the RSS and the adjacent coding segment. These breaks are generated in two steps, nicking of one strand (hydrolysis), followed by hairpin formation (transesterification). RAG1/2 has also been shown to function as a transposase in vitro, and to possess RSS-independent endonuclease activity (end processing) and hairpin opening. RAG1 alone can bind to RSS but stable, efficient binding requires RAG2. All known catalytic activities require the presence of both proteins. For more information see PUBMED:18066091.

    \ ' '10259' 'IPR019486' '\

    This entry represents a region called the argonaute hook PUBMED:17891150. It has been shown to bind to the Piwi domain () of argnonaute proteins.

    \ ' '10260' 'IPR019487' '\

    The RAM signalling pathway regulates Ace2p transcription factor activity and cellular morphogenesis in Saccharomyces cerevisiae (Baker\'s yeast), and is thought to be conserved amongst eukaryotes PUBMED:12972564.

    \ \

    This entry is found in one of the components of this pathway, the leucine-rich repeat-containing protein SOG2.

    \ ' '10261' 'IPR019488' '\

    Mtr2 is a monomeric, dual-action, RNA-shuttle protein found in yeasts. Transport across the nuclear-cytoplasmic membrane is via the macro-molecular membrane-spanning nuclear pore complex, NPC. The pore is lined by a subset of NPC members called nucleoporins that present FG (Phe-Gly) receptors, characteristically GLFG and FXFG motifs, for shuttling RNAs and proteins. RNA cargo is bound to soluble transport proteins (nuclear export factors) such as Mex67 in yeasts, and TAP in metazoa, which pass along the pore by binding to successive FG receptors. Mtr2 when bound to Mex67 maximises this FG-binding. Mtr2 also acts independently of Mex67 in transporting the large ribosomal RNA subunit through the pore PUBMED:14504280.

    \ ' '10262' 'IPR018941' '\

    Angiogenesis is a physiological process whereby new blood vessels are formed from existing ones. It is essential for tissue repair and regeneration during wound healing but also plays important roles in many pathological processes including tumor growth and metastasis PUBMED:16849318, PUBMED:16732286. Angiogenesis is regulated in part by the receptor protein tyrosine kinase Tie2 and its ligands, the angiopoietins. The angiopoietin-binding site is harbord by the N-terminal two immunoglobulin-like (Ig-like) domains of Tie2 PUBMED:16849318.

    \ \

    The angiopoietin-1 receptor contains the Tie-2 Ig-like domain. This protein is a tyrosine-kinase transmembrane receptor for angiopoietin 1. It probably regulates endothelial cell proliferation, differentiation and guides the proper patterning of endothelial cells during blood vessel formation.

    \ \

    Tie2 contains not two but three immunoglobulin domains. They fold together with the three epidermal growth factor domains to form a compact, arrowhead-shaped structure PUBMED:16732286.

    \ ' '10263' 'IPR019489' '\

    Most Clp ATPases form complexes with peptidase subunits and are involved in protein degradation, though some, such as ClpB, do not associate with peptidases and are involved in protein disaggregation PUBMED:16879409. This entry represents the C-terminal domain of Clp ATPases, often referred to as the D2-small domain, which forms a mixed alpha-beta structure. Compared with the adjacent AAA D1-small domain () it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighbouring subunit, thereby providing enough binding energy to stabilise the functional assembly PUBMED:14567920.

    \ ' '10264' 'IPR019490' '\

    Phosphoglucose isomerase (PGI) catalyses the interconversion of phosphoglucose and phosphofructose, and is a component of many sugar metabolic pathways. In some archaea and bacteria PGI activity occurs via a bifunctional enzyme that also exhibits phosphomannose isomerase (PMI) activity. Though not closely related to eukaryotic PGIs, the bifunctional enzyme is similar enough that the sequence includes the cluster of threonines and serines that forms the sugar phosphate-binding site in conventional PGI. This entry represents the C-terminal half of the bifunctional PGI/PMI enzyme, which contains many of the active catalytic site residues. The enzyme is thought to use the same catalytic mechanisms for both glucose ring-opening and isomerisation for the interconversion of glucose 6-phosphate to fructose 6-phosphate PUBMED:15252053.

    \ ' '10265' 'IPR018846' '\

    Methyl methanesulphonate-sensitivity protein 1 (MMS1) protects against replication-dependent DNA damage in Saccharomyces cerevisiae (Baker\'s yeast) PUBMED:11810260.

    \ ' '10266' 'IPR018847' '\

    Monopolin is a protein complex, originally identified in Saccharomyces cerevisiae (Baker\'s yeast), that is required for the segregation of homologous centromeres to opposite poles of a dividing cell during meiosis I PUBMED:11163190, PUBMED:12689592. MAM1 is required in S. cerevisiae for monopolar attachment PUBMED:11163190.

    \ ' '10267' 'IPR018954' '\

    This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyses the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain is made up of 16 antiparallel beta-strands and an alpha-helix at its C terminus. The fold of this domain appears to be unique. In addition, the last seven strands of the domain form a subdomain with an immunoglobulin-like (I-type Ig) fold in which the first strand is divided between the two beta-sheets. In penicillin spp this strand is interrupted by a 12-residue insertion which forms an additional edge-strand to the second beta-sheet of the sub-domain. The remainder of the second domain forms a series of beta-hairpins at its N terminus, four strands of which are contiguous with part of the Ig-like sub-domain, forming in total a seven-stranded antiparallel beta-sheet. This domain is associated with , which is N-terminal to it, but itself has no metazoan members.

    \ ' '10268' 'IPR018955' '\

    Catabolism and synthesis of leucine, isoleucine and valine are finely balanced, allowing the body to make the most of dietary input but removing excesses to prevent toxic build-up of their corresponding keto-acids. Regulating the activity of the branched-chain alpha-ketoacid dehydrogenase (BCDH) complex is the primary means by which these processes are coordinated. BCDH kinase regulates BCDH by phosphorylation, thereby inactivating it when synthesis is required.

    \ \

    Pyruvate dehydrogenase kinase inhibits the pyruvate dehydrogenase complex by phosphorylation of the E1 alpha subunit, thus contributing to the regulation of glucose metabolism. It is also involved in telomere maintenance.

    \ \ \

    This entry is associated with which is found towards the C-terminus.

    \ ' '10269' 'IPR019491' '\

    This is the C-terminal domain of a bacterial lipoate protein ligase. There is no conservation between this C terminus and that of vertebrate lipoate protein ligase C-termini, but both are associated with , further upstream. This C-terminal domain is more stable than and the hypothesis is that the C-terminal domain has a role in recognising the lipoyl domain and/or transferring the lipoyl group onto it from the lipoyl-AMP intermediate. C-terminal fragments of length 172 to 193 amino acid residues are observed in the eubacterial enzymes whereas in their archaeal counterparts the C-terminal segment is significantly smaller, ranging in size from 87 to 107 amino acid residues.

    \ ' '10270' 'IPR019492' '\

    This domain is at the very C terminus of cyclo-malto-dextrinase proteins and consists of 8 beta strands, is largely globular and appears to help stabilise the active sites created by upstream domains, , and . Cyclo-malto-dextrinases hydrolyse cyclodextrans to maltose and glucose and catalyse trans-glycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules.

    \ ' '10271' 'IPR019493' '\

    Bacteriocins are proteinaceous toxins produced by bacteria to inhibit the growth of similar or closely related strains. This entry represents a family of bacteriocins secreted by Streptococcal species. The precursor protein is proteolytically cleaved at the double-glycine. These proteins do not carry the YGNGVXC motif characteristic of pediocin-like bacteriocins, . The producer bacteria are protected from the effects of their own bacteriocins by production of a specific immunity protein which is co-transcribed with the genes encoding the bacteriocins, e.g. . The bacteriocins are structurally more specific than their immunity-protein counterparts. Typically, production of the bacteriocin gene is from within an operon carrying up to 6 genes including a typical two-component regulatory system (R and H), a small peptide pheromone (C), and a dedicated ABC transporter (A and -B) as well as an immunity protein PUBMED:17074857. The ABC transporter is thought to recognise the N termini of both the pheromone and the bacteriocins and to transport these peptides across the cytoplasmic membrane, concurrent with cleavage at the conserved double-glycine motif. Cleaved extracellular C can then bind to the sensor kinase, H, resulting in activation of R and up-regulation of the entire gene cluster via binding to consensus sequences within each promoter PUBMED:17298586. It seems likely that the whole regulon is carried on a transmissible plasmid which is passed between closely related Firmicute species since many clinical isolates from different Firmicutes can produce at least two bacteriocins, and the same bacteriocins can be produced by different species.

    \ \ ' '10272' 'IPR018848' '\

    This entry represents a presumed domain which has been predicted to contain three alpha helices. It was named the WIYLD domain based on the pattern of the ost conserved residues PUBMED:17020925. This domain appears to be specific to plant SET-domain proteins.

    \ ' '10273' 'IPR018849' '\

    This entry includes the Urb2 protein from yeast that is involved in ribosome biogenesis PUBMED:15226434.

    \ ' '10274' 'IPR019494' '\

    This entry represents a novel sensory domain, designated FIST C (short for F-box and intracellular signal transduction, C-terminal), which is present in signal transduction proteins from bacteria, archaea and eukaryotes. The chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids PUBMED:17855421.

    \ ' '10275' 'IPR018850' '\

    Mitochondrial escape protein 2 (also known as RNA12) plays a role in maintaining the mitochondrial genome and in controlling mtDNA escape PUBMED:8514129, PUBMED:8649384. It is also involved in the regulation of mtDNA nucleotide structure and number PUBMED:16850347. Additionally, this protein have a dispensable role in the early maturation of pre-rRNA PUBMED:1557037.

    \ \

    This entry is found C-terminal to the RRM domain and includes a P-loop suggesting that this region may bind NTP.

    \ ' '10276' 'IPR018851' '\

    The function of these proteins is not known, but at least one has been shown to be upregulated during meiosis PUBMED:16303567.

    \ ' '10277' 'IPR018852' '\

    This entry represents a family of uncharacterised proteins.

    \ ' '10278' 'IPR018853' '\

    This entry represents a family of uncharacterised proteins.

    \ ' '10279' 'IPR019495' '\

    The exosome mediates degradation of unstable mRNAs that contain AU-rich elements (AREs) within their 3\' untranslated regions PUBMED:11719186. The proteins in this entry are components of the exosome 3\'->5\' exoribonuclease complex. They do not have exonuclease activity, but are required for the 3\'-processing of the 7S pre-RNA to the mature 5.8S rRNA and for mRNA decay PUBMED:10465791, PUBMED:17173052.

    \ ' '10280' 'IPR018854' '\

    This entry represents the fungal 20S proteasome chaperones 3 and 4 (also known as DMP1 and DMP2), which function in early 20S proteasome assembly. The structures of these chaperones have been solved, and they closely resemble that of the mammalian proteasome assembling chaperone PAC3, although there is little sequence similarity between them PUBMED:18278057.

    \ ' '10282' 'IPR018855' '\

    This entry represents fungal proteosome chaperone 1, a chaperone of the 20S proteasome which functions in early 20S proteasome assembly PUBMED:17707236.

    \ ' '10283' 'IPR018856' '\

    The budding yeast protein Stn1 is a DNA-binding protein which has specificity for telomeric DNA. Structural profiling has predicted an OB-fold PUBMED:17293872.

    \ ' '10284' 'IPR018857' '\

    TC089 is a component of the TORC1 complex. TORC1 is responsible for a wide range of rapamycin-sensitive cellular activities PUBMED:14736892, PUBMED:18076573.

    \ \ \ ' '10285' 'IPR019496' '\

    Nuclear fragile X mental retardation-interacting protein 1 (Nufip1) has been implicated in the assembly of the large subunit of the ribosome PUBMED:10567587 and in telomere maintenance PUBMED:16552446. It is known to bind RNA PUBMED:10556305 and is phosphorylated upon DNA damage PUBMED:17525332. This entry represents a conserved region found within Nufip1. Some proteins containing this region also contain a CCCH zinc finger.

    \ ' '10286' 'IPR018858' '\

    This entry represents a family of uncharacterised proteins.

    \ ' '10287' 'IPR018859' '\

    Endocytosis and intracellular transport involve several mechanistic steps: \

    \ Members of the Amphiphysin protein family are key regulators in the early steps of endocytosis, involved in the formation of clathrin-coated vesicles by promoting the assembly of a protein complex at the plasma membrane and directly assist in the induction of the high curvature of the membrane at the neck of the vesicle. Amphiphysins contain a characteristic domain, known as the BAR (Bin-Amphiphysin-Rvs)-domain, which is required for their in vivo function and their ability to tubulate membranes PUBMED:14993925. \

    The crystal structure of these proteins suggest the domain forms a crescent-shaped dimer of a three-helix coiled coil with a characteristic set of conserved hydrophobic, aromatic and hydrophilic amino acids. Proteins containing this domain have been shown to homodimerise, heterodimerise or, in a few cases, interact with small GTPases.

    \

    This entry identifies several fungal BAR domain proteins, such as Gvp36, that are not found by PUBMED:18156177.

    \ ' '10288' 'IPR019497' '\

    The C-terminal region of the Sorting nexin group of proteins appears to carry a BAR-like (Bin/amphiphysin/Rvs) domain. This domain is very diverse and the similarities with other BAR domains are few. In the Sorting nexins it is associated with , and in combination with PX appears to be necessary to bind WASP along with p85 to form a multimeric signalling complex PUBMED:14993925.

    \ ' '10289' 'IPR019498' '\

    Human meta-static lymph node (MLN) 64 is a late endosomal membrane protein, and carries this domain (also known as MENTAL, short for MLN64 N-terminal) at its N terminus. The domain is composed of four transmembrane helices with three short intervening loops PUBMED:12393907. Its function is to capture cholesterol and pass it to the associated START domain () for transfer to a cytosolic acceptor protein or membrane. In mammals, this domain is involved in the localisation of MLN64 and MENTHO in late endosomes, and also in homo- and hetero-interactions of these two proteins PUBMED:15718238.

    \ ' '10290' 'IPR019499' '\

    The aminoacyl-tRNA synthetases () catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology PUBMED:2203971. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric PUBMED:10673435. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices PUBMED:8364025, and are mostly dimeric or multimeric, containing at least three conserved regions PUBMED:8274143, PUBMED:2053131, PUBMED:1852601. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2\'-hydroxyl of the tRNA, while, in class II reactions, the 3\'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases PUBMED:.

    \

    This entry represents the C-terminal domain of Valyl-tRNA synthetase, which consists of two helices in a long alpha-hairpin. Valyl-tRNA synthetase () is an alpha monomer that belongs to class Ia.

    \ ' '10291' 'IPR019500' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad PUBMED:11517925.

    \ \ \ \

    This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.

    \ ' '10292' 'IPR019501' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This family contains metallopeptidases belonging to MEROPS peptidase family M30 (hyicolysin family, clan MA). Hyicolysin has a zinc ion which is liganded by two histidine and one glutamate residue.

    \ ' '10293' 'IPR019502' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This entry contains serine peptidases belonging to MEROPS peptidase family S68 (PIDD auto-processing protein, clan S-). These proteins are known as Pidd (short for p53-induced protein with a death domain) proteins. Pidd forms a complex with Raidd and procaspase-2 that is known as the \'Piddosome\'. The Piddosome forms when DNA damage occurs and either activates NF-kappaB, leading to cell survival, or caspase-2, which leads to apoptosis.

    \ ' '10294' 'IPR019503' '\

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. \ \ Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as \'abXHEbbHbc\', where \'a\' is most often valine or threonine and forms part of the S1\' subsite in thermolysin and neprilysin, \'b\' is an uncharged residue, and \'c\' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This is a family of metallopeptidases belonging to MEROPS peptidase family M66 (StcE peptidase, clan MA). The StcE peptidase is a virulence factor found in Shiga toxigenic Escherichia coli strains. StcE peptidase cleaves C1 esterase inhibitor PUBMED:12123444.

    \ ' '10295' 'IPR019504' '\

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported.

    \

    This entry contains peptidases belonging to MEROPS peptidase family U49 (Lit peptidase, clan U-). The Lit peptidase from Escherichia coli functions in bacterial cell death in response to infection by Enterobacteria phage T4. Following binding of Gol peptide to domains II and III of elongation factor Tu, the Lit peptidase cleaves domain I of the elongation factor. This prevents binding of guanine nucleotides, shuts down translation and leads to cell death.

    \ ' '10296' 'IPR019505' '\

    Proteins in this entry include P5 murein endopeptidase from Pseudomonas phage phi6. P5 murein endopeptidase has lytic activity against several Gram-negative bacteria. It is thought that the enzyme cleaves the cell wall peptide bridge formed by meso-2,6-diaminopimelic acid and D-Alanine.

    \ ' '10297' 'IPR019506' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    PinA inhibits the endopeptidase La. It binds to the La homotetramer but does not interfere with the ATP binding site or the active site of La.

    \ ' '10298' 'IPR019507' '\

    The saccharopepsin inhibitor is highly specific for the aspartic peptidase saccharopepsin. In the absence of saccharopepsin it is largely unstructured PUBMED:15065849, but in its presence, the inhibitor undergoes a conformational change forming an almost perfect alpha-helix from Asn2 to Met32 in the active site cleft of the peptidase.

    \ ' '10299' 'IPR019508' '\

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    \

    Clitocypin binds and inhibits cysteine proteinases. It has no similarity to any other known cysteine proteinase inhibitors but bears some similarity to a lectin-like family of proteins from mushrooms PUBMED:10748021.

    \ ' '10300' 'IPR019509' '\

    This entry represents a family of tick carboxypetidase inhibitors.

    \ ' '10301' 'IPR019510' '\

    This entry represents the N-terminal nuclear localisation signal-containing domain found in the cyclic AMP-dependent protein kinase A (PKA) anchor protein, AKAP7. This protein anchors PKA for its role in regulating PKA-mediated gene transcription in both somatic cells and oocytes PUBMED:12804576. This domain carries the nuclear localisation signal (NLS) KKRKK, that indicates the cellular destiny of this anchor protein PUBMED:16483255. Binding to the regulatory subunits RI and RII of PKA is mediated via the RI-RII subunit-binding domain at the C terminus.

    \ ' '10302' 'IPR019511' '\

    This entry represents the RI-RII subunit-binding domain found at the C-terminal of the cyclic AMP-dependent protein kinase A (PKA) anchor protein, AKAP7. This protein anchors PKA, for its role in regulating PKA-mediated gene transcription in both somatic cells and oocytes, by binding to its regulatory subunits, RI and RII, hence being known as a dual-specific AKAP PUBMED:12804576. The 25 crucial amino acids of RII-binding domains in general form structurally conserved amphipathic helices with unrelated sequences; hydrophobic amino acid residues form the backbone of the interaction and hydrogen bond- and salt-bridge-forming amino acid residues increase the affinity of the interaction PUBMED:16483255. The nuclear localisation signal-containing domain is found at the N terminus.

    \ ' '10303' 'IPR018860' '\

    The anaphase-promoting complex (APC) or cyclosome is a cell cycle-regulated ubiquitin-protein ligase that regulates important events in mitosis such as the initiation of anaphase and exit from telophase. The APC, in conjunction with other enzymes, assembles multi-ubiquitin chains on a variety of regulatory proteins thereby targeting them for proteolysis by the 26S proteasome. CDC26 is one of the nine or so subunits identified within APC but its exact function is not known PUBMED:10922056.

    \ ' '10304' 'IPR019512' '\

    This entry represents the conserved N-terminal domain of the regulatory subunit (15B) of protein phosphatase 1 (also known as CReP, or the constitutive repressor of eIF2alpha phosphorylation). The CReP catalytic subunit functions in the dephosphorylation of eIF2-alpha under basal conditions in the absence of stress. In response to translation inhibition, there is reduced synthesis of the labile CReP that contributes to elevated levels of eIF2-alpha phosphorylation PUBMED:14638860. The C terminus, family PP1c, is shared with the apoptosis-associated protein Gadd34 and herpes simplex virus PUBMED:15355306.

    \ ' '10305' 'IPR019513' '\

    Cenp-F, a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, is essential for the full functioning of the mitotic checkpoint pathway PUBMED:7542657, PUBMED:14555653. There are several leucine-rich repeats along the sequence of LEK1 that are considered to be zippers, though they do not appear to be binding DNA directly in this instance PUBMED:14555653.

    \ ' '10306' 'IPR019514' '\

    This protein is found in eukaryotes but its function is not known. The C-terminal part of some members is DUF2450.

    \ ' '10307' 'IPR019515' '\

    This protein is found in eukaryotes but its function is not known. The C-terminal part of some members is DUF2451.

    \ ' '10308' 'IPR018861' '\

    This entry represent the C-terminal region; the proteins in the entry are restricted to chordates and their function is not known.

    \ ' '10309' 'IPR018862' '\

    EIF4E-T is the transporter protein for shuttling the mRNA cap-binding protein EIF4E protein, targeting it for nuclear import. EIF4E-T contains several key binding domains including two functional leucine-rich NESs (nuclear export signals) between residues 438-447 and 613-638 in the human protein. The other two binding domains are an EIF4E-binding site, between residues 27-42 in , and a bipartite NLS (nuclear localisation signals) between 194-211, and these lie in family EIF4E-T_N. EIF4E is the eukaryotic translation initiation factor 4E that is the rate-limiting factor for cap-dependent translation initiation.

    \ ' '10311' 'IPR018863' '\

    This is the conserved C-terminal half of the protein KIAA1109, which is the fragile site-associated protein FSA PUBMED:16545529. Genome-wide-association studies showed this protein to linked to the susceptibility to coeliac disease PUBMED:17558408. The protein may also be associated with polycystic kidney disease PUBMED:16632497.

    \ ' '10312' 'IPR019517' '\

    ICAP-1 is a serine/threonine-rich protein that binds to the cytoplasmic domains of beta-1 integrins in a highly specific manner, binding to a NPXY sequence motif on the beta-1 integrin. The cytoplasmic domains of integrins are essential for cell adhesion, and the fact that phosphorylation of ICAP-1 by interaction with the cell-matrix implies an important role of ICAP-1 during integrin-dependent cell adhesion PUBMED:9281591. Over expression of ICAP-1 strongly reduces the integrin-mediated cell spreading on extracellular matrix and inhibits both Cdc42 and Rac1. In addition, ICAP-1 induces release of Cdc42 from cellular membranes and prevents the dissociation of GDP from this GTPase PUBMED:11807099. An additional function of ICAP-1 is to promote differentiation of osteoprogenitors by supporting their condensation through modulating the integrin high affinity state PUBMED:17567669,

    \ ' '10313' 'IPR018463' '\

    Mitosin or centromere-associated protein-F (Cenp-F) is found bound across the centromere as one of the proteins of the outer layer of the kinetochore PUBMED:15939891. Most of the kinetochore/centromere functions appear to depend upon binding of the C-terminal part of the molecule, whereas the N-terminal part, here, may be a cytoplasmic player in controlling the function of microtubules and dynein PUBMED:9704407.

    \ ' '10314' 'IPR019518' '\

    CtIP is predominantly a nuclear protein that complexes with both BRCA1 and the BRCA1-associated RING domain protein (BARD1). At the protein level, CtIP expression varies with cell cycle progression in a pattern identical to that of BRCA1. Thus, the steady-state levels of CtIP polypeptides, which remain low in resting cells and G1 cycling cells, increase dramatically as Dividing cells traverse the G1/S boundary. CtIP can potentially modulate the functions ascribed to BRCA1 in transcriptional regulation, DNA repair, and/or cell cycle checkpoint control PUBMED:18007598. This N-terminal domain carries a coiled-coil region and is essential for homodimerisation of the protein PUBMED:17936710. The C-terminal domain is family CtIP_C and carries functionally important CxxC and RHR motifs, absence of which lead cells to grow slowly and show hypersensitivity to genotoxins PUBMED:17936710.

    \ ' '10315' 'IPR019519' '\

    Histone acetylation protein (Hap) 2 is one of three histone acetyltransferases proteins that, in yeasts, are found associated with elongating forms of RNA polymerase II (Elongator). The Haps can be isolated in two forms, as a six-subunit complex with Elongator, and as a complex of the three proteins on their own. The role of the Hap complex in transcription is still speculative, being possibly to keep the histone acetylation activity of free Elongator in check, allowing histone acetylation only in the presence of a transcribing polymerase, or the interaction with Haps might render Elongator susceptible to modifications thereby altering its activity PUBMED:11390369.

    \ ' '10316' 'IPR019520' '\

    The MRP (mitochondrial ribosomal protein); MRP-S23, is one of the proteins that makes up the 55S ribosome in eukaryotes. It does not appear to carry any common motifs; either RNA binding or ribosomal PUBMED:10938081. All of the mammalian MRPs are encoded in nuclear genes that are evolving more rapidly than those encoding cytoplasmic ribosomal proteins. The MRPs are imported into mitochondria where they assemble in co-ordination with mitochondrially transcribed rRNAs into ribosomes that are responsible for translating the 13 mRNAs for essential proteins of the oxidative phosphorylation system PUBMED:14658756. MRP-S23 is significantly up-regulated in uterine cancer cells PUBMED:17054779.

    \ ' '10318' 'IPR019522' '\

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific PUBMED:3291115.

    \

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation PUBMED:12368087. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    \

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved PUBMED:15078142, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases PUBMED:15320712.

    \ \

    Class I PI3Ks are dual-specific lipid and protein kinases involved in numerous intracellular signaling pathways. Class IB PI3K, p110gamma, is mainly activated by seven-transmembrane G-protein-coupled receptors (GPCRs) through its regulatory subunit p101 and G-protein beta-gamma subunits PUBMED:11395417.

    \

    PI3K is a lipid kinase and a key signalling enzyme involving in cell survival and proliferation, cell motility and adhesion, cytoskeletal rearrangement and vesicle trafficking PUBMED:10579926. The different PI3K isoforms have cell-specific functions. In yeast, VPS34 is a key enzyme required for cell division, vacuolar protein sorting, and vacuole segregation PUBMED:8385367. The major components of the yeast VPS intracellular trafficking complex are conserved in humans PUBMED:7628435.

    \ \

    There are three major classes of PI3Ks, I and III (Class I is also subdivided into Ia and Ib), and a more distantly related Class IV which contains Ser/Thr kinases. The different classes of PI3K catalyse phosphorylation of the 3\'-OH position of phosphatidyl myo-inositol (PtdIns) lipids, generating different 3\'-phosphorylated lipid products that act as secondary messengers. The classification of PI3Ks is based upon sequence analysis and domain architecture of the catalytic subunits, but the divisions also reflect the biochemical properties and the differential association with a variety of regulatory adaptor subunits.

    \ ' '10319' 'IPR018864' '\

    This is one of the many peptides that make up the nucleoporin complex (NPC), and is found across eukaryotes PUBMED:11029043. The Nup188 subcomplex (Nic96p-Nup188p-Nup192p-Pom152p) is one of at least six that make up the NPC, and as such is symmetrically localised on both faces of the NPC at the nuclear end, being integrally bound to the C terminus of Pom34p PUBMED:16361228.

    \ ' '10320' 'IPR019523' '\

    This entry represents the conserved C terminus of the regulatory subunit (15A and 15B) of protein phosphatase 1. This c-terminal domain appears to be a binding region for the catalytic subunit (PP1C) of protein phosphatase-1, which may in some circumstances also be retroviral in origin since it is found in both herpes simplex virus and in mouse and man. This domain is found in Gadd-34 apoptosis-associated proteins as well as the constitutive repressor of eIF2-alpha phosphorylation/protein phosphatase 1, regulatory (inhibitor) subunit 15b, otherwise known as CReP. Diverse stressful conditions are associated with phosphorylation of the {alpha} subunit of eukaryotic translation initiation factor 2 (eIF2{alpha}) on serine 51. This signaling event, which is conserved from yeast to mammals, negatively regulates the guanine nucleotide exchange factor, eIF2-B and inhibits the recycling of eIF2 to its active GTP bound form. In mammalian cells eIF2{alpha} phosphorylation emerges as an important event in stress signaling that impacts on gene expression at both the translational and transcriptional levels PUBMED:14638860.

    \ ' '10321' 'IPR019524' '\

    This short transcript is purported to be the antisense protein of exon 2 of the ret-finger protein-like 3 (RFPL3) gene, however this was not confirmed. Since RFPL3 is expressed in testis, the suggestion is that it may have a role in the antisense regulation of the RFPL genes. RFPL transcripts encode proteins with tripartite structure of RING finger, coiled-coil, and B30-2 domains, which are characteristic of the RING-B30 family. Each of these domains is thought to mediate protein-protein interactions by promoting homo- or heterodimerisation PUBMED:10508838.

    \ ' '10322' 'IPR018302' '\

    This entry represents the Rb protein-binding domain from the centromere protein Cenp-F. Cenp-F is a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, that is involved in chromosome segregation during mitosis and is essential for the full functioning of the mitotic checkpoint pathway PUBMED:7542657, PUBMED:14555653. Cenp-F interacts with retinoblastoma protein (RB), CENP-E and BUBR1. This domain is at the very C-terminus of the C-terminal coiled-coil region, and binds to the Rb family of tumour suppressors PUBMED:17498689.

    \ ' '10323' 'IPR019525' '\

    Nuclear respiratory factor-1 is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila PUBMED:11069771.

    \ \

    In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human PUBMED:10656923.

    \ \

    The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein, , there is also an NLS domain at 88-116, and a DNA binding and dimerisation domain at 127-282. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity PUBMED:11278998.

    \ ' '10324' 'IPR019526' '\

    Nuclear respiratory factor-1 (Nrf-1) is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila PUBMED:11069771.

    \ \

    In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human PUBMED:10656923.

    \ \ \

    The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein, , there is an activation domain at 303-469, the most conserved part of which is this domain 446-469. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity PUBMED:11278998.

    \ ' '10325' 'IPR019527' '\

    Rod, the Rough deal protein (also known as Kinetochore-associated protein 1) displays a dynamic intracellular staining pattern, localising first to kinetochores in pro-metaphase, but moving to kinetochore microtubules at metaphase. Early in anaphase the protein is once again restricted to the kinetochores, where it persists until the end of telophase. This behaviour is in all respects similar to that described for ZW10 PUBMED:10523511, and indeed the two proteins function together, localisation of each depending upon the other PUBMED:14711415. These two proteins are found at the kinetochore in complex with a third, Zwilch, in both flies and humans. The C- terminus is the most conserved part of the protein. During pro-metaphase, the ZW10-Rod complex, dynein/dynactin, and Mad2 all accumulate on unattached kinetochores; microtubule capture leads to Mad2 depletion as it is carried off by dynein/dynactin; ZW10-Rod complex accumulation continues, replenishing kinetochore dynein. The continuing recruitment of the ZW10-Rod complex during metaphase may serve to maintain adequate dynein/dynactin complex on kinetochores for assisting chromatid movement during anaphasePUBMED:14711415. The ZW10-Rod complex acts as a bridge whose association with Zwint-1 links Mad1 and Mad2, components that are directly responsible for generating the diffusible \'wait anaphase\' signal, to a structural, inner kinetochore complex containing Mis12 and KNL-1AF15q14, the last of which has been proved to be essential for kinetochore assembly in Caenorhabditis elegans. Removal of ZW10 or Rod inactivates the mitotic checkpoint PUBMED:15824131.

    \ ' '10326' 'IPR018865' '\

    This serine-threonine protein kinase number 19 is expressed from the MHC and predominantly in the nucleus. Protein kinases are involved in signal transduction pathways and play fundamental roles in the regulation of cell functions. This is a novel Ser/Thr protein kinase, that has Mn2+-dependent protein kinase activity that phosphorylates alpha -casein at Ser/Thr residues and histone at Ser residues. It can be covalently modified by the reactive ATP analogue 5\'-p-fluorosulphonylbenzoyladenosine in the absence of ATP, and this modification is prevented in the presence of 1 mM ATP, indicating that the kinase domain of is capable of binding ATP PUBMED:9812991.

    \ ' '10327' 'IPR019528' '\

    This entry represents a coiled-coil region close to the C terminus of centrosomal proteins that is directly responsible for recruiting AKAP-450 and pericentrin to the centrosome. Hence the suggested name for this region is a PACT domain (pericentrin-AKAP-450 centrosomal targeting). This domain is also present at the C terminus of coiled-coil proteins from Drosophila and Schizosaccharomyces pombe (Fission yeast), and in the Drosophila protein it is sufficient for targeting to the centrosome in mammalian cells. The function of these proteins is unknown but they seem good candidates for having a centrosomal or spindle pole body location. The final 22 residues of this domain in AKAP-450 appear specifically to be a calmodulin-binding domain, indicating that this protein at least is likely to contribute to centrosome assembly PUBMED:11263498.

    \ ' '10328' 'IPR019529' '\

    This is the conserved N-terminal of Syntaxin-18. Syntaxin-18 is found in the SNARE complex of the endoplasmic reticulum and functions in the trafficking between the ER intermediate compartment and the cis-Golgi vesicle. In particular, the N-terminal region is important for the formation of ER aggregates PUBMED:10788491. More specifically, syntaxin-18 is involved in endoplasmic reticulum-mediated phagocytosis, presumably by regulating the specific and direct fusion of the ER with the plasma or phagosomal membranes PUBMED:16790498.

    \ ' '10329' 'IPR018866' '\

    R1 is a transcription factor repressor that inhibits monoamine oxidase A gene expression. This domain is a four-CXXC zinc finger putative DNA-binding domain found at the C-terminal end of R1. The domain carries 12 cysteines of which four pairs are of the CXXC type PUBMED:15654081.

    \ ' '10330' 'IPR019530' '\

    Eukaryotic cilia and flagella are specialised organelles found at the periphery of cells of diverse organisms. Intra-flagellar transport (IFT) is required for the assembly and maintenance of eukaryotic cilia and flagella, and consists of the bi-directional movement of large protein particles between the base and the distal tip of the organelle. IFT particles contain multiple copies of two distinct protein complexes, A and B, which contain at least 6 and 11 protein subunits. IFT57 is part of complex B but is not, however, required for the core subunits to stay associated PUBMED:15955805. This protein is known as Huntington-interacting protein-1 in humans.

    \ ' '10331' 'IPR019531' '\

    Peroxisomes are single membrane bound organelles, present in practically all eukaryotic cells, and involved in a variety of metabolic pathways; the deduced protein is extremely basic, a characteristic of many other peroxisomal intrinsic membrane proteins. They carry two short stretches of hydrophobic residues shown to be necessary for the correct targeting of these proteins. This entry represent Pmp4.

    \ ' '10332' 'IPR019532' '\

    SR-25, otherwise known as ADP-ribosylation factor-like factor 6-interacting protein 4, is expressed in virtually all tissue types. At the N terminus there is a repeat of serine-arginine (SR repeat), and towards the middle of the protein there are clusters of both serines and of basic amino acids. The presence of many nuclear localisation signals strongly implies that this is a nuclear protein that may contribute to RNA splicing PUBMED:10708573. SR-25 is also implicated, along with heat-shock-protein-27, as a mediator in the Rac1 (GTPase ras-related C3 botulinum toxin substrate 1; also see ) signalling pathway PUBMED:17952876.

    \ ' '10333' 'IPR018305' '\

    This entry represents the L50 protein from the mitochondrial 39S ribosomal subunit. L50 appears to be a secondary RNA-binding protein PUBMED:3129699. The 39S ribosomal protein appears to be a subunit of one of the larger mitochondrial 66S or 70S units PUBMED:8947311. Under conditions of ethanol-stress in rats the larger subunit is largely dissociated into its smaller components PUBMED:15928344. In Escherichia coli, in the absence of the enzyme pseudouridine synthase (RluD) synthase, there is an accumulation of 50S and 30S subunits and the appearance of abnormal particles (62S and 39S), with concomitant loss of 70S ribosomes PUBMED:15928344.

    \ ' '10334' 'IPR019533' '\

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes PUBMED:7845208. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence PUBMED:7845208. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases PUBMED:7845208.

    \ \

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base PUBMED:7845208. The geometric orientations of the catalytic residues are similar between families, despite different protein folds PUBMED:7845208. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) PUBMED:7845208, PUBMED:8439290.

    \ \

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    \ \ \

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    \ \ \

    This entry represents a conserved region found in the S26A family of serine endopeptidases, which function in the processing of newly-synthesised secreted proteins. Peptidase S26 removes the hydrophobic, N-terminal signal peptides as proteins are translocated across membranes.

    \ ' '10335' 'IPR010126' '\

    This entry describes a group of lipases of the ab-hydrolase family. They include bacterial depolymerases for poly(3-hydroxybutyrate) (PHB) and related polyhydroxyalkanoates (PHA), as well as acetyl xylan esterases, and feruloyl esterases from fungi.

    \ ' '10336' 'IPR019534' '\

    This group of proteins are found predominantly in eukaryotes, however, their function is unknown.

    \ ' '10337' 'IPR019535' '\

    The transition of neuronal cells from pre-cursor to mature state is regulated by the N-methyl-d-aspartate (NMDA) receptor, a glutamate-gated ion channel that is permeable to Ca2+. NMDA receptors probably mediate this activity by permitting expression of NARG2. NARG2 (NMDA receptor-regulated gene protein 2) is transiently expressed, being a regulatory protein that is present in the nucleus of dividing cells and then down-regulated as progenitors exit the cell cycle and begin to differentiate. NARG2 contains repeats of (S/T)PXX, (11 in mouse, 6 in human), a putative DNA-binding motif that is found in many gene-regulatory proteins including Kruppel, Hunchback and Antennapedi PUBMED:15606750.

    \ ' '10338' 'IPR019536' '\

    The entry represents a protein that has a high homology to the tumour suppressor Usher syndrome type-1C protein-binding protein 1, or known as MCC2 (mutated in colon cancer).

    \

    MCC2 protein binds the first PDZ domain of AIE-75 with its C-terminal amino acids -DTFL. A possible role of MCC2 as a tumour suppressor has been put forward. The carboxyl terminus of the predicted protein was DTFL which matched the consensus motif X-S/T-X-phi (phi: hydrophobic amino acid residue) for binding to the PDZ domain of AIE-75 PUBMED:11311560, PUBMED:15219944.

    \ ' '10339' 'IPR019537' '\

    The function of these transmembrane protein is not known.

    \ ' '10340' 'IPR019538' '\

    The 26S proteasome, a eukaryotic ATP-dependent, dumb-bell shaped, protease complex with a molecular mass of approx 20kDa consists of a central 20S proteasome, functioning as a catalytic machine, and two large V-shaped terminal modules, having possible regulatory roles, composed of multiple subunits of 25- 110 kDa attached to the central portion in opposite orientations. It is responsible for degradation of abnormal intracellular proteins, including oxidatively damaged proteins, and may play a role as a component of a cellular anti-oxidative system. Expression of catalytic core subunits including PSMB5 and peptidase activities of the proteasome were elevated following incubation with 3-methylcholanthrene. The 20S proteasome comprises a cylindrical stack of four rings, two outer rings formed by seven alpha-subunits (alpha1-alpha7) and two inner rings of seven beta-subunits (beta1-beta7). Two outer rings of alpha subunits maintain structure, while the central beta rings contain the proteolytic active core subunits beta1 (PSMB6), beta2 (PSMB7), and beta5 (PSMB5). Expression of PSMB5 can be altered by chemical reactants, such as 3-methylcholanthrene PUBMED:16723119.

    \ ' '10341' 'IPR019539' '\

    This entry represents a highly conserved galactokinase signature sequence which appears to be present in all galactokinases, irrespective of how many other ATP binding sites, etc that they carry PUBMED:10359639. The function of this domain appears to be to bind galactose PUBMED:12796487, and it is normally located at the N terminus of these enzymes PUBMED:15526155. It is associated with and . While all enzymes in this entry posses galactokinase activity, some are annotated as N-acetylgalactosamine kinases as they also posses this enzyme activity.

    \ ' '10342' 'IPR019540' '\

    Phosphatidylinositol-glycan biosynthesis class S protein (PIG-S) is one of several key, core components of the glycosylphosphatidylinositol (GPI) trans-amidase complex that mediates GPI anchoring in the endoplasmic reticulum. Anchoring occurs when a protein\'s C-terminal GPI attachment signal peptide is replaced with a pre-assembled GPI PUBMED:11483512. Mammalian GPI transamidase consists of at least five components: Gaa1, Gpi8, PIG-S, PIG-T, and PIG-U, all five of which are required for its function. It is possible that Gaa1, Gpi8, PIG-S, and PIG-T form a tightly associated core that is only weakly associated with PIG-U. The exact function of PIG-S is unclear PUBMED:14660601.

    \ ' '10343' 'IPR019541' '\

    Trappin-2,a protease inhibitor, has a unique N-terminal domain that enables it to become cross-linked to extracellular matrix proteins by transglutaminase PUBMED:10359639. This domain contains several repeated motifs (rpresented by this entry) with the consensus sequence Gly-Gln-Asp-Pro-Val-Lys, and these together can anchor the whole molecule to extracellular matrix proteins, such as laminin, fibronectin, beta-crystallin, collagen IV, fibrinogen, and elastin, by transglutaminase-catalysed cross-links. The whole domain is rich in glutamine and lysine, thus allowing and transglutaminase(s) to catalyse the formation of an intermolecular epsilon-(gamma-glutamyl)lysine isopeptide bond PUBMED:17964057. Cementoin is associated with the WAP family, , at the C terminus.

    \ ' '10344' 'IPR018867' '\

    The chromosomal passenger complex of Aurora B kinase, INCENP, and Survivin has essential regulatory roles at centromeres and the central spindle in mitosis. Borealin is also a member of the complex. Approximately half of Aurora B in mitotic cells is complexed with INCENP, Borealin, and Survivin. Depletion of Borealin by RNA interference delays mitotic progression and results in kinetochore-spindle mis-attachments and an increase in bipolar spindles associated with ectopic asters PUBMED:15249581.

    \ ' '10345' 'IPR019542' '\

    This entry represents EPL1 (Enhancer of polycomb-like) proteins. The EPL1 protein is a member of a histone acetyltransferase complex which is involved in transcriptional activation of selected genes PUBMED:15964809.

    \ ' '10346' 'IPR018868' '\

    BAD is a Bcl-2 homology domain 3 (BH3)-only pro-apoptotic member of the Bcl-2 protein family that is regulated by phosphorylation in response to survival factors PUBMED:9372935. Binding of BAD to mitochondria is thought to be exclusively mediated by its BH3 domain. Membrane localisation of BAD mediates membrane translocation of Bcl-XL. The C-terminal part of BAD is sufficient for membrane binding. There are two segments with differing lipid-binding preferences, LBD1 and LBD2, that are responsible for this binding: (i) LBD1 located in the proximity of the BH3 domain (amino acids 122-131) and (ii) LBD2, the putative C-terminal alpha-helix-5 PUBMED:16226704. Phosphorylation-regulated 14-3-3 protein binding may expose the cholesterol-preferring LBD1 and bury the LBD2, thereby mediating translocation of BAD to raft-like micro-domains PUBMED:16603546.

    \ ' '10347' 'IPR019543' '\

    This is the amyloid, C-terminal, protein of the beta-Amyloid precursor protein (APP) which is a conserved and ubiquitous transmembrane glycoprotein strongly implicated in the pathogenesis of Alzheimer\'s disease but whose normal biological function is unknown. The C-terminal 100 residues are released and aggregate into amyloid deposits which are strongly implicated in the pathology of Alzheimer\'s disease plaque-formation. The domain is associated with , further towards the N terminus.

    \ ' '10348' 'IPR019544' '\

    The tetratrico peptide repeat region (TPR) is a structural motif present in a wide range of proteins PUBMED:7667876, PUBMED:9482716, PUBMED:1882418. It mediates protein-protein interactions and the assembly of multiprotein complexes PUBMED:14659697. The TPR motif\ consists of 3-16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can\ be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a\ consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been\ identified in various different organisms, ranging from bacteria to humans. Proteins containing\ TPRs are involved in a variety of biological processes, such as cell cycle regulation,\ transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and\ protein folding.

    The X-ray structure of a domain containing three TPRs from protein phosphatase 5 revealed that\ TPR adopts a helix-turn-helix arrangement, with adjacent TPR motifs packing in a parallel\ fashion, resulting in a spiral of repeating anti-parallel alpha-helices PUBMED:14659697. The two helices are denoted\ helix A and helix B. The packing angle between helix A and helix B is ~24 degrees within a\ single TPR and generates a right-handed superhelical shape. Helix A interacts with helix B and\ with helix A\' of the next TPR. Two protein surfaces are generated: the inner concave surface is\ contributed to mainly by residue on helices A, and the other surface presents residues from both\ helices A and B.

    \

    This entry represents SHNi-TPR (Sim3-Hif1-NASP interrupted TPR), a sequence that is an interrupted form of TPR repeat PUBMED:18158900.

    \ ' '10349' 'IPR019545' '\

    The DM13 domain is a component of a novel electron-transfer system potentially involved in oxidative modification of animal cell-surface proteins PUBMED:17878204. It contains a nearly absolutely conserved cysteine, which could be involved in a redox reaction, either as a naked thiol group or through binding a prosthetic group like heme PUBMED:17878204.

    \ ' '10350' 'IPR019546' '\

    The twin-arginine translocation (Tat) pathway serves the role of transporting folded proteins across energy-transducing membranes PUBMED:16322447. Homologues of the genes that encode the transport apparatus occur in archaea, bacteria, chloroplasts, and plant mitochondria PUBMED:12029389. In bacteria, the Tat pathway catalyses the export of proteins from the cytoplasm across the inner/cytoplasmic membrane. In chloroplasts, the Tat components are found in the thylakoid membrane and direct the import of proteins from the stroma. The Tat pathway acts separately from the general secretory (Sec) pathway, which transports proteins in an unfolded state PUBMED:16092521.

    \ \

    It is generally accepted that the primary role of the Tat system is to translocate fully folded proteins across membranes. An example of proteins that need to be exported in their 3D conformation are redox proteins that have acquired complex multi-atom cofactors in the bacterial cytoplasm (or the chloroplast stroma or mitochondrial matrix). They include hydrogenases, formate dehydrogenases, nitrate reductases, trimethylamine N-oxide (TMAO) reductases and dimethyl sulphoxide (DMSO) reductases PUBMED:16756481, PUBMED:15546663. The Tat system can also export whole heteroligomeric complexes in which some proteins have no Tat signal. This is the case of the DMSO reductase or formate dehydrogenase complexes. But there are also other cases where the physiological rationale for targeting a protein to the Tat signal is less obvious. Indeed, there are examples of homologous proteins that are in some cases targeted to the Tat pathway and in other cases to the Sec apparatus. Some examples are: copper nitrite reductases, flavin domains of flavocytochrome c and N-acetylmuramoyl-L-alanine amidases PUBMED:15802249.

    \ \

    In halophilic archaea such as Halobacterium almost all secreted proteins appear to be Tat targeted. It has been proposed to be a response to the difficulties these organisms would otherwise face in successfully folding proteins extracellularly at high ionic strength PUBMED:12427925.

    \ \

    The Tat signal peptide consists of three motifs: the positively charged N-terminal motif, the hydrophobic region and the C-terminal region that generally ends with a consensus short motif (A-x-A) specifying cleavage by signal peptidase. Sequence analysis revealed that signal peptides capable of targeting the Tat protein contain the consensus sequence [ST]-R-R-x-F-L-K. The nearly invariant twin-arginine gave rise to the pathway\'s name. In addition the h-region of Tat signal peptides is typically less hydrophobic than that of Sec-specific signal peptides PUBMED:16756481, PUBMED:15546663.\

    \ ' '10352' 'IPR019547' '\

    This entry represents part of the transcript of the fusion of two genes, the UEV1.

    \

    UEV1 is an enzymatically inactive variant of the E2 ubiquitin-conjugating enzymes that regulate non-canonical elongation of ubiquitin chains, and Kua, an otherwise unknown gene. UEV1A is a nuclear protein, whereas both Kua and Kua-UEV localise to cytoplasmic structures, indicating that the addition of a Kua domain to UEV confers new biological properties. UEV1-Kua carries the B domain with its characteristic double histidine motif, and it is probably this domain which determines the cytoplasmic localisation. It is postulated that this hybrid transcript could preferentially direct the variant polyubiquitination of substrates closely associated with the cytoplasmic face of the endoplasmic reticulum, possibly, although not necessarily, in conjunction with membrane-bound ubiquitin-conjugating enzymes PUBMED:11076860.

    \ ' '10353' 'IPR018870' '\

    A Schizosaccharomyces pombe (Fission yeast) member of this family is known to interact with Tel2. Tel2 is a component of the TOR complexes PUBMED:18076573.

    \ ' '10354' 'IPR018459' '\

    This domain is found is a wide variety of AKAPs (A kinase anchoring proteins).

    \ ' '10355' 'IPR018379' '\

    The BEN domain is a alpha-helical module. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription PUBMED:18203771. This domain is found in diverse animal proteins such as:

    \

    \ ' '10356' 'IPR019548' '\

    Nuclear factor I (NF-I) or CCAAT box-binding transcription factor (CTF) PUBMED:2504497, PUBMED:2339052 (also known as TGGCA-binding proteins) are a family of vertebrate nuclear proteins which recognise and bind, as dimers, the palindromic DNA sequence 5\'-TGGCANNNTGCCA-3\'. CTF/NF-I binding sites are present in viral and cellular promoters and in the origin of DNA replication of Human adenovirus 2 (HAdV-2). The CTF/NF-I proteins were first identified as nuclear factor I, a collection of proteins that activate the replication of several Adenovirus serotypes (together with NF-II and NF-III) PUBMED:6216480. The family of proteins was also identified as the CTF transcription factors, before the NFI and CTF families were found to be identical PUBMED:3398920. The CTF/NF-I proteins are individually capable of activating transcription and DNA replication. In a given species, there are a large number of different CTF/NF-I proteins, generated both by alternative\ splicing and by the occurrence of four different genes. CTF/NF-1 proteins contain 400 to 600 amino acids. The N-terminal 200 amino-acid sequence, almost perfectly conserved in all species and genes sequenced, mediates site-specific DNA recognition, protein dimerisation and Adenovirus DNA replication. The C-terminal\ 100 amino acids contain the transcriptional activation domain. This activation domain is the target of gene expression regulatory pathways elicited by growth factors and it interacts with basal transcription factors\ and with histone H3 PUBMED:8543151.

    \ \

    This entry represents the N-terminal, of which 200 residues contain the DNA-binding and dimerisation domain, but also has an 8-47 residue highly conserved region 5\' of this, whose function is not known. Deletion of the N-terminal 200 amino acids removes the DNA-binding activity, dimerisation-ability and the stimulation of adenovirus DNA replication PUBMED:2339052.

    \ ' '10357' 'IPR019549' '\

    Homeodomain proteins are transcription factors that share a related DNA-binding homeodomain PUBMED:10377888. The homeodomain was initially identified in Drosophila melanogaster (Fruit fly) homeotic and segmentation proteins, but is well conserved throughout metazoans PUBMED:2568852, PUBMED:1357790. The homeodomain binds DNA through a helix-turn-helix (HTH) structure, consisting of approximately 20 residues PUBMED:1970866. The HTH motif is comprised of two alpha-helices that make intimate contacts with the DNA; the second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions. These interactions occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure and is joined to the second through a short turn.

    \

    Most proteins which contain a homeobox domain can be classified PUBMED:2568852, PUBMED:2884726,\ on the basis of their sequence characteristics, into three subfamilies, engrailed, antennapedia and\ paired. A number of different proteins contain homeodomains, including Drosophila engrailed, yeast mating type proteins, hepatocyte nuclear factor 1a and Hox proteins. Hox genes encode homeodomain-containing transcriptional regulators that operate differential genetic programs along the anterior-posterior axis of animal bodies PUBMED:12445403. The homeodomain motif is very similar in sequence identity and structure to domains in other DNA-binding proteins, including recombinases, GARP response regulators, human telomeric protein, AraC type transcriptional activator and tetracycline repressor PUBMED:12215502, PUBMED:9739097, PUBMED:7707374.

    \ \

    This entry represents a conserved region of some 20 amino-acid residues located at the C-terminal of the \'homeobox\' domain and forms a kind of a signature pattern for this subfamily of proteins PUBMED:2568852.

    \ ' '10358' 'IPR019550' '\

    This subunit appears to be a recent vertebrate addition to the MADH-ubiquinone reductase complex 1, acting within the membrane. Its exact function is not known, but it is highly expressed in muscle and neural tissue, suggesting a role in ATP generation PUBMED:15843018.

    \ ' '10359' 'IPR019551' '\

    This entry represents a highly conserved signature found at the N terminus of pyrrolo-quinoline quinone (PQQ)-dependent enzymes.

    \ ' '10360' 'IPR018871' '\

    This presumed domain is found in fungal adhesins and is related to the PA14 domain.

    \ ' '10361' 'IPR019552' '\

    This entry represents a histidine-rich calcium-binding repeat which appears in proteins called histidine-rich-calcium binding proteins (HRC). HRC is a high capacity, low affinity Ca2+-binding protein, residing in the lumen of the sarcoplasmic reticulum. HRC binds directly to triadin. This binding interaction occurs between the histidine-rich region of HRC and multiple clusters of charged amino acids, named KEKE motifs, in the lumenal domain of triadin. This repeat is found in the acidic region of the protein, which can be long and variable. There is also a cysteine-rich region further towards the C terminus PUBMED:15777620. HRC may regulate sarcoplasmic reticular calcium transport, play a critical role in maintaining calcium homeostasis, and function in the heart. HRC is a candidate regulator of sarcoplasmic reticular calcium uptake.

    \ ' '10362' 'IPR019553' '\

    Spider toxins of the CSTX family are ion channel toxins containing an inhibitor cystine knot (ICK) structural motif or Knottin scaffold. The four disulphide bonds present in the CSTX spider toxin family are arranged in the following pattern: 1-4, 2-5, 3-8 and 6-7. CSTX-1 is the most important component of Cupiennius salei (Wandering spider) venom in terms of relative abundance and toxicity and therefore is likely to contribute significantly to the overall toxicity of the whole venom. CSTX-1 blocked rat neuronal L-type, but no other types of HVA Cav channels PUBMED:17517422. Interestingly, the omega-toxins from Phoneutria nigriventer (Brazilian armed spider) venom (another South American species also belonging to the Ctenidae family) are included as they carry the same disulphide bond arrangement. suggested that CSTX-1 may interact with Cav channels. Calcium ion voltage channel heteromultimer containing an L-type pore-forming alpha1-subunit is the most probable candidate for the molecular target of CSTX-1 these toxins PUBMED:17517422.

    \ ' '10363' 'IPR019554' '\

    The soluble ligand-binding beta-grasp domain (SLBB) contains a beta-grasp fold. They are found in a diverse set of proteins that include the animal vitamin B12 uptake proteins; transcobalamin, intrinsic factor and the bacterial polysaccharide export proteins PUBMED:17250770. Some proteins may be part of a membrane complex involved in electron transport, others are probably involved in the export of the extracellular polysaccharide colanic acid from the cell to medium.

    \ ' '10364' 'IPR018290' '\

    This entry represents a domain with an all-beta structure that is found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif PUBMED:15755923, PUBMED:17130173. This domain is found associated with the WRKY domain PUBMED:18523729.

    \ ' '10365' 'IPR018872' '\

    This zinc binding domain is found associated with the WRKY domain PUBMED:17130173.

    \ ' '10366' 'IPR019555' '\

    The multi-domain protein Connector enhancer of kinase suppressor of ras (Connector enhancer of KSR) (CNK) functions as a scaffold in several signal cascades and acts on proliferation, differentiation and apoptosis. CNK connects upstream activators and downstream targets of Ras- and Rho-dependent signalling pathways and may allow cross-talk between these pathways. In invertebrates, CNK is expressed as one isoform, whereas in mammals there exists CNK1, CNK2A, and its splice variant CNK2B. CNK proteins consist of one sterile alpha motif (SAM) domain (see , ) one conserved region in CNK (CRIC) domain, one PSD-96/Dlg-A/ZO-1 (PDZ) domain (see , ) and one pleckstrin homology (PH) domain (see , . The CRIC domain is enriched in leucine residues and functions as a protein-protein interaction domain PUBMED:12028359, PUBMED:16289034.

    \ ' '10367' 'IPR019556' '\

    This entry represents a highly conserved sequence found at the C terminus of pyrrolo-quinoline quinone (PQQ)-dependent enzymes.

    \ ' '10368' 'IPR019557' '\

    This entry represents a domain found in a variety of transposases PUBMED:17130173.

    \ ' '10369' 'IPR013136' '\

    ACF (for ATP-utilising chromatin assembly and remodeling factor) is a chromatin-remodeling complex that catalyzes the ATP-dependent assembly of periodic nucleosome arrays. This reaction utilises the energy of ATP hydrolysis by ISWI, the smaller of the two subunits of ACF. Acf1, the large subunit of ACF, is essential for the full activity of the complex. The WAC (WSTF/Acf1/cbp146) domain is an ~110-residue module present at the N-termini of Acf1-related proteins in a variety of organisms. It is found in association with other domains such as the bromodomain, the PHD-type zinc finger, DDT or WAKS. The DNA-binding region of Acf1 includes the WAC domain, which is necessary for the efficient binding of ACF complex to DNA. It seems probable that the WAC domain will be involved in DNA binding in other related factors PUBMED:10385622, PUBMED:12192034.

    \

    \ Some proteins known to contain a WAC domain are the Drosophila melanogaster (Fruit fly) ATP-dependent chromatin assembly factor large subunit Acf1, human WSTF (Williams syndrome transcription factor), mouse cbp146, yeast imitation switch two complex protein 1 (ITC1 or YGL133w), and yeast protein YPL216w.

    \ ' '10370' 'IPR012316' '\

    Signal transduction by T and B cell antigen receptors and certain receptors\ for Ig Fc regions involves a conserved sequence motif, termed an\ immunoreceptor tyrosine-based activation motif (ITAM). It is also found in the\ cytoplasmic domain of the apoptosis receptor. Phosphorylation of the two ITAM\ tyrosines is a critical event in signal transduction. All (p)2ITAMs, but not\ their nonphosphorylated counterparts, induced extensive protein tyrosine\ phosphorylation in permeabilised cells. After binding of the ligand via an SH2\ domain, phosphorylation of the two conserved tyrosines of ITAM creates binding sites for downstream signalling molecules and thus enables the initiation of signalling events. This phosphorylation was found to reflect activation of the src family kinases Lyn and Syk. Different ITAMs may preferentially activate distinct signalling pathways as a consequence of distinct SH2 effector binding preference PUBMED:7594458, PUBMED:14552840. Furthermore, in viruses, ITAMs may play key roles in viral pathogenesis by regulating viral clearance, immune cell activation, immune cell recruitment through binding of cellular kinases and thereby down regulate their function PUBMED:12502882.

    \ \

    This motif can be found in one to three copies and in association with the Ig-like domain. Proteins currently known to contain an ITAM motif are:

    \ \

    \ ' '10371' 'IPR013989' '\

    The DCD (Development and Cell Death) domain is found in plant proteins involved in development and cell death. The DCD domain is an ~130 amino acid long stretch that contains several mostly invariable motifs. These include a FGLP and a LFL motif at the N-terminus and a PAQV and a PLxE motif towards the C-terminus of the domain. The DCD domain is present in proteins with different architectures. Some of these proteins contain additional recognizable motifs, like the KELCH repeats or the ParB domain PUBMED:16008837. Biological studies indicate a role of these proteins in phytohormone response, embryo development and programmed cell death by pathogens or ozone.

    \ \

    The predicted secondary structure of the DCD domain is mostly composed of beta strands and confined by an alpha-helix at the N- and at the C-terminus PUBMED:16008837.

    \ \

    Proteins known to contain a DCD domain are listed below:\

    \

    \ \ ' '10372' 'IPR019558' '\

    Mammalian uncoordinated homology 13 (Munc13) proteins constitute a family of three highly homologous molecules (Munc13-1, Munc13-2 and Munc13-3) with homology to Caenorhabditis elegans unc-13p. Munc13 proteins contain a phorbol ester-binding C1 domain and two C2 domains, which are Ca2+/phospholipid binding domains. Sequence analyses have uncovered two regions called Munc13 homology domains 1 (MHD1) and 2 (MHD2) that are arranged between two flanking C2 domains. MHD1 and MHD2 domains are present in a wide variety of proteins from Arabidopsis thaliana (Mouse-ear cress), C. elegans, Drosophila melanogaster (Fruit fly), Mus musculus (Mouse), Rattus norvegicus (Rat) and Homo sapiens (Human), some of which may function in a Munc13-like manner to regulate membrane trafficking. The MHD1 and MHD2 domains are predicted to be alpha-helical.

    \ ' '10373' 'IPR012315' '\

    The KASH (Klarsicht/ANC-1/Syne-1 homology), or KLS domain is a highly\ hydrophobic nuclear envelope localization domain of approximately 60 amino acids comprising\ a 20-amino-acid transmembrane region and a 30-35-residue C-terminal region\ that lies between the inner and the outer nuclear membranes. The KASH domain\ is found in association with other domains, such as spectrin repeats and CH, \ at the C-terminus of proteins tethered to the nuclear\ membrane in diverse cell types PUBMED:10556085, PUBMED:10878022, PUBMED:12169658, PUBMED:12408964, PUBMED:15579692.

    \

    Some proteins known to contain a KASH domain are listed below:\

    \ ' '10374' 'IPR013135' '\

    In Drosophila melanogaster (Fruit fly) the vitelline membrane (VM) is the first layer of the eggshell produced by the follicular epithelium. It is composed of at least four different proteins. VM proteins are similarly organised with a central highly conserved 38-amino acid domain which is flanked by unrelated regions. Since the surrounding regions have diverged significantly, it is possible that the VM domain is of key importance in VM protein structure PUBMED:3143615, PUBMED:8293994. The VM domain contains three highly conserved cysteines.

    \ ' '10375' 'IPR018873' '\

    This entry represents an N-terminal DNA-binding domain found in a wide range of proteins from bacterial and eukaryotic DNA viruses and there bacterial homologues, they include the poxvirus D6R/N1R and baculoviral Bro protein families. The KilA-N domain is considered to be homologous to the fungal DNA-binding APSES domain. Both the KilA-N and APSES domains share a common fold with the nucleic acid-binding modules of the LAGLIDADG nucleases and the amino-terminal domains of the tRNA endonuclease PUBMED:11897024.

    \ ' '10376' 'IPR018306' '\

    This entry represents the putative helicase A859L () PUBMED:11897024.

    \ ' '10377' 'IPR006578' '\

    The MADF (myb/SANT-like domain in Adf-1) domain is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain PUBMED:12459265. MADF is related to the Myb DNA-binding domain (). The retroviral oncogene v-myb, and its cellular counterpart c-myb, are nuclear DNA-binding proteins that specifically recognise the sequence YAAC(G/T)G. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains. Some proteins known to contain a MADF domain are listed below:

    \

    \ \ ' '10378' 'IPR018874' '\

    This entry represents the C-terminal domain from p63 (), one of the proteins of Myxococcus phage Mx8. The function of these proteins are unknown PUBMED:11897024.

    \ ' '10379' 'IPR018875' '\

    This entry represents the amino-terminal domain of the Enterobacteria phage P22 antirepressor protein () PUBMED:11897024.

    \ ' '10380' 'IPR018876' '\

    This entry represents the carboxy-terminal domain of the Enterobacteria phage P22 antirepressor (() PUBMED:11897024. It is found associated with .

    \ ' '10381' 'IPR018877' '\

    This entry represents the carboxy-terminal domain from ORF11 (), one of the proteins of Pseudomonas phage D3 (Bacteriophage D3). The function of these proteins are unknown PUBMED:11897024.

    \ ' '10382' 'IPR005918' '\

    The conantokins are a family of neuroactive peptides found in the venoms of fish-hunting cone snails. They possess a relatively high number of residues (4-5) of the non-standard amino acid gamma-carboxyglutamic acid (Gla), which is generated by the post-translational modification of glutamate (Glu) residues. Conantokins are the only naturally produced peptides known to be N-methyl-D-aspartate (NMDA) receptor antagonists and show therapeutic promise in treating conditions associated with NMDA receptor dysfunction. In animal models they have exhibited anticonvulsant and anti-Parkinsonian properties and have provided neuroprotection within therapeutically acceptable times following transient focal brain ischemia PUBMED:9398296, PUBMED:11554555, PUBMED:11096077, PUBMED:12350383.

    \ \

    Upon binding of Ca2+ to Gla, conantokin undergoes a conformational transition from a distorted curvilinear 3(10) helix to a linear alpha-helix. The binding of Ca2+ to conantokin leads to the exposure of a hydrophobic region on the opposite face of the helix PUBMED:9398296. Conantokins share relatively few sequence elements, which include include sequence identity at the first four residues, homologous positioning of the two most C-terminal Gla residues, and an Arg preceding the most C-terminal Gla PUBMED:11554555.

    \ \

    The conantokin family is currently known to include:

    \ \ \ ' '10383' 'IPR018289' '\

    This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif PUBMED:15755923, PUBMED:17130173. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 PUBMED:11751820.

    \ ' '10384' 'IPR018878' '\

    This entry represents the carboxy-terminal domain from ORF6 (), an antirepressor protein from Lactococcus phage bIL285 PUBMED:11897024.

    \ ' '10385' 'IPR018879' '\

    This entry represents ORF MSV199 (), an MTG motif gene family protein from Melanoplus sanguinipes entomopoxvirus (MsEPV) PUBMED:11897024.

    \ ' '10386' 'IPR018880' '\

    This entry represents the amino-terminal domain of the Ash protein (), from Bacteriophage P4 PUBMED:11897024.

    \ ' '10387' 'IPR018480' '\

    Phospho-N-acetylmuramoyl-pentapeptide-transferase () (MraY) is a bacterial enzyme responsible for the formation of the first lipid intermediate of the cell wall peptidoglycan synthesis PUBMED:10564498. It catalyses the formation of undecaprenyl-pyrophosphoryl-N-acetylmuramoyl-pentapeptide from UDP-MurNAc-pentapeptide and undecaprenyl-phosphate.

    \ \

    MraY is an integral membrane protein with probably ten transmembrane domains. It belongs to family 4 of glycosyl transferases. Homologues of MraY have been found in archaebacteria Methanobacterium thermoautotrophicum and in Arabidopsis thaliana (Mouse-ear cress).

    \ \ \ \

    This entry represents two conserved sites found in these proteins. The first site is located at the end of the first cytoplasmic loop and the beginning of the second transmembrane domain. The second site is located in the third cytoplasmic loop.

    \ ' '10389' 'IPR019559' '\

    This is the neddylation site of cullin proteins, which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae (Baker\'s yeast), and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue PUBMED:15021886.

    \ ' '10390' 'IPR019560' '\

    This family of proteins are mitochondrial 18kDa proteins that are often misannotated as carbonic anhydrases. It was shown that knockdown of MTP18 protein results in a cytochrome c release from mitochondria and consequently leads to apoptosis PUBMED:15155745. Over expression studies suggest that MTP18 is required for mitochondrial fission PUBMED:15985469.

    \ ' '10391' 'IPR019561' '\

    The Sec61/SecY translocon mediates translocation of proteins across the membrane and integration of membrane proteins into the lipid bilayer. The structure of the translocon revealed a plug domain blocking the pore on the lumenal side. The plug is unlikely to be important for sealing the translocation pore in yeast but it plays a role in stabilising Sec61p during translocon formation. The domain runs from residues 52-74 PUBMED:16822836.

    \ ' '10393' 'IPR018881' '\

    This family of proteins has no known function.

    \ ' '10394' 'IPR018882' '\

    This is a very short highly conserved domain that is C-terminal to the cytosolic transmembrane region IV of the NMDA-receptor 1. It has been shown to bind Calmodulin-Calcium with high affinity. The ionotropic N-methyl-D-aspartate receptor (NMDAR) is a major source of calcium flux into neurons in the brain and plays a critical role in learning, memory, neural development, and synaptic plasticity. Calmodulin (CaM) regulates NMDARs by binding tightly to the C0 and C1 regions of their NR1 subunit. The conserved tryptophan is considered to be the anchor residue PUBMED:18073110.

    \ ' '10395' 'IPR018883' '\

    This entry represents the cadmium-binding carbonic anhydrase of marine diatoms PUBMED:17222138. The prevalence of carbonic anhydrase in diatoms that contain Cd at their active site probably reflects the very low concentration of Zn in the marine environment and the difficulty in acquiring inorganic carbon for photosynthesis. Compared with alpha- and gamma-carbonic anhydrases that use three histidines to coordinate the zinc-atom, this beta-carbonic anhydrase has two cysteines and one histidine, and rapidly binds cadmium PUBMED:18322527.

    \ ' '10396' 'IPR019562' '\

    This entry represents a novel carbohydrate-binding domain found on micronemal proteins. Micronemal proteins (MICs) are released onto the parasite surface just before invasion of host cells and play important roles in host cell recognition, attachment and penetration. Toxoplasma gondii can infect and replicate within all nucleated cells PUBMED:17491595. This domain interacts with sialylated oligosaccharides; the protein in T. gondii is a monomer but several MAR domains are carried on the protein. Each MAR domain contains one central sialic acid-binding pocket PUBMED:18203663.

    \ ' '10397' 'IPR018884' '\

    This domain is found at the C terminus of many NMDA-receptor proteins, many of which are also associated with and . This region is predicted to be a large extra-cellular domain of the NMDA receptor proteins, being highly hydrophilic, and is thought to be integrally involved in the function of the receptor. The region also carries a number of potential N-glycosylation sites PUBMED:8428958.

    \ ' '10398' 'IPR019563' '\

    O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families PUBMED:7624375, PUBMED:8535779, PUBMED:. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in clans.

    \ \

    This is the 97th family of glycosidases, in this case bacterial. The central part of the GH97 family protein sequences represents a typical and complete (beta/alpha)8-barrel or catalytic TIM-barrel type domain. The N- and C-terminal parts of the sequences, mainly consisting of beta-strands, most probably form two additional non-catalytic domains with as yet unknown functions. The non-catalytic domains of glycosidases from the alpha-galactosidase and alpha-glucosidase superfamilies are also predominantly composed of beta-strands, and at least some of these domains are involved in oligomerisation and carbohydrate binding. In all known glycosidases with the (beta-alpha)8-barrel fold, the amino acid residues at the active site are located on the C-termini of the beta-strands PUBMED:16131397.

    \ ' '10399' 'IPR018885' '\

    This conserved domain is found in fungal proteins and appears to be involved in RNA-processing. It binds to poly-adenylated RNA, interacts genetically with mRNA 3\'-end processing factors, co-purifies with the nuclear cap-binding protein Cbp20p, and is found in complexes containing other translation factors, such as EIF4G as in and .

    \ ' '10400' 'IPR019564' '\

    Metaxin is an outer membrane protein of mammalian mitochondria which is involved in transport of proteins into the mitochondrion PUBMED:11027586. Metaxin is a mitochondrial protein that extends into the cytosol while anchored into the outer membrane at its C terminus.

    \

    The TOM37 protein is one of the outer membrane proteins that make up the TOM complex for guiding cytosolic mitochondrial beta-barrel proteins from the cytosol across the outer mitochondrial membrane into the intramembrane space. In conjunction with TOM70 it guides peptides without an mitochondrial targeting sequence (MTS) into TOM40, the protein that forms the passage through the outer membrane PUBMED:16890951. It has homology with Metaxin-1, also part of the outer mitochondrial membrane beta-barrel protein transport complex PUBMED:17981999.

    \

    This entry represents outer mitochondrial membrane transport complex proteins metaxin and Tom37. In its N-terminal region, metaxin shows significant sequence identity to Tom37, a component of the outer membrane portion of the mitochondrial preprotein translocation apparatus in Saccharomyces cerevisiae (Baker\'s yeast), but important structural differences, including apparently different mechanisms of targeting to membranes, also exist between the two proteins PUBMED:9045676.

    \ ' '10401' 'IPR019565' '\

    This short highly conserved region of proteinase-binding alpha-macro-globulins contains the cysteine and a glutamine of a thiol-ester bond that is cleaved at the moment of proteinase binding, and mediates the covalent binding of the alpha-macro-globulin to the proteinase. The GCGEQ motif is highly conserved.

    \

    This entry contains serum complement C3 and C4 precursors and alpha-macrogrobulins.

    \ \

    The alpha-macroglobulin (aM) family of proteins includes protease inhibitors PUBMED:2473064, typified by the human tetrameric a2-macroglobulin (a2M); they belong to the MEROPS proteinase inhibitor family I39, clan IL. These protease inhibitors share several defining properties, which include (i) the ability to inhibit proteases from all catalytic classes, (ii) the presence of a \'bait region\' and a thiol ester, (iii) a similar protease inhibitory\ mechanism and (iv) the inactivation of the inhibitory capacity by reaction of the thiol ester with small primary amines. \ aM protease inhibitors inhibit by steric hindrance PUBMED:2472396. The mechanism involves protease cleavage of the bait region, a segment of the aM that is particularly susceptible to proteolytic cleavage, which initiates a conformational change such that the aM collapses about the protease. In the resulting aM-protease complex, the active site of the protease is sterically shielded, thus substantially decreasing access to protein substrates. Two additional events occur as a consequence of bait region cleavage, namely (i) the h-cysteinyl-g-glutamyl thiol ester becomes highly reactive and (ii) a major conformational change exposes a conserved COOH-terminal receptor binding domain PUBMED:2469470 (RBD). RBD exposure allows the aM protease complex to bind to clearance receptors and be removed from circulation PUBMED:2430968. Tetrameric, dimeric, and, more recently, monomeric aM protease inhibitors have been identified PUBMED:9914899, PUBMED:10426429.

    \ \ ' '10402' 'IPR019566' '\

    The myelin sheath is a multi-layered membrane, unique to the nervous system, that functions as an insulator \ to greatly increase the velocity of axonal impulse conduction. The P0 glycoprotein, absent in the central \ nervous system PUBMED:2435734, is a major component of the myelin sheath in peripheral nerves. It comprises \ a large extracellular N-terminal domain, a single transmembrane (TM) region, and a smaller positively\ charged intracellular domain. It is postulated that P0 is a structural element in the formation and \ stabilisation of peripheral nerve myelin PUBMED:2578885, holding its characteristic coil structure together \ by the interaction of its positively-charged domain with acidic lipids in the cytoplasmic face of the \ opposed bilayer, and by interaction between hydrophobic globular \'heads\' of adjacent extracellular domains \ PUBMED:2435734.

    \

    This entry is the extracellular domain found at the C-terminal end of myelin-PO.

    \ ' '10403' 'IPR018886' '\

    This domain may well be a type of zinc-finger as it carries two pairs of highly conserved cysteine residues though with no accompanying histidines. Several members are annotated as putative helicases.

    \ ' '10404' 'IPR018887' '\

    This family of proteins has no known function.

    \ ' '10405' 'IPR018888' '\

    This family of proteins has no known function.

    \ ' '10406' 'IPR018889' '\

    This family of proteins has no known function.

    \ ' '10408' 'IPR003651' '\

    Endonuclease III () is a DNA repair enzyme which removes a number of damaged pyrimidines from DNA via its glycosylase activity and also cleaves the phosphodiester backbone at apurinic / apyrimidinic sites via a beta-elimination mechanism PUBMED:7773744, PUBMED:9032058. The structurally related DNA glycosylase MutY\ recognises and excises the mutational intermediate 8-oxoguanine-adenine mispair PUBMED:1328155. The 3-D structures of Escherichia coli endonuclease III PUBMED:1411536 and catalytic domain of MutY PUBMED:9846876 have been determined. The\ structures contain two all-alpha domains: a sequence-continuous, six-helix domain (residues 22-132) and a Greek-key,\ four-helix domain formed by one N-terminal and three C-terminal helices (residues 1-21 and 133-211) together with the\ [Fe4S4] cluster. The cluster is bound entirely within the C-terminal loop by four cysteine residues with a ligation pattern\ Cys-(Xaa)6-Cys-(Xaa)2-Cys-(Xaa)5-Cys which is distinct from all other known Fe4S4 proteins. This structural motif is\ referred to as a [Fe4S4] cluster loop (FCL) PUBMED:7664751. Two DNA-binding motifs have been proposed, one at either end of the\ interdomain groove: the helix-hairpin-helix (HhH) and FCL motifs. The primary role of the iron-sulphur cluster appears to\ involve positioning conserved basic residues for interaction with the DNA phosphate backbone by forming the loop of\ the FCL motif PUBMED:7664751, PUBMED:10900127.

    \ \ \

    The iron-sulphur cluster loop (FCL) is also found in DNA-(apurinic or apyrimidinic site) lyase, a subfamily of endonuclease III. The enzyme has both apurinic and apyrimidinic endonuclease activity and a DNA N-glycosylase activity. It cuts damaged DNA at cytosines, thymines and guanines, and acts on the damaged strand 5\' of the damaged site. The enzyme binds a 4Fe-4S cluster which is not important for the catalytic activity, but is probably involved in the alignment of the enzyme along the DNA strand.

    \ ' '10409' 'IPR018890' '\

    This family of proteins has no known function.

    \ ' '10410' 'IPR018942' '\

    This entry represents a repeat that is found in the RNA-binding protein EWS, hornerin and seminal vesical proteins. EWS-fusion-proteins (EFPS) may play a role in the tumorigenic process.

    \ ' '10411' 'IPR019568' '\

    Neuromuscular junction formation relies upon the clustering of acetylcholine receptors and other proteins in the muscle membrane. Rapsyn is a peripheral membrane protein that is selectively concentrated at the neuromuscular junction and is essential for the formation of synaptic acetylcholine receptor aggregates. Acetylcholine receptors fail to aggregate beneath nerve terminals in mice where rapsyn has been knocked out. The N-terminal six amino acids of rapsyn are its myristoylation site, and myristoylation is necessary for the targeting of the protein to the membrane PUBMED:15730871.

    \ ' '10412' 'IPR018947' '\ Neuromodulin is a component of the\ motile growth cones. It is membrane protein whose expression is\ widely correlated with successful axon elongation PUBMED:3272162. It is a crucial\ component of an effective regeneration response in the nervous system PUBMED:2641999.\ Although its function is uncertain, the N-terminal region is well\ conserved and contains both a calmodulin binding domain, and sites for\ acylation, membrane attachment and protein kinase C phosphorylation.\ Structure predictions suggest that the C-terminal region may exist as an extended, negatively-charged rod with some\ similarity to the side arms of neurofilaments, indicating that the\ biological role of neuromodulin may depend on its ability to form a\ dynamic membrane-cytoplasm-calmodulin complex PUBMED:2641999.\

    This entry represents the neuromodulin N-terminal domain.

    \ ' '10413' 'IPR019569' '\ Synapsins are neuronal phosphoproteins that coat synaptic vesicles, bind to several \ elements of the cytoskeleton (including actin filaments), and are believed to function in \ the regulation of neurotransmitter release PUBMED:2117454, PUBMED:10578110. The synapsin family currently \ includes the highly related synapsin I and II. Both synapsins exist in two alternatively \ spliced variants, IA and IB and IIA and IIB, that only differ at the C-terminus. \ It also includes synapsin III.\

    This highly conserved domain of synapsin proteins has a serine at position 9 or 10 which is a phosphorylation site. The domain appears to be the part of the molecule that binds to calmodulin PUBMED:15147519.

    \ ' '10414' 'IPR019570' '\

    The connexins are a family of integral membrane proteins that oligomerise to form intercellular channels that are clustered at gap junctions. These channels are specialised sites of cell-cell contact that allow the passage of ions, intracellular metabolites and messenger molecules (with molecular weight less than 1-2kDa) from the cytoplasm of one cell to its opposing neighbours. They are found in almost all vertebrate cell types, and somewhat similar proteins have been cloned from plant species. Invertebrates utilise a different family of molecules, innexins, that share a similar predicted secondary structure to the vertebrate connexins, but have no sequence identity to them PUBMED:9769729.

    \ \

    Vertebrate gap junction channels are thought to participate in diverse biological functions. For instance, in the heart they permit the rapid cell-cell transfer of action potentials, ensuring coordinated contraction of the cardiomyocytes. They are also responsible for neurotransmission at specialised \'electrical\' synapses. In non-excitable tissues, such as the liver, they may allow metabolic cooperation between cells. In the brain, glial cells are extensively-coupled by gap junctions; this allows waves of intracellular Ca2+ to propagate through nervous tissue, and may contribute to their ability to spatially-buffer local changes in extracellular K+ concentration PUBMED:7685944.

    \ \

    The connexin protein family is encoded by at least 13 genes in rodents, with many homologues cloned from other species. They show overlapping tissue expression patterns, most tissues expressing more than one connexin type. Their conductances, permeability to different molecules, phosphorylation and voltage-dependence of their gating, have been found to vary. Possible communication diversity is increased further by the fact that gap junctions may be formed by the association of different connexin isoforms from apposing cells. However, in vitro studies have shown that not all possible combinations of connexins produce active channels PUBMED:8811187, PUBMED:8608591.

    \ \

    Hydropathy analysis predicts that all cloned connexins share a common transmembrane (TM) topology. Each connexin is thought to contain 4 TM\ domains, with two extracellular and three cytoplasmic regions. This model\ has been validated for several of the family members by in vitro biochemical\ analysis. Both N- and C-termini are thought to face the cytoplasm, and the\ third TM domain has an amphipathic character, suggesting that it contributes\ to the lining of the formed-channel. Amino acid sequence identity between\ the isoforms is ~50-80%, with the TM domains being well conserved. Both\ extracellular loops contain characteristically conserved cysteine residues,\ which likely form intramolecular disulphide bonds. By contrast, the single\ putative intracellular loop (between TM domains 2 and 3) and the cytoplasmic\ C-terminus are highly variable among the family members.\ Six connexins are\ thought to associate to form a hemi-channel, or connexon. Two connexons then\ interact (likely via the extracellular loops of their connexins) to form the\ complete gap junction channel.

    \ \
     \
           NH2-***        ***        *************-COOH\
                 **     **   **      **\
                 **    **     **    **   Cytoplasmic\
              ---**----**-----**----**----------------\
                 **    **     **    **   Membrane\
                 **    **     **    **\
              ---**----**-----**----**----------------\
                 **    **     **    **   Extracellular\
                  **  **       **  **\
                    **           **\
    
    \ \

    Two sets of nomenclature have been used to identify the connexins. The\ first, and most commonly used, classifies the connexin molecules according\ to molecular weight, such as connexin43 (abbreviated to Cx43), indicating\ a connexin of molecular weight close to 43kDa. However, studies have\ revealed cases where clear functional homologues exist across species\ that have quite different molecular masses; therefore, an alternative\ nomenclature was proposed based on evolutionary considerations, which\ divides the family into two major subclasses, alpha and beta, each with a\ number of members PUBMED:1320430. Due to their ubiquity and overlapping tissue distributions, it has proved difficult to elucidate the functions of individual connexin isoforms. To circumvent this problem, particular connexin-encoding genes have been subjected to targeted-disruption in mice, and the phenotype of the resulting animals investigated. Around half the connexin isoforms have been investigated in this manner PUBMED:9861669. Further insight into the functional roles of connexins has come from the discovery that a number of human diseases are caused by mutations in connexin genes. For instance, mutations in Cx32 give rise to a form of inherited peripheral neuropathy called X-linked dominant Charcot-Marie-Tooth disease PUBMED:7570999. Similarly, mutations in Cx26 are responsible for both autosomal recessive and dominant forms of nonsyndromic deafness, a disorder characterised by hearing loss, with no apparent effects on other organ systems.

    \ \

    This entry represents the cysteine rich domain of the connexins.

    \ ' '10415' 'IPR019571' '\

    Involucrin PUBMED:1359382, PUBMED:8277848 is a highly reactive, soluble, transglutaminase substrate protein present in keratinocytes of epidermis and other stratified squamous epithelia. Involucrin first appears in the cell cytosol, but ultimately becomes cross-linked to membrane proteins by transglutaminase thus helping in the formation of an insoluble envelope beneath the plasma membrane PUBMED:8098344 functioning as a glutamyl\ donor during assembly of the cornified envelope.

    Structurally involucrin consists of a conserved region of about 75 amino acid\ residues followed by two extremely variable length segments that contain\ glutamine-rich tandem repeats. The glutamine residues in the tandem repeats\ are the substrate for the tranglutaminase in the cross-linking reaction. The\ total size of the protein varies from 285 residues (in dog) to 835 residues\ (in orangutan).

    \

    This is the N-terminal three beta strands of involucrin, a protein present in keratinocytes of epidermis and other stratified squamous epithelia. Apigenin is a plant-derived flavanoid that has significant promise as a skin cancer chemopreventive agent. It has been found that apigenin regulates normal human keratinocyte differentiation by suppressing it and this is associated with reduced cell proliferation without apoptosis PUBMED:16982614. The downstream part of the protein is represented by .

    \ ' '10416' 'IPR000426' '\

    The proteasome (or macropain) () PUBMED:7682410, PUBMED:2643381, PUBMED:1317508, PUBMED:7697118, PUBMED:8882582 is a eukaryotic and\ archaeal multicatalytic proteinase complex that seems to be involved in\ an ATP/ubiquitin-dependent nonlysosomal proteolytic pathway. In eukaryotes the\ proteasome is composed of about 28 distinct subunits which form a highly\ ordered ring-shaped structure (20S ring) of about 700 kDa.\ Most proteasome subunits can be classified, on the basis on sequence\ similarities into two groups, alpha (A) and beta (B).

    \

    This family contains the alpha subunit sequences which range from 210 to 290 amino acids. These sequences are classified as non-peptidase homologues in MEROPS peptidase family T1 (clan PB(T)).

    \ ' '10417' 'IPR019572' '\

    Ubiquitin-activating enzyme (E1 enzyme) activates ubiquitin by first adenylating with ATP its C-terminal glycine residue and thereafter linking this residue to the side chain of a cysteine residue in E1, yielding an ubiquitin-E1 thiolester and free AMP. Later the ubiquitin moiety is transferred to a cysteine residue on one of the many forms of ubiquitin-conjugating enzymes (E2) PUBMED:1986373. This domain carries the last of five conserved cysteines that is part of the active site of the enzyme, responsible for ubiquitin thiolester complex formation, the active site being represented by the sequence motif PICTLKNFP PUBMED:11004499. Not all proteins in this entry contain a functional active site.

    \ ' '10419' 'IPR018940' '\

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome PUBMED:12762045, PUBMED:15922593, PUBMED:12932732. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    \

    Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta) PUBMED:12762045.

    \

    This domain is found in the centre of the beta subunits of Elongation factor-1. More information about these proteins can be found at Protein of the Month: Elongation Factors PUBMED:.

    \ \ \ ' '10420' 'IPR019574' '\

    This entry describes the G subunit (one of 14 subunits, A to N) of the NADH-quinone oxidoreductase complex I which generally couples NADH and ubiquinone oxidation/reduction in bacteria and mammalian mitochondria while translocating protons, but may act on NADPH and/or plastoquinone in cyanobacteria and plant chloroplasts. This family does not contain related subunits from formate dehydrogenase complexes.

    \ \ \

    This entry represents the iron-sulphur binding domain of the G subunit.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '10421' 'IPR019575' '\

    This entry describes the F subunit of complexes that resemble NADH-quinone oxidoreductases. The electron acceptor is a quinone, ubiquinone, in mitochondria and most bacteria, including Escherichia coli, where the recommended gene symbol is nuoF. This family does not have any members in chloroplast or cyanobacteria, where the quinone may be plastoquinone and NADH may be replaced by NADPH, nor in Methanosarcina, where NADH is replaced by F420H2.

    \

    This entry represents the iron-sulphur binding domain of the F subunit.

    \

    NADH:ubiquinone oxidoreductase (complex I) () is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) PUBMED:1470679. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea PUBMED:10940377, mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins PUBMED:18394423. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters PUBMED:18563446. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes PUBMED:18563446, PUBMED:17854760. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I PUBMED:18982432.

    \ \ ' '10422' 'IPR019576' '\ \ Pyridoxamine 5\'-phosphate oxidase () is a FMN flavoprotein involved in the de novo synthesis\ of pyridoxine (vitamin B6) and pyridoxal phosphate. It oxidizes pyridoxamine-5-P (PMP) and pyridoxine-5-P\ (PNP) to pyridoxal-5-P. The sequences of the enzyme from bacterial (genes pdxH or fprA) PUBMED:1356963 and\ fungal (gene PDX3) PUBMED:7896706 sources show that this protein has been highly conserved throughout\ evolution. PdxH is evolutionary related PUBMED:8586283 to one of the enzymes in the phenazine biosynthesis\ protein pathway, phzD (also known as phzG).\ \

    This entry represents one of the two dimerisation regions of the protein, located at the edge of the dimer interface, at the C terminus, being the last three beta strands, S6, S7, and S8 along with the last three residues to the end. In , S6 runs from residues 178-192, S7 from 200-206 and S8 from 211-215. the extended loop, of residues 167-177 may well be involved in the pocket formed between the two dimers that positions the FMN molecule PUBMED:10903950.

    \ ' '10423' 'IPR019577' '\

    This entry represents the calcium-binding domain found in SPARC (Secreted Protein Acidic and Rich in Cysteine) and Testican (also known as SPOCK; or SParc/Osteonectin, Cwcv and Kazal-like domains) proteins. SPARC proteins are down-regulated in various tumours and may have a tumour-suppressor function PUBMED:18459035, PUBMED:17325739. Testican-3 appears to be a novel regulator that reduces the activity of matrix metalloproteinase (MMP) in adult T-cell leukemia (ATL) PUBMED:19144404.

    \

    This cysteine-rich domain is responsible for the anti-spreading activity of human urothelial cells. This extracellular calcium-binding domain is rich in alpha-helices and contains two EF-hands that each coordinates one Ca2+ ion, forming a helix-loop-helix structure that not only drives the conformation of the protein but is also necessary for biological activity. The anti-spreading activity was dependent on the coordination of Ca2+ by a Glu residue at the Z position of EF-hand 2 PUBMED:16121393.

    \ ' '10424' 'IPR018891' '\

    This family of proteins was identified as an abortive infection phage resistance protein often found in restriction modification system operons PUBMED:18346280.

    \ ' '10425' 'IPR018310' '\

    This entry represents the Z1 domain of unknown function that is found in a group of putative endonucleases. This domain is found associated with a helicase domain of superfamily type II PUBMED:18346280.

    \ ' '10427' 'IPR019579' '\

    This entry represents proteins with no known function. However, one of the members, , is annotated as an EF-hand family protein.

    \ ' '10428' 'IPR019580' '\

    This entry represents the interacting site for U6-snRNA, which is part of U4/U6.

    \

    U5 tri-snRNPs complex of the spliceosome is a prime candidate for the role of cofactor in the spliceosome\'s RNA core. The essential spliceosomal protein Prp8 interacts with U5 and U6 snRNAs and with specific pre-mRNA sequences that participate in catalysis. This close association with crucial RNA sequences, together with extensive genetic evidence, suggests that Prp8 could directly affect the function of the catalytic core, perhaps acting as a splicing cofactor PUBMED:16431982.

    \ ' '10429' 'IPR019581' '\

    The essential spliceosomal protein Prp8 interacts with U5 and U6 snRNAs and with specific pre-mRNA sequences that participate in catalysis PUBMED:16431982. This close association with crucial RNA sequences, together with extensive genetic evidence, suggests that Prp8 could directly affect the function of the catalytic core, perhaps acting as a splicing cofactor PUBMED:16431982.

    \ ' '10430' 'IPR019582' '\

    The large RNA-protein complex of the spliceosome catalyses pre-mRNA splicing. One of the most conserved core proteins is the pre-mRNA-processing-splicing factor 8 (PrP8) which occupies a central position in the catalytic core of the spliceosome, and has been implicated in several crucial molecular rearrangements that occur there, and has recently come under the spotlight for its role in the inherited human disease, Retinitis Pigmentosa PUBMED:15840809. The RNA-recognition motif of PrP8 is highly conserved and provides a possible RNA binding centre for the 5-prime SS, BP, or 3-prime SS of pre-mRNA which are known to contact with Prp8. \ \ The most conserved regions of an RNA- recognition-motif (RRM) are defined as the RNP1 and RNP2 sequences. Recognition of RNA targets can also be modulated by a number of other factors, most notably the two loops beta1-alpha1, beta2-beta3 and the amino acid residues C-terminal to the RNP2 domain PUBMED:16431982.

    \ ' '10431' 'IPR018892' '\

    This is the highly conserved motif GRKIxxxxxRRKx of nucleoporins that plays a critical and unique role in the nuclear import of retro-transposons in both yeasts and higher organisms. It would appear that the arginine residues at positions 2 and 9-10 constitute a bipartite nuclear localisation signal, with two basic peptide motifs separated by an interchangeable spacer sequence, that is crucial for the retro-transposon activity PUBMED:17615301.

    \ ' '10432' 'IPR019583' '\

    This domain is found in higher eukaryotes between the second and third PDZ domains, , of glutamate receptor like proteins. Its exact function is not known.

    \ ' '10433' 'IPR019584' '\

    Members of this family are usually short proteins (less than 300 residues) with the motif C-XX-C- separated from a more C-terminal cysteine-rich motif HX-C(P)X-C-X4-G-R by a variable region of usually 25-30 (hydrophobic) residues. This domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS) PUBMED:17408970. This particular cysteine-rich signature has also been found to be characteristic of intracellular Zn2+-binding domains suggesting a role in the regulation of gene transcription. The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure PUBMED:11731489.

    \ ' '10434' 'IPR019585' '\

    This entry represents the regulatory subunit RPN7 (known as the non-ATPase regulatory subunit 6 in higher eukaryotes) of the 26S proteasome. This entry also matches the evolutionarily related subunit 1 of the COP9 signalosome complex (CSN) from Arabidopsis PUBMED:11742986.

    \

    The 26S proteasome plays a major role in ATP-dependent degradation of ubiquitinated proteins. Substrate specificity is conferred by the regulatory particle (RP), which can dissociate into stable lid and base subcomplexes. The regulatory subunit RPN7 is one of the lid subunits of the 26S proteasome and has been shown in Saccharomyces cerevisiae (Baker\'s yeast) to be required for structural integrity PUBMED:15102831.

    \

    The COP9 signalosome is a conserved protein complex composed of eight subunits, where Individual subunits of the complex have been linked to various signal transduction pathways leading to gene expression and cell cycle control PUBMED:11114242. The overall organisation and the amino acid sequences of the COP9 signalosome subunits resemble the lid subcomplex of the 19 S regulatory particle for the 26 S proteasome PUBMED:9741626. COP9 subunit 1 (CSN1 or GPS1) of the COP9 complex is an essential subunit of the complex with regard to both structural integrity and functionality. The N-terminal region of subunit 1 (CSN1-N) can inhibit c-fos expression from either a transfected template or a chromosomal transgene (fos-lacZ), and may contain the activity domain that confers most of the repression functions of CSN1. The C-terminal region of subunit 1 (CSN1-C) allows integration of the protein into the COP9 signalosome.

    \ \ \ \ \ ' '10436' 'IPR019587' '\

    This family contains polyketide cylcases/dehydrases which are enzymes involved in polyketide synthesis. It also includes other proteins of the START superfamily PUBMED:11276083.

    \ ' '10437' 'IPR016582' '\

    This entry represents a group of predicted D-(-)-3-hydroxybutyrate oligomer hydrolases (also known as 3HB-oligomer hydrolase), which function in the degradation of poly-3-hydroxybutyrate (PHB). These enzymes catalyse the hydrolysis of D(-)-3-hydroxybutyrate oligomers (3HB-oligomers) into 3HB-monomers PUBMED:16233278, PUBMED:15170237.

    \ ' '10438' 'IPR019588' '\

    This entry represents the proline-rich region of metabotropic glutamate receptor proteins that bind Homer-related synaptic proteins.

    \ \

    Metabotropic glutamate receptors function as receptors for glutamate. The activity of this receptor is mediated by a G-protein that activates a phosphatidylinositol-calcium second messenger system.

    \ \ \

    The Homer proteins form a physical tether linking mGluRs with the inositol trisphosphate receptors (IP3R) that appears to be due to the proline-rich Homer ligand (PPXXFr). Activation of PI turnover triggers intracellular calcium release PUBMED:9808459. Metabotropic glutamate receptor (MGluR) function is altered in the mouse model of human Fragile X syndrome mental retardation, a disorder caused by loss of function mutations in the Fragile X mental retardation gene Fmr1. Homer 3 (and to a lesser extent Homer 1b/c) has been shown to form a multimeric complex with mGlu1a and the IP3 receptor, indicating that Homers may play a role in the localisation of receptors to their signalling partners PUBMED:18184796.

    \ \ ' '10439' 'IPR019589' '\

    This entry represents the CRA (or CT11-RanBPM) domain, which is a protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi) and which is found in Ran-binding proteins such as Ran-binding protein 9 (RanBP9 or RanBPM) and RanBP10. RanBPM is a scaffolding protein important in regulating cellular function in both the immune system and the nervous system, and may act as an adapter protein to couple membrane receptors to intracellular signaling pathways. This domain is at the C terminus of the proteins and is the binding domain for the CRA motif, which is comprised of approximately 100 amino acids at the C-terminal of RanBPM. It was found to be important for the interaction of RanBPM with fragile X mental retardation protein (FMRP), but its functional significance has yet to be determined PUBMED:15381419.

    \ \ \ \ \ ' '10440' 'IPR019590' '\

    The residues upstream of this domain are the probable palmitoylation sites, particularly two cysteines. The domain has a putative PEST site at the very start that seems to be responsible for poly-ubiquitination PUBMED:8755249. PEST domains are polypeptide sequences enriched in proline (P), glutamic acid (E), serine (S) and threonine (T) that target proteins for rapid destruction. The whole domain, in conjunction with a C-terminal domain of the longer protein, is necessary for dimerisation of the whole protein PUBMED:18215622.

    \ ' '10441' 'IPR019591' '\

    This entry represents ATPases involved in plasmid partitioning PUBMED:2149583. It also contains cytosolic Fe-S cluster assembling factors, NBP35 and CFD1 which are required for biogenesis and export of both ribosomal subunits probably through assembling the ISCs in RLI1, a protein which performs rRNA processing and ribosome export PUBMED:15728363, PUBMED:15667273, PUBMED:15660135.

    \ \ \ ' '10442' 'IPR014491' '\

    Thin aggressive fibres known as curli fibres or fimbriae (curli; Tafi) are cell-surface protein polymers found in Salmonella typhimurium and Escherichia coli that mediate interactions important for host and environmental persistence, development of biofilms, motility, colonisation and invasion of cells, and conjugation PUBMED:9457880. Four general assembly pathways for different fimbriae have been proposed, one of which is extracellular nucleation-precipitation (ENP), which differs from the others in that fibre-growth occurs extracellularly. Thin aggregative fimbriae are the only fimbriae dependent on the ENP pathway. Tafi were first identified in Salmonella spp and the controlling operon termed agf; however subsequent isolation of the homologous operon in E coli led to its being called csg. Tafi are known as curli because, in the absence of extracellular polysaccharides, their morphology appears curled; however, when expressed with such polysaccharides their morphology appears as a tangled amorphous matrix. The gene agfC is found to be transcribed at low levels, localised to the periplasm in a mature form, and in combination with AgfE is important for AgfA extracellular assembly, which facilitates the synthesis of Tafi. The genes involved in Tafi production are organised into two adjacent divergently transcribed operons, agfBAC and agfDEFG, both of which are required for biosynthesis and assembly PUBMED:17379722.

    \ \ \ ' '10443' 'IPR019592' '\

    This entry represents a group of proteins often found in Actinomycetes species, clustered with signal peptidase and/or RNAse-HII.

    \ ' '10444' 'IPR019593' '\

    This entry represents proteins spore coat proteins Z (aka CotZ) and Y (aka CotY). They belong to a cysteine-rich spore coat family and are necessary for the assembly of intact exosporium.

    \ ' '10445' 'IPR019594' '\

    This entry, sometimes called the S1 domain, is the luminal domain just upstream of the first, M1, transmembrane region of transmembrane ion-channel proteins, and binds L-glutamate and glycine PUBMED:10465381, PUBMED:8428958. It is found in association with .

    \ ' '10446' 'IPR018893' '\

    Fimbriae are cell-surface protein polymers, of e.g. Escherichia coli and Salmonella spp, that mediate interactions important for host and environmental persistence, development of biofilms, motility, colonisation and invasion of cells, and conjugation. Four general assembly pathways for different fimbriae have been proposed, one of which is extracellular nucleation-precipitation (ENP), that differs from the others in that fibre-growth occurs extracellularly. Thin aggregative fimbriae (Tafi) are the only fimbriae dependent on the ENP pathway. Tafi were first identified in Salmonella spp. and the controlling operon termed agf; however subsequent isolation of the homologous operon in E. coli led to its being called csg. Tafi are known as curli because, in the absence of extracellular polysaccharides, their morphology appears curled; however, when expressed with such polysaccharides their morphology appears as a tangled amorphous matrix PUBMED:17379722. CsgF is one of three putative curli assembly factors appearing to act as a nucleator protein. Unlike eukaryotic amyloid formation, curli biogenesis is a productive pathway requiring a specific assembly machinery PUBMED:11823641.

    \ ' '10447' 'IPR019595' '\

    This entry represents a putative haem-iron utilisation family, as many members are annotated as being pyridoxamine 5\'-phosphate oxidase-related, FMN-binding; however this could not be confirmed.

    \ ' '10448' 'IPR018894' '\

    The function of this family is unknown. Members all come from Burkholderia spp. A number of proteins in this entry, including , are annotated as serine/threonine-protein kinases.

    \ ' '10449' 'IPR018895' '\

    This family of short proteins has no known function.

    \ ' '10450' 'IPR019596' '\

    This entry contains phage tail tube proteins related to the Bacteriophage Mu protein PUBMED:9714755. Bacteriophage Mu has an eicosahedral head and contractile tail. The tail is composed of an outer sheath and an inner tube.

    \ ' '10452' 'IPR017557' '\

    Malonate decarboxylase, like citrate lyase, has a unique acyl carrier protein subunit with a prosthetic group derived from, and distinct from, coenzyme A. Members of this protein family are the phosphoribosyl-dephospho-CoA transferase specific to the malonate decarboxylase system. This enzyme can also be designated holo-ACP synthase (). The corresponding component of the citrate lyase system, CitX, shows little or no sequence similarity to this family.

    \ ' '10453' 'IPR018288' '\

    This entry represents the FpoO subunit of membrane-bound multi-subunit F420H2 dehydrogenase, which oxidises the reduced coenzyme F420H2 to coenzyme F420 and feeds the electrons via an FeS cluster into an energy-conserving electron transport chain PUBMED:9933933, PUBMED:15168610. This enzyme plays a role in the methanogenic pathway in methanogenic archaea. Reduced coenzyme F420H2 is the major cytoplasmic electron carrier of methanogens and a reversible hydride donor, much like NADH PUBMED:10940377. \ \

    \ \

    Where CoB-S-S-CoM (the heterosulphide of 2-mercaptoethanesulphonate and 7-mercaptoheptanoylthreonine phosphate) is the terminal electron acceptor of the methanogenic pathway, and is reduced with the concomitant generation of a transmembrane proton potential and ATP synthesis.

    \

    The FpoO subunit of F420H2 dehydrogenase probably participates in the reduction of methanophenazine, where it acts as a special mechanism for the reduction of the methanogenic cofactor PUBMED:10751389.

    \ ' '10454' 'IPR019597' '\

    Ehb (energy-converting hydrogenase B) is an methanogenic archaeal enzyme that functions in one of the metabolic pathways involved in methanol reduction to methane. This entry contains subunit P of Ehb.

    \ ' '10455' 'IPR018897' '\

    The thin pilus of plasmid R64 belongs to the type IV family and is required for liquid matings. PilI is one of 14 genes that have been identified as being involved in biogenesis of the R64 thin pilus PUBMED:9171405.

    \ ' '10456' 'IPR018898' '\

    Entry exclusion (Eex) is a process which prevents redundant transfer of DNA between donor cells. TraS is a protein involved in Eex. It blocks redundant conjugative DNA synthesis and transport between donor cells, and it is suggested that TraS interferes with a signalling pathway that is required to trigger DNA transfer PUBMED:17259615. TraS on the recipient cell is known to form an interaction with TraG on the donor cell PUBMED:17259615.

    \ ' '10457' 'IPR019598' '\

    Universal stress protein B (UspB) in Escherichia coli is a 14kDa protein which is predicted to be an integral membrane protein. Over expression of UspB results in cell death in stationary phase, and mutants of UspB are sensitive to ethanol exposure during stationary phase PUBMED:9829921.

    \ ' '10458' 'IPR018899' '\

    This is a family of conjugative transposon proteins.

    \ ' '10459' 'IPR018900' '\

    Curli are a class highly aggregated surface fibres that are part of a complex extracellular matrix. They promote biofilm formation in addition to other activities. CsgE is a non-structural protein involved in curli biogenesis PUBMED:11823641. CsgE forms an outer membrane complex with the curli assembly proteins CsgG and CsgF PUBMED:16420357.

    \ ' '10460' 'IPR018901' '\

    CotE is a morphogenic protein that is required for the assembly of the outer coat of the endospore PUBMED:3139490 and spore resistance to lysozyme PUBMED:3139490. CotE also regulates the expression of cotA, cotB, cotC and other genes encoding spore outer coat proteins PUBMED:3139490. The timing of cotE expression has been shown in Bacillus subtilis to affect spore coat morphology but not lysozyme resistance PUBMED:17172339.

    \ ' '10461' 'IPR018902' '\

    This entry represents both UPF0573 and UPF0605 families. Both these families of proteins have no known function.

    \ ' '10462' 'IPR018903' '\

    This is a family of proteins of unknown function. The family is rich in proline residues.

    \ ' '10463' 'IPR018904' '\

    This is a family of proteins with no known function. The family is rich in proline residues.

    \ ' '10464' 'IPR019599' '\

    This domain has been named NEW1 but its actual function is not known. It is found on proteins which are bacterial galactosidases PUBMED:15285616. The domain is associated with , a putative Ig-containing domain.

    \ ' '10465' 'IPR018905' '\

    This domain has been named NEW3,its function is not known but it is found on proteins which are bacterial galactosidases PUBMED:15285616. The domain is associated with , a novel putative carbohydrate binding module found at the N terminus of glycosyl hydrolases.

    \ ' '10466' 'IPR018470' '\

    This is a bacterial family of periplasmic proteins that are thought to function in high-affinity Fe2+ transport.

    \ ' '10467' 'IPR018906' '\

    The DisA protein is a bacterial checkpoint protein that dimerises into an octameric complex. The protein consists of three distinct domains. the first, N-terminal region, from 1-145 is globular and is represented by ; the next 146-289 residues is this domain that consists of an elongated bundle of three alpha helices (alpha-6, alpha-10, and alpha-11), one side of which carries an additional three helices (alpha7-9), thus forming a spine like-linker between domains 1 and 3. The C-terminal residues of domain 3 (), represent the specific DNA-binding domain. The octameric complex thus has structurally linked nucleotide-binding and DNA-binding HhH domains and the nucleotide-binding domains are bound to a cyclic di-adenosine phosphate such that DisA is a specific di-adenylate cyclase. The di-adenylate cyclase activity is strongly suppressed by binding to branched DNA, but not to duplex or single-stranded DNA, suggesting a role for DisA as a monitor of the presence of stalled replication forks or recombination intermediates via DNA structure-modulated c-di-AMP synthesis PUBMED:18439896.

    \ ' '10468' 'IPR019600' '\

    This entry represents bacterial proteins that are involved in the uptake of the iron source hemin PUBMED:1425573.

    \ ' '10469' 'IPR019601' '\

    This entry represents the C-terminal degradation domain of oxoglutarate and iron-dependent oxygenase (Ofd1), the domain being conserved from yeasts to humans. Ofd1 is a prolyl 4-hydroxylase-like 2-oxoglutarate-Fe(II) dioxygenase that accelerates the degradation of Sre1N (the N-terminal transcription factor domain of Sre1) in the presence of oxygen PUBMED:19158663. Yeast Sre1 is the orthologue of mammalian sterol regulatory element binding protein (SREBP), and it responds to changes in oxygen-dependent sterol synthesis as an indirect measure of oxygen availability. However, unlike the prolyl 4-hydroxylases that regulate mammalian hypoxia-inducible factor, Ofd1 uses multiple domains to regulate Sre1N degradation by oxygen; the Ofd1 N-terminal dioxygenase domain is required for oxygen sensing and this Ofd1 C-terminal domain accelerates Sre1N degradation in yeasts PUBMED:18418381.

    \ ' '10470' 'IPR018907' '\

    This C-terminal domain of spindle-body-associated protein Sfi1 has an important role to play in the bridge-splitting during bi-polar spindle assembly, and this separation event possibly requires interaction with integral components of the nuclear envelope, such as the Mps2-Bbp1 complex PUBMED:17392514. Centrally to this domain is a region carrying centrin-binding repeats with repeating units containing tryptophan, .

    \ ' '10471' 'IPR018908' '\

    This family of proteins has no known function. Many members are annotated as potential transmembrane proteins.

    \ ' '10472' 'IPR019602' '\

    Viral mRNA capping enzymes catalyse the first two reactions in the mRNA cap formation pathway. They are a heterodimer consisting of a large and small subunit.

    \

    This domain is the N terminus of the large subunit viral mRNA capping enzyme, and carries both the ATPase and the guanylyltransferase activities of the enzyme. The guanylyltransferase enzymatic region runs from residues 242 (leucine)-273(arginine) PUBMED:8227060, the core of the active site being the lysine residue at 260 PUBMED:8662635. The ATPase activity is at the very N-terminal part of the domain PUBMED:17989694.

    \ ' '10474' 'IPR019603' '\

    This entry represents a short family of yeast proteins. Tom5 is one of three very small translocases of the mitochondrial outer membrane. Tom5 links mitochondrial preprotein receptors to the general import pore PUBMED:9217162. Although Tom5 has allegedly been identified in vertebrates this could not be confirmed.

    \ ' '10475' 'IPR019604' '\

    A photosynthetic reaction-centre complex is found in certain green sulphur bacteria such as Chlorobium vibrioforme, which are anaerobic photo-auto-trophic organisms. The primary electron donor is P840, a probable B-Chl a dimer, and the primary electron acceptor is a B-Chl monomer. Also on the donor side c-type cytochromes are known to function as electron donors to photo-oxidised P840. This family is thus the secondary endogenous donor of the photosynthetic reaction-centre complex and is a membrane-bound cytochrome containing a single haem group.

    \ ' '10476' 'IPR019605' '\

    The misato protein contains three distinct, conserved domains, segments I, II and III and is involved in the regulation of mitochondrial distribution and morphology PUBMED:17349998. This entry represents misato segment II.

    \ \

    Segments I and III are common to tubulins (), but segment II aligns with myosin heavy chain sequences from Drosophila melanogaster (Fruit fly, ), rabbit (), and human.

    \ \ \

    Segment II of misato is a major contributor to its greater length compared with the various tubulins. The most significant sequence similarities to this 54-amino acid region are from a motif found in the heavy chains of myosins from different organisms. A comparison of segment II with the vertebrate myosin heavy chains reveals that it is homologous to a myosin peptide in the hinge region linking the S2 and LMM domains. Segment II also contains heptad repeats which are characteristic of the myosin tail alpha-helical coiled-coils PUBMED:9144213.

    \ \ ' '10477' 'IPR018909' '\

    This is a carbohydrate binding domain which has been shown in Schizosaccharomyces pombe (Fission yeast) to be required for septum localisation PUBMED:18466295.

    \ ' '10478' 'IPR019606' '\

    The GerMN domain is a region of approximately 100 residues that is found, duplicated, in the Bacillus GerM protein and is implicated in both sporulation and spore germination. The domain is found in a number of different bacterial species both alone and in association with other domains such as , Gmad1 and Gmad2. It is predicted to have a novel alpha-beta fold.

    \ ' '10479' 'IPR018910' '\

    The Gmad1 domain is found associated with , in bacterial spore formation. It is predicted to have a beta-propeller fold and to have a passive binding role rather than a catalytic function owing to the low number of conserved hydrophilic residues.

    \ ' '10480' 'IPR018911' '\

    This domain is found linked to in some bacterial proteins. It is predicted to contain an immunoglobulin-like all-beta fold.

    \ ' '10481' 'IPR018912' '\

    This is a family of hypothetical bacterial proteins encoded in the vicinity of molybdenum ABC transporter ATP-binding gene-products MobA MobB and MobC. However the function could not be confirmed.

    \ ' '10482' 'IPR019607' '\

    This domain is conserved in fungi and might be a zinc-finger domain as it contains three conserved Cs and an H in the C-x8-C-x5-C-x3-H conformation typical of a zinc-finger.

    \ ' '10483' 'IPR018913' '\

    This domain is found in phage from a number of different bacteria including (Listeria phage A118 (Bacteriophage A118)). It is purported to be a putative long tail fibre protein, but this could not be confirmed.

    \ ' '10484' 'IPR018914' '\

    All the members of this family are uncharacterised proteins, but the environment in which they are found on the bacterial genome suggests a function as a glucose-6-phosphate isomerase (). This could not, however, be confirmed.

    \ ' '10485' 'IPR018915' '\

    This domain is found in bacteriophage and is thought to have a gp45 function within the phage tail-fibre system.

    \ ' '10486' 'IPR018916' '\

    This is a hypothetical protein family homologous to Lmo2305 in Listeria phage A118 (Bacteriophage A118) systems.

    \ ' '10487' 'IPR018917' '\

    All the members of this very small, very short family are derived from bacteriophages, of the SA bacteriophages 11, Mu50B, system, and from the Staphylococcal_phi-Mu50B-like_prophages subsystem. All members are hypothetical proteins.

    \ ' '10488' 'IPR018918' '\

    This is a family of proteins found in bacteriophage particularly of the SA bacteriophages 11, Mu50B, family, homologous to phi-ETA orf16.

    \ ' '10489' 'IPR019608' '\

    Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll \'a\' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.

    \ \ \

    The photosynthetic reaction centres (RCs) of aerotolerant organisms contain a heterodimeric core, built up of two strongly homologous polypeptides each of which contributes five transmembrane peptide helices to hold a pseudo-symmetric double set of redox components. Two molecules of PscD are housed within a subunit. PscD may be involved in stabilising the PscB component since it is found to co-precipitate with FMO (Fenna-Mathews-Olson BChl a-protein) and PscB. It may also be involved in the interaction with ferredoxin PUBMED:11687219.

    \ ' '10490' 'IPR018919' '\

    A role of this family in UDP-N-acetylenolpyruvoylglucosamine reductase, as MurB, could not be confirmed.

    \ ' '10491' 'IPR019609' '\

    The trypanosome parasite expresses these proteins to evade the immune response PUBMED:9574925.

    \ ' '10492' 'IPR019610' '\

    The CDGSH iron sulphur domain are a group of iron-sulphur (Fe-S) clusters and a unique 39 amino acid CDGSH domain [C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H].

    \

    The CDGSH iron sulphur domain protein (also referred to as mitoNEET) is an integral membrane protein located in the outer mitochondrial membrane and whose function may be to transport iron into the mitochondria PUBMED:17766440. Iron in turn is essential for the function of several mitochondrial enzymes.

    \ \

    This entry represents the N-terminal of the mitoNEET and Miner-type proteins that carry a CDGSH-type cluster-binding domain () that coordinate a redox-active 2Fe-2S cluster.

    \ \

    In the outer mitochondrian membrane (OMM), the CDGSH 2Fe-2S-containing domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by the N-terminal domain found in higher vertebrates PUBMED:17584744, PUBMED:17766440, PUBMED:17376863. The whole protein regulates oxidative capacity and may function in electron transfer, for instance in redox reactions with metabolic intermediates, cofactors and/or proteins localized at the OMM.

    \ \ ' '10493' 'IPR018920' '\

    The Wss (WXG100 protein secretion system) in Staphylococcus aureus seems to be encoded by a locus of eight ORFs, called ess (eSAT-6 secretion system) PUBMED:15657139. This locus encodes, amongst several other proteins, EssA, a protein predicted to possess one transmembrane domain. Due to its predicted membrane location and its absolute requirement for WXG100 protein secretion, it has been speculated that EssA could form a secretion apparatus in conjunction with YukC and YukAB. Proteins homologous to EssA, YukC, EsaA and YukD were absent from mycobacteria PUBMED:16911044.

    \ ' '10494' 'IPR019611' '\

    The function of this domain, from proteins alternatively referred to as EutP, is not known PUBMED:15516577.

    \ ' '10495' 'IPR018921' '\

    The human pathogen Staphylococcus aureus secretes EsxA and EsxB, ESAT-6-like proteins, across the bacterial envelope. Staphylococcal esxA and esxB are clustered with six other genes and some of these are required for synthesis or secretion of EsxA and EsxB. Mutants that failed to secrete EsxA and EsxB displayed defects in the pathogenesis of S. aureus murine abscesses, suggesting that this specialised secretion system might be a general strategy of human bacterial pathogenesis PUBMED:15657139.

    \ ' '10496' 'IPR018922' '\

    The NADH dehydrogenase I complex shuttles electrons from an unknown electron donor, via FMN and iron-sulphur (Fe-S) centres, to quinones in the respiratory and/or the photosynthetic chain. The immediate electron acceptor for the enzyme in plants is believed to be plastoquinone. The NADH dehydrogenase I complex couples the redox reaction to proton translocation, and thus conserves the redox energy in a proton gradient.

    \ \

    This entry represents subunit M of the NADH dehydrogenase I complex in cyanobacteria and plant chloroplasts PUBMED:15608332.

    \ ' '10497' 'IPR019612' '\

    This entry represents a putative tail-knob protein from Listeria phage A118.

    \ ' '10498' 'IPR018923' '\

    This phage protein family is of unknown function but is expressed from within a cluster of tail- and base plate-producing genes PUBMED:14507382.

    \ ' '10499' 'IPR018924' '\

    This family is made up of members from various Burkholderia spp. The function is unknown.

    \ ' '10500' 'IPR018925' '\

    This family of small highly conserved proteins come from a subset of Firmicute species. Its putative function is as a phage terminase small subunit.

    \ ' '10501' 'IPR018926' '\

    This entry is represented by the major tail subunit protein, gp23, of Bacteriophage A118.

    \ ' '10502' 'IPR019613' '\

    Proteins in this entry form part of the nickel-transport complex NikMNQO in prokaryotes. CbiMNQO (cobalt-transport) and NikMNQO are the most widespread groups of microbial transporters for nickel and cobalt ions and are unusual uptake systems as they consist of two transmembrane components (M and Q), a small membrane-bound component (N) and an ATP-binding protein (O), but no extra-cytoplasmic solute-binding protein. The cobalt small membrane-bound component CbiN is not similar to NikN or NikL at the sequence level. NikM, represented by this entry, is the substrate-specific component of the complex and is a seven-transmembrane region protein PUBMED:16352848. The CbiMNQO and NikMNQO systems form part of the coenzyme B12 biosynthesis pathway PUBMED:18174128. The CbiM protein is modelled by .

    \ ' '10503' 'IPR018927' '\

    The toxin-coregulated pilus (TCP) of Vibrio cholerae and the soluble TcpF protein that is secreted via the TCP biogenesis apparatus are essential for intestinal colonisation in the disease of cholera. TcpQ is part of an outer membrane complex of the TCP biogenesis apparatus, comprised of TcpC and TcpQ, and the TcpQ is required for proper localisation of TcpC to the outer membrane.

    \ ' '10504' 'IPR019614' '\

    Members of this family are S-adenosylmethionine-dependent methyltransferases from gamma-proteobacterial species. The diversity in the roles of methylation is matched by the almost bewildering number of methyltransferase enzymes that catalyse the methylation reaction. Although several classes of methyltransferase enzymes are known, the great majority of methylation reactions are catalysed by the S-adenosylmethionine-dependent methyltransferases. SAM (S-adenosylmethionine, also known as AdoMet) is well known as the methyl donor for the majority of methyltransferases that modify DNA, RNA, histones and other proteins, dictating replicational, transcriptional and translational fidelity, mismatch repair, chromatin modelling, epigenetic modifications and imprinting PUBMED:16545107.

    \ ' '10505' 'IPR019615' '\

    This entry represents proteins with unknown function that appears to be restricted to Bacillus sp.

    \ ' '10506' 'IPR019616' '\

    This entry represents proteins annotated as YCF54 found in Viridiplantae. The function is not known.

    \ ' '10507' 'IPR019617' '\

    This entry represents bacterial uncharacterised proteins.

    \ ' '10508' 'IPR019618' '\

    This is a bacterial family of proteins that are required for the formation of functionally normal spores. Proteins in this family may be involved in establishing normal coat structure and/or permeability which could control the access of germinants to their receptor PUBMED:10715007.

    \ ' '10509' 'IPR019619' '\

    This entry represents a bacterial family of uncharacterised proteins.

    \ ' '10510' 'IPR019620' '\

    This entry represents uncharacterised bacterial proteins.

    \ ' '10511' 'IPR019621' '\

    This entry represents bacterial uncharacterised proteins.

    \ ' '10512' 'IPR019622' '\

    This entry represents the RNA polymerase I-specific transcription initiation factor RRN9.

    \ \

    Initiation of transcription of ribosomal DNA (rDNA) in yeast involves an interaction of upstream activation factor (UAF) with the upstream element of the promoter, to form a stable UAF-template complex. UAF, together with the TATA-binding transcription initiation factor protein (TBP), then recruits an essential core factor to the promoter, to form a stable pre-initiation complex PUBMED:9632758. \ \ This Rrn9 domain, which seems to be constrained to fungi, is the two highly conserved regions of proteins which form one of the subunits of UAF and appears to be the region responsible for the interaction with TBP. The family includes the Schizosaccharomyces pombe (Fission yeast) Arc1 protein, , which is found to be essential for the accumulation of condensin at kinetochores PUBMED:18362178.

    \ \ ' '10513' 'IPR019623' '\

    This conserved fungal family is an essential molecular chaperone in the endoplasmic reticulum. Molecular chaperones transiently interact with unfolded proteins to inhibit their self-aggregation and to support their folding and/or assembly. Rot1 is a general chaperone with some substrate specificity, its substrates being the structurally unrelated Kre5 Kre6 Big1 Atg22, which are type I, type II, and polytopic membrane proteins. The dependencies of each for Rot1 do not share similarities. However, their folding does require BiP, and one of these proteins was simultaneously associated with both Rot1 and BiP. In addition, Rot1 may cooperate with BiP/Kar2 in the folding of Kre6 PUBMED:18508919.

    \ ' '10514' 'IPR019624' '\

    This entry represents glycoprotein UL40 from Human cytomegalovirus (HHV-5) (Human herpesvirus 5). HHV-5 has evolved mechanisms to interfere with a host\'s immune recognition, including targeting antigen presentation by MHC class I molecules. The signal sequence of the HHV-5 encoded UL40 polypeptide contains an HLA-E ligand identical with HLA-Cw*0304. The first 37 residues of UL40, including this ligand, are predicted to encode a signal peptide. The virus thus prevents the lysis by NK (natural killer) cells of the cell it has invaded PUBMED:10799855.

    \ ' '10515' 'IPR018473' '\

    This domain confers specific DNA-binding on Hermes transposase PUBMED:16041385.

    \ ' '10516' 'IPR019625' '\

    This entry represents a group of tightly conserved proteins from Enterobacteriaceae which are annotated as being biofilm-dependent modulation protein homologues. This entry includes Bdm, whose expression is reduced in biofilms and is repressed by a high salt concentration PUBMED:10498711.

    \ ' '10517' 'IPR019626' '\

    This repeat contains a highly conserved, characteristic sequence motif, KGG, that is recognised by plants and lower eukaryotes.\ \ Further downstream from this motif is a Walker A, nucleotide binding motif. YciG is expressed as part of a three-gene operon, yciGFE and this operon is induced by stress and is regulated by RpoS, which controls the general stress-response in E coli. YciG was shown to be important for stationary-phase resistance to thermal stress and in particular to acid stress PUBMED:17293430.

    \ ' '10518' 'IPR019627' '\

    Members of this family are mainly Proteobacteria. The function is not known.

    \ ' '10520' 'IPR019629' '\

    This entry represents inner membrane proteins, many are YgjV protein. The function is unknown.

    \ ' '10521' 'IPR019630' '\

    This family consists of proteins from Gammaproteobacteria species. Many members are annotated as being like the Escherichia coli protein; YbaM.

    \ ' '10522' 'IPR019631' '\

    Myticin is a cysteine-rich peptide produced in three isoforms, A, B and C, by Mytilus galloprovincialis (Mediterranean mussel). Isoforms A and B show antibacterial activity against Gram-positive bacteria, while isoform B is additionally active against the fungus Fusarium oxysporum and a Gram-negative bacterium, Escherichia coli (streptomycin resistant strain D31) PUBMED:10491159. Myticin-prepro is the precursor peptide. The mature molecule, named myticin, consists of 40 residues, with four intramolecular disulphide bridges and a cysteine array in the primary structure different from that of previously characterised cysteine-rich antimicrobial peptides. The first 20 amino acids are a putative signal peptide, and the antimicrobial peptide sequence is a 36-residue C-terminal extension. Such a structure suggests that myticins are synthesised as prepro-proteins that are then processed by various proteolytic events before storage in the haemocytes as the active peptide. Myticin precursors are expressed mainly in the haemocytes.

    \ ' '10523' 'IPR019632' '\

    Members of this family belong to the Alphaproteobacteria. The function of the family is not known.

    \ ' '10524' 'IPR019633' '\

    This entry represents proteins found in Gamma proteobacteria, including YciN from Escherichia coli. Their function is not known.

    \ ' '10525' 'IPR019634' '\

    This entry represents proteins found in plants, lower eukaryotes, and bacteria and the chloroplast where it is annotated as Ycf49 or Ycf49-like. The function is not known though several members are annotated as putative membrane proteins.

    \ ' '10526' 'IPR019635' '\

    This entry represents a group of proteins that is largely confined to the Gammaproteobacteria. The function is not known.

    \ ' '10527' 'IPR019636' '\

    Members of this family are confined largely to bacterial species. Most members are annotated as being cell wall-associated hydrolases, but this could not be confirmed.

    \

    The cell wall envelope of Gram-positive bacteria is a macromolecular, exoskeletal organelle that is assembled and turned over at designated sites. The cell wall also functions as a surface organelle that allows Gram-positive pathogens to interact with their environment, in particular the tissues of the infected host. All of these functions require that surface proteins and enzymes be properly targeted to the cell wall envelope. Two basic mechanisms, cell wall sorting and targeting, have been identified. Cell well sorting is the covalent attachment of surface proteins to the peptidoglycan via a C-terminal sorting signal that contains a consensus LPXTG sequence. More than 100 proteins that possess cell wall-sorting signals, including the M proteins of Streptococcus pyogenes, protein A of Staphylococcus aureus, and several internalins of Listeria monocytogenes, have been identified. Cell wall targeting involves the noncovalent attachment of proteins to the cell surface via specialised binding domains. Several of these wall-binding domains appear to\ interact with secondary wall polymers that are associated with the peptidoglycan, for example teichoic acids and polysaccharides. Proteins that are targeted to the cell surface include muralytic enzymes such as autolysins, lysostaphin, and phage lytic enzymes. Other examples for targeted proteins are the surface S-layer proteins of bacilli and clostridia, as well as virulence factors required for the pathogenesis of L. monocytogenes (internalin B) and Streptococcus pneumoniae (PspA) infections PUBMED:10066836.

    \ ' '10528' 'IPR019637' '\

    This entry represents proteins that are found in Proteobacteria. Several are annotated as being YjjA or YjjA-like, but this protein is uncharacterised.

    \ ' '10529' 'IPR019638' '\

    This entry represents proteins mainly found in Gammaproteobacteria. The function is not known.

    \ ' '10530' 'IPR019639' '\

    This entry represents proteins found Actinobacteria and Proteobacteria. The function is not known.

    \ ' '10531' 'IPR018928' '\

    The gene encoding Arabidopsis HAP2 is allelic with GCS1 (Generative cell-specific protein 1). HAP2 is expressed only in the haploid sperm and is required for efficient guidance of the pollen tube to the ovules. In Arabidopsis the protein is a predicted membrane protein with an N-terminal secretion signal, a single transmembrane domain and a C-terminal histidine-rich domain PUBMED:17079265. HAP2-GCS1 is found from plants to lower eukaryotes and is necessary for the fusion of the gametes in fertilisation. It is involved in a novel mechanism for gamete fusion where a first species-specific protein binds male and female gamete membranes together after which a second, broadly conserved protein, either directly or indirectly, causes fusion of the two membranes together. The broadly conserved protein is represented by this HAP2-GCS1 domain, conserved from plants to lower eukaryotes PUBMED:18367645. In Plasmodium berghei the protein is expressed only in male gametocytes and gametes, having a male-specific function during the interaction with female gametes, and being indispensable for parasite fertilisation. The gene in plants and eukaryotes might well have originated from acquisition of plastids from red algae PUBMED:18403203.

    \ ' '10534' 'IPR019642' '\

    This entry represents a conserved protein found in Firmicutes sp. The function is not known.

    \ ' '10535' 'IPR019643' '\

    Molybdenum cofactor biosynthesis protein F (MoaF)is essential for the production of the monoamine-inducible 30kDa protein in Klebsiella PUBMED:7590328. It is necessary for reconstituting organoautotrophic growth in Ralstonia eutropha (Alcaligenes eutrophus) PUBMED:11545279. It is conserved in Proteobacteria and some lower eukaryotes. The operon regulating the Moa genes is responsible for molybdenum cofactor biosynthesis.

    \ ' '10536' 'IPR019644' '\

    This entry represents a protein family of unknown function that is conserved in the firmicutes.

    \ ' '10537' 'IPR019645' '\

    In some species of plants the ycf15 gene is probably not a protein-coding gene because the protein in these species has premature stop codons. Most of the members of the family are hypothetical or uncharacterised PUBMED:16303753.

    \ ' '10538' 'IPR019646' '\

    Aminoglycoside-2\'\'-adenylyltransferase is conserved in Bacteria. It confers resistance to kanamycin, gentamicin, and tobramycin PUBMED:3024112. The protein is also produced by plasmids in various bacterial species and confers resistance to essentially all clinically available aminoglycosides except streptomycin, and it eliminates the synergism between aminoglycosides and cell-wall active agents PUBMED:17030911.

    \ ' '10539' 'IPR019647' '\

    This entry represents proteins that are activated by the protein PhoP. PhoP controls the expression of a large number of genes that mediate adaptation to low Mg2+ environments and/or virulence in several bacterial species. YbrL is proposed to be acting in a loop activity with PhoP and PrmA analogous to the multi-component loop in Salmonella sp., where the PhoP-dependent PmrD protein activates the regulatory protein PmrA, and the activated PmrA then represses transcription from the PmrD promoter which harbours binding sites for both the PhoP and PmrA proteins. Expression of YrbL is induced in low Mg2+ in a PhoP-dependent fashion and repressed by Fe3+ in a PmrA-dependent manner PUBMED:15703297.

    \ ' '10540' 'IPR018929' '\

    This is family of proteins conserved in Actinobacteria. Many members are annotated as putative membrane proteins but this could not be confirmed.

    \ ' '10541' 'IPR019648' '\

    This family is conserved in bacteria. The function is not known.

    \ ' '10542' 'IPR019649' '\

    Proteins in this entry are predicted to be integral membrane proteins, and many of them are annotated as being YndM protein. They are all found in Firmicutes. The true function is not known.

    \ ' '10543' 'IPR019650' '\

    The function of this family is not known.

    \ ' '10544' 'IPR019651' '\

    The best-characterised protein in this entry is an NAD-specific glutamate dehydrogenase encoded in an antisense gene pair arrangement with a DnaK-J-like protein PUBMED:8308022.

    \ ' '10545' 'IPR019652' '\

    This entry represents a conserved protein found in Proteobacteria. The function is not known but many of the members are annotated as protein YgdB.

    \ ' '10546' 'IPR018930' '\

    This is a family of late embryogenesis-abundant proteins There is high accumulation of this protein in dry seeds, and in the roots of full-grown plants in response to dehydration and ABA (abscisic acid application) treatments PUBMED:9349263. This LEA protein disappears after germination. It accumulates in growing regions of well irrigated hypocotyls and meristems suggesting a role in seedling growth resumption on rehydration PUBMED:10318687. As a group the LEA proteins are highly hydrophilic, contain a high percentage of glycine residues, lack Cys and Trp residues and do not coagulate upon exposure to high temperature, and for these reasons are considered to be members of a group of proteins called hydrophilins PUBMED:10681550. Expression of the protein is negatively regulated during etiolating growth, particularly in roots, in contrast to its expression patterns during normal growth PUBMED:11414610.

    \ ' '10547' 'IPR019653' '\

    The RegB endoribonuclease encoded by Bacteriophage T4 is a unique sequence-specific nuclease that cleaves in the middle of GGAG or, in a few cases, GGAU tetranucleotides, preferentially those found in the Shine-Dalgarno regions of early phage mRNAs. T4 regB expression is regulated autogenously by attacking its own mRNA. The deduced primary structure of RegB proteins in many phages is almost identical to that of T4, while the sequences of RegB encoded by Enterobacteria phage RB69, Enterobacteria phage TuIa and Enterobacteria phage RB49 show substantial divergence from their T4 counterpart PUBMED:15486207. In RB49 regB expression is regulated by both RegB and Escherichia coli endoribonuclease E.

    \ ' '10548' 'IPR019654' '\

    NAD(P)H-quinone oxidoreductase subunit L (NdhL) is a component of the NDH-1L complex that is one of the proton-pumping NADH:ubiquinone oxidoreductases that catalyse the electron transfer from NADH to ubiquinone linked with proton translocation across the membrane. NDH-1L is essential for photoheterotrophic cell growth. NdhL appears to contain two transmembrane helices and it is necessary for the functioning of though not the correct assembly of the NDH-1 complex in Synechocystis 6803. The conservation between cyanobacteria and green plants suggests that chloroplast NDH-1 complexes contain related subunits PUBMED:15548534.

    \ ' '10549' 'IPR019655' '\

    Baculovirus occlusion-derived virus (ODV) derives its envelope from an intranuclear membrane source. Occlusion-derived viral envelope proteins that are detected in viral-induced intranuclear microvesicles, but not detected in the plasma membrane, cytoplasmic membranes, or the nuclear envelope. This entry represents ODV-E18 protein which is encoded by baculovirus late genes with transcription initiating from a TAAG motif. ODV-E18 exists as a dimer in the ODV envelope, which contains a hydrophobic domain that putatively acts as a target or retention signal for intranuclear microvesicles PUBMED:8091653.

    \ ' '10550' 'IPR019656' '\

    This entry represents proteins annotated as hypothetical chloroplast protein YCF34. The function is not known.

    \ ' '10551' 'IPR019657' '\

    ComFB is the second protein encoded within the late competence locus ComF. The expression of this locus is dependent on early regulatory competence genes and is only expressed in competence medium PUBMED:8412657. The function of ComFB within late competence development is not known.

    \ ' '10552' 'IPR019658' '\

    This family is conserved in Firmicutes. Several members are annotated as YppC. The function is not known.

    \ ' '10553' 'IPR019659' '\

    This protein is conserved in bacteria and some viruses. The function is not known.

    \ ' '10554' 'IPR019660' '\

    YbjN is a putative sensory transduction regulator protein found in Proteobacteria. As it is a multi-copy suppressor of the coenzyme A-associated temperature sensitivity in temperature-sensitive mutant strains of Escherichia coli the suggestion is that it both helps CoA-A1 and possibly works as a general stabiliser for some other unstable proteins PUBMED:16701556.

    \ ' '10555' 'IPR019661' '\

    This family of proteins regulates the replication of rolling circle replication (RCR) plasmids that have a double-strand replication origin (dso). Regulation of the replication of the RCR plasmids occurs mainly at the initiation of leading strand synthesis at the dso, such that concentration of Rep protein controls plasmid replication PUBMED:9409148.

    \ ' '10556' 'IPR019662' '\

    This entry represents a conserved protein in Actinobacteria. The function is not known.

    \ ' '10557' 'IPR019663' '\

    This entry represents proteins conserved in Proteobacteria. Several members are annotated as being protein YbfA. The function is not known.

    \ ' '10558' 'IPR019664' '\

    This protein is conserved in Cyanobacteria. Several members are annotated as the protein Ycf51. The function is not known.

    \ ' '10559' 'IPR019665' '\

    This entry represents an NAD/NADP-binding domain with a core Rossmann-type fold, found in an uncharacterised protein family thought to be putative NADP oxidoreductase coenzyme F420-dependent proteins and/or NAD-dependent glycerol-3-phosphate dehydrogenase-like proteins. This Rossmann-fold domain consists of 3-layers alpha/beta/alpha, where the six beta strands are parallel in the order 321456.

    \ ' '10560' 'IPR018931' '\

    This presumed domain is found C-terminal to a Rossmann-like domain suggesting that these proteins are oxidoreductases.

    \ ' '10561' 'IPR019666' '\

    CedA is made up of four antiparallel beta-strands and an alpha-helix. It activates cell division by inhibiting chromosome over-replication. This is mediated by binding to dsDNA via the beta-sheet PUBMED:15865419, PUBMED:9427399.

    \ ' '10562' 'IPR019667' '\

    This entry represents a protein of unknown function specific to Bacillus.

    \ ' '10563' 'IPR018932' '\

    Members of this family are all inhibitors of thrombin, the peptidase that is at the end of the blood coagulation cascade and which creates the clot by cleaving fibrinogen. The interaction between thrombin and fibrinogen involves two different areas of contact - via the thrombin active site and via a second substrate-binding site known as an exosite. The inhibitor acts by blocking the exosite, rather than by interacting with the active site. The inhibitors are from mosquitoes that feed on human blood and which, by inhibiting thrombin, prevent the blood from clotting and keep it flowing.

    \ ' '10564' 'IPR019668' '\

    This entry represents proteins with unknown function, and appear to be restricted to the Bacillaceae.

    \ ' '10565' 'IPR019669' '\

    This represents proteins of unknown function and appear to be restricted to Enterobacteriaceae, they are are highly conserved.

    \ ' '10566' 'IPR019670' '\

    This entry represents a family of phage-related proteins whose function is uncharacterised.

    \ ' '10567' 'IPR019671' '\

    This entry represents proteins with unknown function and is a highly conserved sequence, it is restricted to Enterobacteriaceae.

    \ ' '10568' 'IPR019672' '\

    This entry represents proteins with unknown function and appear to be restricted to a family of Enterobacterial proteins. It has a highly conserved sequence.

    \ ' '10569' 'IPR019673' '\

    GerPC is required for the formation of functionally normal spores. The gerP locus encodes a number of proteins which are thought to be involved in the establishment of normal spore coat structure and/or permeability, which allows the access of germinants to their receptor PUBMED:10715007.

    \ ' '10570' 'IPR019674' '\

    This protein is conserved in Mycobacteriaceae and is likely to be a lipoprotein PUBMED:15539077.

    \ ' '10571' 'IPR019675' '\

    This protein is conserved in Corynebacterineae. The function is not known though most members are annotated as either secreted, or membrane, proteins.

    \ ' '10572' 'IPR019676' '\

    This entry represents a protein conserved in the Bacillales family of bacteria. The function is not known. Several members are annotated as being YWJG, a protein expressed downstream of pyrG, a gene encoding for cytidine triphosphate synthetase.

    \ ' '10573' 'IPR019677' '\

    Proteins in this entry are mostly found in proteobacteria, and differ from the GspM proteins found in Vibrio spp.

    \ ' '10574' 'IPR019678' '\

    This entry represents conserved proteins found in Cyanobacteria. The function is not known.

    \ ' '10575' 'IPR019679' '\

    Phage Cox proteins are expressed by Enterobacteria phages. The Cox protein is a 79-residue basic protein with a predicted strong helix-turn-helix DNA-binding motif. It inhibits integrative recombination and it activates site-specific excision of the HP1 genome from the Haemophilus influenzae chromosome, Hp1. Cox appears to function as a tetramer. Cox binding sites consist of two direct repeats of the consensus motif 5\'-GGTMAWWWWA, one Cox tetramer binding to each motif. Cox binding interferes with the interaction of HP1 integrase with one of its binding sites, IBS5. This competition is central to directional control. Both Cox binding sites are needed for full inhibition of integration and for activating excision, because it plays a positive role in assembling the nucleoprotein complexes that produce excisive recombination, by inducing the formation of a critical conformation in those complexes PUBMED:9079698.

    \ ' '10576' 'IPR019680' '\

    This entry represents subunit Med1 of the Mediator complex. The Med1 forms part of the Med9 submodule of the Srb/Med complex. It is one of three subunits essential for viability of the whole organism via its role in environmentally-directed cell-fate decisions PUBMED:12150923.

    \

    The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.

    \ \

    The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.

    \ \

    The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.\

    \ \ \

    Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.

    \ \ ' '10577' 'IPR019681' '\

    This entry represents proteins with unknown function appears to be restricted to mycobacteria.

    \ ' '10578' 'IPR019682' '\

    This entry represents a protein conserved in Caudovirales (known as tailed bacteriophages). Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis.

    \ ' '10579' 'IPR019683' '\

    This entry represents proteins with unknown function that appear to be restricted to Bacillus sp.

    \ ' '10580' 'IPR019684' '\

    This entry represents proteins with unknown function and appears to be restricted to Enterobacteriaceae.

    \ ' '10581' 'IPR019685' '\

    This entry represents proteins with unknown function, and appear to be restricted to Enterobacteriaceae.

    \ ' '10582' 'IPR019686' '\

    This entry represents proteins with unknown function appears to be restricted to Bacillus spp.

    \ ' '10583' 'IPR019687' '\

    This entry represents proteins with unknown function, and appear to be restricted to Bacillus spp.

    \ ' '10584' 'IPR019688' '\

    This entry represents proteins with unknown function, and appear to be restricted to Bacillus spp.

    \ ' '10585' 'IPR019689' '\

    This entry represents conserved proteins in Enterobacteriaceae. The function is not known.

    \ ' '10586' 'IPR019690' '\

    This entry represents a protein that is conserved in bacteria. The function is not known, but several members are annotated as being YdgK or a homologue thereof and associated to the inner membrane.

    \ ' '10587' 'IPR019692' '\

    This protein is conserved in the Actinomycetales. Although several members are annotated as RbiX homologues, RbiX being a putative regulator of riboflavin biosynthesis, the function could not be confirmed.

    \ ' '10588' 'IPR019691' '\

    This family is conserved in Proteobacteria. The function is not known,k but it thought to be a transmembrane protein.

    \ ' '10589' 'IPR019693' '\

    YbaJ regulates biofilm formation. It also has an important role in the regulation of motility in the biofilm. YbaJ functions in increasing conjugation, aggregation and decreasing the motility, resulting in an increase of biofilm PUBMED:16317765.

    \ ' '10590' 'IPR019694' '\

    This family of proteins has no known function.

    \ ' '10591' 'IPR019695' '\

    This entry represents proteins found Actinobacteria sp. The function is not known.

    \ ' '10593' 'IPR019697' '\

    This entry of proteins has no known function.

    \ ' '10594' 'IPR019698' '\

    Some members in this entry are annotated as YchH however currently no function is known.

    \ ' '10595' 'IPR019699' '\

    This bacterial family of proteins have no known function.

    \ ' '10596' 'IPR019700' '\

    Gin allows sigma-F to delay late forespore transcription by preventing sigma-G to take over before the cell has reached a critical stage of development. Gin is also known as CsfB PUBMED:18208527.

    \ ' '10597' 'IPR019701' '\

    This entry represents bacterial proteins has no known function.

    \ ' '10598' 'IPR019702' '\

    This entry represents proteins with unknown function, and appear to be restricted to Enterobacteriaceae.

    \ ' '10599' 'IPR019703' '\

    This entry represents proteins that appear to be restricted to Enterobacteriaceae. Some members are annotated as YbjO, however there is currently no known function.

    \ ' '10600' 'IPR019704' '\

    The FliX protein is possibly a transient component of the flagellum that is required for the assembly process. FliX may contribute to the targeting or assembly of the P- and L-ring protein monomers at the cell pole. The family carries a potential N-terminal signal sequence and at least one transmembrane domain indicating that it might function either in or in association with the cell membrane PUBMED:9555902.

    \ ' '10601' 'IPR019705' '\

    This entry represents proteins with unknown function and appear to be restricted to Enterobacteriaceae.

    \ ' '10602' 'IPR019706' '\

    This entry represents bacterial proteins has no known function.

    \ ' '10603' 'IPR019707' '\

    This entry represents conserved proteins found in bacteria and archaea. The function is not known.

    \ ' '10604' 'IPR019708' '\

    This entry represents proteins with unknown function and is restricted to Proteobacteria. One of the proteins is annotated to a predictive tail tube protein.

    \ ' '10606' 'IPR019710' '\

    BssS (also known as YliH) regulates Escherichia coli (strain K12) biofilm formation through quorum sensing. BssS is also involved in motility regulation as it represses motility 7 fold by decreasing transcription of the flagella and motility loci PUBMED:16597943.

    \ ' '10607' 'IPR019711' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit H found in the F0 complex of F-ATPases from fungal mitochondria. Subunit H is homologous to the mammalian factor F6, and is essential for the correct assembly and/or functioning of F-ATPases, since yeast cells lacking it are not able to grow on non-fermentable carbon sources. Subunit H occupies a central place in the peripheral stalk between the F1 sector and the membrane PUBMED:14556635.

    \ ' '10608' 'IPR019712' '\

    This is a bacterial family of proteins. Some members in the family are annotated as YtpB, however no function is currently known.

    \ ' '10609' 'IPR019713' '\

    The extracytoplasmic function (ECF) sigma factors are small regulatory proteins that are quite divergent in sequence relative to most other sigma factors. YlaC, regulated by YlaA, is important in oxidative stress resistance. It contributes to hydrogen peroxide resistance in Bacillus subtilis PUBMED:16728958.

    \ ' '10610' 'IPR019714' '\

    Haloacid dehalogenases catalyse the removal of halides from organic haloacids. The 2-haloacid dehalogenase DehI can process both L- and D-substrates. A crucial aspartate residue is predicted to activate a water molecule for nucleophilic attack of the substrate chiral centre resulting in an inversion of the configuration of either L- or D-substrates in contrast to D-only enzymes PUBMED:18353360.

    \ ' '10611' 'IPR019715' '\

    Haemolysin XhlA is a cell-surface associated haemolysin that lyses the two most prevalent types of insect immune cells (granulocytes and plasmatocytes) as well as rabbit and horse erythrocytes PUBMED:15659065.

    \ ' '10612' 'IPR019716' '\

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites PUBMED:11297922, PUBMED:11290319. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    \

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal\'s contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function \'outside\' the ribosome PUBMED:11290319, PUBMED:11114498.

    \ \ \

    Mitochondrial ribosomal protein L53 (also known as L44) is part of the 39S ribosome PUBMED:17604309.

    \ ' '10613' 'IPR019717' '\

    DSRB is a novel dextransucrase which produces a dextran different from the typical dextran, as it contains (1-6) and (1-2) linkages, when this strain is grown in the presence of sucrose PUBMED:9503626.

    \ ' '10614' 'IPR019718' '\

    This bacterial family of proteins that has no known function.

    \ ' '10615' 'IPR019719' '\

    The function of the bacterial proteins in this entry is not known.

    \ ' '10616' 'IPR019720' '\

    This family is conserved in the Enterobacteriales. It is a putative plasmid stability protein in that it is expressed from the operon involved in stability, but its actual function has not yet been characterised but it may be involved in the control of plasmid partition.

    \ ' '10617' 'IPR019721' '\

    This entry represents the 21 kDa subunit of NADH-ubiquinone oxidoreductase from the fungi, including Nuo-21 from Neurospora crassa PUBMED:10477266. This mitochondrial inner membrane protein catalyses the transfer of electrons from NADH to the respiratory chain. The immediate electron acceptor for the enzyme is believed to be ubiquinone. The catalytic activity of NADH-ubiquinone oxidoreductase is:

    \ \

    \ \ ' '10618' 'IPR019722' '\

    This entry represents proteins conserved in Firmicutes and Proteobacteria. Several members are annotated as being glucose-6-phosphate 1-dehydrogenase () but this could not be confirmed.

    \ ' '10619' 'IPR019723' '\

    This entry represents proteins conserved in the Bacillus cereus group. Several members are called YfmQ but the function is not known.

    \ ' '10620' 'IPR019724' '\

    This entry represents a conserved protein in epsilon-Proteobacteria. The function is not known.

    \ ' '10621' 'IPR019725' '\

    Upon infection, the RpbA encoded phage protein binds to the ADP-ribosylated core RNA polymerase and modulates function to preferentially bind T4 promoters. This is a non-essential protein to the phage life cycle.

    \ ' '10622' 'IPR019726' '\

    This entry represents bacterial proteins with undetermined function.

    \ ' '10623' 'IPR019727' '\

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport PUBMED:15473999, PUBMED:15078220.

    \ \ \

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) () are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis PUBMED:11309608. These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    \

    This entry represents subunit F found in the F0 complex of F-ATPases from fungal mitochondria. The membrane bound F1-FO-type H+ ATP synthase of mitochondria catalyses the terminal step in oxidative respiration converting the generation of the electrochemical gradient into ATP for cellular biosynthesis. The general structure and the core subunits of the enzyme are highly conserved in both prokaryotic and eukaryotic organisms.

    \ ' '10624' 'IPR019728' '\

    This entry represents a protein conserved in Cyanobacteria. The function is not known.

    \ ' '10625' 'IPR019729' '\

    This entry represents proteins that are Gloverin-like. Gloverin is a 13.8kDa inducible antibacterial insect protein which inhibits the synthesis of vital outer membrane proteins leading to a permeable outer membrane. Gloverin contains a large number of glycine residues PUBMED:18076111.

    \ ' '10626' 'IPR019730' '\

    This entry represents bacterial proteins with unknown function.

    \ ' '10627' 'IPR019731' '\

    This entry represents conserved protein found in in Gammaproteobacteria. The function is not known.

    \ ' '10628' 'IPR019732' '\

    This entry is conserved in Enterobacteriaceae. It is one of a series of proteins, expressed by these bacteria in response to stress, that help to regulate Sigma-S, the stationary phase sigma factor of Escherichia coli and Salmonella. IraP is essential for Sigma-S stabilisation in some but not all starvation conditions PUBMED:18383615.

    \ ' '10629' 'IPR019733' '\

    This family is conserved in Firmicutes and Proteobacteria. The function is not known but several members are annotated as being homologues of Escherichia coli YhfT, a protein thought to be involved in fatty acid oxidation.

    \ '