Search and browse database

Single protein data presentation

After selecting a particular protein from the database (e.g. after browsing or searching the database, as described in the next section) users can view all information about it in the following screens:

  • Knotting data – the main part of this screen is shown in figure below; it contains the matrix diagram (knotting fingerprint) of a protein (top left), JSmol graphic representation of the protein (top right), a table listing all knots found in subchains of a protein (and detailed information about their lengths, depths, chirality, etc.), and the sequence representation of the protein with knot and slipknot elements (knot core, knot and slipknot tails and loops, etc.) highlighted in colors (bottom). The matrix diagram is interactive: after choosing a knot type (if more than one knot type is detected) from the table, the data corresponding to this knot is shown in the diagram. By default the data corresponding to the knot formed by the whole chain (for knotted proteins), or the most complicated slipknot (for proteins with slipknots) is shown in the diagram.
  • Chain information summary – this screen collects basic biological information about the protein: its size, molecule tags and keys, source organism, Enzyme Classification (ec), the number of missing residues, pfam annotations, etc.; hyperlinks to the pdb, PubMed, pfam and doi (if available) are also included.
  • Similar chains (by sequence) – provides two lists: the pdb codes of other chains deposited in the KnotProt database with at least 40% sequence similarity, and the pdb codes of other chains (from the full pdb, but not included in the KnotProt) with at least 40% sequence similarity (those proteins should have sequences nearly identical to those deposited in KnotProt)
  • Similar chains (by structures)– lists pdb codes with the same super family or topology or homology, as defined by the cath database

Fig. 1 An example of data presentation for a knotted protein 1yrl in the KnotProt. In this example the analyzed polypeptide chain of E. coli ketol-acid reductoisomerase reveals that the entire polypeptide chain forms a 41 knot, and has a subchain forming a 31 knot. Diagram in top left: knotting fingerprint revealing the positions of subchains forming 41 and 31 knots. Top right: graphical representation of protein in JSmol. Table in the middle: detailed data about knots and slipknots formed by backbone subchains. Bottom: sequence representation with the knot core and knot tails highlighted in various colors.

Browsing, searching, and processing structures

There are three main options a user can choose from to view or analyze data:

  • Browse database, which lists all structures currently deposited in the database; all these structures are also hyperlinked to other databases
  • Search database, which provides classification of proteins according to their topological, biological, sequential, and geometrical properties
  • Process my structure, which allows users to upload new polymer-like structures and analyze their topology, or analyze time evolution of entangled structures.
These options are summarized below. Apart from the above options a user can also search proteins according to their pdb code and chain notation.

Browse database. This option includes all protein chains deposited in the KnotProt database. They can be browsed in three different ways. The default screen presents a list of proteins with their pdb code (together with the chain number specified), the topological notation, and the title used in the pdb header. Proteins with incomplete chains are denoted by an additional symbol of a broken chain element. Another option is to browse a list of names together with miniature figures of knotting fingerprints – this enables users to quickly identify some particular shape of matrix diagram. The third possibility is to browse a list of raw data, which is suitable for independent analysis. Upon choosing one of the listed proteins, the full information about it is presented as described in the section “Single protein data presentation”.

Search database. A user can search the database in various ways. In the default screen for this option (“Knot type”) the following options can be chosen:

  • “Knot types”: contains four subclassifications: based on a type of knot, a type of slipknot, a list “Other” of proteins with knots which must be artefacts (arising from broken chains), and a list “Unknot” of all unknotted (and not containing any knotted subchains) proteins in pdb
  • “Fingerprint”: classification (separately for knots and slipknots) of proteins according to their knot notation (knotting fingerprint), such as K523131, S3141, etc.
  • “Knot length” and “Knot/slipknot depth”: knots and slipknots are grouped according to the length of knot core, as well as the length of the N-terminal and C-terminal tails; those lengths are used to classify knots as shallow or deep (for tails shorter or longer than 10 aminoacids respectively)
Moreover several other classifications can be chosen, according to: “Molecule keywords”, “Molecule tags” (based on the classification from the pdb website), “pfam family identifier”, “ec nomenclature” (numerical classification for enzymes based on the chemical reactions they catalyze), “cath classification” (which includes class, architecture, and topology), and “Keywords cloud”. We believe that, based on these classifications, this database will provide the opportunity for other researchers to make new, deep discoveries.

Process my structure. A user can upload and analyze two types of data: either a single structure, or the whole set of structures (e.g. a folding or unfolding trajectory). The data can be submitted either in pdb format or in a simplified “x-y-z” format (containing only Cartesian coordinates of atoms, which enables users to analyze arbitrary polymers or open chains). In the case of a single structure, when knots or slipknots are detected by KnotProt, the relevant knotting fingerprint is constructed. In the case of a trajectory, an xtc format (typical for Gromacs software) can also be uploaded (with a gro or pdb file). To detect knotting in an open chain, a user can choose either our standard method or direct closure, i.e. connecting two termini by a line segment (for more details see section “Knot detection”). It is also possible to determine a knot type associated to any subchain of the whole chain – in this case a user provides the numbers of two atoms, which are then regarded as the beginning and the end of a subchain.

KnotProt | Interdisciplinary Laboratory of Biological Systems Modelling