KnotProt 2.0: A database of proteins with knots and slipknots

Analyze single structure and trajectory

Analyze your data – single structure
Analyze your data – trajectory

Analyze your data – single structure

A user can upload and analyze two types of data: either a single structure, or the whole trajectory. In this section we explain how to analyze a single structure, which can be either a protein, or any other polymer-like chain.

Input

Data representing some structure can be provided in three ways:
- Firstly, a user can choose a structure from Protein Data Bank, which will be automatically downloaded by the database and analyzed. The user should only provide relevant pdb code (4 letters) and the chain number (in accordance with pdb notation). Note: in the analysis all non-standard aminoacids (MSE, FGL, LLP, SAC, SER, PCA, MEN, CSB, HTR, PTR, TYR, SCE, M3L, OCS, KCX, SEB, MLY, CSW, TPO, SEP, AYA, TRN) denoted as HETATOM will be automatically replaced by corresponding proper aminoacids. For nmr structures only the first model with requested chain type will be analyzed.
- Secondly, a user can upload a file in standard pdb format.
- Thirdly, a user can upload a file in xyz format, which only contains atom numbers along the chain (in the first column) and Cartesian coordinates of all atoms in a chain (in second, third and fourth column); such a file cannot contain any additional columns. An example of such a file:
```
        1       -9.102        -18.555          15.000    
        2       -9.384        -17.556          14.080
        3       -5.661        -16.841          14.367
        4       -3.660        -15.387          11.487
        5        0.096        -15.769          11.241
        6        2.646        -13.739           9.329
        7        5.806        -15.775           8.604
     
```
To detect a knot type, an open chain must be transformed into a closed chain (for more details see section Knot detection). A user can chose two methods of such closure:
- stochastic
- direct closure (connecting two terminal by a straight interval)
It is also possible to determine a knot type associated to any subchain of the whole chain. In this case a user should provide numbers of two atoms, which will be regarded as the beginning and the end of a subchain.
A structure uploaded and analyzed by a user is stored for 14 days (so that it can be viewed again). It is also possible to choose to add a structure to the whole database permanently and make it available to others. New structures will be added to the database after verification by an administrator.

Output for knots

The results of the analysis are presented in a matrix form of a “topological fingerprint”, described in detail in section Knot plot. Also locations of a knot core, knot tails, slipknot tails and loops are given, and these elements are denoted on 3d and sequential representations of the structure.

After completing the analysis a user can download:

pictures representing protein topology in formats such as: svg map (vector format), svg map with arrows (rasterized), png map,
a structure transformed into xyz format.

Output for knotoids

For the knotoids analysis, the output is provided in a form of a simple tab separated text file. Each row corresponds to a single subchain. There are six columns in total. The first and second columns indicate respectively the starting and ending index of each subchain. The third column shows the length of each subchain which is defined as the number of beads that the particular subchain is made of. The fifth columns presents the dominant knotoid type that the subchain has while the sixth columns shows the corresponding polynomial.

The parameters that are used in knotoid analysis are the following:

planar: Analyzes the protein curve by projecting it on a plane. If it is omitted then the protein curve is projected on a sphere. Projections on a plane provide a more detailed overview of the topology of the chain.
nb-projections: Indicates the number of projections that will be used for the determination of the knotoid type of the subchain. Knotprot samples N random projections with uniform distribution on the surface of the sphere.
closure-method: Knotoids don’t require the protein chain to be closed into a loop however, the user can choose one of the two following closure methods in order to obtain information on the knot that is produced from a particular knotoid. The first method is “direct” which connects the endpoints with a straight line and the second is called “rays” which applies the probabilistic method known as infinity closure. There is a third option, “open” which is the default one and corresponds to knotoids.

Analyze your data – trajectory

It is also possible to upload and analyze the entire (knotted or slipknotted) trajectory, i.e. a sequence of structures, determined e.g. in simulations of pulling a protein by two aminoacids in opposite directions.

Input

A trajectory can uploaded in three different formats:

xtc format (typical format from Gromacs software), which requires uploading *.gro or *.pdb file
multimodel pdb file

xyz format, in which consecutive time frames are separated by letter “t” followed by a number specifying moment of time, as in example below:

    t  0
       1       -9.102        -18.555          15.000    
       2       -9.384        -17.556          14.080
       3       -5.661        -16.841          14.367
       4       -3.660        -15.387          11.487
       5        0.096        -15.769          11.241
       6        2.646        -13.739           9.329
       7        5.806        -15.775           8.604
    t  1 
       1       -9.102        -18.555          15.000    
       2       -9.384        -17.556          14.080
       3       -5.661        -16.841          14.367
       4       -3.660        -15.387          11.487
       5        0.096        -15.769          11.241
       6        2.646        -13.739           9.329
       7        5.806        -15.775           8.604
    t  2
       ....

Analysis of a trajectory can be conducted in two ways:
- fast analysis: at each time moment a knot type of the entire chain is determined,
- detailed (however slower) analysis: in addition, at each time moment locations end-points of a knot are detected (i.e. a knot core is determined), and it is determined if a chain contains a slipknot (to get this results, all subchains of the form (1,k) and (k,N) are analyzed, where N is the length of the entire chain and 1<k<N).

Output

The output is provided in a form of a text file, whose contents depends on a type of analysis:

for detailed analysis, if a knot exists at a given time moment, apart from knot type also a sequential location of the knot core is given; slipknot is denoted as -1; in case of a trivial knot or a slipknot, the locations of knot ends are given as 1 and N (the length of a chain); for example, for a chain of length 100, if unknot is detected in moments t=1,2, trefoil knot with a core at location (20,80) is detected at t=3, and slipknot is detected at t=4, the output file takes form:
```
1     01      1      100
2     01      1      100
3     31     20      80
4     -1      1      100
             
```