top of page

Step-by-Step Homology Modeling with SWISS-MODEL: From Template to 3D Structure

Homology modeling is a computational method used to predict the three-dimensional (3D) structure of a protein based on its amino acid sequence and a similar known template structure. This method is based on two main principles. First, the 3D structure of a protein is largely determined by its amino acid sequence. Second, certain regions of proteins tend to be evolutionarily conserved, and changes in these regions occur more slowly compared to changes in the sequence. As a result, proteins with similar sequences fold into similar structures, and even sequences with low similarity can sometimes form similar structures (1).

SWISS-MODEL (https://swissmodel.expasy.org) is the first fully automated homology modeling server and has been continuously developed over the past 25 years (2). In 2014, SWISS-MODEL was producing approximately 1,500 models per day (3), and by 2018, this number had increased to 3,000 models per day (around 2 models per minute), making it one of the most widely used structure modeling servers worldwide (4).



Homology Modeling Workflow in SWISS-MODEL

In the comparative modeling process, the 3D model of the target protein is constructed using experimental information derived from an evolutionarily related protein structure. In SWISS-MODEL, the default modeling workflow consists of five main steps:


1. Input Data:

Information related to the target protein can be provided as an amino acid sequence in FASTA or Clustal format, or as plain text. Alternatively, a UniProtKB accession code (5) can be used. If the target protein is heteromeric (i.e., composed of different subunits), the amino acid sequences or UniProtKB accession codes for each subunit should be specified.


2. Template Search:

In the initial stage, the provided target sequence is used to search the SWISS-MODEL Template Library (SMTL) (3) for structurally characterized, evolutionarily related proteins. SWISS-MODEL performs this using two different search methods: BLAST (6), which rapidly and reliably detects closely related templates, and Hhblits (7), which enhances sensitivity for detecting distant homologies.


3. Template Selection:

Once the search is complete, the identified templates are ranked according to the Global Model Quality Estimation (GMQE) score (3) and Quaternary Structure Quality Estimate (QSQE) score (8). The top templates and alignments are evaluated to determine if they cover different regions of the target protein or represent alternative conformational states. In such cases, multiple templates may be automatically selected and used to generate different models. A table listing all templates with descriptive features is provided, giving users the opportunity to manually select different templates if desired. Interactive graphical views also allow for easy comparison and analysis of 3D structures, sequence similarities, and quaternary structure features of the templates.


4. Model Building:

For each selected template, conserved atomic coordinates are transferred according to the target-template alignment to automatically generate a 3D protein model. Residue coordinates corresponding to insertions or deletions in the alignment are built using loop modeling. In addition, non-conserved amino acid side chains are reconstructed to yield a full-atom protein model. SWISS-MODEL uses the OpenStructure computational structural biology framework (9) and the ProMod3 modeling engine to perform these tasks.


5. Model Quality Estimation:

To assess the accuracy of the generated model and identify potential errors in the modeling process, SWISS-MODEL uses the QMEAN scoring system (10). QMEAN provides both global and residue-level quality estimates, evaluating the reliability of the model using statistical potentials. Local quality predictions are supported by binary distance restraints that reflect consensus information among the template structures.


Example Application - 1


As a practical example, a known 3D structure from the PDB database and a random amino acid sequence will be used for homology modeling, helping to better understand the concept. For a known protein example, Enoyl-CoA Carboxylases/Reductases, selected as the "Molecule of the Month" on the PDB website in March, will be used. Specifically, the protein with PDB ID 6OWE, titled "Enoyl-CoA carboxylases/reductases", has been selected for homology modeling.

Figure 1. View of the protein structure with PDB ID 6OWE on the PDB website.
Figure 1. View of the protein structure with PDB ID 6OWE on the PDB website.

When viewing the page, it is noted that the protein has a resolution of 1.72 Å in X-ray crystallography, was produced using E. coli bacteria, and contains no mutated amino acids. Since the protein structure has been determined, the sequence can now be downloaded by clicking on the FASTA Sequence option under the Download Files section, and the homology modeling workflow can be followed step by step.


Step 1:

After accessing the SWISS-MODEL website, click on the Start Modelling option. Then, open the downloaded FASTA file and copy the amino acid sequence into the Target Sequence field on SWISS-MODEL. Optionally, a project name can be entered in the Project Title field, and if an email address is provided, the results will also be sent to that address. Once the protein sequence is entered, click the Search for Templates button and wait for the process to complete (this typically takes around 8 minutes).

Figure 2. SWISS-MODEL Input Page.
Figure 2. SWISS-MODEL Input Page.

Step 2:

Once the process is complete, the Template Results page will open. SWISS-MODEL provides 50 different templates that match the target protein sequence. The top-listed template has an identity value of 100%, and its GMQE and QSQE scores are very close to 1. This indicates it is the most suitable template for modeling. Additionally, the provided protein code helps confirm the best template. Just as we previously selected the structure with PDB code 6OWE, SWISS-MODEL has also included the same protein structure among the templates. A total of 12 template structures were selected, with identity values ranging from 82% to 100%. After the template structures are determined, the modeling process can be initiated by clicking the Build Models button located in the upper right corner.

Figure 3. SWISS-MODEL Template Results Page.
Figure 3. SWISS-MODEL Template Results Page.

Step 3:

When reviewing the model results, SWISS-MODEL has generated 12 models. Among them, Model 04 and Model 01 can be considered the closest to the target sequence, with a GMQE score of 0.95 and 100% identity. Examining the 3D structures of the proposed models, blue and orange regions are visually prominent. The blue regions correspond to amino acid segments with high confidence (approximately 0.80 and above), while the orange regions represent parts that are relatively less reliable.

Figure 4. SWISS-MODEL Model Result Page.
Figure 4. SWISS-MODEL Model Result Page.

In Example Application – 1, we attempted to perform homology modeling of a protein whose structure had been previously determined experimentally.


In Example Application – 2, we will use a random protein sequence to obtain the best and most reliable model.


For the random protein sequence, the website Random Protein Sequence (https://www.bioinformatics.org/sms2/random_protein.html?) will be used. The site provides a randomly generated protein sequence of the desired length. To make the analysis easier, a 200-amino acid protein sequence can be requested. To do this, enter 200 in the "Enter the length of the sequence in the text area below" section and click the Submit button.

Figure 5. Random Protein Sequence Page.
Figure 5. Random Protein Sequence Page.

As a result of the process, the protein sequence is ready for homology modeling.

Figure 6. Random Protein Sequence Result Page.
Figure 6. Random Protein Sequence Result Page.

In the first example, all the steps applied will be followed in the same way for this random protein sequence as well. The protein sequence has been entered into SWISS-MODEL, and templates have been obtained.


Step 1:

Once the process is completed, SWISS-MODEL provides three different templates based on the target protein sequence. SWISS-MODEL concludes that the entered target sequence is slightly similar to the proteins with PDB codes 3AVR and 6FUK in terms of GMQE and identity. Since the 200 amino acid sequence was randomly generated, as seen, the identity of the template is 19.35%, and the GMQE is 0.01, which indicates a poor result for homology modeling. Additionally, due to the very low similarity, only a small portion of the protein was obtained as a template. Nonetheless, this template can still be used for modeling, and the modeling result can be observed.

Figure 7. SWISS-MODEL Template Results Page.
Figure 7. SWISS-MODEL Template Results Page.

Step 2:

When examining the 3D structures of the models provided by SWISS-MODEL, only the orange regions are visible. The QMEAN value of the created models is on average 0.13, and the GMQE value is 0.01, which is close to 0. This indicates that the success of model creation is low. This is expected because the target protein sequence was randomly generated, and the likelihood of these amino acids being positioned next to each other is very low, which results in a low identity.

Figure 8. SWISS-MODEL Model Result Page.
Figure 8. SWISS-MODEL Model Result Page.

In conclusion, SWISS-MODEL stands out as a reliable and accessible tool for protein structure prediction due to its user-friendly interface and powerful algorithms. The key steps and tips shared in this guide aim to make the process more understandable and easy to apply, especially for beginners in homology modeling. Considering the importance of model quality in computational biology and structural bioinformatics studies, the correct and careful use of platforms like SWISS-MODEL can directly impact the success of research. For more complex analyses, the integration of other tools can enable more advanced evaluation of SWISS-MODEL outputs.



REFERENCES

1. E. Krieger, S. D. Nabuurs, G. Vriend, in Structural Bioinformatics (Eds: P. E. Bourne, H. Weissig), Wiley-Liss, Hoboken, NJ 2012, pp. 507–520.

2. Peitsch,M.C. (1996) ProMod and Swiss-Model: Internet-based tools for automated comparative protein modelling. Biochem. Soc. Trans., 24, 274–279 https://doi.org/10.1042/bst0240274 

3. Biasini,M., Bienert,S., Waterhouse,A., Arnold,K., Studer,G., Schmidt,T., Kiefer,F., Gallo Cassarino,T., Bertoni,M., Bordoli,L. et al. (2014) SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Res., 42, W252–W258 https://doi.org/10.1093/nar/gku340 

4. Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R., Heer, F. T., de Beer, T. A. P., Rempfer, C., Bordoli, L., Lepore, R., & Schwede, T. (2018). SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic acids research, 46(W1), W296–W303. https://doi.org/10.1093/nar/gky427 

5. The UniProt, C. (2017) UniProt: the universal protein knowledgebase. Nucleic Acids Res., 45, D158–D169. https://doi.org/10.1093/nar/gkw1099 

6. Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402. https://doi.org/10.1093/nar/25.17.3389 

7. Remmert,M., Biegert,A., Hauser,A. and Soding,J. (2011) HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods, 9, 173–175. https://doi.org/10.1038/nmeth.1818 

8. Bertoni,M., Kiefer,F., Biasini,M., Bordoli,L. and Schwede,T. (2017) Modeling protein quaternary structure of homo- and hetero-oligomers beyond binary interactions by homology. Scientific Rep., 7, 10480. https://doi.org/10.1038/s41598-017-09654-8 

9. Biasini,M., Schmidt,T., Bienert,S., Mariani,V., Studer,G., Haas,J., Johner,N., Schenk,A.D., Philippsen,A. and Schwede,T. (2013) OpenStructure: an integrated software framework for computational structural biology. Acta Crystallogr. D, Biol. Crystallogr., 69, 701–709 https://doi.org/10.1107/S0907444913007051 

10. Benkert,P., Biasini,M. and Schwede,T. (2011) Toward the estimation of the absolute quality of individual protein structure models. Bioinformatics, 27, 343–350. https://doi.org/10.1093/bioinformatics/btq662

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page