Use Consensus Labels as Multiple-Sequence-Alignment (MSA)
In this notebook, we will use mdciao
and the consensus labels of the GPCRdb to structurally align four GPCR structures of four receptors with low sequence identity:
opsin,
OPS
in the rest of the notebookbeta2 adrenergic receptor,
B2AR
in the rest of the notebookmu-opioid receptor,
MUOR
in the rest of the notebookdopamine D1 receptor,
DOP
in the rest of the notebook
Whereas we use directly PDB structures, the point should be clear that the MSA works on any arbitary geometries and topologies that can be imported into the notebook, in particular user provided trajectories.
[1]:
import mdciao
import nglview
import matplotlib
Load PDB Structures into the Notebook
[2]:
pdbs = {"OPS" : mdciao.cli.pdb("3CAP"),
"B2AR" : mdciao.cli.pdb("3SN6"),
"MUOR" : mdciao.cli.pdb("6DDF"),
"DOP" : mdciao.cli.pdb("7CKW")}
Checking https://files.rcsb.org/download/3CAP.pdb ...Please cite the following 3rd party publication:
* Crystal structure of the ligand-free G-protein-coupled receptor opsin
Park, J.H. et al., Nature 2008
https://doi.org/10.1038/nature07063
Please cite the following 3rd party publication:b ...
* Crystal structure of the beta2 adrenergic receptor-Gs protein complex
Rasmussen, S.G. et al., Nature 2011
https://doi.org/10.1038/nature10361
Checking https://files.rcsb.org/download/6DDF.pdb ...
/home/guille/miniconda3/lib/python3.12/site-packages/mdtraj/formats/pdb/pdbfile.py:208: UserWarning: Unlikely unit cell vectors detected in PDB file likely resulting from a dummy CRYST1 record. Discarding unit cell vectors.
warnings.warn(
Please cite the following 3rd party publication:
* Structure of the mu-opioid receptor-Giprotein complex.
Koehl, A. et al., Nature 2018
https://doi.org/10.1038/s41586-018-0219-7
Please cite the following 3rd party publication:b ...
* Ligand recognition and allosteric regulation of DRD1-Gs signaling complexes.
Xiao, P. et al., Cell 2021
https://doi.org/10.1016/j.cell.2021.01.028
Load Consensus Labels from the GPCRdb into the Notebook
[3]:
maps = { "OPS": mdciao.nomenclature.LabelerGPCR("opsd_bovin"),
"B2AR": mdciao.nomenclature.LabelerGPCR("adrb2_human"),
"MUOR": mdciao.nomenclature.LabelerGPCR("oprm_mouse"),
"DOP" : mdciao.nomenclature.LabelerGPCR("DRD1_HUMAN")
}
No local file ./opsd_bovin.xlsx found, checking online in
done!://gpcrdb.org/services/residues/extended/opsd_bovin ...
Please cite the following reference to the GPCRdb:
* Kooistra et al, (2021) GPCRdb in 2021: Integrating GPCR sequence, structure and function
Nucleic Acids Research 49, D335--D343
https://doi.org/10.1093/nar/gkaa1080
For more information, call mdciao.nomenclature.references()
No local file ./adrb2_human.xlsx found, checking online in
done!://gpcrdb.org/services/residues/extended/adrb2_human ...
Please cite the following reference to the GPCRdb:
* Kooistra et al, (2021) GPCRdb in 2021: Integrating GPCR sequence, structure and function
Nucleic Acids Research 49, D335--D343
https://doi.org/10.1093/nar/gkaa1080
For more information, call mdciao.nomenclature.references()
No local file ./oprm_mouse.xlsx found, checking online in
done!://gpcrdb.org/services/residues/extended/oprm_mouse ...
Please cite the following reference to the GPCRdb:
* Kooistra et al, (2021) GPCRdb in 2021: Integrating GPCR sequence, structure and function
Nucleic Acids Research 49, D335--D343
https://doi.org/10.1093/nar/gkaa1080
For more information, call mdciao.nomenclature.references()
No local file ./DRD1_HUMAN.xlsx found, checking online in
done!://gpcrdb.org/services/residues/extended/drd1_human ...
Please cite the following reference to the GPCRdb:
* Kooistra et al, (2021) GPCRdb in 2021: Integrating GPCR sequence, structure and function
Nucleic Acids Research 49, D335--D343
https://doi.org/10.1093/nar/gkaa1080
For more information, call mdciao.nomenclature.references()
Use the Consensus Labels to Trim down the PDBs to the just Receptors
This can happen regardless of chain definitions and co-crystalized entities
[4]:
pdb_just_receptor = {}
for key, pdb in pdbs.items():
print(key)
receptor_residue_idxs = mdciao.nomenclature.guess_by_nomenclature(maps[key],
pdb.top,
fragments="resSeq",
return_residue_idxs=True,
accept_guess=True,
return_str=False)
pdb_just_receptor[key] = mdciao.fragments.fragment_slice(pdb, [receptor_residue_idxs])
print()
OPS
Auto-detected fragments with method 'resSeq'
fragment 0 with 326 AAs MET1 ( 0) - ASN326 (325 ) (0)
fragment 1 with 326 AAs MET1 ( 326) - ASN326 (651 ) (1)
fragment 2 with 4 AAs NAG1 ( 652) - BMA4 (655 ) (2)
fragment 3 with 6 AAs NAG1 ( 656) - BMA4 (661 ) (3) resSeq jumps
fragment 4 with 2 AAs NAG1 ( 662) - NAG2 (663 ) (4)
fragment 5 with 4 AAs BGL801 ( 664) - BGL804 (667 ) (5)
fragment 6 with 1 AAs PLM901 ( 668) - PLM901 (668 ) (6)
fragment 7 with 2 AAs BGL805 ( 669) - BGL806 (670 ) (7)
fragment 8 with 1 AAs PLM902 ( 671) - PLM902 (671 ) (8)
fragment 9 with 10 AAs HOH902 ( 672) - HOH907 (681 ) (9) resSeq jumps
The GPCR-labels align best with fragments: [0] (first-last: MET1-ASN326).
B2AR
Auto-detected fragments with method 'resSeq'
fragment 0 with 51 AAs THR9 ( 0) - GLN59 (50 ) (0)
fragment 1 with 115 AAs LYS88 ( 51) - VAL202 (165 ) (1)
fragment 2 with 51 AAs SER205 ( 166) - MET255 (216 ) (2)
fragment 3 with 132 AAs THR263 ( 217) - LEU394 (348 ) (3)
fragment 4 with 340 AAs GLN1 ( 349) - ASN340 (688 ) (4)
fragment 5 with 58 AAs ASN5 ( 689) - ARG62 (746 ) (5)
fragment 6 with 159 AAs ASN1002 ( 747) - ALA1160 (905 ) (6)
fragment 7 with 146 AAs GLU30 ( 906) - ARG175 (1051) (7)
fragment 8 with 61 AAs GLN179 (1052) - ARG239 (1112) (8)
fragment 9 with 77 AAs CYS265 (1113) - CYS341 (1189) (9)
fragment 10 with 128 AAs GLN1 (1190) - SER128 (1317) (10)
fragment 11 with 1 AAs P0G1601 (1318) - P0G1601 (1318) (11)
The GPCR-labels align best with fragments: [7, 8, 9] (first-last: GLU30-CYS341).
MUOR
Auto-detected fragments with method 'resSeq'
fragment 0 with 51 AAs LEU5 ( 0) - ILE55 (50 ) (0)
fragment 1 with 52 AAs THR182 ( 51) - VAL233 (102 ) (1)
fragment 2 with 114 AAs ASN241 ( 103) - PHE354 (216 ) (2)
fragment 3 with 336 AAs ASP5 ( 217) - ASN340 (552 ) (3)
fragment 4 with 53 AAs ILE9 ( 553) - PHE61 (605 ) (4)
fragment 5 with 281 AAs MET65 ( 606) - ARG345 (886 ) (5)
fragment 6 with 5 AAs TYR1 ( 887) - ETA5 (891 ) (6)
The GPCR-labels align best with fragments: [5] (first-last: MET65-ARG345).
DOP
Auto-detected fragments with method 'resSeq'
fragment 0 with 53 AAs ASP11 ( 0) - LEU63 (52 ) (0)
fragment 1 with 51 AAs THR205 ( 53) - MET255 (103 ) (1)
fragment 2 with 132 AAs THR263 ( 104) - LEU394 (235 ) (2)
fragment 3 with 340 AAs GLY1 ( 236) - ASN340 (575 ) (3)
fragment 4 with 58 AAs ASN5 ( 576) - ARG62 (633 ) (4)
fragment 5 with 128 AAs GLN1 ( 634) - SER128 (761 ) (5)
fragment 6 with 146 AAs PHE20 ( 762) - LYS165 (907 ) (6)
fragment 7 with 53 AAs ASN185 ( 908) - HIS237 (960 ) (7)
fragment 8 with 36 AAs SER263 ( 961) - CYS298 (996 ) (8)
fragment 9 with 38 AAs CYS307 ( 997) - LEU344 (1034) (9)
fragment 10 with 2 AAs G3C501 (1035) - CLR502 (1036) (10)
The GPCR-labels align best with fragments: [6, 7, 8, 9] (first-last: PHE20-LEU344).
Receptors are not 3D Aligned
This is a bit obvious because they come from different PDBs, but helps highlight the point
[5]:
colors = {"MUOR":"tab:red",
"OPS":"tab:blue",
"B2AR":"tab:green",
"DOP": "tab:orange"}
iwd = nglview.NGLWidget()
for ii, (key, geom) in enumerate(pdb_just_receptor.items()):
iwd.add_trajectory(geom)
iwd.clear_representations(component=ii)
iwd.add_cartoon(color=matplotlib.colors.to_hex(colors[key]), component=ii)
iwd
Use the Consensus Labels to generate an AlignerConsensus for MSA
[6]:
AC = mdciao.nomenclature.AlignerConsensus(maps,
tops={key : geom.top for key, geom in pdb_just_receptor.items()})
Sequence Identity within Residues with Consensus Labels
[7]:
AC.sequence_match()
[7]:
OPS | B2AR | MUOR | DOP | |
---|---|---|---|---|
OPS | 100% | 22% | 24% | 21% |
B2AR | 22% | 100% | 28% | 42% |
MUOR | 24% | 28% | 100% | 28% |
DOP | 21% | 42% | 28% | 100% |
Pick a Reference Structure (e.g. Opsin) and 3D-align all Receptors on It
We do this using the CAidxs_match method of the AlignerConsensus that will generate pairs of indices matching one another via their consensus labels. For brevity, here we show the example in the “3.50…3.59” region of TM3, but for the 3D alignment, we take all consensus labels
[8]:
AC.CAidxs_match("3.5*", keys=["OPS","B2AR"])
[8]:
consensus | OPS | B2AR | |
---|---|---|---|
115 | 3.50x50 | 1065 | 746 |
116 | 3.51x51 | 1076 | 757 |
117 | 3.52x52 | 1088 | 769 |
118 | 3.53x53 | 1095 | 780 |
119 | 3.54x54 | 1102 | 785 |
120 | 3.55x55 | 1109 | 793 |
121 | 3.56x56 | 1115 | 800 |
[9]:
ref_key = "OPS"
ref_geom = pdb_just_receptor[ref_key]
for key, geom in pdb_just_receptor.items():
if key!=ref_key:
ref_CAs, key_CAs = AC.CAidxs_match(keys=[ref_key, key])[[ref_key, key]].values.T
geom.superpose(ref_geom, atom_indices=key_CAs, ref_atom_indices=ref_CAs)
Receptors are now 3D-aligned
[10]:
iwd = nglview.NGLWidget()
for ii, (key, geom) in enumerate(pdb_just_receptor.items()):
iwd.add_trajectory(geom)
iwd.clear_representations(component=ii)
iwd.add_cartoon(color=matplotlib.colors.to_hex(colors[key]), component=ii)
iwd
[ ]: