Interface of β2 Adrenergic Receptor in Complex with Empty Gs-Protein

[1]:
import mdtraj as md
import mdciao

Download example data and load into the namespace

[2]:
import numpy as np
import os
if not os.path.exists("mdciao_example"):
    mdciao.examples.fetch_example_data("b2ar@Gs")
traj = md.load("mdciao_example/traj.xtc",top="mdciao_example/prot.pdb")
Unzipping to 'mdciao_example'

Create consensus labeler objects

[3]:
GPCR = mdciao.nomenclature.LabelerGPCR("adrb2_human")
CGN = mdciao.nomenclature.LabelerCGN("3SN6")
No local file ./adrb2_human.xlsx found, checking online in
https://gpcrdb.org/services/residues/extended/adrb2_human ...done!
Please cite the following reference to the GPCRdb:
 * Kooistra et al, (2021) GPCRdb in 2021: Integrating GPCR sequence, structure and function
   Nucleic Acids Research 49, D335--D343
   https://doi.org/10.1093/nar/gkaa1080
For more information, call mdciao.nomenclature.references()
done without 404, continuing.
Using CGN-nomenclature, please cite
 * Flock et al, (2015) Universal allosteric mechanism for G$\alpha$ activation by GPCRs
   Nature 2015 524:7564 524, 173--179
   https://doi.org/10.1038/nature14663
No local file ./CGN_3SN6.txt found, checking online in
https://www.mrc-lmb.cam.ac.uk/CGN/lookup_results/3SN6.txt ...done without 404, continuing.
No local PDB file for 3SN6 found in directory '.', checking online in
https://files.rcsb.org/download/3SN6.pdb ...found! Continuing normally

Guess molecular fragments

This would be done anyway by the mdciao.cli.interface call in the cell below, here we do it have the fragments defined in the namespace

[4]:
fragments = mdciao.fragments.get_fragments(traj.top);
fragment_names = ["Galpha", "Gbeta", "Ggamma","B2AR","P0G"]
Auto-detected fragments with method 'lig_resSeq+'
fragment      0 with    354 AAs     LEU4 (     0) -   LEU394 (353   ) (0)  resSeq jumps
fragment      1 with    340 AAs     GLN1 (   354) -   ASN340 (693   ) (1)
fragment      2 with     66 AAs     ALA2 (   694) -    PHE67 (759   ) (2)
fragment      3 with    283 AAs    GLU30 (   760) -   LEU340 (1042  ) (3)  resSeq jumps
fragment      4 with      1 AAs   P0G395 (  1043) -   P0G395 (1043  ) (4)

Compute G\(\alpha\)-B2AR interface

Using the above fragment definitions

[5]:
intf = mdciao.cli.interface(traj,
                            title="3SN6 beta2AR-Galpha interface",
                            fragments=fragments, fragment_names = fragment_names,
                            frag_idxs_group_1=[0],
                            frag_idxs_group_2=[3],
                            GPCR_uniprot=GPCR, CGN_PDB=CGN,
                            accept_guess=True, no_disk=True, figures=False)
Will compute contact frequencies for trajectories:
<mdtraj.Trajectory with 280 frames, 8384 atoms, 1044 residues, and unitcells>
 with a stride of 1 frames
Using method 'user input by residue array or range' these fragments were found
fragment      0 with    354 AAs     LEU4 (     0) -   LEU394 (353   ) (0)  resSeq jumps
fragment      1 with    340 AAs     GLN1 (   354) -   ASN340 (693   ) (1)
fragment      2 with     66 AAs     ALA2 (   694) -    PHE67 (759   ) (2)
fragment      3 with    283 AAs    GLU30 (   760) -   LEU340 (1042  ) (3)  resSeq jumps
fragment      4 with      1 AAs   P0G395 (  1043) -   P0G395 (1043  ) (4)
GPCR-labels align best with fragments: [3] (first-last: GLU30-LEU340).

These are the GPCR fragments mapped onto your topology:
    TM1 with     32 AAs    GLU30@1.29x29   (   760) -    PHE61@1.60x60   (791   ) (TM1)
   ICL1 with      4 AAs    GLU62@12.48x48  (   792) -    GLN65@12.51x51  (795   ) (ICL1)
    TM2 with     32 AAs    THR66@2.37x37   (   796) -    LYS97@2.68x67   (827   ) (TM2)
   ECL1 with      4 AAs    MET98@23.49x49  (   828) -   PHE101@23.52x52  (831   ) (ECL1)
    TM3 with     36 AAs   GLY102@3.21x21   (   832) -   SER137@3.56x56   (867   ) (TM3)
   ICL2 with      8 AAs   PRO138@34.50x50  (   868) -   LEU145@34.57x57  (875   ) (ICL2)
    TM4 with     27 AAs   THR146@4.38x38   (   876) -   HIS172@4.64x64   (902   ) (TM4)
   ECL2 with     20 AAs   TRP173           (   903) -   THR195           (922   ) (ECL2)  resSeq jumps
    TM5 with     42 AAs   ASN196@5.35x36   (   923) -   GLU237@5.76x76   (964   ) (TM5)
   ICL3 with      2 AAs   GLY238           (   965) -   ARG239           (966   ) (ICL3)
    TM6 with     35 AAs   CYS265@6.27x27   (   967) -   GLN299@6.61x61   (1001  ) (TM6)
   ECL3 with      4 AAs   ASP300           (  1002) -   ILE303           (1005  ) (ECL3)
    TM7 with     25 AAs   ARG304@7.31x30   (  1006) -   ARG328@7.55x55   (1030  ) (TM7)
     H8 with     12 AAs   SER329@8.47x47   (  1031) -   LEU340@8.58x58   (1042  ) (H8)
CGN-labels align best with fragments: [0] (first-last: LEU4-LEU394).

These are the CGN fragments mapped onto your topology:
   G.HN with     28 AAs     THR9@G.HN.26   (     5) -    VAL36@G.HN.53   (32    ) (G.HN)
 G.hns1 with      3 AAs    TYR37@G.hns1.1  (    33) -    ALA39@G.hns1.3  (35    ) (G.hns1)
   G.S1 with      7 AAs    THR40@G.S1.1    (    36) -    LEU46@G.S1.7    (42    ) (G.S1)
 G.s1h1 with      6 AAs    GLY47@G.s1h1.1  (    43) -    GLY52@G.s1h1.6  (48    ) (G.s1h1)
   G.H1 with      7 AAs    LYS53@G.H1.1    (    49) -    GLN59@G.H1.7    (55    ) (G.H1)
   H.HA with     26 AAs    LYS88@H.HA.4    (    56) -   LEU113@H.HA.29   (81    ) (H.HA)
 H.hahb with      9 AAs   VAL114@H.hahb.1  (    82) -   PRO122@H.hahb.9  (90    ) (H.hahb)
   H.HB with     14 AAs   GLU123@H.HB.1    (    91) -   ASN136@H.HB.14   (104   ) (H.HB)
 H.hbhc with      7 AAs   VAL137@H.hbhc.1  (   105) -   PRO143@H.hbhc.15 (111   ) (H.hbhc)
   H.HC with     12 AAs   PRO144@H.HC.1    (   112) -   GLU155@H.HC.12   (123   ) (H.HC)
 H.hchd with      1 AAs   ASP156@H.hchd.1  (   124) -   ASP156@H.hchd.1  (124   ) (H.hchd)
   H.HD with     12 AAs   GLU157@H.HD.1    (   125) -   GLU168@H.HD.12   (136   ) (H.HD)
 H.hdhe with      5 AAs   TYR169@H.hdhe.1  (   137) -   ASP173@H.hdhe.5  (141   ) (H.hdhe)
   H.HE with     13 AAs   CYS174@H.HE.1    (   142) -   LYS186@H.HE.13   (154   ) (H.HE)
 H.hehf with      7 AAs   GLN187@H.hehf.1  (   155) -   SER193@H.hehf.7  (161   ) (H.hehf)
   H.HF with      6 AAs   ASP194@H.HF.1    (   162) -   ARG199@H.HF.6    (167   ) (H.HF)
 G.hfs2 with      5 AAs   CYS200@G.hfs2.1  (   168) -   GLY206@G.hfs2.7  (172   ) (G.hfs2)  resSeq jumps
   G.S2 with      8 AAs   ILE207@G.S2.1    (   173) -   VAL214@G.S2.8    (180   ) (G.S2)
 G.s2s3 with      2 AAs   ASP215@G.s2s3.1  (   181) -   LYS216@G.s2s3.2  (182   ) (G.s2s3)
   G.S3 with      8 AAs   VAL217@G.S3.1    (   183) -   VAL224@G.S3.8    (190   ) (G.S3)
 G.s3h2 with      3 AAs   GLY225@G.s3h2.1  (   191) -   GLN227@G.s3h2.3  (193   ) (G.s3h2)
   G.H2 with     10 AAs   ARG228@G.H2.1    (   194) -   CYS237@G.H2.10   (203   ) (G.H2)
 G.h2s4 with      5 AAs   PHE238@G.h2s4.1  (   204) -   THR242@G.h2s4.5  (208   ) (G.h2s4)
   G.S4 with      7 AAs   ALA243@G.S4.1    (   209) -   ALA249@G.S4.7    (215   ) (G.S4)
 G.s4h3 with      8 AAs   SER250@G.s4h3.1  (   216) -   ASN264@G.s4h3.15 (223   ) (G.s4h3)  resSeq jumps
   G.H3 with     18 AAs   ARG265@G.H3.1    (   224) -   LEU282@G.H3.18   (241   ) (G.H3)
 G.h3s5 with      3 AAs   ARG283@G.h3s5.1  (   242) -   ILE285@G.h3s5.3  (244   ) (G.h3s5)
   G.S5 with      7 AAs   SER286@G.S5.1    (   245) -   ASN292@G.S5.7    (251   ) (G.S5)
 G.s5hg with      1 AAs   LYS293@G.s5hg.1  (   252) -   LYS293@G.s5hg.1  (252   ) (G.s5hg)
   G.HG with     17 AAs   GLN294@G.HG.1    (   253) -   ASP310@G.HG.17   (269   ) (G.HG)
 G.hgh4 with     10 AAs   TYR311@G.hgh4.1  (   270) -   THR320@G.hgh4.10 (279   ) (G.hgh4)
   G.H4 with     27 AAs   PRO321@G.H4.1    (   280) -   ARG347@G.H4.27   (306   ) (G.H4)
 G.h4s6 with     11 AAs   ILE348@G.h4s6.1  (   307) -   TYR358@G.h4s6.20 (317   ) (G.h4s6)
   G.S6 with      5 AAs   CYS359@G.S6.1    (   318) -   PHE363@G.S6.5    (322   ) (G.S6)
 G.s6h5 with      5 AAs   THR364@G.s6h5.1  (   323) -   ASP368@G.s6h5.5  (327   ) (G.s6h5)
   G.H5 with     26 AAs   THR369@G.H5.1    (   328) -   LEU394@G.H5.26   (353   ) (G.H5)

Computing distances in the interface between fragments
0
and
3
The interface is restricted to the residues within 35.0 Angstrom of each other in the reference topology.
Computing interface...done!

From 100182 potential group_1-group_2 distances, the interface was reduced to only 20307 potential contacts.
If this number is still too high (i.e. the computation is too slow) consider using a smaller interface cutoff

  0%|                                                                                                                                                               | 0/1 [00:00<?, ?it/s]
Streaming over trajectory object nr.   0 (   280 frames,    280 with stride  1) in chunks of 10000 frames. Now at chunk nr    0, frames so far    280
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00,  4.98s/it]
These 20 contacts capture 12.03 (~77%) of the total frequency 15.54 (over 20307 input contacts)
As orientation value, the first 31 ctcs already capture 90.0% of 15.54.
The 31-th contact has a frequency of 0.14


    freq                            label residue idxs    sum
0   1.00   D381@G.H5.13    - Q229@5.68x68      340 956   1.00
1   1.00   R385@G.H5.17    - Q229@5.68x68      344 956   2.00
2   1.00   D381@G.H5.13    - K232@5.71x71      340 959   3.00
3   0.98   Q384@G.H5.16    - I135@3.54x54      343 865   3.98
4   0.96      T350@G.h4s6.3   - R239@ICL3      309 966   4.93
5   0.85   E392@G.H5.24    - T274@6.36x36      351 976   5.79
6   0.68   Q384@G.H5.16    - Q229@5.68x68      343 956   6.47
7   0.64   H387@G.H5.19    - A134@3.53x53      346 864   7.10
8   0.57   Y391@G.H5.23    - T274@6.36x36      350 976   7.67
9   0.55   Y358@G.h4s6.20  - E237@5.76x76      317 964   8.23
10  0.49   R385@G.H5.17    - K232@5.71x71      344 959   8.71
11  0.47   L394@G.H5.26    - K270@6.32x32      353 972   9.19
12  0.47      L346@G.H4.26    - R239@ICL3      305 966   9.65
13  0.45   Y358@G.h4s6.20  - S236@5.75x75      317 963  10.11
14  0.39   Y360@G.S6.2     - S236@5.75x75      319 963  10.50
15  0.35   Q384@G.H5.16    - E225@5.64x64      343 952  10.85
16  0.34  H41@G.S1.2      - F139@34.51x51       37 869  11.19
17  0.31   Q384@G.H5.16    - T136@3.55x55      343 866  11.49
18  0.28  D139@H.hbhc.3   - K140@34.52x52      107 870  11.77
19  0.26   Y358@G.h4s6.20  - I233@5.72x72      317 960  12.03

             label  freq
0     Q384@G.H5.16  2.32
1     D381@G.H5.13  2.00
2     R385@G.H5.17  1.49
3   Y358@G.h4s6.20  1.26
4    T350@G.h4s6.3  0.96
5     E392@G.H5.24  0.85
6     H387@G.H5.19  0.64
7     Y391@G.H5.23  0.57
8     L394@G.H5.26  0.47
9     L346@G.H4.26  0.47
10     Y360@G.S6.2  0.39
11      H41@G.S1.2  0.34
12   D139@H.hbhc.3  0.28

            label  freq
0    Q229@5.68x68  2.68
1    K232@5.71x71  1.48
2       R239@ICL3  1.42
3    T274@6.36x36  1.42
4    I135@3.54x54  0.98
5    S236@5.75x75  0.85
6    A134@3.53x53  0.64
7    E237@5.76x76  0.55
8    K270@6.32x32  0.47
9    E225@5.64x64  0.35
10  F139@34.51x51  0.34
11   T136@3.55x55  0.31
12  K140@34.52x52  0.28
13   I233@5.72x72  0.26

Plot each residues’s participation in the interface

[6]:
ifig = intf.plot_frequency_sums_as_bars(3.5, title_str = intf.name,
                                 list_by_interface=True,
                                 interface_vline=True);
ifig.figure.savefig("intf.svg")
../_images/notebooks_Manuscript_11_0.png

Plot contact matrix

[7]:
ifig, iax = intf.plot_interface_frequency_matrix(3.5, grid=True, pixelsize=.5);
ifig.savefig("matrix.svg")
../_images/notebooks_Manuscript_13_0.png

Flareplot

We combine a lot of information into one single flareplot:

  • the molecular topology with sub-fragments and consensus labels,

  • the secondary structure,

  • the individual contact-pairs

  • the participation of each residue in the interface.

[8]:
ifig, iax = intf.plot_freqs_as_flareplot(3.5,
                                         fragments=fragments, fragment_names = fragment_names,
                                         scheme="consensus_sparse", consensus_maps=[GPCR, CGN],
                                         aura=intf.frequency_sum_per_residue_idx_dict(3.5,return_array=True),
                                         SS=True)
ifig.figure.savefig("flare.svg")
Drawing this many dots (206 residues + 13 padding spaces) in a panel 10.0 inches wide/high
forces too small dotsizes and fontsizes. If crowding effects occur, either reduce the
number of residues or increase the panel size
../_images/notebooks_Manuscript_15_1.png

Coarse-Grained Frequencies and Flareplots

[9]:
ifig, iax = intf.plot_freqs_as_flareplot(3.5,
                                         fragments=fragments, fragment_names = fragment_names,
                                         consensus_maps=[GPCR, CGN],
                                         coarse_grain=True,
                                        )
intf.frequency_as_contact_matrix_CG(3.5, fragments=fragments, fragment_names = fragment_names,
                                    consensus_labelers=[GPCR, CGN],
                                    interface=True).round(1).replace(0,"")

[9]:
TM3 ICL2 TM5 ICL3 TM6
G.S1 0.3
H.hbhc 0.3
G.H4 0.5
G.h4s6 1.3 1.0
G.S6 0.4
G.H5 1.9 4.5 1.9
../_images/notebooks_Manuscript_17_1.png

Grab a representative frame

This frame will be used to plot the interface frequencies as a 3D heatmap (see frequency_to_bfactor below).

[10]:
repframe = intf.repframes(return_traj=True)[-1][0]
Returning frame 118 of traj nr. 0: <mdtraj.Trajectory with 280 frames, 8384 atoms, 1044 residues, and unitcells>

Save the interface as a heatmap and view externally

[11]:
intf.frequency_to_bfactor(3.5, pdbfile="interface_heatmap.pdb",
                          geom=repframe,
                          interface_sign=True
                         );
Contact frequencies stored as signed bfactor in 'interface_heatmap.pdb'

Save all mdciao objects for later reuse

We can save all mdciao objects to numpy .npy (pickle) files and later reload them without having to compute everything again.

[12]:
import numpy as np
np.save("GPCR.npy", GPCR)
np.save("CGN.npy",CGN)
np.save("intf.npy",intf)