Binding-Pocket Interactions of Four EGFR Inhibitors

For this notebook, we use mdciao to visualize the binding-pocket interactions of four Epidermal Growth Factor Receptor (EGFR) inhibitors. EGFR is an important drug target with implications in cancer and inflammation (Wikipedia). It is a transmembrane protein with an extracellular receptor domain and an intracellular kinase domain.

The molecular dynamics (MD) data used here was generated by slightly modifying the notebook

which is part of the impressive TeachOpenCADD collection, made available as teaching platform for computer-aided drug design by the Volkamer Lab at Saarland University, Saarbrücken.

The four inhibitors and structures are chosen from the following RCSB entries:

Please see the references at the bottom of the notebook for more information.

[1]:
import mdciao
import os
import matplotlib
import nglview
from glob import glob

Consensus labeler object for KLIFS nomenclature

Since it will be used more than once, it is better to have it instantiated only once and reused many times. The only thing we need is the UniProt Accession Code of the EGFR, P00533.

[2]:
KLIFS = mdciao.nomenclature.LabelerKLIFS("UniProtAC:P00533")
No local file ./KLIFS_UniProtAC:P00533.xlsx found, checking online in
https://klifs.net/api/kinase_ID?kinase_name=UniProtAC:P00533 ...https://klifs.net/api/structures_list?kinase_ID=406
done!
Please cite the following reference to the KLIF structural database:
 * Kanev et al, (2021) KLIFS: an overhaul after the first 5 years of supporting kinase research
   Nucleic Acids Research 49, D562--D569
   https://doi.org/10.1093/NAR/GKAA895
For more information, call mdciao.nomenclature.references()
/home/perezheg/miniconda3/lib/python3.11/site-packages/mdtraj/formats/pdb/pdbfile.py:208: UserWarning: Unlikely unit cell vectors detected in PDB file likely resulting from a dummy CRYST1 record. Discarding unit cell vectors.
  warnings.warn(

Download example data

[3]:
if not os.path.exists("example_kinases"):
    mdciao.examples.fetch_example_data("EGFR");
Unzipping to 'example_kinases'

Guess molecular fragments

[4]:
for pdb in sorted(glob("example_kinases/*.pdb")):
    print(pdb)
    mdciao.fragments.get_fragments(pdb)
    print()
example_kinases/topology.3POZ.pdb
Auto-detected fragments with method 'lig_resSeq+'
fragment      0 with    317 AAs   GLN701 (   0) -  LEU1017 (316 ) (0)
fragment      1 with      1 AAs     03P1 ( 317) -     03P1 (317 ) (1)

example_kinases/topology.3W32.pdb
Auto-detected fragments with method 'lig_resSeq+'
fragment      0 with    317 AAs   GLN701 (   0) -  LEU1017 (316 ) (0)
fragment      1 with      1 AAs     W321 ( 317) -     W321 (317 ) (1)

example_kinases/topology.6LUB.pdb
Auto-detected fragments with method 'lig_resSeq+'
fragment      0 with    323 AAs   GLY696 (   0) -  ILE1018 (322 ) (0)
fragment      1 with      1 AAs     EUX1 ( 323) -     EUX1 (323 ) (1)

example_kinases/topology.7VRE.pdb
Auto-detected fragments with method 'lig_resSeq+'
fragment      0 with    323 AAs   GLY696 (   0) -  ILE1018 (322 ) (0)
fragment      1 with      1 AAs     7VH1 ( 323) -     7VH1 (323 ) (1)

All three setups share the equivalent topology of kinase (fragment 0) and ligand (fragment 1):

  • from PDB ID 3POZ ligand 03P1

  • from PDB ID 3W32 ligand W321

  • from PDB ID 6LUB ligand EUX1

  • from PDB ID 7VRE ligand 7VH1

For labelling purposes, create a mapping between PDB IDs and ligand names:

[5]:
pdb2lig = {"3POZ" : "03P1",
           "3W32" : "W321",
           "6LUB" : "EUX1",
           "7VRE" : "7VH1"
          }

Compute the ligand-kinase interactions for the four inhibitors

[6]:
binding_pocket = {}
for pdb in sorted(glob("example_kinases/*.pdb")):
    key = os.path.basename(pdb).split(".")[1]
    key="%s@%s"%(pdb2lig[key], key)
    xtc = pdb.replace(".pdb",".xtc").replace("topology","trajectory")
    binding_pocket[key]=mdciao.cli.interface(xtc,
                                             pdb,
                                             fragment_names=["EGFR", "ligand"],
                                             KLIFS_string=KLIFS,
                                             ctc_control=1.0,
                                             interface_selection_1=[0],
                                             interface_selection_2=[1],
                                             accept_guess=True, figures=False, no_disk=True)
Will compute contact frequencies for trajectories:
example_kinases/trajectory.3POZ.xtc
 with a stride of 1 frames
Using method 'lig_resSeq+' these fragments were found
fragment EGFR   with    317 AAs   GLN701 (   0) -  LEU1017 (316 ) (EGFR)
fragment ligand with      1 AAs     03P1 ( 317) -     03P1 (317 ) (ligand)
The KLIFS-labels align best with fragments: [0] (first-last: GLN701-LEU1017).

Mapping the KLIFS fragments onto your topology:
 I      with      3 AAs   LYS716@I.1       (  15) -   LEU718@I.3       (17  ) (I)
 g.l    with      6 AAs   GLY719@g.l.4     (  18) -   GLY724@g.l.9     (23  ) (g.l)
 II     with      4 AAs   THR725@II.10     (  24) -   LYS728@II.13     (27  ) (II)
 III    with      6 AAs   VAL742@III.14    (  41) -   LEU747@III.19    (46  ) (III)
 αC     with     11 AAs   GLU758@αC.20     (  57) -   SER768@αC.30     (67  ) (αC)
 b.l    with      7 AAs   VAL769@b.l.31    (  68) -   ARG776@b.l.37    (75  ) (b.l) resSeq jumps
 IV     with      4 AAs   LEU777@IV.38     (  76) -   ILE780@IV.41     (79  ) (IV)
 V      with      3 AAs   GLN787@V.42      (  86) -   ILE789@V.44      (88  ) (V)
 GK     with      1 AAs   THR790@GK.45     (  89) -   THR790@GK.45     (89  ) (GK)
 hinge  with      3 AAs   GLN791@hinge.46  (  90) -   MET793@hinge.48  (92  ) (hinge)
 linker with      4 AAs   PRO794@linker.49 (  93) -   CYS797@linker.52 (96  ) (linker)
 αD     with      7 AAs   LEU798@αD.53     (  97) -   GLU804@αD.59     (103 ) (αD)
 αE     with      5 AAs   TYR827@αE.60     ( 126) -   ARG831@αE.64     (130 ) (αE)
 VI     with      3 AAs   ARG832@VI.65     ( 131) -   VAL834@VI.67     (133 ) (VI)
 c.l    with      8 AAs   HIS835@c.l.68    ( 134) -   ASN842@c.l.75    (141 ) (c.l)
 VII    with      3 AAs   VAL843@VII.76    ( 142) -   VAL845@VII.78    (144 ) (VII)
 VIII   with      1 AAs   ILE853@VIII.79   ( 152) -   ILE853@VIII.79   (152 ) (VIII)
 xDFG   with      4 AAs   THR854@xDFG.80   ( 153) -   GLY857@xDFG.83   (156 ) (xDFG)
 a.l    with      2 AAs   LEU858@a.l.84    ( 157) -   ALA859@a.l.85    (158 ) (a.l)
Select group 1: 0
Select group 2: 1

Will look for contacts in the interface between fragments
0
and
1.

Performing a first pass on the 317 group_1-group_2 residue pairs to compute lower bounds on residue-residue distances via residue-COM distances.
Reduced to only 220 (from 317) residue pairs for the computation of actual residue-residue distances:
The following 30 contacts capture 25.13 (~99%) of the total frequency 25.28 (over 36 contacts with nonzero frequency at 4.50 Angstrom).
As orientation value, the first 25 ctcs already capture 90.0% of 25.28.
The 25-th contact has a frequency of 0.55.

    freq             label              residues  fragments   sum
1   1.00  L792@hinge.47  - 03P1@ligand   91 - 317    0 - 1    1.00
2   1.00  M793@hinge.48  - 03P1@ligand   92 - 317    0 - 1    2.00
3   1.00  L777@IV.38     - 03P1@ligand   76 - 317    0 - 1    3.00
4   1.00  L844@VII.77    - 03P1@ligand  143 - 317    0 - 1    4.00
5   1.00  T854@xDFG.80   - 03P1@ligand  153 - 317    0 - 1    5.00
6   1.00  D855@xDFG.81   - 03P1@ligand  154 - 317    0 - 1    6.00
7   1.00  F856@xDFG.82   - 03P1@ligand  155 - 317    0 - 1    7.00
8   1.00  T790@GK.45     - 03P1@ligand   89 - 317    0 - 1    8.00
9   1.00  K745@III.17    - 03P1@ligand   44 - 317    0 - 1    9.00
10  1.00  C775@b.l.36    - 03P1@ligand   74 - 317    0 - 1   10.00
11  1.00  Q791@hinge.46  - 03P1@ligand   90 - 317    0 - 1   11.00
12  1.00  A743@III.15    - 03P1@ligand   42 - 317    0 - 1   12.00
13  1.00  L788@V.43      - 03P1@ligand   87 - 317    0 - 1   13.00
14  0.99  V726@II.11     - 03P1@ligand   25 - 317    0 - 1   13.99
15  0.99  R776@b.l.37    - 03P1@ligand   75 - 317    0 - 1   14.99
16  0.99  M766@αC.28     - 03P1@ligand   65 - 317    0 - 1   15.98
17  0.98  L718@I.3       - 03P1@ligand   17 - 317    0 - 1   16.95
18  0.91  I744@III.16    - 03P1@ligand   43 - 317    0 - 1   17.87
19  0.86  S720@g.l.5     - 03P1@ligand   19 - 317    0 - 1   18.72
20  0.85  R841@c.l.74    - 03P1@ligand  140 - 317    0 - 1   19.57
21  0.82  L858@a.l.84    - 03P1@ligand  157 - 317    0 - 1   20.39
22  0.81  G796@linker.51 - 03P1@ligand   95 - 317    0 - 1   21.20
23  0.80  F997@EGFR      - 03P1@ligand  296 - 317    0 - 1   22.00
24  0.68  I789@V.44      - 03P1@ligand   88 - 317    0 - 1   22.68
25  0.55  G719@g.l.4     - 03P1@ligand   18 - 317    0 - 1   23.22
26  0.50  G721@g.l.6     - 03P1@ligand   20 - 317    0 - 1   23.72
27  0.47  C797@linker.52 - 03P1@ligand   96 - 317    0 - 1   24.19
28  0.44  N842@c.l.75    - 03P1@ligand  141 - 317    0 - 1   24.63
29  0.36  I853@VIII.79   - 03P1@ligand  152 - 317    0 - 1   24.99
30  0.14  D800@αD.55     - 03P1@ligand   99 - 317    0 - 1   25.13

             label  freq
1    L792@hinge.47  1.00
2    M793@hinge.48  1.00
3       L777@IV.38  1.00
4      L844@VII.77  1.00
5     T854@xDFG.80  1.00
6     D855@xDFG.81  1.00
7     F856@xDFG.82  1.00
8       T790@GK.45  1.00
9      K745@III.17  1.00
10     C775@b.l.36  1.00
11   Q791@hinge.46  1.00
12     A743@III.15  1.00
13       L788@V.43  1.00
14      V726@II.11  0.99
15     R776@b.l.37  0.99
16      M766@αC.28  0.99
17        L718@I.3  0.98
18     I744@III.16  0.91
19      S720@g.l.5  0.86
20     R841@c.l.74  0.85
21     L858@a.l.84  0.82
22  G796@linker.51  0.81
23       F997@EGFR  0.80
24       I789@V.44  0.68
25      G719@g.l.4  0.55
26      G721@g.l.6  0.50
27  C797@linker.52  0.47
28     N842@c.l.75  0.44
29    I853@VIII.79  0.36
30      D800@αD.55  0.14

         label   freq
1  03P1@ligand  25.13

Will compute contact frequencies for trajectories:
example_kinases/trajectory.3W32.xtc
 with a stride of 1 frames
Using method 'lig_resSeq+' these fragments were found
fragment EGFR   with    317 AAs   GLN701 (   0) -  LEU1017 (316 ) (EGFR)
fragment ligand with      1 AAs     W321 ( 317) -     W321 (317 ) (ligand)
The KLIFS-labels align best with fragments: [0] (first-last: GLN701-LEU1017).

Mapping the KLIFS fragments onto your topology:
 I      with      3 AAs   LYS716@I.1       (  15) -   LEU718@I.3       (17  ) (I)
 g.l    with      6 AAs   GLY719@g.l.4     (  18) -   GLY724@g.l.9     (23  ) (g.l)
 II     with      4 AAs   THR725@II.10     (  24) -   LYS728@II.13     (27  ) (II)
 III    with      6 AAs   VAL742@III.14    (  41) -   LEU747@III.19    (46  ) (III)
 αC     with     11 AAs   GLU758@αC.20     (  57) -   SER768@αC.30     (67  ) (αC)
 b.l    with      7 AAs   VAL769@b.l.31    (  68) -   ARG776@b.l.37    (75  ) (b.l) resSeq jumps
 IV     with      4 AAs   LEU777@IV.38     (  76) -   ILE780@IV.41     (79  ) (IV)
 V      with      3 AAs   GLN787@V.42      (  86) -   ILE789@V.44      (88  ) (V)
 GK     with      1 AAs   THR790@GK.45     (  89) -   THR790@GK.45     (89  ) (GK)
 hinge  with      3 AAs   GLN791@hinge.46  (  90) -   MET793@hinge.48  (92  ) (hinge)
 linker with      4 AAs   PRO794@linker.49 (  93) -   CYS797@linker.52 (96  ) (linker)
 αD     with      7 AAs   LEU798@αD.53     (  97) -   GLU804@αD.59     (103 ) (αD)
 αE     with      5 AAs   TYR827@αE.60     ( 126) -   ARG831@αE.64     (130 ) (αE)
 VI     with      3 AAs   ARG832@VI.65     ( 131) -   VAL834@VI.67     (133 ) (VI)
 c.l    with      8 AAs   HIS835@c.l.68    ( 134) -   ASN842@c.l.75    (141 ) (c.l)
 VII    with      3 AAs   VAL843@VII.76    ( 142) -   VAL845@VII.78    (144 ) (VII)
 VIII   with      1 AAs   ILE853@VIII.79   ( 152) -   ILE853@VIII.79   (152 ) (VIII)
 xDFG   with      4 AAs   THR854@xDFG.80   ( 153) -   GLY857@xDFG.83   (156 ) (xDFG)
 a.l    with      2 AAs   LEU858@a.l.84    ( 157) -   ALA859@a.l.85    (158 ) (a.l)
Select group 1: 0
Select group 2: 1

Will look for contacts in the interface between fragments
0
and
1.

Performing a first pass on the 317 group_1-group_2 residue pairs to compute lower bounds on residue-residue distances via residue-COM distances.
Reduced to only 200 (from 317) residue pairs for the computation of actual residue-residue distances:
The following 31 contacts capture 26.59 (~100%) of the total frequency 26.67 (over 34 contacts with nonzero frequency at 4.50 Angstrom).
As orientation value, the first 25 ctcs already capture 90.0% of 26.67.
The 25-th contact has a frequency of 0.81.

    freq             label              residues  fragments   sum
1   1.00  T790@GK.45     - W321@ligand   89 - 317    0 - 1    1.00
2   1.00  L792@hinge.47  - W321@ligand   91 - 317    0 - 1    2.00
3   1.00  C775@b.l.36    - W321@ligand   74 - 317    0 - 1    3.00
4   1.00  L788@V.43      - W321@ligand   87 - 317    0 - 1    4.00
5   1.00  T854@xDFG.80   - W321@ligand  153 - 317    0 - 1    5.00
6   1.00  D855@xDFG.81   - W321@ligand  154 - 317    0 - 1    6.00
7   1.00  F856@xDFG.82   - W321@ligand  155 - 317    0 - 1    7.00
8   1.00  K745@III.17    - W321@ligand   44 - 317    0 - 1    8.00
9   1.00  L777@IV.38     - W321@ligand   76 - 317    0 - 1    9.00
10  1.00  Q791@hinge.46  - W321@ligand   90 - 317    0 - 1   10.00
11  1.00  A743@III.15    - W321@ligand   42 - 317    0 - 1   11.00
12  1.00  M793@hinge.48  - W321@ligand   92 - 317    0 - 1   12.00
13  1.00  V726@II.11     - W321@ligand   25 - 317    0 - 1   13.00
14  1.00  R776@b.l.37    - W321@ligand   75 - 317    0 - 1   14.00
15  1.00  M766@αC.28     - W321@ligand   65 - 317    0 - 1   15.00
16  1.00  L844@VII.77    - W321@ligand  143 - 317    0 - 1   16.00
17  1.00  G719@g.l.4     - W321@ligand   18 - 317    0 - 1   16.99
18  0.99  L718@I.3       - W321@ligand   17 - 317    0 - 1   17.99
19  0.98  G796@linker.51 - W321@ligand   95 - 317    0 - 1   18.97
20  0.97  S720@g.l.5     - W321@ligand   19 - 317    0 - 1   19.94
21  0.88  I744@III.16    - W321@ligand   43 - 317    0 - 1   20.82
22  0.86  I789@V.44      - W321@ligand   88 - 317    0 - 1   21.68
23  0.83  L858@a.l.84    - W321@ligand  157 - 317    0 - 1   22.51
24  0.82  C797@linker.52 - W321@ligand   96 - 317    0 - 1   23.33
25  0.81  L1001@EGFR     - W321@ligand  300 - 317    0 - 1   24.14
26  0.74  G721@g.l.6     - W321@ligand   20 - 317    0 - 1   24.89
27  0.54  I853@VIII.79   - W321@ligand  152 - 317    0 - 1   25.42
28  0.47  R841@c.l.74    - W321@ligand  140 - 317    0 - 1   25.90
29  0.24  F997@EGFR      - W321@ligand  296 - 317    0 - 1   26.14
30  0.23  D800@αD.55     - W321@ligand   99 - 317    0 - 1   26.38
31  0.22  G724@g.l.9     - W321@ligand   23 - 317    0 - 1   26.59

             label  freq
1       T790@GK.45  1.00
2    L792@hinge.47  1.00
3      C775@b.l.36  1.00
4        L788@V.43  1.00
5     T854@xDFG.80  1.00
6     D855@xDFG.81  1.00
7     F856@xDFG.82  1.00
8      K745@III.17  1.00
9       L777@IV.38  1.00
10   Q791@hinge.46  1.00
11     A743@III.15  1.00
12   M793@hinge.48  1.00
13      V726@II.11  1.00
14     R776@b.l.37  1.00
15      M766@αC.28  1.00
16     L844@VII.77  1.00
17      G719@g.l.4  1.00
18        L718@I.3  0.99
19  G796@linker.51  0.98
20      S720@g.l.5  0.97
21     I744@III.16  0.88
22       I789@V.44  0.86
23     L858@a.l.84  0.83
24  C797@linker.52  0.82
25      L1001@EGFR  0.81
26      G721@g.l.6  0.74
27    I853@VIII.79  0.54
28     R841@c.l.74  0.47
29       F997@EGFR  0.24
30      D800@αD.55  0.23
31      G724@g.l.9  0.22

         label   freq
1  W321@ligand  26.59

Will compute contact frequencies for trajectories:
example_kinases/trajectory.6LUB.xtc
 with a stride of 1 frames
Using method 'lig_resSeq+' these fragments were found
fragment EGFR   with    323 AAs   GLY696 (   0) -  ILE1018 (322 ) (EGFR)
fragment ligand with      1 AAs     EUX1 ( 323) -     EUX1 (323 ) (ligand)
The KLIFS-labels align best with fragments: [0] (first-last: GLY696-ILE1018).

Mapping the KLIFS fragments onto your topology:
 I      with      3 AAs   LYS716@I.1       (  20) -   LEU718@I.3       (22  ) (I)
 g.l    with      6 AAs   GLY719@g.l.4     (  23) -   GLY724@g.l.9     (28  ) (g.l)
 II     with      4 AAs   THR725@II.10     (  29) -   LYS728@II.13     (32  ) (II)
 III    with      6 AAs   VAL742@III.14    (  46) -   LEU747@III.19    (51  ) (III)
 αC     with     11 AAs   GLU758@αC.20     (  62) -   SER768@αC.30     (72  ) (αC)
 b.l    with      7 AAs   VAL769@b.l.31    (  73) -   ARG776@b.l.37    (80  ) (b.l) resSeq jumps
 IV     with      4 AAs   LEU777@IV.38     (  81) -   ILE780@IV.41     (84  ) (IV)
 V      with      3 AAs   GLN787@V.42      (  91) -   ILE789@V.44      (93  ) (V)
 GK     with      1 AAs   MET790@GK.45     (  94) -   MET790@GK.45     (94  ) (GK)
 hinge  with      3 AAs   GLN791@hinge.46  (  95) -   MET793@hinge.48  (97  ) (hinge)
 linker with      4 AAs   PRO794@linker.49 (  98) -   SER797@linker.52 (101 ) (linker)
 αD     with      7 AAs   LEU798@αD.53     ( 102) -   GLU804@αD.59     (108 ) (αD)
 αE     with      5 AAs   TYR827@αE.60     ( 131) -   ARG831@αE.64     (135 ) (αE)
 VI     with      3 AAs   ARG832@VI.65     ( 136) -   VAL834@VI.67     (138 ) (VI)
 c.l    with      8 AAs   HIS835@c.l.68    ( 139) -   ASN842@c.l.75    (146 ) (c.l)
 VII    with      3 AAs   VAL843@VII.76    ( 147) -   VAL845@VII.78    (149 ) (VII)
 VIII   with      1 AAs   ILE853@VIII.79   ( 157) -   ILE853@VIII.79   (157 ) (VIII)
 xDFG   with      4 AAs   THR854@xDFG.80   ( 158) -   GLY857@xDFG.83   (161 ) (xDFG)
 a.l    with      2 AAs   ARG858@a.l.84    ( 162) -   ALA859@a.l.85    (163 ) (a.l)
Select group 1: 0
Select group 2: 1

Will look for contacts in the interface between fragments
0
and
1.

Performing a first pass on the 323 group_1-group_2 residue pairs to compute lower bounds on residue-residue distances via residue-COM distances.
Reduced to only 190 (from 323) residue pairs for the computation of actual residue-residue distances:
The following 25 contacts capture 20.50 (~98%) of the total frequency 20.83 (over 39 contacts with nonzero frequency at 4.50 Angstrom).
As orientation value, the first 20 ctcs already capture 90.0% of 20.83.
The 20-th contact has a frequency of 0.64.

    freq             label              residues  fragments   sum
1   1.00  G796@linker.51 - EUX1@ligand  100 - 323    0 - 1    1.00
2   1.00  V726@II.11     - EUX1@ligand   30 - 323    0 - 1    2.00
3   1.00  P794@linker.49 - EUX1@ligand   98 - 323    0 - 1    3.00
4   1.00  L718@I.3       - EUX1@ligand   22 - 323    0 - 1    4.00
5   1.00  M793@hinge.48  - EUX1@ligand   97 - 323    0 - 1    5.00
6   1.00  L844@VII.77    - EUX1@ligand  148 - 323    0 - 1    6.00
7   1.00  L792@hinge.47  - EUX1@ligand   96 - 323    0 - 1    7.00
8   1.00  Q791@hinge.46  - EUX1@ligand   95 - 323    0 - 1    8.00
9   1.00  M790@GK.45     - EUX1@ligand   94 - 323    0 - 1    9.00
10  1.00  A743@III.15    - EUX1@ligand   47 - 323    0 - 1   10.00
11  0.98  G724@g.l.9     - EUX1@ligand   28 - 323    0 - 1   10.97
12  0.97  T854@xDFG.80   - EUX1@ligand  158 - 323    0 - 1   11.94
13  0.96  K745@III.17    - EUX1@ligand   49 - 323    0 - 1   12.90
14  0.95  S797@linker.52 - EUX1@ligand  101 - 323    0 - 1   13.85
15  0.92  G721@g.l.6     - EUX1@ligand   25 - 323    0 - 1   14.78
16  0.92  K728@II.13     - EUX1@ligand   32 - 323    0 - 1   15.70
17  0.92  G719@g.l.4     - EUX1@ligand   23 - 323    0 - 1   16.61
18  0.88  T725@II.10     - EUX1@ligand   29 - 323    0 - 1   17.49
19  0.82  C775@b.l.36    - EUX1@ligand   79 - 323    0 - 1   18.31
20  0.64  F795@linker.50 - EUX1@ligand   99 - 323    0 - 1   18.95
21  0.48  D855@xDFG.81   - EUX1@ligand  159 - 323    0 - 1   19.43
22  0.43  S720@g.l.5     - EUX1@ligand   24 - 323    0 - 1   19.86
23  0.32  D800@αD.55     - EUX1@ligand  104 - 323    0 - 1   20.17
24  0.17  R841@c.l.74    - EUX1@ligand  145 - 323    0 - 1   20.34
25  0.15  L1001@EGFR     - EUX1@ligand  305 - 323    0 - 1   20.50

             label  freq
1   G796@linker.51  1.00
2       V726@II.11  1.00
3   P794@linker.49  1.00
4         L718@I.3  1.00
5    M793@hinge.48  1.00
6      L844@VII.77  1.00
7    L792@hinge.47  1.00
8    Q791@hinge.46  1.00
9       M790@GK.45  1.00
10     A743@III.15  1.00
11      G724@g.l.9  0.98
12    T854@xDFG.80  0.97
13     K745@III.17  0.96
14  S797@linker.52  0.95
15      G721@g.l.6  0.92
16      K728@II.13  0.92
17      G719@g.l.4  0.92
18      T725@II.10  0.88
19     C775@b.l.36  0.82
20  F795@linker.50  0.64
21    D855@xDFG.81  0.48
22      S720@g.l.5  0.43
23      D800@αD.55  0.32
24     R841@c.l.74  0.17
25      L1001@EGFR  0.15

         label  freq
1  EUX1@ligand  20.5

Will compute contact frequencies for trajectories:
example_kinases/trajectory.7VRE.xtc
 with a stride of 1 frames
Using method 'lig_resSeq+' these fragments were found
fragment EGFR   with    323 AAs   GLY696 (   0) -  ILE1018 (322 ) (EGFR)
fragment ligand with      1 AAs     7VH1 ( 323) -     7VH1 (323 ) (ligand)
The KLIFS-labels align best with fragments: [0] (first-last: GLY696-ILE1018).

Mapping the KLIFS fragments onto your topology:
 I      with      3 AAs   LYS716@I.1       (  20) -   LEU718@I.3       (22  ) (I)
 g.l    with      6 AAs   GLY719@g.l.4     (  23) -   GLY724@g.l.9     (28  ) (g.l)
 II     with      4 AAs   THR725@II.10     (  29) -   LYS728@II.13     (32  ) (II)
 III    with      6 AAs   VAL742@III.14    (  46) -   LEU747@III.19    (51  ) (III)
 αC     with     11 AAs   GLU758@αC.20     (  62) -   SER768@αC.30     (72  ) (αC)
 b.l    with      7 AAs   VAL769@b.l.31    (  73) -   ARG776@b.l.37    (80  ) (b.l) resSeq jumps
 IV     with      4 AAs   LEU777@IV.38     (  81) -   ILE780@IV.41     (84  ) (IV)
 V      with      3 AAs   GLN787@V.42      (  91) -   ILE789@V.44      (93  ) (V)
 GK     with      1 AAs   MET790@GK.45     (  94) -   MET790@GK.45     (94  ) (GK)
 hinge  with      3 AAs   GLN791@hinge.46  (  95) -   MET793@hinge.48  (97  ) (hinge)
 linker with      4 AAs   PRO794@linker.49 (  98) -   SER797@linker.52 (101 ) (linker)
 αD     with      7 AAs   LEU798@αD.53     ( 102) -   GLU804@αD.59     (108 ) (αD)
 αE     with      5 AAs   TYR827@αE.60     ( 131) -   ARG831@αE.64     (135 ) (αE)
 VI     with      3 AAs   ARG832@VI.65     ( 136) -   VAL834@VI.67     (138 ) (VI)
 c.l    with      8 AAs   HIS835@c.l.68    ( 139) -   ASN842@c.l.75    (146 ) (c.l)
 VII    with      3 AAs   VAL843@VII.76    ( 147) -   VAL845@VII.78    (149 ) (VII)
 VIII   with      1 AAs   ILE853@VIII.79   ( 157) -   ILE853@VIII.79   (157 ) (VIII)
 xDFG   with      4 AAs   THR854@xDFG.80   ( 158) -   GLY857@xDFG.83   (161 ) (xDFG)
 a.l    with      2 AAs   LEU858@a.l.84    ( 162) -   ALA859@a.l.85    (163 ) (a.l)
Select group 1: 0
Select group 2: 1

Will look for contacts in the interface between fragments
0
and
1.

Performing a first pass on the 323 group_1-group_2 residue pairs to compute lower bounds on residue-residue distances via residue-COM distances.
Reduced to only 212 (from 323) residue pairs for the computation of actual residue-residue distances:
The following 23 contacts capture 16.70 (~99%) of the total frequency 16.79 (over 30 contacts with nonzero frequency at 4.50 Angstrom).
As orientation value, the first 17 ctcs already capture 90.0% of 16.79.
The 17-th contact has a frequency of 0.53.

    freq             label              residues  fragments   sum
1   1.00  L718@I.3       - 7VH1@ligand   22 - 323    0 - 1    1.00
2   1.00  M793@hinge.48  - 7VH1@ligand   97 - 323    0 - 1    2.00
3   1.00  L792@hinge.47  - 7VH1@ligand   96 - 323    0 - 1    3.00
4   1.00  V726@II.11     - 7VH1@ligand   30 - 323    0 - 1    4.00
5   1.00  A743@III.15    - 7VH1@ligand   47 - 323    0 - 1    5.00
6   1.00  P794@linker.49 - 7VH1@ligand   98 - 323    0 - 1    6.00
7   1.00  G796@linker.51 - 7VH1@ligand  100 - 323    0 - 1    6.99
8   1.00  Q791@hinge.46  - 7VH1@ligand   95 - 323    0 - 1    7.99
9   0.99  L844@VII.77    - 7VH1@ligand  148 - 323    0 - 1    8.98
10  0.96  F723@g.l.8     - 7VH1@ligand   27 - 323    0 - 1    9.93
11  0.93  G719@g.l.4     - 7VH1@ligand   23 - 323    0 - 1   10.86
12  0.90  M790@GK.45     - 7VH1@ligand   94 - 323    0 - 1   11.76
13  0.88  T854@xDFG.80   - 7VH1@ligand  158 - 323    0 - 1   12.64
14  0.79  D855@xDFG.81   - 7VH1@ligand  159 - 323    0 - 1   13.44
15  0.76  K745@III.17    - 7VH1@ligand   49 - 323    0 - 1   14.20
16  0.68  L1001@EGFR     - 7VH1@ligand  305 - 323    0 - 1   14.88
17  0.53  R841@c.l.74    - 7VH1@ligand  145 - 323    0 - 1   15.41
18  0.44  F795@linker.50 - 7VH1@ligand   99 - 323    0 - 1   15.85
19  0.20  C775@b.l.36    - 7VH1@ligand   79 - 323    0 - 1   16.05
20  0.19  N842@c.l.75    - 7VH1@ligand  146 - 323    0 - 1   16.24
21  0.18  K728@II.13     - 7VH1@ligand   32 - 323    0 - 1   16.41
22  0.16  S797@linker.52 - 7VH1@ligand  101 - 323    0 - 1   16.58
23  0.12  D800@αD.55     - 7VH1@ligand  104 - 323    0 - 1   16.70

             label  freq
1         L718@I.3  1.00
2    M793@hinge.48  1.00
3    L792@hinge.47  1.00
4       V726@II.11  1.00
5      A743@III.15  1.00
6   P794@linker.49  1.00
7   G796@linker.51  1.00
8    Q791@hinge.46  1.00
9      L844@VII.77  0.99
10      F723@g.l.8  0.96
11      G719@g.l.4  0.93
12      M790@GK.45  0.90
13    T854@xDFG.80  0.88
14    D855@xDFG.81  0.79
15     K745@III.17  0.76
16      L1001@EGFR  0.68
17     R841@c.l.74  0.53
18  F795@linker.50  0.44
19     C775@b.l.36  0.20
20     N842@c.l.75  0.19
21      K728@II.13  0.18
22  S797@linker.52  0.16
23      D800@αD.55  0.12

         label  freq
1  7VH1@ligand  16.7

Compare interactions across the four compounds in a violinplot

Additionally, we will display representative geometries directly on the violinplots via their residue-residue distance-values. Subsequently, we will view these geometries in 3D

[7]:
colors = mdciao.plots.color_dict_guesser("tab10", binding_pocket.keys())
myfig, myax, keys, representatives = mdciao.plots.compare_violins(binding_pocket,
                                                                  colors=colors,
                                                                  anchor="ligand",
                                                                  ctc_cutoff_Ang=4.5,
                                                                  mutations_dict={
                                                                      "EUX1": "ligand",
                                                                      "7VH1": "ligand",
                                                                      "W321": "ligand",
                                                                      "03P1": "ligand"
                                                                  },
                                                                  defrag=None,
                                                                  sort_by="residue",
                                                                  legend_rows=2,
                                                                  representatives=True,
                                                                  figsize=(20,5)
                                                         )
myax.set_title("binding pocket interactions"
               "\nfor 4 different EGFR inhibitors")
myfig.tight_layout()
#myfig.savefig("EGFR.png", bbox_inches="tight")
Returning frame 83 of traj nr. 0: example_kinases/trajectory.3POZ.xtc
Returning frame 237 of traj nr. 0: example_kinases/trajectory.3W32.xtc
Returning frame 369 of traj nr. 0: example_kinases/trajectory.6LUB.xtc
Returning frame 304 of traj nr. 0: example_kinases/trajectory.7VRE.xtc
../_images/notebooks_07.EGFR_Kinase_Inhibitors_14_1.png

Show the representative geometries

The object representatives is a dictionary containing the geometries behind the small dots inside the violins of the previous figure, using the repframes method. In the next cells we will first align them and then overlap them using the KLIFS nomenclature.

Superpose structures using the KLIFs alignment labels

This way, the alignment will be particularly good in the binding pocket

[8]:
KLIFS_alignment = mdciao.nomenclature.AlignerConsensus({key : KLIFS for key in binding_pocket.keys()},
                                                       tops={key : bp.top for key, bp in binding_pocket.items()})

KLIFS_alignment.AAresSeq
[8]:
consensus 03P1@3POZ W321@3W32 EUX1@6LUB 7VH1@7VRE
0 I.1 K716 K716 K716 K716
1 I.2 V717 V717 V717 V717
2 I.3 L718 L718 L718 L718
3 g.l.4 G719 G719 G719 G719
4 g.l.5 S720 S720 S720 S720
... ... ... ... ... ...
80 xDFG.81 D855 D855 D855 D855
81 xDFG.82 F856 F856 F856 F856
82 xDFG.83 G857 G857 G857 G857
83 a.l.84 L858 L858 R858 L858
84 a.l.85 A859 A859 A859 A859

85 rows × 5 columns

[9]:
# We can directly get CA indices to map atoms
KLIFS_alignment.CAidxs
[9]:
consensus 03P1@3POZ W321@3W32 EUX1@6LUB 7VH1@7VRE
0 I.1 280 280 340 340
1 I.2 302 302 362 362
2 I.3 318 318 378 378
3 g.l.4 337 337 397 397
4 g.l.5 344 344 404 404
... ... ... ... ... ...
80 xDFG.81 2515 2515 2578 2578
81 xDFG.82 2527 2527 2590 2590
82 xDFG.83 2547 2547 2610 2610
83 a.l.84 2554 2554 2617 2617
84 a.l.85 2573 2573 2641 2636

85 rows × 5 columns

[10]:
ref_key = "W321@3W32" # We take this one but could be any one
ref_geom = representatives[ref_key]
for key, geom in representatives.items():
    if key!=ref_key:
        ref_CAs, key_CAs = KLIFS_alignment.CAidxs[[ref_key, key]].values.T.astype(int)
        geom.superpose(ref_geom, atom_indices=key_CAs, ref_atom_indices=ref_CAs)

Visualize residues with different behaviors in each compound

For example, residues

  • 775@b.l.36

  • 841@c.l.74

  • 855@xDFG.81

  • 997@EGFR (doesn’t have a KLIFS label)

[12]:
colors = {key: matplotlib.colors.to_hex(col) for key, col in colors.items()}
iwd = nglview.NGLWidget()
for ii, (key, rep) in enumerate(representatives.items()):
    iwd.add_trajectory(rep)
    iwd.clear_representations(component=ii)
    iwd.add_cartoon(color="white", component=ii)
    iwd.add_licorice(color=colors[key], component=ii, selection="(775 841 855 997) and not Hydrogen", radius=.1)
    iwd.add_ball_and_stick(color=colors[key], component=ii,
                          selection="not protein and not Hydrogen",
                           radius=.1,
                          )
iwd

References