TeachOpenCADD¶

About this resource

TeachOpenCADD is a resource to teach computer-aided drug design (cheminformatics and structural-bioinformatics). It is organized into modules (talktorials) where each talktorial is represented by a jupiter notebook focusing on a single task.

** Tutorial Type **: code based (Python)

** Authors & Contributors **: TeachOpenCADD has been initiated by the members of Volkamer Lab, Charité - Universitätsmedizin Berlin. Many contributors have participated in its developments

** Prerequisite: Python Knowledge (if you have any programing background you should be able to understand the python code)
TeachOpenCadd Website: Link
Github Link**: Link

** Howto **: The easiest way to start learning without installing anything is to open the jupiper notebook in github (click the link in the table below), it will automatically run the jupiter code and display the result inline. If you want to run the code locally you will need to follow these instructions

** Platform **: Jupiter notebook (a sharing document that contain live python code, visualizations, and narrative text), RDKit (an a open-source software toolkit for cheminformatics and computational chemistry), Conda & mamba (mamba is a CLI tool to manage conda s environments)

Modules	Description	Python package
T001 · Compound data acquisition (ChEMBL)	Extract data (compounds and activity) from the ChEMBL database related to the EGFR kinase and display their 2D structures
T002 · Molecular filtering: ADME and lead-likeness criteria	Remove compounds with low oral bioavailability from the result of the previous task
T003 · Molecular filtering: unwanted substructures	Remove toxic,reactive and false-positives compounds from the previous task
T004 · Ligand-based screening: compound similarity	Draw 2D molecules, generate molecular descriptors, compare molecules based on these descriptors, then search a library to identify similar compounds (virtual screening)
T005 · Compound clustering	from the virtual screening result (T004) use a clustering algorithms to select 1000 diverse compound in order to maximize the chances to find a hit
T006 · Maximum common substructure	visualize common scaffolds (MCS) of a set of molecules (T005)
T007 · Ligand-based screening: machine learning	how to use supervised ML algorithms to predict the activity of compounds against the EGFR Kinase
T008 · Protein data acquisition: Protein Data Bank (PDB)	superimpose ligands from many high resolution EGFR PDB complexes	biotite & pypdb
T009 · Ligand-based pharmacophores	identify pharmacophore feature from the ligands set generated in the previous tasks
T010 · Binding site similarity and off-target prediction	binding site similarity of PDB complexes with Imatinib as ligand	biotite & pypdb
T011 · Querying online API webservices	query remote bioinformatics API service using Python	requests
T012 · Data acquisition from KLIFS
T013 · Data acquisition from PubChem	Cheminformatics
T014 · Binding site detection	Structural-Bioinformatics
T015 · Protein ligand docking	Structural-Bioinformatics
T016 · Protein-ligand interactions	Structural-Bioinformatics
T017 · Advanced NGLview usage	Structural-Bioinformatics
T018 · Automated pipeline for lead optimization	Structural-Bioinformatics
T019 · Molecular dynamics simulation	Structural-Bioinformatics
T020 · Analyzing molecular dynamics simulations	Structural-Bioinformatics
T021 · One-Hot Encoding	Cheminformatics
T022 · Ligand-based screening: neural networks	Cheminformatics
T023 · What is a kinase?	Kinase Similarity
T024 · Kinase similarity: Sequence	Kinase Similarity
T025 · Kinase similarity: Kinase pocket (KiSSim fingerprint	Kinase Similarity
T026 · Kinase similarity: Interaction fingerprints	Kinase Similarity
T027 · Kinase similarity: Ligand profile	Kinase Similarity
T028 · Kinase similarity: Compare different perspectives	Kinase Similarity