Bio Utilities#
Functions#
- mDeepFRI.bio_utils.build_align_contact_map(alignment: AlignmentResult, threshold: float = 6, generated_contacts: int = 2) Tuple[AlignmentResult, ndarray]#
Retrieve contact map for aligned sequences.
- Parameters:
alignment (AlignmentResult) – Alignment of query and target sequences.
database (str) – Path to FoldComp database. If empty, the structure will be retrieved from the PDB.
threshold (float) – Distance threshold for contact map.
generated_contacts (int) – Number of generated contacts to add for gapped regions in the query alignment.
- Returns:
Tuple[AlignmentResult, np.ndarray] – Tuple of alignment and contact map.
- mDeepFRI.bio_utils.calculate_contact_map(coordinates: ndarray, threshold=6.0, distance='sqeuclidean', mode='matrix') ndarray#
Calculate contact map from PDB string.
Description#
The bio_utils module provides biological utilities for structure processing and contact map generation. Values computed from these utilities are critical inputs for structure-based protein function prediction using DeepFRI.
Key Features#
Structure Loading: Extract structures from specific database formats like FoldComp or standard PDB/CIF files
Coordinate Extraction: Isolate C-alpha coordinates crucial for backbone representation
Contact Map Generation: create adjacency matrices representing protein residue interactions
Contact Alignment: Map structural contact information onto query sequences guided by alignments
Usage#
This module is primarily used internally by the pipeline to process structural data, but functions can be used independently for structural analysis tasks.
Example#
from mDeepFRI.bio_utils import get_calpha_coordinates, calculate_contact_map
# Assuming 'structure' is a loaded Biotite AtomArray
coords = get_calpha_coordinates(structure)
if coords is not None:
# Calculate contact map with 6 Angstrom threshold
cmap = calculate_contact_map(coords, threshold=6.0)