Local molecular determinants of cysteine-associated osteogenesis imperfecta severity in human type I collagen
Please login to view abstract download link
Type I collagen, the most abundant protein in the human skeleton, comprises two α1 chains and one α2 chain arranged in a triple helix with repeating GXY units. Glycine substitutions in this highly ordered structure cause osteogenesis imperfecta (OI), a brittle bone disease with heterogeneous clinical severity even among mutations of the same chemical type. Here, the focus is on Gly→Cys substitutions, using lethality as a proxy phenotype to dissect how local sequence and structural context modulate functional impairment. Fully atomistic molecular dynamics simulations of human type I collagen were used to extract microscopic structural descriptors around each mutation, including the dynamic unit height of nine triplets centered on the mutated site. These mechanics-informed features were integrated with sequence-based descriptors such as mutation position and physicochemical properties of flanking residues to train an interpretable XGBoost model linking local molecular environment to mutation severity. Analysis of feature importance reveals that specific patterns of local sequence-structure coupling, rather than cysteine substitution alone, govern the likelihood of lethal versus nonlethal OI outcomes, providing mechanistic insight into the molecular origins of severe Gly→Cys-associated OI phenotypes.
