The role of clinical and genomic characteristics in lung tumor dynamics
Please login to view abstract download link
Mathematical models of tumor growth and treatment response have limited clinical translation due to a fundamental challenge: accurate parameter estimation requires dense longitudinal tumor measurements rarely available in routine clinical practice. Furthermore, most approaches overlook baseline biological features that may already encode information about tumor dynamics. We investigated whether baseline clinical and genomic features could predict tumor dynamics in 3,200 NSCLC patients receiving osimertinib or immunotherapy. First, we developed a natural language processing (NLP) pipeline to extract tumor sizes from radiology reports. Second, we stratified patients by genomic alterations and clinical-demographic features. Finally, we applied hierarchical Bayesian inference to estimate parameters of a two-compartment ODE model (sensitive vs. resistant subpopulations), enabling parameter estimation for data-sparse patients by leveraging information from biologically similar subgroups. The NLP pipeline achieved 97% accuracy for tumor size extraction. Critically, inferred parameters revealed distinct dynamic patterns: TP53/RB1 alterations associated with rapid initial response followed by fast relapse in osimertinib-treated patients, while lower disease stage and prior local therapy associated with more durable responses. Across both osimertinib and immunotherapy cohorts, higher tumor growth rates correlated with increased metastatic burden and shorter progression-free survival. This integrated framework—combining scalable NLP-based data extraction with biologically stratified Bayesian modeling—enables prediction of tumor evolution from readily available baseline features, reducing dependence on intensive longitudinal monitoring and potentially enabling earlier, risk-adapted therapeutic intervention.
