MOJITO: Multi-Output Joint Information-Theoretic Optimization for Scientific Discovery

  • Cabral, Manuel (TU Delft)
  • Font, Bernat (TU Delft)
  • Weymouth, Gabriel (TU Delft)

Please login to view abstract download link

Dimensional analysis identifies invariant structure in physical systems, but leaves substantial freedom in how it is represented. Physical equivalence alone does not guarantee learnability, and some non-dimensional representations fail to reveal the intrinsic manifold. Information-theoretic approaches exploit this freedom by maximizing mutual information to identify invariants with high predictive content. In multi-output problems, however, these methods can be unstable: small input perturbations may lead to large changes in the learned representations, masking dimensionality collapse. We introduce a model for learning optimal input-output representations in multi-output systems that addresses these limitations. The approach removes the need for human expert intervention across the pipeline, from dimensional group construction and coordinate selection to intrinsic dimensionality estimation, representation conditioning, and hyperparameter selection. Stability is achieved by coupling mutual information with a geometry-aware component based on Sobolev regularity. The method is evaluated on a canonical test suite designed to examine different aspects of the model. This includes multiple physical scales, redundant inputs leading to many admissible initial invariants, and systems with local versus global behavior, as well as on a real-world case where the appropriate non-dimensional groups are not known. The resulting representations support models that generalize more reliably, require lower capacity, and learn effectively from less data, contributing to a broader and more robust toolbox for scientific discovery.