|
Datasets in the physical sciences are often very challenging: (1) experimental data is expensive and thus necessarily small in quantity, (2) the existing data is often clustered, since experiments tend to be repeated on related materials and systems, (3) or else it is imbalanced, since only positive results tend to be reported in the literature. Frequently, only datasets consisting of a mix of experimental and computational data with very different levels of fidelity are available. We seek to address the challenge of learning from these datasets by maximally leveraging information available from known physics.
|