DNA-Encoded Libraries

Uncertainty quantification and applicability domain analysis for machine learning on DNA-encoded libraries (Google Accelerated Sciences)

At GAS as an AI resident, I worked on a team focused on applying machine learning to drug discovery. Specifically, we aimed to use molecular data from DNA-encoded libraries to predict protein/ligand binding for pharmaceutically relevant targets (a previous paper summarizing the work can be found here). My work focused on uncertainty quantification for these models: specifically, I developed methods to estimate the number of hits from a selected list of molecules and incorporated applicability domain modeling into our pipeline to improve overall performance.