Doctoral defence: Karl Marti Toots „Cheminformatics Approaches for Analyzing and Modeling the Gas-Ionic Liquid Distribution of Organic Solutes“

  • 05 Sep 2025
  • 10:15–12:00
  • Ravila 14a–1020
Doctoral defence

On 5 September at 10.15 Karl Marti Toots will defend his doctoral thesis „Cheminformatics Approaches for Analyzing and Modeling the Gas-Ionic Liquid Distribution of Organic Solutes“.

Supervisors:
Prof. Uko Maran, PhD, University of Tartu
Associate Prof. Sulev Sild, PhD, University of Tartu
Associate Prof. Jaan Leis, PhD, University of Tartu

Opponent:
Igor Tetko, PhD, Helmholtz Zentrum München, Molecular Targets and Therapeutics Center, Institute of Structural Biology (Germany)

Summary:
The application of artificial intelligence and machine learning (ML) in the framework of quantitative structure-property relationships enables studies to understand the physicochemical properties of materials and substances. One such class of substances is ionic liquids (ILs), in which understanding and evaluating the partitioning properties of organic solutes provides a basis for the study and development of such applied chemical environments. As potential environmentally friendly alternatives to organic solvents, ILs are an important research object due to their numerous applications. However, few systematic studies have been conducted on the structure-property relationships of ILs in terms of partitioning properties. The aim was to investigate the relationships between the gas-IL partition coefficients (log K) and the structure of organic solutes and/or the ionic components of the IL using cheminformatics approaches. This involved the use of theoretical molecular descriptions and advanced ML methods to model the interaction mechanisms of the solute and IL structure in a multicomponent system. Modeling and analysis of data sets corresponding to the structures of organic solute, cations and anions showed that Random Forest, Support Vector Regression and Gaussian Process Regression ML methods represent the solute-IL relationships encoded in molecular descriptors more effectively than conventional Multi Linear Regression. At the same time, the latter is easier to interpret. Both linear and nonlinear models emphasize the critical influence of cation and anion composition on solute distribution. The results also show that modeling the entire solute-IL system, combining solute, cation and anion descriptors, improves the predictive power for large and chemically diverse data sets, emphasizing the importance of multicomponent approaches. The molecular features included in the derived models explain possible interactions based on dispersion forces, Coulomb-dipolar interactions and hydrogen bonding. In addition to mechanistic insights, the derived dependencies allow for the design of more selective and efficient IL environments for targeted industrial, environmental, and scientific applications.

  • 05 Sep 2025
  • 10:15–12:00
  • Ravila 14a–1020
Doctoral defence