Mattson Thieme: About

Hi there, I’m Mattson.

I spend my time building at the intersection of AI and the physical sciences.

Currently, I’m working on a startup that’s modernizing chemical hazard assessment.

Stay tuned.

I received my PhD from Northwestern University in 2024, where I worked on machine learning for scientific discovery and split my time between projects in drug discovery and particle physics.

Dissertation: AI for Science: Graph Machine Learning as an Instrument for Understanding, Controlling, and Creating Physical Systems

Some select projects in drug discovery, manufacturing, organic materials research, particle physics, medical imaging, quant finance and social good are highlighted below. If any of these interest you as well, I’d love to chat. I’m easiest to reach on LinkedIn.

Select Projects

Graph Generation via Adaptation

Navigating local chemical space with competing objectives. As the structure of a chemical compound dictates its properties, any structural change to some compound affects every one of its properties. When optimizing some compound for multiple properties at once (the rule, not the exception) it isn’t clear a priori which structural changes will yield a more favorable property profile. Given this fact, optimizing for multiple properties simultaneously requires a guess-and-check approach (typically guided by a chemists intuition). In this work, we take a first step towards automating this task by introducing a novel methodology for transforming molecular property predictors into molecular manipulators. In collaboration with AbbVie, we deployed this method on an active drug discovery project, yielding novel compounds that were physically synthesized and tested. [Dissertation Chapter]

Graph Partition Learning

We can have our graph and pool it too. In molecules and proteins, discrete substructures affect their high level properties and behavior in distinct ways. As such, when learning representations of these objects, explicitly locating and accounting for these substructures is a central problem. However, this poses a challenge where differentiability is concerned, and each of the learnable graph pooling methods proposed to date must make strong a priori assumptions in regards to the number or size of the learned pools. In this work, we introduce the first differentiable, hierarchical graph pooling algorithm that learns an arbitrary number of varying sized pools without making any a priori assumptions about their number or size. Published in the NeurIPS 2023 Workshop on New Frontiers of AI for Drug Discovery and Development (AI4D3 2023) [Paper]

Graph Structure Learning

The demands of structure learning conflict with those of node embedding. To simultaneously learn interaction structures and node embeddings, these mechanisms should be distinct. During my internship at the MIT-IBM Watson AI Lab, I developed a novel method for learning task-informed graph structure that achieves SOTA performance on benchmark graph datasets without relying on exogenous heuristics to train the structure learner. In preparation. [Poster]

Manufacturing and physical control

Adapting biomedical segmentation models for accelerator loss deblending. Building on our previous work, we introduce a novel adaptation of the popular UNet architecture (primarily used in biomedical imaging applications) to disentangle the relative contributions of neighboring accelerators to the ionizing beam losses observed during periods of joint operation. Published in NAPAC’22. [PDF]
Language models as PID controllers. While simple, effective, and well understood, PID controllers are linear and symmetric heuristic control systems with constant parameters, meaning they implicitly assume the response profile is invariant across all operating regions. In this work, we show that standard neural language models are able to learn to regulate non-linear systems (in our case, resonant extraction systems) and outperform PID controllers in all tested configurations. Published in NAPAC’22. [Paper]
Particle accelerator loss de-blending. The Fermilab Main Injector enclosure houses two accelerators, each of which produce ionizing beam losses during operation. To distinguish losses originating in each accelerator, we’ve constructed an ML model to disentangle each machine’s contribution. Published in IPAC’21. [Paper]
High frequency particle extraction. Once particles have been accelerated in Fermilab’s Delivery Ring, they must be extracted and sent to experiments. The uniformity of the extracted beam intensity will determine the ultimate sensitivity of the experiment. We developed a differentiable spill simulator that allowed us to train an ML system to automatically tune existing PID controller parameters and reduce extraction variability by 66%. Published in IPAC’21. [Paper]
Predicting plasma etch rate variability with inferred plasma composition. Plasma is used to etch patterns into chips and photomasks, but the etch rate is dependent on factors of the plasma we cannot directly measure. The system I built learned a representation of this complex plasma state and improved etch rate forecasting accuracy by 32% over existing heuristic methods. See here for more info on the challenges and significance of maintaining uniform etch rates. [Source code is Intel IP]
High-dimensional machine-state visualization and intervention. The performance of modern manufacturing tools depend non-linearly on hundreds of internal parameters and exogenous factors. This system used advanced analytics to present operators with only the most relevant info and reduced downtime by 25%. [Source code is Intel IP]
Correcting astigmatism in scanning electron microscopes. Improperly calibrated electrostatic lenses can yield an e-beam with multiple focal planes and sub-optimal, elliptical beam profiles. The system I developed used CV techniques to quantify the degree of astigmatism and recommend tool adjustments to the operator. [Source code is Intel IP]

Optical and Electronic Properties of Organic Materials

Sub-nanosecond charge carrier dynamics in organic small-molecules. Organic semiconductors are interesting because they are light weight, low cost, and solution processable. However, to operate effectively in solar cells, photoinduced charge carriers need to separate efficiently, a challenge in these small-molecule bulk-heterojunction materials. In this work, we show that four-fold increases in ultrafast charge carrier separation occurs with certain fullerene acceptors, and that this effect scales with donor crystallinity. Published in Applied Physics Letters. [PDF]
Disentangling the influence of energy offsets from molecular packing. One advantage of small-molecule bulk-heterojunction semiconductor materials is that their photoluminescent and photoconductive properties are tunable. However, these properties are influenced by both the electronic and spatial qualities of the material, and the effects of, for example, LUMO offsets vs. molecular packing are not always clear. In this work, we show how LUMO offsets, physical donor/acceptor separation, and acceptor domain structure impact time-resolved photocurrent in the material. Published in The Journal of Physical Chemistry C. [Paper]

Quantitative Finance

Forecasting cryptocurrency price movements with graph neural networks. As cryptocurrencies are routinely bought and sold between each other, there exist complex relationships between their prices (this is in contrast to equities, where we cannot buy MSFT with AAPL). Using GNN methods, I built a model that learns a representation of these relationships and exploits them to make more informed trading decisions. All data was acquired and trades executed through the ccxt library. [Repo]
Deep RL for algorithmic cryptocurrency trading. Built and deployed a novel deep Q-learning architecture to trade cryptocurrencies. All data was acquired and trades executed through the ccxt library. [Repo]

Large-scale ML runtime optimization

Accelerating memory-bound meta-learning models. ML models consume huge amounts of data. If retrieving that data from memory becomes a bottleneck, we say that the training time is memory-bound. In this paper, we use existing threading and memory placement libraries to realize a 100x speedup when training memory-bound meta-learning models. [White Paper]
Deployed the first CPU-optimized ML VM’s on AWS and Azure. The speed with which we can train and deploy ML models is dependent on a complex software/hardware interface. In this work, we built orchestration software to ensure that ML frameworks deployed on the AWS and Azure clouds are built and configured optimally for the hardware they’re running on. [Blog]

High Performance Computing in Healthcare

Increasing the efficiency of biomedical image segmentation. Simple changes can go a long way at scale. In this White Paper, we present a more computationally efficient upsampling scheme and dramatically improve training time. [White Paper]

ML data pipelines to identify trafficked children. The first step in rescuing exploited children is recognizing that they are children - a straightforward, albeit difficult, classification problem. By scraping public data from Instagram, I built an image dataset that reduced our ML age-estimation model error by 3 years (from +/- ~5 years to +/- ~2 years). This allowed local and federal law enforcement to respond with greater confidence. See Thorn for more.

Deep Learning Seminars

Deep RL Seminar (2020, Northwestern University) [slides]
Graph Neural Networks Seminar (2020, Northwestern University) [slides]

Publications

* Equal contribution.

Thieme, M., Hassan, M., Rupakheti, C., Thiagarajan, K. B., Pandey, A., and Liu, H. TopoPool: An adaptive graph pooling layer for extracting molecular and protein substructures. In NeurIPS 2023 Workshop on New Frontiers of AI for Drug Discovery and Development (2023)
M. Thieme*, K.J. Hazelwood*, M.A. Ibrahim, H. Liu, S. Memik, V. P. Nagaslaev, A. Narayanan, G. Pradhan, K. Seiya, R. Shi, B.A. Schupbach, and N.V. Tran. “Semantic Regression for Disentangling Beam Losses in the Fermilab Main Injector and Recycler”. In Conference NAPAC’22
A. Narayanan*, M. Thieme*, J. Jiang*, K.J. Hazelwood, M.A. Ibrahim, H. Liu, S. Memik, V. P. Nagaslaev, K. Seiya, R. Shi, B.A. Schupbach, and N.V. Tran. “Machine Learning for Slow Spill Regulation in Mu2e”. In Conference NAPAC’22
K.J. Hazelwood*, M.A. Ibrahim, H. Liu, S. Memik, V. P. Nagaslaev, A. Narayanan, D.J. Nicklaus,P.S. Prieto, K. Seiya, R. Shi, B.A. Schupbach*, M. Thieme* , R.M. Thurman-Keup, and N.V. Tran. “Real-time Edge AI For Distributed Systems (READS): Progress On Beam Loss De-blending for the Fermilab Main Injector and Recycler.” In Conference IPAC’21.
A. Narayanan*, K.J. Hazelwood, M.A. Ibrahim, H. Liu, S. Memik, V. P. Nagaslaev, D.J. Nicklaus,P.S. Prieto, K. Seiya, R. Shi, B.A. Schupbach, M. Thieme*, R.M. Thurman-Keup, and N.V. Tran. “Optimizing Mu2e Spill Regulation System Algorithms”. In Conference IPAC’21.
K. Paudel, B. Johnson, M. Thieme, M. Haley, M. M. Payne, J. E. Anthony, and O. Ostroverkhova, “Enhanced charge photogeneration promoted by crystallinity in small-molecule donor-acceptor bulk heterojunctions” Applied Physics Letters 105, 043301 (2014).
K. Paudel, B. Johnson, M. Thieme, J. Anthony, O. Ostroverkhova, “Charge carrier dynamics in small-molecule and polymer-based donor-acceptor blends” MRS Proceedings, v. 1733, DOI: https://dx.doi.org/10.1557/opl.2014.956 (2014).
K. Paudel, B. Johnson, A. Neunzert, M. Thieme, B. Purushothaman, M. M. Payne, J. E. Anthony, and O. Ostroverkhova, “Small-Molecule Bulk Heterojunc- tions: Distinguishing Between Effects of Energy Offsets and Molecular Packing on Optoelectronic Properties” Journal of Physical Chemistry C 117, 24752-24760 (2013).
K. Paudel, B. Johnson, A. Neunzert, M. Thieme, J. Anthony, O. Ostroverkhova “Effects of energy offsets and molecular packing on exciton and charge carrier dynamics in small-molecule donor-acceptor composites” Proc. of SPIE, v. 8827, 88270Q, 2013.

Industrial White Papers

Thieme, Mattson, et al. “Intel Optimized Data Science Virtual Machine on Microsoft Azure*.” Intel AI, 8 Mar. 2019, Link to article.
Thieme, M., et al. (2018) Training Deep Convolutional Neural Networks with Horovod on Intel High Performance Computing Architecture. [White paper] Intel AI: Link to pdf.
Thieme, M., et al. (2018) Accelerating Memory-Bound Machine Learning models on Intel Xeon Processors. [White paper] Intel AI: Link to pdf.
Thieme, Mattson, et al. “Amazon Web Services Works with Intel to Enable Optimized Deep Learning Frameworks on Amazon* EC2 CPU Instances.” Intel AI, 26 Apr. 2018, Link to article.
Thieme, Mattson, and Anthony Reina. “Biomedical Image Segmentation with U-Net.” Intel AI, 23 Jan. 2018, Link to article.

About