News
- (Sept 2025) "Emergence and Evolution of Interpretable Concepts in Diffusion Models" got accepted as a Spotlight paper to NeurIPS 2025!
- (Jun 2025) I gave a talk at the 2nd Workshop on Visual Concepts at CVPR 2025 on our new work "Emergence and Evolution of Interpretable Concepts in Diffusion Models Through the Lens of Sparse Autoencoders". Our paper got selected for the "Best paper honorable mention" award!
- (May 2025) Proud to be recognized as a "top reviewer" for ICML 2025 (and for NeurIPS 2024 last year).
- (May 2025) Returning to Amazon in their Seattle office as an Applied Science intern for the summer of 2025!
- (Apr 2025) Passed my qualification exam! I am a PhD candidate now!
- (Sep 2024) Visiting Simons Institute for the semester as part of the "Modern Paradigms in Generalization" and "Special Year on Large Language Models and Transformers" long programs!
- (Aug 2024) Wrapped up my internship at Amazon.
- (May 2024) DiracDiffusion (Poster) and Adapt-and-Diffuse (Spotlight) got accepted to ICML 2024!
- (Jan 2024) Will be joining Amazon in the San Diego office as an Applied Science Intern for the summer of 2024!
- (Dec 2022) Obtained my M.Sc. degree in EE!
- (Apr 2022) Will be attending Princeton ML Theory Summer School organized by Boris Hanin this June. Excited to visit beautiful campus of Princeton University and IAS!
- (Dec 2021) Passed SIPI screening exam (ranked 1st in the department)!
- (May 2021) Will be attending CIFAR's Deep Learning + Reinforcement Learning (DLRL) and MLSS summer schools.
- (Apr 2020) Website is live! Received offers from UCLA, USC, and UBC. Very excited to join USC for my PhD studies next fall.
|
|
Research
I am currently passionate about post-training strategies to improve visual reasoning in Multimodal Large Language Models (MLLMs), with a focus on enhancing how models process and reason over complex visual data. A central goal of my work is to address limitations such as the lack of mental visualization, which we revealed in our recent benchmark Hyperphantasia (NeurIPS 2025).
Previously, my research spanned mechanistic interpretability, inverse problems, large language models, and learning theory, approaching machine learning problems from both practical and theoretical perspectives. Using sparse autoencoders (SAEs), I studied how human-interpretable concepts emerge and evolve across reverse diffusion steps in diffusion models (NeurIPS Spotlight 2025). I developed diffusion-based methods for inverse problems—including denoising and deblurring—in our works Adapt-and-Diffuse (ICML Spotlight 2024) and DiracDiffusion (ICML 2024). I also worked on a transformer–convolution hybrid architecture achieving state-of-the-art performance on the fastMRI dataset: HUMUS-Net (NeurIPS 2022). On the theory side, I analyzed the gradient descent dynamics of learning linear target functions with shallow ReLU networks (in submission).
I also have extensive experience working with large language models (LLMs), including fine-tuning, continual pretraining, post-training, prompting, and evaluation. During my 2024 internship at Amazon, I worked on knowledge injection into LLMs through continual pretraining with DoRA adapters and retrieval-augmented generation. In my 2025 Amazon internship, I developed a novel approach for extracting use-case–adaptive embeddings from LLM hidden states for downstream applications. I have additionally explored improving mathematical reasoning in LLMs via self-feedback and self-revision loops, without relying on external verifiers.
Selected papers are shown below.
|
|
|
Emergence and Evolution of Interpretable Concepts in Diffusion Models
Berk Tinaz*,
Zalan Fabian*,
Mahdi Soltanolkotabi
(* denote equal contribution)
NeurIPS (Spotlight), 2025
CVPR VisCon (Spotlight + Best paper honorable mention) and GMCV Workshops, 2025
GitHub
/
Paper Link
Using sparse autoencoders (SAEs), we investigate how human-interpretable concepts evolve in diffusion models through the generative process. Furthermore, we demonstrate that these concepts can be manipulated to steer image generation.
|
|
|
Hyperphantasia: A Benchmark for Evaluating the Mental Visualization Capabilities of Multimodal LLMs
Mohammad Shahab Sepehri,
Berk Tinaz,
Zalan Fabian,
Mahdi Soltanolkotabi
NeurIPS, 2025
GitHub
/
Paper Link
We introduce Hyperphantasia, a benchmark to evaluate the mental visualization capabilities of multimodal large language models (MLLMs). We demonstrate a substantial gap between the performance of humans and state-of-the-art MLLMs.
|
|
|
Adapt and Diffuse: Sample-adaptive Reconstruction
via Latent Diffusion Models
Zalan Fabian*,
Berk Tinaz*,
Mahdi Soltanolkotabi
(* denote equal contribution)
ICML (Spotlight), 2024
NeurIPS Deep Inverse Workshop, 2023
GitHub
/
Paper Link
Latent diffusion based reconstruction of degraded images by estimating the severity of degradation and initiating the reverse diffusion sampling accordingly to achieve sample-adaptive inference times.
|
|
|
DiracDiffusion: Denoising and Incremental
Reconstruction with Assured Data-Consistency
Zalan Fabian,
Berk Tinaz,
Mahdi Soltanolkotabi,
ICML, 2024
GitHub
/
Paper Link
Novel framework for solving inverse problems that maintains consistency with
the original measurement throughout the reverse process and allows for great flexibility in trading
off perceptual quality for improved distortion metrics and sampling speedup via early-stopping.
|
|
|
HUMUS-Net: Hybrid Unrolled Multi-scale Network
Architecture for Accelerated MRI Reconstruction
Zalan Fabian,
Berk Tinaz,
Mahdi Soltanolkotabi,
NeurIPS, 2022
GitHub
/
Paper Link
A hybrid architecture that combines the implicit bias and efficiency of conbolutions with the power of Transformer blocks in an unrolled and multi-scale network to establish SOTA on fastMRI dataset.
|
|