|
Syamantak Kumar
I am a third year PhD student studying computer science at UT Austin advised by Prof. Purnamrita Sarkar and Prof. Kevin Tian . Broadly, my research interests lie at the intersection of statistics and machine learning.
I am interested in designing machine learning algorithms with provable guarantees of convergence and correctness. I am also interested to work on quantifying the computational and statistical aspects of state-of-the-art deep learning methods.
I deeply enjoy studying the applications of high-dimensional statistics, optimization and probability theory to real-world problems.
Previously, I was an undergraduate student at the Computer Science and Engineering Department, IIT Bombay .
I previously worked with Prof. Suyash Awate on Medical Imaging and with Prof. Preethi Jyothi on NLP for Code-switching at IIT Bombay.
I have also worked with Prof. Thomas Deserno on Medical Informatics and Imaging, back in 2018.
Email  / 
CV  / 
Github  / 
Google Scholar  / 
LinkedIn
|
|
|
Updates
- [September 2025] Selected for the Amazon AI PhD Fellowship!
- [Summer 2024] Worked with Dheeraj on Score Matching for Diffusion models as a research intern at Google Deepmind, Bengaluru
- [Sept 2023] Our paper "Streaming PCA for Markovian Data" accepted as a Spotlight (Top ~3%) at NeurIPS 2023.
- [Aug 2023] Started PhD CS at UT Austin.
- [Aug 2021] Started MS CS at UT Austin.
- [Jan 2021] Our paper "From Machine Translation to Code-Switching : Generating High-Quality Code-Switched Text" accepted in ACL-IJCNLP 2021.
- [July 2020] Joined Google, Bangalore as a software engineer in the Google Maps team.
- [Jun 2019] Working at Google, Bangalore as a software engineering intern.
- [Feb 2019] Our paper "A comparison of open source libraries ready for 3D reconstruction of wounds" got accepted in SPIE Medical Imaging 2019.
|
|
Low-Precision Streaming PCA
Syamantak Kumar, Shourya Pandey, Purnamrita Sarkar
NeurIPS, 2025
Paper link
We study Oja's algorithm for streaming PCA under linear and nonlinear stochastic quantization and show that a batched version achieves the lower bound up to logarithmic factors under both schemes.
|
|
Dimension-free Score Matching and Time Bootstrapping for Diffusion Models
Syamantak Kumar, Dheeraj Nagaraj, Purnamrita Sarkar
NeurIPS, 2025
Paper link
We establish the first (nearly) dimension-free sample complexity bounds for learning these score functions, achieving a double exponential improvement in dimension over prior results. Building on these insights, we propose Bootstrapped Score Matching (BSM), a variance reduction technique that utilizes previously learned scores to improve accuracy at higher noise levels.
|
|
Private Geometric Median in Nearly-Linear Time
Syamantak Kumar, Daogao Liu, Kevin Tian, Chutong Yang
NeurIPS, 2025
Paper link
We give an near-linear time differentially-private algorithm for computing the geometric median.
|
|
Spike-and-Slab Posterior Sampling in High Dimensions
Syamantak Kumar, Purnamrita Sarkar, Kevin Tian, Yusong Zhu
COLT, 2025
Paper link
We give the first provable algorithms for spike-and-slab posterior sampling that apply for any SNR, and use a measurement count sublinear in the problem dimension.
|
|
Beyond sin-squared error: linear time entrywise uncertainty quantification for streaming PCA
Syamantak Kumar, Shourya Pandey, Purnamrita Sarkar
UAI, 2025
Paper link
We propose a novel statistical inference framework for streaming principal component analysis (PCA) using Oja's algorithm, enabling the construction of confidence intervals for individual entries of the estimated eigenvector.
|
|
Oja's Algorithm for Streaming Sparse PCA
Syamantak Kumar, Purnamrita Sarkar
NeurIPS, 2024
Paper link
We analyze Oja's algorithm with thresholding for streaming sparse PCA, achieving near-optimal rates of convergence.
|
|
Nonparametric Evaluation of Noisy ICA Solutions
Syamantak Kumar, Purnamrita Sarkar, Peter Bickel, Derek Bean
NeurIPS, 2024
Paper link
We develop a nonparametric score to adaptively pick the right algorithm for Independent Component Analysis (ICA) with arbitrary Gaussian noise
|
|
Black-Box k-to-1-PCA Reductions: Theory and Applications
Arun Jambulapati, Syamantak Kumar, Jerry Li, Shourya Pandey, Ankit Pensia, Kevin Tian
COLT, 2024
Paper link
We theoretically analyze deflation-based algorithms for k-PCA using 1-PCA oracles.
|
|
Streaming PCA for Markovian Data
Syamantak Kumar,
Purnamrita Sarkar,
NeurIPS, 2023 (Spotlight)
Paper link
We propose a nearly-optimal streaming algorithm for performing PCA under Markovian dependence among the samples.
|
|
From Machine Translation to Code-Switching : Generating High-Quality Code-Switched Text
Ishan Tarunesh,
Syamantak Kumar,
Preethi Jyothi
ACL-IJCNLP 2021
Paper link
In this work, we adapt a state-of-the-art neural machine translation model to generate Hindi-English code-switched sentences starting from monolingual Hindi sentences.
|
|
A comparison of open source libraries ready for 3D reconstruction of wounds
Syamantak Kumar*,
Dhruv Jaglan*,
Nagarajan Ganapathy,
Thomas Deserno
SPIE Medical Imaging Conference, 2019
Paper link
We propose an android application for real-time three dimensional scanning of surface wounds to aid effective diagnosis of wounds remotely and present a comparative study of open-source libraries available for performing this task.
|
* : Equal Contribution
For a complete list of projects, please refer to my CV.
|
Generative Model for User Contributions
I worked at Google, Bangalore with the Maps team on a model for predicting the correctness of an edit made by a user on Maps. This model is used to ensure that data displayed on Google Maps is true to the accuracy it claims.
|
|
Adversarial Examples for Keyword Spotting
code  / 
report
Used Generative Adversarial Networks (GANs) to generate adversarial examples for keyword spotting systems, improving their robustness.
|
|
Tournament Ranking given pairwise preferences
code  / 
report
Designed an algorithm for fully-sequential sampling in a Probably-Approximately-Correct (PAC) setting to determine top-K players in a tournament, given pariwise preferences
|
|
Chinese Checkers AI
code
Implemented an AI for playing Chinese Checkers using the Minimax algorithm with alpha-beta pruning. (Implemented in Racket)
|
Undergraduate Teaching Assistant, PH107, Quantum Physics and its Applications, Fall 2017
|
Graduate Teaching Assistant, CS361S, Network Security and Privacy, Fall 2021
|
Graduate Teaching Assistant, EE 461P, Data Science Principles, Spring 2022
|
|
Stolen this awesome template from Jon Barron. Thanks a lot for sharing!
|
|