My failure log
2020-10~
- studied GAN
- started David Silver’s lecture, Sutton book, PRML in a study club. Younghoon is such a brilliant guy
- multi-agent snake game
- celery, dockerized agent, redis, mongodb
- ranking system (trueskill), belief network
- came up with the idea of DFS to solve initialization of snakes’ positions
- imitation learning, implemented GAIL, we built custom expert trajectories, partial observation
- fixed discriminator collapsing
-
felt happy when it worked
- tried to abstract RL, and tried to implement RL library, RL2
- tried to observe emergent behavior from multi-agent RL, game theory
- studied QMix, COMA
- deployed alpha version of snake battle, ~10 people participated, Azure
- worked with intern on decision tree + deep model; fell into “rabbit hole” of fractal math
- helped outsider on dockerizing R + Python environment for genomics area
- implemented VAIL
- another rabbit hole into instrumental variable, graphical model, GLS (Generalized Least Squares)
- little interest in the acceleration of RL, GPU simulator, FPGA
2021-09~
- started working with Wontae, Rescaled Representation Space for Robust RL (r2d2). overfitting and catastrophic forgetting
- rabbit hole of smoothness (Lipshitz), functional analysis of DQN, algorithmic error
- gained new perspective on RL from value improvement path, value polytope
- pondering about weight dynamics of DQN, regularization
- AI for music, ISMIR. thought about EGG, what is emotion? should be more complex than discrete variable on valance/arousal plane. met Keunwoo Choi there! Another rabbit hole into definition of emotion.
- scratched surface of signal processing, librosa, etc. melspectrogram
- rabbit hole into complex-valued NN
- weight’s magnitude, direction (angle) of the representation in RL
- intepretability! distill.pub, Christopher Olah. he is a genius. later he co-founded Anthropic, and I think the company will pioneer AI
- started compiler optimization
- little interest in continual learning
- rabbit hole into group theory, manifold from Neurips 2021 tutorial
- spent the entire night of 12/31 running experiments for r2d2, no sleep
- observed hopeful experimental results, failed to justify the work and failed to submit to ICML
- server maintenance
- graph convolutional RL for MARL
- traffic routing, flatland
- little interest in optimizing replay buffer in cpp, pre-allocation
- started using ray/RLlib
- started studying GNN, laplacian operator, Chevyshev convolution
- had really fun with Eunki and Yongsun, stable matching, gale-shapley, multi-agent RL
- MARL on various envs, l2rpn, magent
- keep studying GNN, using program graph, CompilerGym
- MCMC, Jeremy Kun
- adopt language encoding methods for program sequence
- made RLlib work with graphs, really hacky, dynamic observation space
- adjacency matrix consumes huge memory → multi gpu training (sparse matrix has its cons, trade-off)
- isotropy in graph
- made RL2 work on CompilerGym. Wait, I kept using RL2 until now..?
- rabbit hole into sheaf theory, oversquashing in graph, graph rewiring, strong collapse, discrete ricci flow, TDA, persistent homology. Michael Bronstein, Aleksa Gordić
- failed to submit to ISCA compiler competition
- jpeg-d sped up 13.6% with RL+GNN approach
- gave feedback of their product, and MakinaRocks CEO gave me 10만원 starbucks card!
2022-05~
→ sometimes, life is hard, life got a little toxic
- servers keep crashing, fix, fix, fix…
- stable matching
-
contributed to FAIR repository! (the penguin)
- interest in sparse modeling
- suddenly I started gravitational wave classification…?
- thought about window function of fourier transform
- suddenly I started audio sentiment analysis, trying to distill semantic features from teacher (kogpt) to student (audio model). scraped 700,000 lyrics from Melon
- probabilistic model using pyro
- metrics in gnn, Weisfeiler-Lehman, Gromov-Wasserstein, isomorphism
- diffusion model, langevin dynamics
- after few hours of brainstorming session, Sanmun and Chaejin asked me for collaboration, KAIST
- reduced training time 60 hours to 10 hours in few days after joining! weird file IO and GPU wasn’t utilized, who coded this…
- kept trying to find topics that I can delve into, but kept diverging…
- started blockchain analysis with Jungyoon
- re-implemented everything in RLlib for JLAB
- interest in optimal transport and NTK
- gathered anomaly transaction label from etherscan
- Neo4j to deal with gigantic graph, dumped the chain and loaded into DB
- study dynamics graph (Ethereum transaction), also refer to Uber’s dynamic graph
- how to sample from huge graph that, samples can well represent mother population?
- Skill-based RL on mujoco,
- studied state clusters during skill-based RL training
2023-05~
- made physics-informed RL work for optical device design! KAIST prof. invited me for lunch and was offered co-first author
- rabbit hole into complex-valued autoencoder, hyperspherical representation, tried to derive Von-Mises latent, ELBO
- conjugate graph (line graph)
- studying Dreamer V3, world-model
- tried to implement Dreamer V3 in RLlib, failed
- testing typical decision tree, logistic regression on blockchain anomaly detection
- museli, mpo
- adapt Dreamer V1 into compiler optimization
- rabbit hole into equivariant CNN
- writing physics-informed RL paper to publish at physics journal, Chaejin is such a hard-working, smart student
- eigenvalue of graph Laplacian, Fiedler eigenvalue
- started working on superoptimization with RL. MCTS, muzero, efficientzero, reproduce AlphaDev with STOKE, x86
- studying natural language processing to handle assembly language
→ Leave of absence for a month, due to anxiety attack
2024~
- physics journal paper got accepted!
- started working on neural operator
- wrote manuscript for superoptimization
- my co-worker in superoptimization project left
- made Dreamer V3 work on material design problem, that was quick!
- made FNO work on electric field prediction, that was quick!
- got productive, didn’t know that getting rest is important
- wrote manuscript on scientific ML, submitted to neurips — I doubted that this paper will be accepted and insisted this work should change direction. failed to convince manager
2024-06~
- thinking about control problem in PDE
- thinking about model collapse when training on recursively generated data; should humanity have downloaded snapshot of internet at some point? probably openai already did
- received quite good scores for neurips but rejected — as expected
- working on the extension of physics journal paper
- little bit of RAG
- started AI compiler study and led deep model compiler team, triton, CUDA, but implementing kernels don’t really interest me…
- diffusion model (Flux.1-dev) fine-tuning in study club
- optimizing Flux.1-dev on tinygrad in study club
- diffusion for control in PDE, also offline RL
- re-submitted the SciML paper to ICLR, rejected
- my co-author left the team
- went to neurips and met Will Dabney and Marc G. Bellemere, I want to work with them so bad
- met great people including Hyungwon Chung, Kyunghyun Cho, could listen to their stories, and later on I found his “Don’t teach. Incentivize.” seminar. it gave me new perspective.
- some progress on the extension of physics journal paper, thanks to the literature Will Dabney shared
- found interesting work, SINDy-RL, Steve Brunton, very nice literature
- tried to use fluid dynamics (CFD) simulator, very torturing, but finally made it work
- re-visiting weight dynamics, plasticity loss in continual RL. Nikishin, Clare lyle and Sutton
- re-submitted the SciML paper to ICML, rejected, of course
- CoT fine-tuning DeepSeek-R1
- can’t we track the change of weight magnitude when normalizing the weight in RL? similar to the idea from r2d2 3 years ago -> nice work on this by adding new axis to the weight, came out from KAIST
- can’t we regularize the weight with orthogonality to keep “effective rank”? -> found exact same idea published recently
- started working on semiconductor manufactoring data
2025-04
- I left the team