Iโm a PhD candidate in Applied Mathematics, with a research focus is on mechanistic interpretability techniques, specifically for improving the safety of transformer-based AI systems. Iโm also interested in symmetries in loss landscapes of neural networks and their applications in understanding the internals of said networks. Iโve previously worked on projects in game theory, computational neuroscience, and computer vision. For more on my research, see here.
Outside of research, I like hiking, rock climbing, and fossil hunting. I am big fan of old sci-fi novels, particularly of the works of Arthur C. Clarke. I also have a number of open, most of which are tooling for either ML research or personal knowledge management systems.
Contact me: mivanits ๐ฆโโโโโ๐นโโโโโ mines ๐ฉโโโโโ๐ดโโโโโ๐นโโโโโ ๐ชโโโโโ๐ฉโโโโโ๐บโโโโ
Iโll be at NeurIPS 2023! Please feel free to email or message me if youโd like to meet. Iโll be presenting work on mechanistic interpretability for transformers trained on offline RL at the UniReps workshop poster session on Friday, December 15th @ 3:15pm. Poster Link.
LinkedIn | GitHub | Google Scholar | ORCID