I am a grad student in CS at the UofA, supervised by Prof. Osmar Zaiane. My research focuses on multimodal generative models, including LLMs, VLMs, and diffusion models, with emphasis on 3D spatial reasoning. Current work includes scaling 3D spatial reasoning in multimodal image generation with test-time scalling, multimodal CoT, applying post-training methods (SFTv& RL Fine-tuning), and curating vision-language datasets.
- Spatial & Visual Reasoning with LLMs/VLMs/MMLMs
- Vision-Language Understanding & Embodied Spatial Reasoning
- 3D Representations, Grounding, & Space Understanding
- Building Vision-Language Datasets for Embodied Multi-Agent Systems
- Visual and Geometry Retrieval Systems
- M.Sc. in CS, University of Alberta (Present)
- Ph.D. in ECE, University of Alberta (Transferred to CS)
- M.Sc. & B.Sc. in ME, Sharif University of Technology & Univ. of Tehran

