Hi there! My name is Alessandro, i'm a ai research engineer.
ShallowMind is my workspace for training and experimenting with language models.
The name is playful, but the goal is straightforward: to build increasingly capable models while exploring new ideas in pretraining and reasoning.
Information-theoretic pretraining
Looking at ways to identify and prioritize the most informative tokens, to see whether current scaling laws can be adjusted. (Work in progress — I’ll share results once experiments are further along.)
Reasoning models
Testing approaches that improve step-by-step and compositional reasoning.
Architectural variations
Extending my training pipeline to support Mixture-of-Experts (MoE) and other non-standard components.