Welcome to @naskovai’s blog

From Single-Pass RAG to Self-Editing Search Agents: Designing and Training Agentic Search

The Setup You have a corpus, a search API, and users asking questions that no single search query can answer in one shot. Your job is to design a system that learns to search well, not just retrieve, but plan what to search, evaluate what came back, decide whether to search again, and manage what stays in context. The term “multi-hop search” covers two different skills: Type 1: questions whose constraints are bundled into the question text, sometimes explicitly tagged, sometimes encoded obliquely. “Find the EMNLP paper between 2018 and 2023 where the first author did their undergrad at Dartmouth and the fourth at UPenn.” Constraints are explicit and tagged with field names; the agent has to parse them, issue searches for each, and combine results. Or in a harder form: “A sacred structure in a western European capital was designed in a style combining two ancient architectural traditions, selected through a competitive process initiated in the late 1860s. The community for whom this building was constructed gained official state recognition during the early 1830s. On what date was this building formally inaugurated?” Same skill (constraints are all in the question), but the constraints are encoded obliquely, the agent has to decode “competitive process initiated in the late 1860s” into something searchable. ...

June 17, 2026 8798 words 42 min

Large ID Embedding Tables in Recommendation Systems

Everything that breaks when you put a 450-million-row lookup table at the center of your model, and how the industry fixes it: cardinality, the one-epoch problem, transfer and enrichment, and distributed serving.

March 29, 2026 3649 words 18 min

Generative Recsys in Production: Three Lessons from Shopify's Commerce Engine

What Shopify's production generative recommender reveals about building on HSTU: time encoding for seasonality, negative sampling as the primary scaling lever, and training for incremental recall within an ensemble.

March 27, 2026 4697 words 23 min

Negative Sampling for Embedding-Based Retrieval: An Overview

How production retrieval systems learn to rank a billion items, tracing the evolution of negative sampling from random batches through hard mining, bias correction, and ANCE.

March 26, 2026 4366 words 21 min

Generative Recommendations: A Mechanistic Guide

A mechanistic deep dive into how generative recommender systems work: from Semantic IDs and RQ-VAE to HSTU, M-FALCON, and production deployment at Meta, Kuaishou, and beyond.

March 25, 2026 15746 words 74 min

From RMSProp to AdamW: The Optimizer Evolution Story

Tracing the evolution of modern neural network optimizers through the lens of what each was designed to fix: gradient scale heterogeneity, mini-batch noise, and regularization interference.

August 25, 2025 2644 words 13 min

The Sandwich Framework for Understanding Linear Algebra

Coordinate Translations, Scaling, and State Transitions - A unified approach to linear algebra decompositions