Swiggy has launched Hermes V3, a generative AI-powered text-to-SQL assistant that enables employees to query data using simple English. Running within Slack, Hermes integrates vector retrieval, conversational memory, agent orchestration, and an explanation layer to accurately translate natural language inputs into executable SQL queries.
As an Indian online food ordering and delivery platform, Swiggy initially introduced Hermes as a lightweight interface allowing staff to ask basic questions and receive corresponding SQL queries for execution against internal data stores. The early versions faced limitations in deriving complex metrics, lacked contextual awareness in conversations, produced inconsistent results for similar prompts, and offered no clear mechanism to validate generated SQL. To address these challenges, the engineering team rebuilt the system using few-shot learning, metadata retrieval, and structured workflows centered around large language models (LLMs).
Previous overall architecture of Hermes (Source: Swiggy Tech Blog)
In its third major iteration, Hermes introduces a vector-based prompt retrieval system trained on historical SQL queries executed in Snowflake. Since most production-level queries lack descriptive metadata, the team leveraged large-context language models to convert raw SQL statements into natural language explanations—effectively reconstructing missing query intents. These synthesized prompts are indexed via vector similarity and injected as few-shot examples, enabling Hermes to ground new requests in established analytical patterns and significantly improve SQL generation accuracy.
As highlighted by Meghana Negi and Rutvik Reddy, engineers at Swiggy:
Hermes now leverages a curated database of previously executed queries and their associated prompts, retrieves relevant examples using vector similarity, and maintains conversational context—improving SQL generation accuracy from 54% to 93% while supporting seamless multi-turn interactions.
Hermes V3 workflow (Source: Swiggy Tech Blog)
Hermes V3 also incorporates persistent conversation memory, allowing multi-round queries to reference prior exchanges without redefining context. User interactions feel more intuitive, as the system tracks session states and evolves simple metric requests into composite queries. An orchestration agent implements a ReAct-style reasoning loop, breaking down complex inquiries into discrete tasks within a repeatable workflow: intent parsing, completeness validation, metadata lookup, example retrieval, intermediate logic construction, SQL generation, and optional clarification requests.
Structured intelligence for query generation agent flow (Source: Swiggy Tech Blog)
A key enhancement in V3 is the addition of an interpretability layer that exposes underlying assumptions behind generated SQL and assigns confidence scores. This transparency helps non-technical stakeholders understand how queries are constructed, fostering greater trust in machine-generated insights.
The Hermes V3 framework is tightly integrated with Swiggy’s security, compliance, and metadata management infrastructure. Role-based access control, single sign-on, ephemeral responses, and audit logging ensure sensitive data access adheres to internal governance policies. A hybrid metadata retrieval strategy efficiently fetches schema, table, and column details—keeping token usage within LLM service limits while maintaining high performance.
Hermes' architecture relies on a combination of open-source and cloud-native technologies. The retrieval component uses vector databases and embedding models; orchestration logic employs tools like LangChain to manage structured prompting workflows; and observability frameworks provide traceability and monitoring across layers. Systems such as Snowflake for analytics, PostgreSQL or comparable transactional databases, and API gateways form integral parts of the broader ecosystem supporting Hermes’ capabilities.