Who is Yogendra Raghuvanshi?

Yogendra Raghuvanshi is an AI & Data Transformation Leader | Program Manager based in Indore, India, with 13+ years delivering enterprise AI, analytics, and data platforms. He leads programs spanning Generative AI, SQLMesh pipelines, StarRocks benchmarking, Python automation, Power BI analytics, and responsible AI governance — with proven impact at Modern Data, Capgemini Invent, and GlobalLogic.

What technical skills does Yogendra Raghuvanshi have?

Yogendra Raghuvanshi specializes in ACOS Optimization, AI Agents, Amazon Marketplace, Apache Spark, Bitbucket, CI/CD Concepts, Data Benchmarking, Data Engineering, Data Quality, Databricks, Decision Intelligence, Digital Transformation, Digital Twins, Documentation, Enterprise AI, Enterprise Analytics, ERD tooling, ETL Pipelines, GCP, GenAI, and related enterprise data and AI technologies.

How can I contact Yogendra Raghuvanshi?

You can contact Yogendra Raghuvanshi via email at yogendra.raghuvanshi31@gmail.com, phone at +91-8130647994, or through LinkedIn at https://www.linkedin.com/in/yogendraraghuvanshi/.

StarRocks Benchmarking Framework: Architecture, Stack & Delivery

Introduction

In this article I break down how I designed and delivered StarRocks Benchmarking Framework — from the original business pain point through architecture, technology choices, implementation phases, and lessons learned. This is the same project featured in my portfolio's Built Solutions section, documented here in full technical depth for engineers, architects, and hiring managers who want to understand how the work was actually done.

I led this initiative as part of my broader program delivery work across enterprise AI, data platforms, and analytics transformation. The approach reflects how I operate: start with the business outcome, choose the minimum viable architecture, instrument everything, and iterate with real users.

Business problem

No objective TB-scale comparison across cloud environments for warehouse selection.

Custom StarRocks FE+CN benchmarks on GCP and OCI using Terraform and JMeter.

Architecture decisions

Key design choices that shaped reliability, performance, and maintainability of the solution.

Identical data volume loaded on both clouds for fair comparison
JMeter thread pools mirror concurrent analyst patterns
Results archived for audit and vendor discussions

Technology stack in depth

This project was built with StarRocks, Terraform, JMeter, GCP, OCI. Each technology was selected for a specific role in the architecture — not because it was trendy, but because it solved a measured bottleneck.

StarRocks: production component with documented integration patterns and operational runbooks
Terraform: production component with documented integration patterns and operational runbooks
JMeter: production component with documented integration patterns and operational runbooks
GCP: production component with documented integration patterns and operational runbooks
OCI: production component with documented integration patterns and operational runbooks

Implementation timeline

Delivery followed phased milestones with explicit deliverables at each gate. This kept stakeholders aligned and made progress auditable for program reviews.

Harness design (2 weeks): Representative query mix from production workloads.
→ Query catalog
→ Data generator
→ Methodology doc
Infra automation (3 weeks): Terraform modules for StarRocks FE+CN on GCP and OCI.
→ IaC modules
→ Sizing matrix
→ Cost model
JMeter execution (2 weeks): Load tests at TB scale with captured metrics and reports.
→ JMeter plans
→ Results dashboard
→ Executive summary

Reproducible benchmark methodology

Vendor claims rarely survive production query mixes. We built a harness using real workload patterns: concurrent analyst queries, batch loads, and aggregate-heavy reporting. Identical TB-scale datasets were loaded on StarRocks FE+CN deployments in GCP and OCI.

Query catalog derived from production slow-query logs (anonymized)
Terraform modules for FE+CN sizing matrix across both clouds
JMeter thread pools mirroring peak concurrent analyst sessions
Results archived with methodology doc for audit and vendor discussions

Infrastructure automation

Terraform provisions networking, compute, storage, and StarRocks cluster topology. A cost model accompanies each sizing tier so finance can compare cloud TCO alongside performance metrics.

IaC modules: VPC, compute pools, object storage connectors
Data generator produces TB-scale datasets with configurable skew
Executive summary dashboard: p50/p95 latency, cost per query, load throughput

Business outcomes

Evidence-based platform decisions for high-volume analytics estates.

Success was measured against adoption, latency/throughput targets, and stakeholder feedback — not just deployment dates. Program reviews tracked these KPIs alongside technical milestones.

Lessons learned

Publish reproducible harnesses so results survive vendor and team scrutiny.

If I were starting again, I would invest even earlier in observability and golden test sets. The cost of retrofitting guardrails after pilot launch always exceeds building them in from day one.

Generative AI10 February 2026 · 12 min

AI-Powered SQL Agent (LangGraph): Architecture, Stack & Delivery

Built an AI SQL Agent using LangGraph and OpenAI models for automated query generation and workflow orchestration. Built with LangGraph, OpenAI, Python, SQL.

LangGraphOpenAIPythonSQL

Read full article →

Generative AI28 January 2026 · 10 min

GenAI Feedback & Retraining Framework: Architecture, Stack & Delivery

Designed a continuous improvement loop to capture feedback and retrain prompts/models for accuracy. Built with GenAI, Python, MLOps patterns.

GenAIPythonMLOps patterns

Read full article →

Data Engineering15 December 2025 · 14 min

IoT Streaming Analytics: Architecture, Stack & Delivery

Implemented streaming analytics with NATS, SQLMesh, and RisingWave for monitoring and failure detection. Built with NATS, SQLMesh, RisingWave, Python.

NATSSQLMeshRisingWavePython

Read full article →

Introduction

Technology stack in depth

StarRocks: production component with documented integration patterns and operational runbooks

Terraform: production component with documented integration patterns and operational runbooks

JMeter: production component with documented integration patterns and operational runbooks

GCP: production component with documented integration patterns and operational runbooks

OCI: production component with documented integration patterns and operational runbooks

Implementation timeline

Delivery followed phased milestones with explicit deliverables at each gate. This kept stakeholders aligned and made progress auditable for program reviews.

Harness design (2 weeks): Representative query mix from production workloads.

→ Query catalog

→ Data generator

→ Methodology doc

Infra automation (3 weeks): Terraform modules for StarRocks FE+CN on GCP and OCI.

→ IaC modules

→ Sizing matrix

→ Cost model

JMeter execution (2 weeks): Load tests at TB scale with captured metrics and reports.

→ JMeter plans

→ Results dashboard

→ Executive summary

Reproducible benchmark methodology

Query catalog derived from production slow-query logs (anonymized)

Terraform modules for FE+CN sizing matrix across both clouds

JMeter thread pools mirroring peak concurrent analyst sessions

Results archived with methodology doc for audit and vendor discussions

Infrastructure automation

Terraform provisions networking, compute, storage, and StarRocks cluster topology. A cost model accompanies each sizing tier so finance can compare cloud TCO alongside performance metrics.

IaC modules: VPC, compute pools, object storage connectors

Data generator produces TB-scale datasets with configurable skew

Executive summary dashboard: p50/p95 latency, cost per query, load throughput

Generative AI10 February 2026 · 12 min

AI-Powered SQL Agent (LangGraph): Architecture, Stack & Delivery

Built an AI SQL Agent using LangGraph and OpenAI models for automated query generation and workflow orchestration. Built with LangGraph, OpenAI, Python, SQL.

LangGraphOpenAIPythonSQL

Read full article →

Generative AI28 January 2026 · 10 min

GenAI Feedback & Retraining Framework: Architecture, Stack & Delivery

Designed a continuous improvement loop to capture feedback and retrain prompts/models for accuracy. Built with GenAI, Python, MLOps patterns.

GenAIPythonMLOps patterns

Read full article →

Data Engineering15 December 2025 · 14 min

IoT Streaming Analytics: Architecture, Stack & Delivery

Implemented streaming analytics with NATS, SQLMesh, and RisingWave for monitoring and failure detection. Built with NATS, SQLMesh, RisingWave, Python.

NATSSQLMeshRisingWavePython

Read full article →