Personal AI Research Initiative

Industrial Mind
& Code

Can AI agents operationalise the fundamentals of industrial engineering? I'm finding out — one micro-experiment at a time.

Researcher Siddharth Srinivasan
Domain Industrial Engineering × AI and beyond

Why this work exists

AI and LLMs have fundamentally transformed our approach to solving problems. With years of hands-on shop floor experience in Indian manufacturing, combined with a current focus on managing hyperscaler relationships from a techno-strategic point of view, I wanted to explore how these two worlds operate and how AI can add value to the core foundational aspects of Industrial Engineering. This program is how I explore that gap.

The core thesis: how can agentic AI operationalise IE theory? Explored through micro-experiments.

Each experiment takes a foundational IE concept, a supply chain dynamic, a maintenance framework, an inventory model, and turns it into a controlled simulation environment where LLM agents make decisions. This goes beyond model comparisons or benchmarks. It places LLMs into full-blown Industrial Engineering environments. Results are broken down, analysed, and shared with fellow AI researchers and domain peers.

All experiments are personal, use personal compute locally or in the cloud, and involve entirely fictional scenarios.

Experiment index

Agentic Bullwhip Effect — Version 1 Published

SUPPLY CHAIN · 2×2 FACTORIAL · GPT-4.1-MINI vs O1

All four configurations amplified demand variability. Context reduced amplification for the lightweight model — and increased it for the reasoning model. The most capable configuration produced a pattern that classical bullwhip theory would not predict.

  • OVAR exceeded 1.0 at every tier in every configuration — no configuration dampened variability
  • The context effect reversed sign depending on the model tier it was applied to — improvement for lightweight, degradation for reasoning
  • context_reasoning produced a fully inverted cascade: OEM was the noisiest node, component the quietest — the opposite of what the classical model predicts
Agentic Bullwhip Effect — Version 2 In Progress

SUPPLY CHAIN · 4 EXPERIMENTS · GPT-4.1-MINI · O4-MINI · PHI-4-REASONING-PLUS

V1 compared AI configurations against each other. V2 asks a harder question: do AI agents beat simple heuristics at all — and if so, which configuration gets closest and why?

  • Ordering fully unconstrained — V1 guardrails removed to observe natural agent behaviour
  • Three heuristic baselines set the bar: exponential smoothing (OVAR 0.54), Order-Up-To (OVAR 1.71), naive passthrough (OVAR 1.0)
  • Four sub-experiments: lightweight, reasoning, synthesis, and open-source vs proprietary reasoning (Phi-4-reasoning-plus vs o4-mini)
  • 25-month demand series spanning two festive cycles · 20 runs per condition
Total Productive Maintenance (TPM) Agent — Press Brake Maintenance Records In Progress

TPM / PREDICTIVE MAINTENANCE · FRAGMENTED RECORDS · VERNACULAR NORMALIZATION

Tests whether AI agents can support TPM workflows when reasoning over fragmented, realistic maintenance records, including a condition that simulates a vernacular input normalisation layer upstream.

How experiments are designed

01

Analytical control baselines

Every experiment pairs LLM agent performance against a non-LLM analytical benchmark — not just model-vs-model. Deviation from theory is the signal.

02

Controlled simulation environments

Synthetic but calibrated parameters derived from public literature. All scenarios are entirely fictional with no proprietary data involved.

03

Multi-model comparison

Experiments compare across model tiers and reasoning architectures, with 50–100 replications per cell to support statistical inference.

Stack

Cloud
Azure AI Foundry Azure OpenAI Service
Local
ASUS Ascent GX10 Ollama
Code
Claude Code Codex

Where I write

Experiment writeups and methodology notes are published on the blog. The first post — Agentic Bullwhip Effect — Version 1 — is live. Code and data for each experiment are on GitHub.