Research

Experiments

Experimental research on LLM agent performance in industrial engineering decision environments.

Back to homepage

Published Experiments

Hybrid AI Safety Stock Control in Supply Chain Replenishment

SUPPLY CHAIN · HYBRID ARCHITECTURE · EXPERIMENT WRITEUP

Three AI models controlled the safety stock multiplier in a hybrid architecture; a mathematical formula handled the order quantity. All four hypotheses failed across three information conditions and 20 replications per condition. Every AI condition produced higher order variance than doing nothing. Context made two out of three models worse. Memory caused the advanced reasoning model to collapse.

Read GitHub
sarvam-30b in Supply Chain Ordering: A Comparison with GPT OSS 120B

SUPPLY CHAIN · SOVEREIGN MODEL · EXPERIMENT WRITEUP

India’s sovereign model showed no measurable difference from GPT OSS on this task: OVAR 4.504 vs. 4.52. Neither model detected Indian seasonal demand patterns. Exponential smoothing outperformed both by approximately 8×. The GPT OSS result is an Agentic Bullwhip Effect Version 2 context reference, not a co-run comparison.

Read GitHub
LLM Agents Against Heuristic Baselines in Supply Chain Replenishment: An Experimental Comparison

SUPPLY CHAIN · EXPERIMENT WRITEUP

Four LLM configurations evaluated against three heuristic baselines across 20 replications and 11,520 LLM calls. Every heuristic outperformed every LLM on both order variance and stockout count simultaneously. All seven hypotheses rejected.

Read GitHub
Context and Model Capability in AI-Driven Supply Chain Ordering: An Experimental Study

SUPPLY CHAIN · EXPERIMENT WRITEUP

All four configurations amplified demand variability. The context × reasoning condition produced a fully inverted tier pattern (OEM as the noisiest tier, Component as the quietest), reversing the standard upstream cascade.

Read GitHub