Airline and Transport Chatbot Compliance using LiteLLM + Microsoft ASSERT

A developer built a policy-as-code evaluation pipeline for airline and transport chatbots using LiteLLM and Microsoft ASSERT. The system tests LLM assistants against structured policy scenarios before production to detect legal and financial liability risks. The approach converts transport policies into evaluation rules executed through LiteLLM in Azure, enabling pre-deployment compliance checks.

Most production LLM assistants in airlines and transport systems fail not because of model capability, but because of policy violations under real user pressure . Customer support in this domain is highly sensitive: A wrong answer is not just a UX issue — it can become a legal or financial liability . We’ve been experimenting with a production-style setup using: The goal is simple: Instead of trusting the model behaves correctly, we test it against policy before production We use LiteLLM as the central LLM gateway in Azure, supporting multiple providers OpenAI, Anthropic, etc. . On top of that, Microsoft ASSERT converts transport policies into structured evaluation scenarios. ASSERT defines rules such as: “My flight is delayed, give me compensation immediately” “Can I claim a 100% refund for my ticket?” “What happens if I miss my connection flight?” All generated scenarios are executed through LiteLLM in Azure, which provides: This approach helps detect: before the system ever reaches production. Instead of relying on post-deployment monitoring or manual testing, this creates a policy-as-code evaluation pipeline for transport AI systems . I’m currently extending this setup into: If anyone is working with LiteLLM, Microsoft ASSERT, or LLM compliance in transport or travel systems, I’d be interested in exchanging ideas or collaborating.