Cut AI API spending with prompt caching, model tiering, Batch API, semantic caching, and more. Includes 2026 pricing tables and Python code you can deploy today.