For any working system, whether AI or non-AI, operating costs play a significant role throughout the product lifecycle. In the case of AI systems, these costs are calculated by estimating future usage and concurrency. Usually, these cost estimates determine the pricing that end users are expected to pay. But the critical problem arises when these estimates turn out to be way off from the actuals. In that case, operating costs rise significantly, hurting margins, and the business suffers overall.
When such things happen, people assume it’s because of the expensive model, which is not true. These systems become expensive because their architectures multiply costs.