×
FasterTransformer

Cloud-native AI Engineering Practice: Accelerating LLM Inference with FasterTransformer

This article demonstrates how to use FasterTransformer to accelerate inference on the ACK container service, using the Bloom7B1 model as an example.