3 comments

  • androiddrew 0 minutes ago
    Now all we need is better support for AMD gpus, both CDNA and RDNA types
  • kingstnap 42 minutes ago
    Impressive performance work. It's interesting that you still see these 40+% perf gains like this.

    Makes you think that you will continue to see the costs for a fixed level of "intelligence" dropping.

  • danielhanchen 40 minutes ago
    Love vLLM!