logo

InferX AI Function Platform (Lambda Function for Inference)

    --   Serve tens models in one box with ultra-fast (<2 sec) cold start (contact: support@inferx.net)



Models

tenant namespace model name gpu count vram (GB) cpu memory (GB) standby state snapshot nodes revision
gpu pageable pinned
public Qwen Qwen2.5-1.5B 1 8.0 12.0 18.0 Blob Blob Blob Normal ['node2'] 146
public Qwen Qwen2.5-7B-Instruct-1M 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node2'] 156
public Qwen Qwen2.5-7B-Instruct-GPTQ-Int8 1 14.2 20.0 30.0 Blob Blob Blob Normal ['node2'] 144
public Qwen Qwen2.5-Coder-1.5B-Instruct 1 6.0 12.0 18.0 Blob Blob Blob Normal ['node2'] 1208
public Qwen Qwen2.5-Coder-14B-Instruct-GPTQ-Int8 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node2'] 142
public Qwen Qwen2.5-Coder-3B 1 10.0 12.0 18.0 Blob Blob Blob Normal ['node2'] 137
public Qwen Qwen2.5-Coder-7B-Instruct 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node2'] 140
public Qwen Qwen2.5-Math-1.5B 1 8.0 12.0 18.0 Blob Blob Blob Normal ['node2'] 150
public Qwen Qwen2.5-Math-1.5B-Instruct 1 8.0 12.0 20.0 Blob Blob Blob Normal ['node2'] 825
public Qwen Qwen2.5-Math-7B 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node2'] 154
public Qwen Qwen2.5-Math-7B-Instruct 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node2'] 841

Summary

Model Count

11

Required GPU Count

16

Required VRAM (GB)

192.2 GB

Required CPU Cores

180.0

Required Memory (GB)

372.0 GB