logo

InferX AI Function Platform (Lambda Function for Inference)

    --   Serve tens models in one box with ultra-fast (<2 sec) cold start (contact: support@inferx.net)



Models

tenant namespace model name gpu count vram (GB) cpu memory (GB) standby state snapshot nodes revision
gpu pageable pinned
public Qwen Qwen2.5-1.5B 1 8.0 12.0 18.0 Blob Blob Blob Normal ['node3'] 128
public Qwen Qwen2.5-7B-Instruct-1M 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 138
public Qwen Qwen2.5-7B-Instruct-GPTQ-Int8 1 13.5 20.0 30.0 Blob Blob Blob Normal ['node3'] 126
public Qwen Qwen2.5-Coder-1.5B-Instruct 1 7.5 12.0 18.0 Blob Blob Blob Normal ['node3'] 121
public Qwen Qwen2.5-Coder-14B-Instruct-GPTQ-Int8 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 124
public Qwen Qwen2.5-Coder-3B 1 13.8 12.0 18.0 Blob Blob Blob Normal ['node3'] 119
public Qwen Qwen2.5-Coder-7B-Instruct 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 117
public Qwen Qwen2.5-Math-1.5B 1 8.0 12.0 18.0 Blob Blob Blob Normal ['node3'] 132
public Qwen Qwen2.5-Math-1.5B-Instruct 1 7.0 12.0 18.0 Blob Blob Blob Normal ['node3'] 130
public Qwen Qwen2.5-Math-7B 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 136
public Qwen Qwen2.5-Math-7B-Instruct 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 134

Summary

Model Count

11

GPU Count

16

VRAM (GB)

195.8 GB

CPU Cores

180.0

Memory (GB)

370.0 GB