logo

InferX AI Function Platform (Lambda Function for Inference)

    --   Serve tens models in one box with ultra-fast (<2 sec) cold start (contact: support@inferx.net)



Models

tenant namespace model name gpu count vram (GB) cpu memory (GB) standby state snapshot nodes revision
gpu pageable pinned
public BAAI Aquila-7B 2 10.0 20.0 50.0 Blob Blob Blob Normal ['node3'] 426
public Deci DeciLM-7B 2 13.0 20.0 50.0 Blob Blob Blob Normal ['node3'] 227
public Deci DeciLM-7B-instruct 2 13.0 20.0 50.0 Blob Blob Blob Normal ['node3'] 230
public EleutherAI pythia-12b 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 243
public OpenAssistant oasst-sft-4-pythia-12b-epoch-3.5 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 254
public Qwen Qwen2.5-1.5B 1 8.0 12.0 18.0 Blob Blob Blob Normal ['node3'] 128
public Qwen Qwen2.5-7B-Instruct-1M 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 138
public Qwen Qwen2.5-7B-Instruct-GPTQ-Int8 1 13.5 20.0 30.0 Blob Blob Blob Normal ['node3'] 126
public Qwen Qwen2.5-Coder-1.5B-Instruct 1 7.5 12.0 18.0 Blob Blob Blob Normal ['node3'] 121
public Qwen Qwen2.5-Coder-14B-Instruct-GPTQ-Int8 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 124
public Qwen Qwen2.5-Coder-3B 1 13.8 12.0 18.0 Blob Blob Blob Normal ['node3'] 119
public Qwen Qwen2.5-Coder-7B-Instruct 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 117
public Qwen Qwen2.5-Math-1.5B 1 8.0 12.0 18.0 Blob Blob Blob Normal ['node3'] 132
public Qwen Qwen2.5-Math-1.5B-Instruct 1 7.0 12.0 18.0 Blob Blob Blob Normal ['node3'] 130
public Qwen Qwen2.5-Math-7B 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 136
public Qwen Qwen2.5-Math-7B-Instruct 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 134
public Salesforce codegen-2B-multi 1 13.0 20.0 12.0 Blob Blob Blob Normal ['node3'] 278
public THUDM chatglm3-6b 1 13.8 12.0 20.0 Blob Blob Blob Normal ['node3'] 144
public THUDM chatglm3-6b-128k 1 13.8 12.0 20.0 Blob Blob Blob Normal ['node3'] 148
public THUDM chatglm3-6b-32k 1 13.8 12.0 20.0 Blob Blob Blob Normal ['node3'] 146
public TinyLlama TinyLlama-1.1B-Chat-v1.0 1 6.0 20.0 18.0 Blob Blob Blob Normal ['node3'] 100
public TinyLlama TinyLlama-1.1B-Chat-v1.0_13GB 1 13.8 20.0 18.0 Blob Blob Blob Normal ['node3'] 106
public TinyLlama TinyLlama-1.1B-Chat-v1.0_2gpu 2 13.8 20.0 50.0 Blob Mem Blob Normal ['node3'] 109
public allenai OLMo-1B-hf 1 13.8 12.0 50.0 Blob Blob Blob Normal ['node3'] 257
public allenai OLMo-7B-hf 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 259
public baichuan-inc Baichuan-7B 2 13.8 20.0 60.0 Blob Blob Blob Normal ['node3'] 158
public baichuan-inc Baichuan2-7B-Chat 2 13.8 20.0 60.0 Blob Blob Blob Normal ['node3'] 160
public bigcode starcoder2-3b 1 13.8 12.0 50.0 Blob Blob Blob Normal ['node3'] 284
public bigcode starcoder2-7b 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 359
public databricks dolly-v2-12b 2 14.0 20.0 90.0 Blob Blob Blob Normal ['node3'] 251
public deepseek-ai DeepSeek-R1-Distill-Llama-8B 2 13.8 20.0 60.0 Blob Blob Blob Normal ['node3'] 262
public deepseek-ai DeepSeek-R1-Distill-Qwen-1.5B 1 13.0 20.0 50.0 Blob Blob Blob Normal ['node3'] 264
public deepseek-ai DeepSeek-R1-Distill-Qwen-7B 2 13.8 20.0 60.0 Blob Blob Blob Normal ['node3'] 266
public deepseek-ai deepseek-llm-7b-chat 2 13.8 20.0 60.0 Blob Blob Blob Normal ['node3'] 268
public deepseek-ai deepseek-math-7b-instruct 2 13.8 20.0 60.0 Blob Blob Blob Normal ['node3'] 271
public facebook opt-iml-max-1.3b 1 3.8 12.0 15.0 Mem File Mem Normal ['node3'] 113
public llava-hf llava-1.5-7b-hf 1 14.0 20.0 12.0 Blob Blob Blob Normal ['node3'] 281
public microsoft Phi-3-mini-128k-instruct 1 13.0 12.0 18.0 Blob Blob Blob Normal ['node3'] 172
public microsoft Phi-3-mini-4k-instruct 1 13.0 12.0 18.0 Blob Blob Blob Normal ['node3'] 170
public mosaicml mpt-7b 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 165
public mosaicml mpt-7b-storywriter 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 349
public nomic-ai gpt4all-j 2 13.8 20.0 50.0 Blob Blob Blob Normal ['node3'] 240
public openai-community gpt2-xl 1 12.0 12.0 18.0 Blob Blob Blob Normal ['node3'] 237
public openbmb MiniCPM-2B-dpo-bf16 1 13.8 12.0 28.0 Blob Blob Blob Normal ['node3'] 246
public openbmb MiniCPM-2B-sft-bf16 1 9.0 12.0 24.0 Blob Blob Blob Normal ['node3'] 248
public tiiuae falcon-rw-7b 2 13.8 12.0 80.0 Blob Blob Blob Normal ['node3'] 234

Summary

Model Count

46

GPU Count

70

VRAM (GB)

903.2 GB

CPU Cores

784.0

Memory (GB)

1841.0 GB