Model codegen-2B-multi
namespace | model name | standby gpu | standby pageable | standby pinned memory | gpu count | vRam (MB) | cpu | memory (MB) | state | revision |
---|---|---|---|---|---|---|---|---|---|---|
Salesforce | codegen-2B-multi | Blob | Blob | Blob | 1 | 13000 | 20.0 | 12000 | Normal | 278 |
Image
Prompt
Sample Rest Call
Pods
tenant | namespace | pod name | state | require resource | allocated resource |
---|---|---|---|---|---|
public | Salesforce | public/Salesforce/codegen-2B-multi/278/931 | Standby | {'CPU': 20000, 'Mem': 12000, 'GPU': {'Type': 'Any', 'Count': 1, 'vRam': 13000}} | {'nodename': 'node3', 'CPU': 20000, 'Mem': 12000, 'GPUType': 'A4000', 'GPUs': {'vRam': 0, 'map': {}, 'slotSize': 0, 'totalSlotCnt': 0}, 'MaxContextPerGPU': 2} |
Failures
tenant | namespace | model name | revision | id | exit info | state |
---|---|---|---|---|---|---|
public | Salesforce | codegen-2B-multi | 278 | 299 | Error("DockerResponseServerError { status_code: 404, message: \"No such container: public.Salesforce.codegen-2B-multi.278.299\" }") | log |
public | Salesforce | codegen-2B-multi | 278 | 459 | Error("DockerResponseServerError { status_code: 404, message: \"No such container: public.Salesforce.codegen-2B-multi.278.459\" }") | log |
Func
{ "image": "vllm-openai-upgraded:v0.1.0", "commands": [ "/usr/lib/run_model.py", "Salesforce/codegen-2B-multi", "200" ], "envs": [ [ "LD_LIBRARY_PATH", "/usr/local/lib/python3.12/dist-packages/nvidia/cuda_nvrtc/lib/:$LD_LIBRARY_PATH" ] ], "mounts": [ { "hostpath": "/home/brad/cache", "mountpath": "/root/.cache/huggingface" } ], "endpoint": { "port": 8000, "schema": "Http", "probe": "/health" }, "version": 278, "entrypoint": [ "/usr/bin/python3" ], "resources": { "CPU": 20000, "Mem": 12000, "GPU": { "Type": "Any", "Count": 1, "vRam": 13000 } }, "standby": { "gpu": "Blob", "pageable": "Blob", "pinned": "Blob" }, "probe": { "port": 80, "schema": "Http", "probe": "/health" }, "sample_query": { "apiType": "standard", "path": "v1/completions", "prompt": "def hello_world():", "body": { "max_tokens": "200", "model": "N/A", "temperature": "0" } } } |