Quick facts

Model ID nvidia-nemotron-3-ultra-550b-a55b

Source Venice AI

Context Window 256000

Pricing $0.62 input / $3.12 output per 1M tokens

Capabilities tool calling, reasoning, structured output, temperature control, open weights

Model overview

NVIDIA Nemotron 3 Ultra is an AI model from Venice AI with 256000 token context window and text input support.

Published pricing is $0.62 input and $3.12 output per 1M tokens.

Model ID nvidia-nemotron-3-ultra-550b-a55b

Provider Venice AI

Family nemotron

Status -

Knowledge Cutoff -

Release Date 2026-06-04

Input Modalities text

Output Modalities text

Context Window 256000

Input Limit -

Output Limit 32768

Tool Calling Yes

Reasoning Yes

Structured Output Yes

Temperature Control Yes

Open Weights Yes

Input Cost / 1M tokens $0.62

Output Cost / 1M tokens $3.12

Reasoning Cost / 1M tokens -

Cache Read Cost / 1M tokens $0.19

Cache Write Cost / 1M tokens -