whichllm — Browse and compare AI model specs and pricing

Venice AI

NVIDIA Nemotron 3 Ultra model ID, context window & pricing

nemotron

Quick facts

Model ID nvidia-nemotron-3-ultra-550b-a55b
Source Venice AI
Context Window 256000
Pricing $0.62 input / $3.12 output per 1M tokens
Capabilities tool calling, reasoning, structured output, temperature control, open weights

Model overview

NVIDIA Nemotron 3 Ultra is an AI model from Venice AI with 256000 token context window and text input support.

Published pricing is $0.62 input and $3.12 output per 1M tokens.

  • Workloads that use text inputs with text outputs.
  • Agent and tool workflows that need function calling.
  • Reasoning-heavy prompts where stepwise problem solving matters.
Model ID nvidia-nemotron-3-ultra-550b-a55b
Provider Venice AI
Family nemotron
Status -
Knowledge Cutoff -
Release Date 2026-06-04
Input Modalities text
Output Modalities text
Context Window 256000
Input Limit -
Output Limit 32768
Tool Calling Yes
Reasoning Yes
Structured Output Yes
Temperature Control Yes
Open Weights Yes
Input Cost / 1M tokens $0.62
Output Cost / 1M tokens $3.12
Reasoning Cost / 1M tokens -
Cache Read Cost / 1M tokens $0.19
Cache Write Cost / 1M tokens -