Exploring the Revolutionary Nemotron-4-340B-Instruct: Enhanced Instruction Following and Mathematical Reasoning
2 octobre 2024 à 09:03
Model Overview Nemotron-4-340B-Instruct is a large language model developed by NVIDIA, designed for English-based single and multi-turn chat applications. It has been fine-tuned for improved instruction-following capabilities and mathematical reasoning. Key points: Based on the Nemotron-4 architecture Supports context length of 4,096 tokens Pre-trained on a corpus of 9 trillion tokens Fine-tuned using Supervised Fine-tuning […]