majentik commited on
Commit
0240fe4
·
verified ·
1 Parent(s): 1dde0be

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. LICENSE +126 -0
  2. README.md +52 -0
LICENSE ADDED
@@ -0,0 +1,126 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ NVIDIA Open Model License Agreement
2
+ This NVIDIA Open Model License Agreement (the “Agreement”) is a legal agreement between the Legal Entity You represent, or if no
3
+ entity is identified, You and NVIDIA Corporation and its Affiliates (“NVIDIA”) and governs Your use of the Models that NVIDIA
4
+ provides to You under this Agreement. NVIDIA and You are each a “party” and collectively the “parties.”
5
+ NVIDIA models released under this Agreement are intended to be used permissively and enable the further development of AI
6
+ technologies. Subject to the terms of this Agreement, NVIDIA confirms that:
7
+
8
+
9
+ Models are commercially useable.
10
+
11
+
12
+
13
+ You are free to create and distribute Derivative Models.
14
+
15
+
16
+
17
+ NVIDIA does not claim ownership to any outputs generated using the Models or Model Derivatives.
18
+
19
+ By using, reproducing, modifying, distributing, performing or displaying any portion or element of the Model or Derivative Model, or
20
+ otherwise accepting the terms of this Agreement, you agree to be bound by this Agreement.
21
+ 1.
22
+
23
+ Definitions. The following definitions apply to this Agreement:
24
+
25
+ 1.1.
26
+
27
+ “Derivative Model” means all (a) modifications to the Model, (b) works based on the Model, and (c) any other derivative
28
+ works of the Model. An output is not a Derivative Model.
29
+
30
+ 1.2.
31
+
32
+ “Legal Entity” means the union of the acting entity and all other entities that control, are controlled by, or are under common
33
+ control with that entity. For the purposes of this definition, “control” means (a) the power, direct or indirect, to cause the
34
+ direction or management of such entity, whether by contract or otherwise, or (b) ownership of fifty percent (50%) or more
35
+ of the outstanding shares, or (c) beneficial ownership of such entity.
36
+
37
+ 1.3.
38
+
39
+ “Model” means the machine learning model, software, checkpoints, learnt weights, algorithms, parameters, configuration
40
+ files and documentation shared under this Agreement.
41
+
42
+ 1.4.
43
+
44
+ “You” or “Your” means an individual or Legal Entity exercising permissions granted by this Agreement.
45
+
46
+ 2.
47
+
48
+ Conditions for Use, License Grant, AI Ethics and IP Ownership.
49
+
50
+ 2.1.
51
+ Conditions for Use. The Model and any Derivative Model are subject to additional terms as described in Section 2 and
52
+ Section 3 of this Agreement and govern Your use. If You institute copyright or patent litigation against any entity (including a crossclaim or counterclaim in a lawsuit) alleging that the Model or a Derivative Model constitutes direct or contributory copyright or
53
+ patent infringement, then any licenses granted to You under this Agreement for that Model or Derivative Model will terminate as of
54
+ the date such litigation is filed. NVIDIA may update this Agreement to comply with legal and regulatory requirements at any time
55
+ and You agree to either comply with any updated license or cease Your copying, use, and distribution of the Model and any
56
+ Derivative Model.
57
+ 2.2.
58
+ License Grant. The rights granted herein are explicitly conditioned on Your full compliance with the terms of this
59
+ Agreement. Subject to the terms and conditions of this Agreement, NVIDIA hereby grants to You a perpetual, worldwide, nonexclusive, no-charge, royalty-free, revocable (as stated in Section 2.1) license to publicly perform, publicly display, reproduce, use,
60
+ create derivative works of, make, have made, sell, offer for sale, distribute (through multiple tiers of distribution) and import the
61
+ Model.
62
+ 2.3.
63
+ AI Ethics. NVIDIA is committed to safety, trust and transparency in AI development. NVIDIA encourages You to (a) ensure
64
+ that the product or service You develop, use, offer as a service or distributes meets the legal and ethical requirements of the
65
+ relevant industry or use case, (b) take reasonable measures to address unintended bias and to mitigate harm to others, including
66
+ underrepresented or vulnerable groups, and (c) inform users of the nature and limitations of the product or service. NVIDIA
67
+ expressly prohibits the use of its products or services for any purpose in violation of applicable law or regulation, including but not
68
+ limited to (a) illegal surveillance, (b) illegal collection or processing of biometric information without the consent of the subject
69
+ where required under applicable law, or (c) illegal harassment, abuse, threatening or bullying of individuals or groups of individuals
70
+ or intentionally misleading or deceiving others.
71
+ 2.4.
72
+ NVIDIA owns the Model and any Model Derivatives created by NVIDIA. Subject to NVIDIA’s underlying ownership rights in
73
+ the Model or its Model Derivatives, You are and will be the owner of Your Model Derivatives. NVIDIA claims no ownership rights in
74
+ outputs. You are responsible for outputs and their subsequent uses. Except as expressly granted in this Agreement, (a) NVIDIA
75
+ reserves all rights, interests and remedies in connection with the Model and (b) no other license or right is granted to you by
76
+ implication, estoppel or otherwise.
77
+ 3.
78
+ Redistribution. You may reproduce and distribute copies of the Model or Derivative Models thereof in any medium, with or
79
+ without modifications, provided that You meet the following conditions:
80
+
81
+ 3.1.
82
+ If you distribute the Model, You must give any other recipients of the Model a copy of this Agreement and include the
83
+ following attribution notice within a “Notice” text file with such copies: “Licensed by NVIDIA Corporation under the NVIDIA Open
84
+ Model License”; and
85
+ 3.2.
86
+ You may add Your own copyright statement to Your modifications and may provide additional or different license terms
87
+ and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Models as a whole, provided
88
+ Your use, reproduction, and distribution of the Model otherwise complies with the conditions stated in this Agreement.
89
+ 4.
90
+ Trademarks. This Agreement does not grant permission to use the trade names, trademarks, service marks, or product
91
+ names of NVIDIA, except as required for reasonable and customary use in describing the origin of the Model and reproducing the
92
+ content of the “Notice” text file.
93
+ 5.
94
+ Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, NVIDIA provides the Model on an “AS
95
+ IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any
96
+ warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are
97
+ solely responsible for determining the appropriateness of using or redistributing the Model, Derivative Models and outputs and
98
+ assume any risks associated with Your exercise of permissions under this Agreement.
99
+ 6.
100
+ Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or
101
+ otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, will NVIDIA be
102
+ liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a
103
+ result of this Agreement or out of the use or inability to use the Model, Derivative Models or outputs (including but not limited to
104
+ damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or
105
+ losses), even if NVIDIA has been advised of the possibility of such damages.
106
+ 7.
107
+ Indemnity. You will indemnify and hold harmless NVIDIA from and against any claim by any third party arising out of or
108
+ related to your use or distribution of the Model, Model Derivatives or outputs.
109
+ 8.
110
+ You.
111
+
112
+ Feedback. NVIDIA appreciates your feedback, and You agree that NVIDIA may use it without restriction or compensation to
113
+
114
+ 9.
115
+ Governing Law. This Agreement will be governed in all respects by the laws of the United States and the laws of the State
116
+ of Delaware, without regard to conflict of laws principles or the United Nations Convention on Contracts for the International Sale of
117
+ Goods. The state and federal courts residing in Santa Clara County, California will have exclusive jurisdiction over any dispute or
118
+ claim arising out of or related to this Agreement, and the parties irrevocably consent to personal jurisdiction and venue in those
119
+ courts; except that, either party may apply for injunctive remedies or an equivalent type of urgent legal relief in any jurisdiction.
120
+ 10.
121
+ Trade and Compliance. You agree to comply with all applicable export, import, trade and economic sanctions laws and
122
+ regulations, as amended, including without limitation U.S. Export Administration Regulations and Office of Foreign Assets Control
123
+ regulations. These laws include restrictions on destinations, end-users and end-use.
124
+ Version Release Date: June 14, 2024
125
+
126
+
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: nvidia-open-model-license
4
+ license_link: https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf
5
+ base_model: nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16
6
+ tags: [nemotron, multimodal, turboquant, kv-cache, gguf, combo-card]
7
+ ---
8
+
9
+ # Nemotron-3-Nano-Omni-30B-A3B-Reasoning - TurboQuant GGUF IQ4_XS + TurboQuant KV-Cache (matched stack)
10
+
11
+ Documentation card for the matched TurboQuant weight + TurboQuant KV-cache stack
12
+ of `Nemotron-3-Nano-Omni-30B-A3B-Reasoning` at GGUF IQ4_XS.
13
+
14
+ **No new weights are published here.** This card describes a runtime configuration:
15
+ load the weights from [`majentik/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-TurboQuant-GGUF-IQ4_XS`](https://huggingface.co/majentik/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-TurboQuant-GGUF-IQ4_XS)
16
+ (forthcoming in Phase 2.2 of the publication plan) and apply the KV-cache modifier
17
+ documented in [`majentik/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-TurboQuant`](https://huggingface.co/majentik/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-TurboQuant).
18
+
19
+ ## Modality matrix
20
+
21
+ | Modality | Encoder | Quantization in this variant |
22
+ |---|---|---|
23
+ | Text | LLM backbone (Mamba-2 + Transformer hybrid Sparse MoE) | per the variant suffix |
24
+ | Image | CRADIO v4-H | **BF16** (kept full-precision in every non-GGUF variant; GGUF uses mmproj-F16 split file) |
25
+ | Audio | Parakeet-TDT-0.6B-v2 | **BF16** (same rationale) |
26
+ | Video | Parakeet-TDT-0.6B-v2 + frame sampler | **BF16** (≤ 2 min, 256 frames @ 2 FPS) |
27
+
28
+ NVIDIA's official FP8 / NVFP4 recipe keeps both encoders + the cross-modal
29
+ MLP projectors in BF16 to preserve multimodal accuracy. We follow that
30
+ convention in every quantized variant we ship.
31
+
32
+ ## Runtime quirks
33
+
34
+ ### llama.cpp
35
+
36
+ Use `llama-mtmd-cli` for multimodal inference; pass `--mmproj mmproj-F16.gguf`
37
+ (see `majentik/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-mmproj-F16`).
38
+
39
+ **Do NOT use CUDA 13.2** — produces gibberish. Pin CUDA 12.x or
40
+ use the Metal/CPU paths.
41
+
42
+ ### Ollama
43
+
44
+ Text-only; multimodal is blocked because Ollama doesn't yet support
45
+ the mmproj split-file pattern.
46
+
47
+ ### Reasoning mode
48
+
49
+ `enable_thinking` defaults to `True`. To disable extended reasoning
50
+ (e.g., for latency-sensitive cases), pass `enable_thinking=False`
51
+ to the chat template / generate call. No separate "no-think"
52
+ variant card exists — this is a runtime flag, not a model variant.