| model_info: | |
| name: anemll-google-gemma-3-4b-it-qat-int4-unquantized-ctx1024 | |
| version: 0.3.5 | |
| description: | | |
| Demonstarates running google-gemma-3-4b-it-qat-int4-unquantized on Apple Neural Engine | |
| Context length: 1024 | |
| Batch size: 64 | |
| Chunks: 2 | |
| license: MIT | |
| author: Anemll | |
| framework: Core ML | |
| language: Python | |
| architecture: gemma3 | |
| parameters: | |
| context_length: 1024 | |
| batch_size: 64 | |
| lut_embeddings: none | |
| lut_ffn: 4 | |
| lut_ffn_per_channel: 4 | |
| lut_lmhead: 6 | |
| lut_lmhead_per_channel: 4 | |
| num_chunks: 2 | |
| model_prefix: gemma3 | |
| embeddings: gemma3_embeddings.mlmodelc | |
| lm_head: gemma3_lm_head_lut6.mlmodelc | |
| ffn: gemma3_FFN_PF_lut4_chunk_01of02.mlmodelc | |
| split_lm_head: 16 | |
| sliding_window: 1024 | |