model_info: name: anemll-Qwen3-4B-MLX-dequantized-ctx1024 version: 0.1.1 description: | Demonstarates running Qwen3-4B on Apple Neural Engine ( alpha release ) Context length: 1024 Batch size: 64 Chunks: 2 license: MIT author: Anemll framework: Core ML language: Python parameters: context_length: 1024 batch_size: 64 lut_embeddings: none lut_ffn: 4 lut_lmhead: 8 num_chunks: 2 model_prefix: qwen embeddings: qwen_embeddings.mlmodelc lm_head: qwen_lm_head_lut8.mlmodelc ffn: qwen_FFN_PF_lut4.mlmodelc split_lm_head: 16