File size: 9,919 Bytes

---
license: apache-2.0
base_model: NousResearch/NousCoder-14B
tags:
  - code
  - coding-assistant
  - solidity
  - typescript
  - python
  - lora
  - peft
  - fine-tuned
datasets:
  - 0xSero/sero-sft-conversations
language:
  - en
pipeline_tag: text-generation
library_name: peft
---
> [!TIP]
> Support this work: **[donate.sybilsolutions.ai](https://donate.sybilsolutions.ai)**
> 
> REAP surfaces: [GLM](https://huggingface.co/spaces/0xSero/reap-glm-family) | [MiniMax](https://huggingface.co/spaces/0xSero/reap-minimax-family) | [Qwen](https://huggingface.co/spaces/0xSero/reap-qwen-family) | [Gemma](https://huggingface.co/spaces/0xSero/reap-gemma-family) | [Paper](https://arxiv.org/abs/2510.13999) | [Code](https://github.com/CerebrasResearch/reap) | [PR17](https://github.com/CerebrasResearch/reap/pull/17) | [Cerebras Collection](https://huggingface.co/collections/cerebras/cerebras-reap)

# sero-nouscoder-14b-sft

A personal coding assistant fine-tuned on 11,711 real coding conversations from my daily development work.

## Model Details

| Property | Value |
|----------|-------|
| Base Model | [NousResearch/NousCoder-14B](https://huggingface.co/NousResearch/NousCoder-14B) |
| Parameters | 14.8B |
| Architecture | Qwen3-based decoder-only transformer |
| Training Method | QLoRA (4-bit quantization + LoRA r=64) |
| Training Tokens | ~51.75 million |
| Final Loss | 0.685 |
| Token Accuracy | 81.6% |
| License | Apache 2.0 |

## The Experiment

### Why I Did This

I've accumulated thousands of coding conversations with AI assistants over the past year. These conversations represent my actual coding style, problem-solving patterns, and domain expertise across:

- **Solidity/Web3** - Smart contracts, DeFi protocols, ethers.js
- **TypeScript/Node.js** - Backend services, API development
- **Python** - Scripts, data processing, automation
- **SQL** - Database queries, schema design
- **DevOps** - Docker, deployment, infrastructure

The goal: create a coding assistant that thinks like me and understands my codebase patterns.

### Data Extraction Pipeline

```
Raw Data Sources
├── Claude Projects conversations (233MB)
├── Claude chat history exports
├── Cursor IDE conversations
└── Various AI assistant logs
         │
         ▼
    Extraction & Parsing
    ├── Parse JSONL conversation logs
    ├── Extract message pairs (user/assistant)
    ├── Normalize formats across sources
    └── Deduplicate conversations
         │
         ▼
    Security Scanning
    ├── Regex patterns for API keys, tokens
    ├── Private key detection
    ├── Path/username redaction
    └── Quarantine flagged entries
         │
         ▼
    Quality Filtering
    ├── Remove empty/trivial exchanges
    ├── Filter non-code conversations
    ├── Length-based filtering
    └── Train/dev/test split (90/5/5)
         │
         ▼
    Final Dataset
    ├── train.jsonl (11,711 conversations)
    ├── dev.jsonl (107 conversations)
    └── test.jsonl (123 conversations)
```

**Security Note:** 95,561 conversations were quarantined due to potential sensitive content. Only verified clean data was used for training.

### Dataset Statistics

| Split | Conversations | Purpose |
|-------|---------------|---------|
| Train | 11,711 | Model training |
| Validation | 107 | Hyperparameter tuning |
| Test | 123 | Final evaluation |
| Quarantined | 95,561 | Excluded (potential secrets) |

### Sample Training Conversation

```json
{
  "messages": [
    {
      "role": "user",
      "content": "For the code present, we get this error:\n```\nConversion of type 'Contract' to type 'ICoboSafeAccount' may be a mistake...\n```\nHow can I resolve this?"
    },
    {
      "role": "assistant",
      "content": "The error occurs because TypeScript doesn't automatically recognize that the Contract instance implements the ICoboSafeAccount interface. You can fix this by using a double type assertion:\n\n```typescript\nconst coboSafeAccount = new ethers.Contract(\n    coboSafeAccountAddress,\n    COBO_SAFE_ACCOUNT_ABI,\n    signer\n) as unknown as ICoboSafeAccount;\n```"
    }
  ]
}
```

## Training Details

### Configuration

```python
# Model
MODEL_ID = "NousResearch/NousCoder-14B"

# QLoRA Config
BITS = 4
LORA_R = 64
LORA_ALPHA = 128
LORA_DROPOUT = 0.05
TARGET_MODULES = ["q_proj", "k_proj", "v_proj", "o_proj",
                  "gate_proj", "up_proj", "down_proj"]

# Training
BATCH_SIZE = 2
GRADIENT_ACCUMULATION = 8  # Effective batch size: 16
LEARNING_RATE = 2e-5
EPOCHS = 3
MAX_LENGTH = 4096
PACKING = True  # Efficient sequence packing
```

### Infrastructure

- **Platform:** HuggingFace Jobs
- **GPU:** NVIDIA A100 80GB
- **Training Time:** ~18 hours (timed out at 93% completion)
- **Cost:** ~$45 USD

### Training Progress

| Epoch | Step | Loss | Token Accuracy | Learning Rate |
|-------|------|------|----------------|---------------|
| 0.03 | ~10 | 1.355 | 71.2% | 2.0e-5 |
| 0.28 | ~80 | 0.920 | 77.2% | 1.9e-5 |
| 0.54 | ~160 | 0.781 | 79.5% | 1.8e-5 |
| 1.04 | ~320 | 0.743 | 80.4% | 1.5e-5 |
| 1.55 | ~480 | 0.711 | 80.8% | 1.1e-5 |
| 2.05 | ~640 | 0.722 | 80.7% | 6.5e-6 |
| 2.52 | ~800 | 0.705 | 81.2% | 1.4e-6 |

### Loss Curve

```
Loss
1.4 │ ●
    │  ╲
1.2 │   ╲
    │    ╲
1.0 │     ●
    │      ╲
0.8 │       ●──●──●
    │              ╲
0.7 │               ●──●──●──●
    │
0.6 │
    └────────────────────────────
        0    0.5   1.0   1.5   2.0   2.5  Epoch
```

The model showed strong convergence:
- Rapid initial loss drop (1.35 → 0.78 in first 0.5 epochs)
- Stable training through epochs 1-2
- Final loss plateau around 0.70

## Usage

### With Transformers + PEFT

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model + LoRA adapter
base = AutoModelForCausalLM.from_pretrained(
    "NousResearch/NousCoder-14B",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "0xSero/sero-nouscoder-14b-sft")
tokenizer = AutoTokenizer.from_pretrained("0xSero/sero-nouscoder-14b-sft")

# Generate
messages = [{"role": "user", "content": "Write a Solidity ERC20 token with permit"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### With vLLM (Recommended for Serving)

```python
from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest

llm = LLM(
    model="NousResearch/NousCoder-14B",
    enable_lora=True,
    max_lora_rank=64,
)

outputs = llm.generate(
    ["Explain how to deploy a contract with ethers.js v6"],
    SamplingParams(temperature=0.7, max_tokens=512),
    lora_request=LoRARequest("sero", 1, "0xSero/sero-nouscoder-14b-sft")
)
```

### Merge Adapter for Standalone Model

```python
from transformers import AutoModelForCausalLM
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained(
    "NousResearch/NousCoder-14B",
    torch_dtype=torch.bfloat16,
    device_map="cpu",
)
model = PeftModel.from_pretrained(base, "0xSero/sero-nouscoder-14b-sft")
merged = model.merge_and_unload()
merged.save_pretrained("./sero-nouscoder-merged")
```

## VRAM Requirements

| Precision | VRAM Required |
|-----------|---------------|
| bfloat16 (full) | ~30GB |
| 8-bit (bitsandbytes) | ~16GB |
| 4-bit (GPTQ/AWQ) | ~8GB |

## Limitations

- **Domain Focused:** Optimized for Solidity, TypeScript, Python - may underperform on other languages
- **93% Trained:** Training timed out before completing epoch 3 (2.52/3.0 epochs)
- **Personal Style:** Tuned to my coding patterns, which may not generalize to all users
- **LoRA Adapter:** Requires base model + adapter loading (not standalone)

## Files

```
sero-nouscoder-14b-sft/
├── adapter_config.json      # LoRA configuration
├── adapter_model.safetensors # Trained LoRA weights (USE THIS)
├── tokenizer.json           # Tokenizer
├── tokenizer_config.json    # Tokenizer config
├── special_tokens_map.json  # Special tokens
├── chat_template.jinja      # Chat template
└── last-checkpoint/         # Training checkpoint (for resuming)
    ├── optimizer.pt
    ├── scheduler.pt
    ├── trainer_state.json
    └── ...
```

## Next Steps

- [ ] DPO alignment training on preference pairs
- [ ] GPTQ/AWQ quantization for consumer GPU deployment
- [ ] Evaluation on coding benchmarks
- [ ] Tool/agent fine-tuning on 136K tool trajectory events

## Citation

```bibtex
@misc{sero-nouscoder-14b-sft,
  author = {0xSero},
  title = {sero-nouscoder-14b-sft: Personal Coding Assistant},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/0xSero/sero-nouscoder-14b-sft}
}
```

## Acknowledgments

- [NousResearch](https://huggingface.co/NousResearch) for the excellent NousCoder-14B base model
- [HuggingFace](https://huggingface.co) for the Jobs compute platform
- The TRL and PEFT teams for making fine-tuning accessible

---

*Built with ~$45 of compute and 11,711 real coding conversations.*

## Support

If this work is useful, support Sybil Solutions here: [https://donate.sybilsolutions.ai](https://donate.sybilsolutions.ai)


<!-- SERO_MANAGED_TOP_LINKS_START -->
## Support and links
- Donate: https://donate.sybilsolutions.ai
- X: https://x.com/0xsero
- GitHub: https://github.com/0xsero
<!-- SERO_MANAGED_TOP_LINKS_END -->

## Sponsors

Thank you for the kind sponsors, wouldn't be possible without them:

- Nvidia
- TNG Technology
- Lambda
- Prime Intellect
- HotAisle