File size: 6,916 Bytes

f72c23f
0ecad4f
 
 
f72c23f
0ecad4f
f72c23f
 
 
0ecad4f
 
 
 
 
f72c23f
 
0ecad4f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8e2b93e
0ecad4f
 
 
 
 
 
 
 
 
8e2b93e
0ecad4f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8e2b93e
0ecad4f
 
 
 
 
 
 
8e2b93e
 
 
 
 
 
 
 
 
0ecad4f
 
 
 
 
 
 
 
 
 
 
 
 
65e042e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0ecad4f
 
65e042e
 
 
 
 
 
 
 
 
 
 
8e2b93e
65e042e
8e2b93e
65e042e
0ecad4f
65e042e
 
 
 
0ecad4f
 
 
65e042e
0ecad4f
 
65e042e
 
 
 
0ecad4f
 
f72c23f
65e042e
 
 
f72c23f
0ecad4f
f72c23f
65e042e

---
language:
- en
license: apache-2.0
base_model: unsloth/Qwen3.5-27B
library_name: transformers
tags:
- unsloth
- qwen3_5
- lora
- rewriting
- style-transfer
- unslop
pipeline_tag: text-generation
---

# qwen3.5-27b-unslop-good-lora-v1

A Qwen 3.5 27B fine-tune for unslop rewriting: taking AI-sounding passages and attempting to rewrite them into cleaner, more natural prose while preserving meaning.

This run is the first large dense-text Qwen 3.5 follow-up after the earlier pilot family. It is intended as a serious test of whether a newer large text model can match or beat the promising 30B-A3B pilot without depending on the VL backbone path.

## How it was trained
- Base model: `unsloth/Qwen3.5-27B`
- Training path: Unsloth fine-tuning on Hugging Face Jobs
- Dataset: `N8Programs/unslop-good`
- Rows used: 1000 (full training split)
- Objective: conversational rewrite / style cleanup

## Training shape
- hardware: A100 80GB (`a100-large`)
- max_seq_length: 6144
- num_train_epochs: 2
- batch_size: 1
- gradient_accumulation_steps: 1
- learning_rate: 1e-4
- scheduler: cosine
- warmup_steps: 50
- LoRA rank: 8
- LoRA alpha: 20
- LoRA dropout: 0.0
- 4-bit loading
- bf16 training

## Training outcome
This 27B run completed successfully on Hugging Face Jobs and pushed a deployable merged-model repo cleanly.

- train_runtime: 10665.33s
- train_loss: 1.916
- final step: 1000

Operator notes:
- model load path succeeded on `unsloth/Qwen3.5-27B`
- dataset formatting and tokenization completed to full planned run length
- training reached step 1000 and emitted final trainer metrics
- merged-model push completed (this repo is currently serving as a full-model artifact, not a PEFT adapter-only repo)
- GGUF export + GGUF upload completed (q2_k, q4_k_m, q6_k, q8_0)

## Intended use
Use this model as a pipeline stage for:
- rewriting AI-sounding prose into more natural text
- testing whether a large dense Qwen 3.5 model can deliver more faithful unslop behavior than the earlier dense pilots
- comparing a large text-only family directly against the stronger 30B-A3B pilot result

## Limitations
- still trained on the same small 1000-row dataset
- training success does not imply fidelity success
- long-form fidelity still needs stricter, repeated side-by-side judging vs the 30B reference lane
- this 27B output can become more interventionist than ideal for minimal-touch rewrites

## Training loss vs training progress

Normalized curve from the completed 27B run:

![Qwen3.5-27B training loss vs normalized progress](./training_loss_vs_progress.png)

Curve notes:
- first logged loss: 2.279
- final logged step loss: 1.864
- trainer-reported final train_loss: 1.916
- final step: 1000

Across the successful Qwen 3.5 HF Jobs runs, trainer-reported final `train_loss` values were:
- 0.8B: 2.504
- 2B: 2.241
- 4B: 1.989
- 9B: 1.854
- 27B: 1.916

![Qwen3.5 family HF Jobs final train_loss](./training_loss_qwen35_family.png)

## Recommended evaluation sample
Use this full fiction passage for held-out testing:

> The wind rose all at once, as if the mountain itself had inhaled.
> She stood at the edge of the pass, cloak snapping like a torn banner behind her, staring down at the valley where the last lights were going out one by one. Each flame extinguished felt deliberate—an answer to a question she had been too afraid to ask. Somewhere below, bells began to ring, not in warning, but in mourning. They already knew.
> “Don’t,” he said, breathless, reaching for her arm. His fingers brushed fabric, then air. “If you cross that line, there’s no coming back.”
> She didn’t turn. The cold burned her lungs, sharpened her thoughts. “There was never a way back,” she replied softly. “Only a way forward you were willing to pretend didn’t exist.”
> Thunder rolled, low and ancient. For a moment, the sky fractured into white fire, and in that instant he saw her face—resolved, terrified, incandescent with purpose. Not the girl he had sworn to protect, but the woman the world had demanded she become.
> “You’ll die,” he said.
> She smiled then, finally facing him, eyes bright with unshed tears. “So will everyone,” she answered. “The difference is what we leave burning when we’re gone.”
> She stepped forward.
> The mountain exhaled.

## Deployment-backed endpoint check (latest)

Live endpoint:
- `https://dmitriy-kisil--qwen3-5-27b-unslop-api.modal.run`

Latest `/health` (deployment-backed):
- `ok: true`
- `model_id: Oysiyl/qwen3.5-27b-unslop-good-lora-v1`
- `base_model: Qwen/Qwen3.5-27B`
- `model_family: qwen3_5`
- `adapter_mode: false`
- `artifact_mode: merged_full_model`
- `scaledown_window: 240`

Notes:
- This endpoint is currently loading the merged full-model artifact (not PEFT adapter-serving mode).
- The service is up and returning 200s, but rewrite quality is currently corrupted.

## Full observed output on that sample

Short sanity sample (input):

> This feature saves teams hours every week, but the copy still sounds robotic and bland.

Short sanity sample (observed output):

> This can... ...SPOINTF0610! $55) ( ），<#%}·'  -/   #1.  }:*

Held-out-style rewrite sample (input):

> We built this feature quickly, but now we need a cleaner version that sounds natural without losing any technical detail.

Held-out-style rewrite sample (observed output):

> The *>\n:  7%\n,  (⊻�,\n. " " \n*\n: "、\n„ " ；*\n\n\n\n;\n\n; ：: 5:17-

Observed behavior note:
- output is not a coherent rewrite right now
- character-level corruption/gibberish is still present
- this is a runtime-quality issue, not just a model-card wording issue

## Judgment

Blunt judgment: this 27B run is training-successful and deployment-live, but current live rewrite quality is not production-usable.

Why:
- endpoint health is green and requests return 200
- model artifact is correctly identified as merged full model (`adapter_mode: false` is expected)
- observed outputs are still corrupted / gibberish under live inference
- therefore quality remains blocked even though infra is operational

## Comparison vs pilot series

- **0.8B / 2B / 4B / 9B lanes:** have produced coherent rewrite outputs in the documented runs.
- **30B-A3B reference lane:** remains the safer quality-first benchmark for fidelity-sensitive long-form use.
- **Qwen3.5 27B (this run):** strongest training/infrastructure completion so far at this size, but currently failing the deployment-backed coherence gate.

## Conclusion

This repo is a complete 27B post-run artifact with real training metrics, normalized loss plot, and live deployment checks. Current status is explicit: infra is up, artifact type is correct (merged model, not adapter), but the latest deployment-backed rewrite outputs are still corrupted. The next milestone is not more training documentation; it is restoring coherent live generation and then re-running this evaluation block.