๐Ÿ“š Qwen3.6-27B Chinese Xianxia / Cultivation LoRA โ€” v2

๐Ÿ”ฎ Roadmap / ้ข„ๅ‘Š

This is one of the last one or two QLoRA fine-tunes built on Qwen3.6-27B. What comes next depends on the base-model timeline:

  • Primary plan โ€” wait for Qwen3.7-27B. Once Qwen3.7-27B is available, new releases will move to that base. This kicks off a v3 line: more rigorous tuning aimed at near-flawless, stable output (the realistic-crime adapter yuxinlu1/qwen3-6-27b-chinese-crime-fiction-lora-v2 is the reference for that next step), together with upgrading several existing adapters to Qwen3.7.
  • Interim fallback. If Qwen3.7-27B does not arrive within the next week or so, a previously finished fine-tune from the backlog will be released in the meantime โ€” note that this one targets an uncensored base model.

Alongside the model work, the companion writing pipeline (github.com/DuckTraDo/Novel) is getting a major overhaul focused on ease of use. It is expected to ship as a packaged application, opening up features such as automatic continuation / auto-drafting.


A LoRA adapter that guides Qwen3.6-27B toward Chinese xianxia / cultivation prose (็Ž„ๅนปไฟฎไป™) โ€” a long-form fantasy-cultivation tradition built around immortal-seeking, clan and sect politics, and the slow climb through cultivation realms.

This adapter is designed for a specific writing problem: Given a short instruction or a neutral scene description, the model should directly produce Chinese cultivation-fiction prose grounded in the xianxia register, rather than analysis, outline, summary, or generic fantasy pastiche.

The core direction is intentionally narrow:

  • third-person narration that follows characters' inner judgment as they scheme, cultivate, and size each other up
  • cultivation-world settings: secluded caves and closed-door cultivation (ๆดžๅบœ/้—ญๅ…ณ), sects and great clans (ๅฎ—้—จ/ไธ–ๅฎถ), mountain gates, spirit veins, market towns (ๅŠๅธ‚)
  • xianxia imagery grounded in a cultivation cosmology (็ตๆฐ”, ๅขƒ็•Œ, ๆณ•ๅ™จ, ไธน่ฏ, ็ฅž่ฏ†, ๅพก้ฃŽ/้ฉพ้ฃŽ, ็މ็ฎ€, ๅ‰‘ๆ„) rather than Western fantasy tropes (no wizards, elves, or Western dragons)
  • cultivation jargon woven into the narration itself, not just the dialogue โ€” realm names, technique names, and artifact lore read as native to the voice
  • direct fiction prose, not analysis, outlines, or revision advice
  • cultivation and power progression treated as a vehicle for depicting clan politics, ambition, and human nature, not as a stat sheet

The goal is for outputs to behave like usable Chinese cultivation-fiction scene drafts, and less like a generic writing assistant explaining what xianxia is.

Typical use cases:

  • drafting cultivation/xianxia scenes inside a longer novel project
  • rewriting modern-toned paragraphs into cultivation-register prose
  • sect / clan / closed-door-breakthrough fiction in a long-form writing pipeline
  • local and private creative-writing workflows
  • integration with a long-form novel pipeline that manages outline, memory, timeline, and continuity

This is an adapter only. It does not include base model weights, training data, or copyrighted source material.

๐Ÿงญ About the v1 / v2 Series

This repository is part of a small series of Chinese fiction LoRAs (็Žฐไปฃๆ‚ฌ็–‘ realistic crime, ๆฐ‘ไฟ—ๅฟ—ๆ€ช folk-horror, and now ็Ž„ๅนปไฟฎไป™ xianxia).

  • v1 โ€” Style Retraining. The v1 series focuses mainly on prose style. It uses SFT to move the base model away from generic AI prose toward a specific Chinese literary voice.
  • v2 โ€” Expanded Training & Full Release. The v2 line uses a larger, cleaner training set, longer SFT, and ships a complete release (PEFT safetensors + GGUF). Behavioral DPO refinement is applied where available.

Note on this release: this xianxia adapter is the SFT stage (QLoRA, 2 epochs on a chapter-level cultivation corpus). It is labeled v2 for its expanded training and full safetensors release. DPO behavioral refinement has not yet been applied and is planned as a follow-up iteration.

Available formats: HF PEFT safetensors (and a GGUF LoRA once converted). MLX users may also be able to use the PEFT safetensors through mlx-lm depending on their local setup.

๐ŸŒฑ Status

Field Value
Version v2 (SFT stage)
Focus Chinese xianxia / cultivation prose behavior
Format HF PEFT safetensors (GGUF LoRA planned)
Base model Qwen3.6-27B
Language Chinese
Use case cultivation scene drafting, sect/clan fiction, realm-progression prose
Training style SFT (QLoRA, r=16 / ฮฑ=32, 2 epochs, max_seq=3300); DPO planned
Recommended workflow local long-form writing pipeline

๐Ÿ”— Companion Novel Pipeline

This LoRA is designed to work together with a local-first, long-form Chinese novel writing pipeline: github.com/DuckTraDo/Novel

The pipeline handles the structural side of long-form fiction:

  • outline and chapter planning
  • scene-level context assembly
  • story memory and character tracking
  • timeline and continuity checks

This LoRA handles the prose-behavior side:

  • third-person cultivation narration with character interiority
  • xianxia scene drafting (closed-door breakthrough, sect confrontation, artifact lore)
  • cultivation-register diction woven into narration
  • anti-modern-tone and anti-Western-fantasy-trope

They are intentionally split: the pipeline owns what happens, the LoRA owns how it reads on the page. You can use either independently, but they are designed as a pair.

๐Ÿ“ฆ Files

Files included: adapter_config.json, adapter_model.safetensors, tokenizer.json, tokenizer_config.json, chat_template.jinja, README.md.

A GGUF LoRA (*-f16.gguf) for llama.cpp can be generated from the adapter and will be added to this repo; see the llama.cpp section below.

โš ๏ธ Critical Usage Note: enable_thinking=False

Qwen3.6 ships with a thinking mode enabled by default. When you call apply_chat_template(messages, add_generation_prompt=True) with default settings, the chat template leaves an unclosed <think>\n block before the assistant's turn. With this LoRA loaded, that causes outputs to begin with an English thinking-process preamble (e.g. "Here's a thinking process: 1. Analyze the user input...") instead of Chinese cultivation prose.

The training samples used a closed empty thinking block: <think>\n\n</think>\n\n followed by the actual prose.

Always pass enable_thinking=False:

text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,
)

For llama.cpp / GGUF inference, pass the included chat_template.jinja via --chat-template-file, or manually prepend the assistant prefix <think>\n\n</think>\n\n in the prompt.

This is the single most common usage error. If outputs look like English reasoning instead of Chinese prose, this is almost certainly the cause.

๐Ÿš€ Example PEFT / Transformers Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model, adapter = "Qwen/Qwen3.6-27B", "yuxinlu1/qwen3-6-27b-chinese-xianxia-lora-v2"

tokenizer = AutoTokenizer.from_pretrained(adapter, trust_remote_code=True)

base = AutoModelForCausalLM.from_pretrained(
    base_model, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, adapter).eval()

messages = [{"role": "user", "content": "ๅ†™ไธ€ๆฎต็Ž„ๅนปไฟฎไป™ๅฐ่ฏดใ€‚"}]

text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)

output = model.generate(
    **inputs,
    max_new_tokens=640,
    temperature=0.8,
    top_p=0.9,
    repetition_penalty=1.15,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

๐Ÿš€ Example llama.cpp Usage

Pair this GGUF LoRA with a compatible Qwen3.6-27B GGUF base. Unsloth's Qwen3.6-27B-GGUF (Q4_K_M or Q8_0) is recommended.

llama-cli -m qwen3.6-27b-base.Q4_K_M.gguf \
  --lora qwen3-6-27b-chinese-xianxia-lora-v2-f16.gguf \
  --chat-template-file chat_template.jinja \
  -p "ๅ†™ไธ€ๆฎต็Ž„ๅนปไฟฎไป™ๅฐ่ฏดใ€‚" \
  -n 640 --temp 0.8 --top-p 0.9 --repeat-penalty 1.15

๐Ÿงช Example Prompts

Prompts can be short:

ๅ†™ไธ€ๆฎต็Ž„ๅนปไฟฎไป™ๅฐ่ฏดใ€‚
่ฎฒไธ€ไธชไฟฎไป™่€…็š„ๆ•…ไบ‹ใ€‚
ๆทฑๅฑฑไธ€ๅค„็Ÿณๆดž้‡Œ๏ผŒๆดžๅฃไธŠ็š„็ตๅ…‰ๅฟฝๆ˜Žๅฟฝๆš—๏ผŒๆ˜ ็€ๆปกๅœฐ็ขŽ็Ÿณใ€‚
ๅฐ่ฏด๏ผšๅฐ‘ๅนด็›˜่†่€Œๅ๏ผŒไฝ“ๅ†…้‚ฃ็ผ•็ตๆฐ”ๅˆๅ‡ๅฎžไบ†ๅ‡ ๅˆ†ใ€‚
ๆŠŠไธ‹้ข่ฟ™ๆฎตๆ”นๆˆ็Ž„ๅนปไฟฎไป™ๅฐ่ฏดๆญฃๆ–‡๏ผš
ไป–ๆ‰“ๅผ€็”ต่„‘ๅ‡†ๅค‡ๅŠ ็ญ๏ผŒๅฟฝ็„ถ่ง‰ๅพ—่ƒธๅฃไธ€็ƒญ๏ผŒ็ญ‰ๅ›ž่ฟ‡็ฅžๆฅ๏ผŒๅ‘็Žฐ่‡ชๅทฑ่บบๅœจไธ€้—ด้™Œ็”Ÿ็š„ๅœŸๅฑ‹้‡Œใ€‚

Suggested decoding range (tuned for this adapter):

Setting Range
temperature 0.7 โ€“ 0.85
top_p 0.85 โ€“ 0.92
repetition_penalty 1.1 โ€“ 1.18
max_new_tokens 400 โ€“ 700

repetition_penalty matters here: too low (โ‰ˆ1.05) lets long generations fall into verbatim loops; too high (โ‰ˆ1.2+) can push Chinese character names toward homophone drift. 1.15 is a good default. The adapter is strongest in shorter bursts โ€” very long single generations (>700 tokens) may drift; generate scene-by-scene for best results.

๐Ÿงช Internal Evaluation Snapshot

A small internal evaluation was run across four prompt tiers (original-text continuation, neutral scene description, character-grounded task, and minimal generic instruction) to verify style transfer.

Qualitative observations:

  • Original-text continuation: natural continuation in the trained cultivation voice; realm/technique/artifact vocabulary and the xianxia register carry through.
  • Neutral scene description: cultivation atmosphere and plot emerge from neutral prompts without explicit xianxia cues in the input.
  • Character-grounded task: characters behave consistently with cultivation-fiction conventions (sect etiquette, closed-door breakthrough, clan politics).
  • Minimal generic instruction (e.g. "write a piece of xianxia fiction"): the model produces cultivation prose directly without first explaining what xianxia is โ€” confirming the style was internalized, not merely instruction-followed.

These observations come from a small local development eval and should be treated as an internal signal, not a public benchmark.

๐Ÿงช Intended Use

Intended for:

  • local Chinese xianxia / cultivation drafting
  • rewriting modern-toned paragraphs into cultivation-register prose
  • testing third-person cultivation narration
  • sect / clan / breakthrough fiction in a human-in-the-loop workflow
  • offline and privacy-respecting novel drafting
  • integration with a long-form novel pipeline

Not intended for: author impersonation, defamation, harassment, factual claims, high-stakes advice, spam, or deception.

โš ๏ธ Limitations

  • It does not guarantee full-novel plot coherence by itself. Character continuity, foreshadowing, and timeline logic should be handled by an external writing pipeline or by the author.
  • Long single generations may degrade. Past roughly 600โ€“700 tokens, very long outputs can drift, loop, or destabilize character names. Generate scene-by-scene and keep repetition_penalty near 1.15.
  • Character-name stability depends on decoding settings; aggressive repetition penalties can turn a name into a homophone variant.
  • Optimized for cultivation-world settings; outputs for modern, urban, or non-cultivation scenes may revert toward base-model tone.
  • It may produce shorter-than-expected outputs if the prompt is very minimal or decoding settings are conservative.
  • If enable_thinking=False is not set, outputs will begin with an English thinking-process preamble. This is the single most common usage error.
  • Output quality depends on the base model, quantization, sampling settings, prompt design, and context quality.

๐Ÿ›ก๏ธ Safety and Legal Notes

  • This repository contains adapter weights and supporting tokenizer/template files, not base model weights.
  • No copyrighted novels, private manuscripts, or proprietary datasets are distributed in this repository.
  • This LoRA is not designed to imitate any specific living author; the goal is to capture a broader Chinese xianxia / cultivation prose tradition.
  • Outputs are machine-generated fiction; do not use this model for harassment, defamation, fraud, or deceptive impersonation.
  • Users are responsible for the base model license, adapter license, and applicable law.

๐Ÿ“œ License

LoRA adapter: MIT. Base model: governed by its own license.

Downloads last month
61
GGUF
Model size
79.7M params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yuxinlu1/qwen3-6-27b-chinese-xianxia-lora-v2

Base model

Qwen/Qwen3.6-27B
Adapter
(145)
this model