📚 Qwen3.6-27B Chinese Xianxia / Cultivation LoRA — v2

🔮 Roadmap / 预告

This is one of the last one or two QLoRA fine-tunes built on Qwen3.6-27B. What comes next depends on the base-model timeline:

Primary plan — wait for Qwen3.7-27B. Once Qwen3.7-27B is available, new releases will move to that base. This kicks off a v3 line: more rigorous tuning aimed at near-flawless, stable output (the realistic-crime adapter yuxinlu1/qwen3-6-27b-chinese-crime-fiction-lora-v2 is the reference for that next step), together with upgrading several existing adapters to Qwen3.7.
Interim fallback. If Qwen3.7-27B does not arrive within the next week or so, a previously finished fine-tune from the backlog will be released in the meantime — note that this one targets an uncensored base model.

Alongside the model work, the companion writing pipeline (github.com/DuckTraDo/Novel) is getting a major overhaul focused on ease of use. It is expected to ship as a packaged application, opening up features such as automatic continuation / auto-drafting.

A LoRA adapter that guides Qwen3.6-27B toward Chinese xianxia / cultivation prose (玄幻修仙) — a long-form fantasy-cultivation tradition built around immortal-seeking, clan and sect politics, and the slow climb through cultivation realms.

This adapter is designed for a specific writing problem: Given a short instruction or a neutral scene description, the model should directly produce Chinese cultivation-fiction prose grounded in the xianxia register, rather than analysis, outline, summary, or generic fantasy pastiche.

The core direction is intentionally narrow:

third-person narration that follows characters' inner judgment as they scheme, cultivate, and size each other up
cultivation-world settings: secluded caves and closed-door cultivation (洞府/闭关), sects and great clans (宗门/世家), mountain gates, spirit veins, market towns (坊市)
xianxia imagery grounded in a cultivation cosmology (灵气, 境界, 法器, 丹药, 神识, 御风/驾风, 玉简, 剑意) rather than Western fantasy tropes (no wizards, elves, or Western dragons)
cultivation jargon woven into the narration itself, not just the dialogue — realm names, technique names, and artifact lore read as native to the voice
direct fiction prose, not analysis, outlines, or revision advice
cultivation and power progression treated as a vehicle for depicting clan politics, ambition, and human nature, not as a stat sheet

The goal is for outputs to behave like usable Chinese cultivation-fiction scene drafts, and less like a generic writing assistant explaining what xianxia is.

Typical use cases:

drafting cultivation/xianxia scenes inside a longer novel project
rewriting modern-toned paragraphs into cultivation-register prose
sect / clan / closed-door-breakthrough fiction in a long-form writing pipeline
local and private creative-writing workflows
integration with a long-form novel pipeline that manages outline, memory, timeline, and continuity

This is an adapter only. It does not include base model weights, training data, or copyrighted source material.

🧭 About the v1 / v2 Series

This repository is part of a small series of Chinese fiction LoRAs (现代悬疑 realistic crime, 民俗志怪 folk-horror, and now 玄幻修仙 xianxia).

v1 — Style Retraining. The v1 series focuses mainly on prose style. It uses SFT to move the base model away from generic AI prose toward a specific Chinese literary voice.
v2 — Expanded Training & Full Release. The v2 line uses a larger, cleaner training set, longer SFT, and ships a complete release (PEFT safetensors + GGUF). Behavioral DPO refinement is applied where available.

Note on this release: this xianxia adapter is the SFT stage (QLoRA, 2 epochs on a chapter-level cultivation corpus). It is labeled v2 for its expanded training and full safetensors release. DPO behavioral refinement has not yet been applied and is planned as a follow-up iteration.

Available formats: HF PEFT safetensors (and a GGUF LoRA once converted). MLX users may also be able to use the PEFT safetensors through mlx-lm depending on their local setup.

🌱 Status

Field	Value
Version	v2 (SFT stage)
Focus	Chinese xianxia / cultivation prose behavior
Format	HF PEFT safetensors (GGUF LoRA planned)
Base model	Qwen3.6-27B
Language	Chinese
Use case	cultivation scene drafting, sect/clan fiction, realm-progression prose
Training style	SFT (QLoRA, r=16 / α=32, 2 epochs, max_seq=3300); DPO planned
Recommended workflow	local long-form writing pipeline

🔗 Companion Novel Pipeline

This LoRA is designed to work together with a local-first, long-form Chinese novel writing pipeline: github.com/DuckTraDo/Novel

The pipeline handles the structural side of long-form fiction:

outline and chapter planning
scene-level context assembly
story memory and character tracking
timeline and continuity checks

This LoRA handles the prose-behavior side:

third-person cultivation narration with character interiority
xianxia scene drafting (closed-door breakthrough, sect confrontation, artifact lore)
cultivation-register diction woven into narration
anti-modern-tone and anti-Western-fantasy-trope

They are intentionally split: the pipeline owns what happens, the LoRA owns how it reads on the page. You can use either independently, but they are designed as a pair.

📦 Files

Files included: adapter_config.json, adapter_model.safetensors, tokenizer.json, tokenizer_config.json, chat_template.jinja, README.md.

A GGUF LoRA (*-f16.gguf) for llama.cpp can be generated from the adapter and will be added to this repo; see the llama.cpp section below.

⚠️ Critical Usage Note: `enable_thinking=False`

Qwen3.6 ships with a thinking mode enabled by default. When you call apply_chat_template(messages, add_generation_prompt=True) with default settings, the chat template leaves an unclosed <think>\n block before the assistant's turn. With this LoRA loaded, that causes outputs to begin with an English thinking-process preamble (e.g. "Here's a thinking process: 1. Analyze the user input...") instead of Chinese cultivation prose.

The training samples used a closed empty thinking block: <think>\n\n</think>\n\n followed by the actual prose.

Always pass enable_thinking=False:

text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,
)

For llama.cpp / GGUF inference, pass the included chat_template.jinja via --chat-template-file, or manually prepend the assistant prefix <think>\n\n</think>\n\n in the prompt.

This is the single most common usage error. If outputs look like English reasoning instead of Chinese prose, this is almost certainly the cause.

🚀 Example PEFT / Transformers Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model, adapter = "Qwen/Qwen3.6-27B", "yuxinlu1/qwen3-6-27b-chinese-xianxia-lora-v2"

tokenizer = AutoTokenizer.from_pretrained(adapter, trust_remote_code=True)

base = AutoModelForCausalLM.from_pretrained(
    base_model, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, adapter).eval()

messages = [{"role": "user", "content": "写一段玄幻修仙小说。"}]

text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)

output = model.generate(
    **inputs,
    max_new_tokens=640,
    temperature=0.8,
    top_p=0.9,
    repetition_penalty=1.15,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

🚀 Example llama.cpp Usage

Pair this GGUF LoRA with a compatible Qwen3.6-27B GGUF base. Unsloth's Qwen3.6-27B-GGUF (Q4_K_M or Q8_0) is recommended.

llama-cli -m qwen3.6-27b-base.Q4_K_M.gguf \
  --lora qwen3-6-27b-chinese-xianxia-lora-v2-f16.gguf \
  --chat-template-file chat_template.jinja \
  -p "写一段玄幻修仙小说。" \
  -n 640 --temp 0.8 --top-p 0.9 --repeat-penalty 1.15

🧪 Example Prompts

Prompts can be short:

写一段玄幻修仙小说。
讲一个修仙者的故事。
深山一处石洞里，洞壁上的灵光忽明忽暗，映着满地碎石。
小说：少年盘膝而坐，体内那缕灵气又凝实了几分。
把下面这段改成玄幻修仙小说正文：
他打开电脑准备加班，忽然觉得胸口一热，等回过神来，发现自己躺在一间陌生的土屋里。

Suggested decoding range (tuned for this adapter):

Setting	Range
temperature	0.7 – 0.85
top_p	0.85 – 0.92
repetition_penalty	1.1 – 1.18
max_new_tokens	400 – 700

repetition_penalty matters here: too low (≈1.05) lets long generations fall into verbatim loops; too high (≈1.2+) can push Chinese character names toward homophone drift. ~~1.15 is a good default. The adapter is strongest in shorter bursts — very long single generations (>~~700 tokens) may drift; generate scene-by-scene for best results.

🧪 Internal Evaluation Snapshot

A small internal evaluation was run across four prompt tiers (original-text continuation, neutral scene description, character-grounded task, and minimal generic instruction) to verify style transfer.

Qualitative observations:

Original-text continuation: natural continuation in the trained cultivation voice; realm/technique/artifact vocabulary and the xianxia register carry through.
Neutral scene description: cultivation atmosphere and plot emerge from neutral prompts without explicit xianxia cues in the input.
Character-grounded task: characters behave consistently with cultivation-fiction conventions (sect etiquette, closed-door breakthrough, clan politics).
Minimal generic instruction (e.g. "write a piece of xianxia fiction"): the model produces cultivation prose directly without first explaining what xianxia is — confirming the style was internalized, not merely instruction-followed.

These observations come from a small local development eval and should be treated as an internal signal, not a public benchmark.

🧪 Intended Use

Intended for:

local Chinese xianxia / cultivation drafting
rewriting modern-toned paragraphs into cultivation-register prose
testing third-person cultivation narration
sect / clan / breakthrough fiction in a human-in-the-loop workflow
offline and privacy-respecting novel drafting
integration with a long-form novel pipeline

Not intended for: author impersonation, defamation, harassment, factual claims, high-stakes advice, spam, or deception.

⚠️ Limitations

It does not guarantee full-novel plot coherence by itself. Character continuity, foreshadowing, and timeline logic should be handled by an external writing pipeline or by the author.
Long single generations may degrade. Past roughly 600–700 tokens, very long outputs can drift, loop, or destabilize character names. Generate scene-by-scene and keep repetition_penalty near 1.15.
Character-name stability depends on decoding settings; aggressive repetition penalties can turn a name into a homophone variant.
Optimized for cultivation-world settings; outputs for modern, urban, or non-cultivation scenes may revert toward base-model tone.
It may produce shorter-than-expected outputs if the prompt is very minimal or decoding settings are conservative.
If enable_thinking=False is not set, outputs will begin with an English thinking-process preamble. This is the single most common usage error.
Output quality depends on the base model, quantization, sampling settings, prompt design, and context quality.

🛡️ Safety and Legal Notes

This repository contains adapter weights and supporting tokenizer/template files, not base model weights.
No copyrighted novels, private manuscripts, or proprietary datasets are distributed in this repository.
This LoRA is not designed to imitate any specific living author; the goal is to capture a broader Chinese xianxia / cultivation prose tradition.
Outputs are machine-generated fiction; do not use this model for harassment, defamation, fraud, or deceptive impersonation.
Users are responsible for the base model license, adapter license, and applicable law.

📜 License

LoRA adapter: MIT. Base model: governed by its own license.

Downloads last month: 61

GGUF

Model size

79.7M params

Architecture

qwen35

Hardware compatibility

16-bit

Model tree for yuxinlu1/qwen3-6-27b-chinese-xianxia-lora-v2

Base model

Qwen/Qwen3.6-27B

Adapter

(145)

this model