BaGua Architecture — 0.5B Base Model (八卦架构基座模型)
A novel non-Transformer neural network architecture inspired by the Eight Trigrams (易经八卦)
始于AI,不止于AI。/ Born from AI. Not limited to AI.
What is BaGua Architecture?
BaGua Architecture is a ground-up redesign of neural network architecture — no Transformer, no fixed attention mechanism.
Core idea: Eight trigram partitions dynamically determine information flow impedance through real-time polarity vectors. Every forward pass, the network topology is completely rebuilt from scratch.
This is fundamentally different from Transformer-based models (GPT, LLaMA, Claude, Gemini) where attention weights are fixed after training.
Model Card
| Item | Detail |
|---|---|
| Architecture | BaGua Architecture (non-Transformer) |
| Parameters | 504M (0.5B) |
| Dimensions | dim=1024, 24 layers |
| Vocabulary | 119,547 (bert-base-multilingual-cased) |
| Languages | Chinese + English |
| Training Data | OpenWebText (40GB English) + Chinese Wikipedia (1.7GB) |
| Training Steps | ~170,000 steps |
| Best Val PPL | ~106 |
| Stage | Early base model — not instruction tuned |
Nine Core Modules
| Module | Function |
|---|---|
| 动态八卦阵 Dynamic BaGua Field | 8 trigram partitions; impedance matrix controls flow |
| 卦象对冲 Polarity Clash Engine | Real-time polarity vectors; opposite=low impedance |
| 淘汰审核 Elimination Auditor | Quality scoring per trigram head |
| 淘汰低效机制 Low-Efficiency Eliminator | Zero out low-value pathways |
| 算力缓冲区 Compute Buffer | Smooth signal jumps from pruning |
| 去中心化自运算 Decentralized Self-Evolution | Global survival tracking; auto pressure regulation |
| 左耳进右耳出 In-and-Out Memory | Intra-sequence memory; inter-sequence reset |
| 九州编码 Nine Zones Encoding | 3-level hierarchical position encoding; zero memory cost |
| 任务自我感知 Task Self-Awareness | First-token task detection; 23-scene soft-weight switching |
Honest Assessment
This is an early-stage base model. Current limitations:
- PPL ≈ 106 (fluent conversation typically requires PPL < 50)
- Not instruction-tuned — responds with continuation, not answers
- Chinese output quality lower than English (data imbalance)
- Output may be incoherent at this training stage
What it demonstrates:
- A working implementation of a novel non-Transformer architecture
- Dynamic impedance mechanism functions correctly
- Anti-overfitting properties verified in classification experiments (28% lower loss than BERT-like)
- Foundation for further community training and research
Quick Start
# Install dependencies
pip install torch transformers
# Download inference script from GitHub
# https://github.com/123456yy384/bagua-Architecture
import torch
import torch.nn.functional as F
from transformers import AutoTokenizer
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("./tokenizer")
# Load model (requires architecture definition from GitHub)
# See bagua_chat_1b_v2.py for full inference code
Training Results
Classification Experiments
| Task | BaGua | Baseline | Note |
|---|---|---|---|
| AG News (20 epochs) | 89.54% / Loss=0.31 | 91.75% / Loss=0.43 | BaGua Loss 28% lower, zero overfitting |
| SST-2 | 73.4% / stable | 79.7% / severe overfit | BERT-like overfit from epoch 8 |
| Random sequences | 0.468M / 6.3201 | 0.784M / 6.3304 | 40% fewer params, lower loss |
LLM Pretraining
- Started from PPL ~116,000 (random init)
- Reached PPL ~106 after ~170,000 steps
- Training on dual RTX 4090D (cloud) + Tesla V100-SXM2-32GB (local)
Architecture Comparison
| Feature | Transformer | BaGua Architecture |
|---|---|---|
| Network Topology | Fixed after training | Rebuilt every forward pass |
| Attention | Independent parallel heads | Polarity-driven interaction |
| Overfitting Defense | External (dropout) | Built-in dynamic sparsity |
| Task Adaptation | Different model per task | Auto-switching (23 scenes) |
| Memory | KV Cache accumulates | Reset between sequences |
| Position Encoding | Unified external | Neuron-level 3-tier |
Citation
@misc{yang2026bagua,
title = {BaGua Architecture: Polarity-Driven Dynamic Impedance Neural Network},
author = {Yang, Enshuo},
year = {2026},
url = {https://github.com/123456yy384/bagua-Architecture}
}
About the Author
Yang Enshuo (阳恩硕) — 17 years old, vocational school student, independent researcher.
No institution. No supervisor. No research funding.
Hardware: RTX 4060 laptop + second-hand Tesla V100 server + rented cloud GPU.
Contact: Oyes13619690046@outlook.com GitHub: https://github.com/123456yy384/bagua-Architecture CSDN: https://blog.csdn.net/2504_93363461/article/details/159346941
"Innovation has no age limit. Creativity has no institutional boundary."