BaGua Architecture — 0.5B Base Model (八卦架构基座模型)

A novel non-Transformer neural network architecture inspired by the Eight Trigrams (易经八卦)

始于AI，不止于AI。/ Born from AI. Not limited to AI.

What is BaGua Architecture?

BaGua Architecture is a ground-up redesign of neural network architecture — no Transformer, no fixed attention mechanism.

Core idea: Eight trigram partitions dynamically determine information flow impedance through real-time polarity vectors. Every forward pass, the network topology is completely rebuilt from scratch.

This is fundamentally different from Transformer-based models (GPT, LLaMA, Claude, Gemini) where attention weights are fixed after training.

Model Card

Item	Detail
Architecture	BaGua Architecture (non-Transformer)
Parameters	504M (0.5B)
Dimensions	dim=1024, 24 layers
Vocabulary	119,547 (bert-base-multilingual-cased)
Languages	Chinese + English
Training Data	OpenWebText (40GB English) + Chinese Wikipedia (1.7GB)
Training Steps	~170,000 steps
Best Val PPL	~106
Stage	Early base model — not instruction tuned

Nine Core Modules

Module	Function
动态八卦阵 Dynamic BaGua Field	8 trigram partitions; impedance matrix controls flow
卦象对冲 Polarity Clash Engine	Real-time polarity vectors; opposite=low impedance
淘汰审核 Elimination Auditor	Quality scoring per trigram head
淘汰低效机制 Low-Efficiency Eliminator	Zero out low-value pathways
算力缓冲区 Compute Buffer	Smooth signal jumps from pruning
去中心化自运算 Decentralized Self-Evolution	Global survival tracking; auto pressure regulation
左耳进右耳出 In-and-Out Memory	Intra-sequence memory; inter-sequence reset
九州编码 Nine Zones Encoding	3-level hierarchical position encoding; zero memory cost
任务自我感知 Task Self-Awareness	First-token task detection; 23-scene soft-weight switching

Honest Assessment

This is an early-stage base model. Current limitations:

PPL ≈ 106 (fluent conversation typically requires PPL < 50)
Not instruction-tuned — responds with continuation, not answers
Chinese output quality lower than English (data imbalance)
Output may be incoherent at this training stage

What it demonstrates:

A working implementation of a novel non-Transformer architecture
Dynamic impedance mechanism functions correctly
Anti-overfitting properties verified in classification experiments (28% lower loss than BERT-like)
Foundation for further community training and research

Quick Start

# Install dependencies
pip install torch transformers

# Download inference script from GitHub
# https://github.com/123456yy384/bagua-Architecture

import torch
import torch.nn.functional as F
from transformers import AutoTokenizer

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("./tokenizer")

# Load model (requires architecture definition from GitHub)
# See bagua_chat_1b_v2.py for full inference code

Training Results

Classification Experiments

Task	BaGua	Baseline	Note
AG News (20 epochs)	89.54% / Loss=0.31	91.75% / Loss=0.43	BaGua Loss 28% lower, zero overfitting
SST-2	73.4% / stable	79.7% / severe overfit	BERT-like overfit from epoch 8
Random sequences	0.468M / 6.3201	0.784M / 6.3304	40% fewer params, lower loss

LLM Pretraining

Started from PPL ~116,000 (random init)
Reached PPL ~106 after ~170,000 steps
Training on dual RTX 4090D (cloud) + Tesla V100-SXM2-32GB (local)

Architecture Comparison

Feature	Transformer	BaGua Architecture
Network Topology	Fixed after training	Rebuilt every forward pass
Attention	Independent parallel heads	Polarity-driven interaction
Overfitting Defense	External (dropout)	Built-in dynamic sparsity
Task Adaptation	Different model per task	Auto-switching (23 scenes)
Memory	KV Cache accumulates	Reset between sequences
Position Encoding	Unified external	Neuron-level 3-tier

Citation

@misc{yang2026bagua,
  title  = {BaGua Architecture: Polarity-Driven Dynamic Impedance Neural Network},
  author = {Yang, Enshuo},
  year   = {2026},
  url    = {https://github.com/123456yy384/bagua-Architecture}
}

About the Author

Yang Enshuo (阳恩硕) — 17 years old, vocational school student, independent researcher.

No institution. No supervisor. No research funding.

Hardware: RTX 4060 laptop + second-hand Tesla V100 server + rented cloud GPU.

Contact: Oyes13619690046@outlook.com GitHub: https://github.com/123456yy384/bagua-Architecture CSDN: https://blog.csdn.net/2504_93363461/article/details/159346941

"Innovation has no age limit. Creativity has no institutional boundary."

Downloads last month: -; Downloads are not tracked for this model. How to track