dystrio commited on
Commit
fa7c026
·
verified ·
1 Parent(s): 7f2e022

Add Key Results summary block

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -40,7 +40,16 @@ Dystrio Sculpt produces dense compiled variants of existing models that:
40
  - require no custom kernels
41
  - load with standard HuggingFace Transformers
42
 
 
43
 
 
 
 
 
 
 
 
 
44
 
45
 
46
  ## Benchmark Results
 
40
  - require no custom kernels
41
  - load with standard HuggingFace Transformers
42
 
43
+ ## Key Results
44
 
45
+ Compared to **mistralai/Mistral-7B-v0.1** baseline on an **A100 80GB**:
46
+
47
+ - **Weights memory:** **-11% (Conservative)** / **-23% (Balanced)**
48
+ - **RAG latency (TTFT p95):** **-7% / -14%**
49
+ - **Decode throughput:** ~flat
50
+ - **No runtime changes:** no custom kernels, no new ops, standard `transformers` loading
51
+
52
+ > Notes: TTFT includes prefill + first decode step. “Weights memory” is computed from parameter sizes (GiB) and is workload-independent.
53
 
54
 
55
  ## Benchmark Results