jc-builds commited on
Commit
28a11e9
·
verified ·
1 Parent(s): cc14274

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -138,6 +138,20 @@ graph LR
138
 
139
  ---
140
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
141
  ## Quick Start
142
 
143
  <details open>
 
138
 
139
  ---
140
 
141
+ ## What I Learned Getting This to Work Well
142
+
143
+ Getting TripoSR to produce clean 3D meshes on a phone took more work than just converting the model to ONNX. The raw model expects a very specific kind of input — a single object, centered, on a neutral background — and if you just feed it a raw photo, the results are pretty rough.
144
+
145
+ The biggest improvement came from **stripping the background** before inference. I'm using Apple's **Vision framework** (`VNGenerateForegroundInstanceMaskRequest` on iOS 17+) to automatically detect and isolate the main subject. This is the same API that powers the "lift subject from background" feature in Photos — it's fast, runs on-device, and handles edges surprisingly well. The isolated subject gets composited onto a **flat gray background** (RGB 0.5, 0.5, 0.5), which matches what TripoSR was trained on.
146
+
147
+ The second big win was **smart cropping and centering**. After removing the background, I analyze the remaining foreground pixels to find the bounding box, then scale and center the subject so it fills roughly **85-95% of the frame**. Too small and the model loses detail; too large and geometry gets clipped. The fill ratio adapts based on the object's shape — tall/narrow objects get a bit more breathing room, compact objects fill more of the frame. A small amount of padding (2-6%) prevents edge artifacts.
148
+
149
+ I also added a lightweight **image enhancement pipeline** before inference: noise reduction, luminance sharpening, and edge smoothing after the resize. Lanczos resampling (instead of bilinear) for the 512x512 resize made a noticeable difference in preserving fine detail. All of this runs through Core Image with Metal acceleration, so it adds minimal overhead.
150
+
151
+ The full pipeline — background removal, crop, center, enhance, infer — runs entirely on-device in [Haplo AI](https://apps.apple.com/us/app/haplo-ai-offline-private-ai/id6746702574). No server, no internet required.
152
+
153
+ ---
154
+
155
  ## Quick Start
156
 
157
  <details open>