metadata
license: apache-2.0
library_name: onnx
tags:
- depth-estimation
- panoramic
- 360-degree
- webgpu
- onnx
pipeline_tag: depth-estimation
DA-2: Depth Anything in Any Direction (ONNX WebGPU Version)
This repository contains the ONNX weights for DA-2: Depth Anything in Any Direction, optimized for WebGPU inference in the browser.
Model Details
- Original Model: haodongli/DA-2
- Framework: ONNX (Opset 17)
- Precision: FP32 (Full Precision)
- Input Resolution: 1092x546
- Size: ~1.4 GB
Conversion Details
This model was converted from the original PyTorch weights to ONNX to enable client-side inference using onnxruntime-web.
- Optimization: Constant folding applied.
- Compatibility: Verified with WebGPU backend.
- Modifications:
- Replaced
clampoperators withMax/Mincombinations to ensure WebGPU kernel compatibility. - Removed internal normalization layers to allow raw 0-1 input from the browser.
- Replaced
Usage (Transformers.js)
You can also run this model using Transformers.js.
import { pipeline } from '@xenova/transformers';
// Initialize the pipeline
const depth_estimator = await pipeline('depth-estimation', 'phiph/DA-2-WebGPU', {
device: 'webgpu',
dtype: 'fp32', // Use FP32 as exported
});
// Run inference
const url = 'path/to/your/panorama.jpg';
const output = await depth_estimator(url);
// output.depth is the raw tensor
// output.mask is the visualized depth map
Usage (ONNX Runtime Web)
You can run this model in the browser using onnxruntime-web.
import * as ort from 'onnxruntime-web/webgpu';
// 1. Initialize Session
// Note: Model is now in the 'onnx' subdirectory
const session = await ort.InferenceSession.create('https://huggingface.co/phiph/DA-2-WebGPU/resolve/main/onnx/model.onnx', {
executionProviders: ['webgpu'],
preferredOutputLocation: { last_hidden_state: 'gpu-buffer' }
});
// 2. Prepare Input (Float32, 0-1 range, NCHW)
// Note: Do NOT apply ImageNet mean/std normalization. The model expects raw 0-1 floats.
const tensor = new ort.Tensor('float32', float32Data, [1, 3, 546, 1092]);
// 3. Run Inference
const results = await session.run({ images: tensor });
const depthMap = results.depth; // Access output
License
This model is a derivative work of DA-2 and is distributed under the Apache License 2.0.
Please cite the original authors if you use this model:
@article{li2025depth,
title={DA$^{2}$: Depth Anything in Any Direction},
author={Li, Haodong and Zheng, Wangguangdong and He, Jing and Liu, Yuhao and Lin, Xin and Yang, Xin and Chen, Ying-Cong and Guo, Chunchao},
journal={arXiv preprint arXiv:2509.26618},
year={2025}
}